Deep learning for coders with fastai and pytorch PDF
Neural Networks
Now let's look into the brain of our AI.
Reference: Deep learning for coders with fastai and pytorch PDF
Like animals, the brain is made up of neurons. They are represented by circles. These neurons are interconnected:
Neurons are grouped into three different types of layers:
Input layer.
Hidden layer(s).
Output layer.
The input layer accepts the data originally provided. In our case, there are four neurons here: initial airport, destination airport, departure date, and airline. The input layer passes the inputs to the first hidden layer.
The hidden layers perform mathematical calculations on our inputs. One of the challenges in creating neural networks is determining the number of hidden layers, and the number of neurons for each such layer. "Depth" in Deep Learning refers to the presence of more than one hidden layer.
The output layer returns a result (output data). In our case it is a prediction of the ticket price.
So, how does it predict prices?
This is where the magic of deep learning begins. Each connection between neurons is associated with a weight. This weight determines the importance of the input value. The initial weight is set randomly.
When predicting airfare, the date of departure is one of the heavier (significant) factors. Consequently, the connections of departure neurons will have a heavy weight (significance).
Each neuron has an activation function. These functions are difficult to understand without mathematical reasoning. Simply put, one of the goals is to "standardize" the output from a neuron. When a set of input data has passed through all the layers of the neural network, it returns the output data through the corresponding layer.
Nothing complicated, right?
Neural Network Training
AI training is the hardest part of Deep Learning. Why?
You need a large set of data.
You need a lot of processing power.
To estimate airfare, you need to find historical data on ticket prices. Because of the large number of possible airports and combinations of departure dates, you will need a huge list of prices.
To train the AI, we need to provide it with data from our dataset and compare its outputs with the outputs from the dataset. Since the AI is not yet trained, its outputs will be in error. Once the entire dataset has been traversed, we can create a function that shows how much the AI's outputs differ from the real outputs. This function is called a cost or value function.
We want (and this is ideal) the cost function to be zero: that's when the outputs of our AI will match the outputs of the data set.
How do we reduce the cost function?
We change the weight. We could do this randomly until our cost function decreases as much as possible, but this is not a very efficient way.
Instead, we will use the gradient descent method.
Gradient descent is a method that allows us to find the minimum of a function. In our case, it is a cost function. It works by changing the weight slightly after each iteration for the data set. By calculating the derivative (or gradient) of the cost function at a particular set of weights, we can see which direction the minimum is in.
To minimize the cost function, you have to iterate through the data many times. That's why you need a lot of processing power. Updating the weight with gradient descent is done automatically. This is exactly the magic of Deep Learning!
Once we've produced our airfare estimate, we can use it to predict future prices.
More Information
There are many other types of neural networks: convolutional, for computer "vision" (image recognition), and recurrent, for text or speech recognition.
If you want to learn the technical aspect of Deep Learning, you can take an online course. Currently, one of the best courses for Deep Learning is a set of lessons from Andrew Ng. If you're not interested in getting certified, you don't need to pay for this course. Read more about it in our article.