Artificial Neural Networks: Mathematics of Backpropagation (Part 4)

Up until now, we haven't utilized any of the expressive non-linear power of neural networks - all of our simple one layer models corresponded to a linear model such as multinomial logistic regression. These one-layer models had a simple derivative. We only had one set of weights the fed directly to our output, and it was easy to compute the derivative with respect to these weights. However, what happens when we want to use a deeper model? What happens when we start stacking layers? 

Read More

Artificial Neural Networks: Linear Multiclass Classification (Part 3)

In the last section, we went over how to use a linear neural network to perform classification. We covered using both the perceptron algorithm and gradient descent with a sigmoid activation function to learn the placement of the decision boundary in our feature space. However, we only covered binary classification. What if we instead want to classify a point belonging to one of $K$ classes?

Read More

Artificial Neural Networks: Linear Classification (Part 2)

So far we've covered using neural networks to perform linear regression. What if we want to perform classification using a single-layer network?  In this post, I will cover two methods: the perceptron algorithm and using a sigmoid activation function to generate a likelihood. I will not cover the delta rule because it is a special case of the more general backpropagation algorithm, which will be covered in detail in Part 4.

Read More

Artificial Neural Networks: Linear Regression (Part 1)

Artificial neural networks (ANNs) were originally devised in the mid-20th century as a computational model of the human brain. Their used waned because of the limited computational power available at the time, and some theoretical issues that weren't solved for several decades (which I will detail at the end of this post). However, they have experienced a resurgence with the recent interest and hype surrounding Deep Learning. One of the more famous examples of Deep Learning is the "Youtube Cat" paper by Andrew Ng et al.

It is theorized that because of their biological inspiration, ANN-based learners will be able to emulate how a human learns to recognize concepts or objects without the time-consuming feature engineering step. Whether or not this is true (or even provides an advantage in terms of development time) remains to be seen, but currently it's important that we machine learning researchers and enthusiasts have a familiarity with the basic concepts of neural networks.

This post covers the basics of ANNs, namely single-layer networks. We will cover three applications: linear regression, two-class classification using the perceptron algorithm and multi-class classification.

Read More