It doesn’t matter who you are or what you do, you must learn, at least, the basics of Machine learning (ML). It is growing more popular by the day and you can see it applied in many things we do in our everyday life. Here you will learn the very basics of machine learning so you can start having a good idea of where it is already used and what problems can it solve.
Let’s start with wikipedia’s definition of machine learning: “Machine Learning (ML) is the scientific study of algorithms and statistical models that computer systems use to effectively perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence.”
In order to understand ML better, we need to break it down into simpler parts, define each one, and see how they all come together as a whole. Before we move deeper, I’m gonna tell you what ML is in a nutshell.
Machine learning in a nutshell
- First, it finds patterns in data.
- Second, it uses those patterns to predict the future.
- Then, a trained model is created that is able to recognize patterns.
- Finally, applications can supply new data to see if it matches known patterns.
Basically, it “learns” by looking at lots of data, which contains patterns, and it finds those patterns by using a machine learning algorithm. The trained model is the output of that learning phase which is then used to make predicitons. Easy right? Let’s now dive a bit deeper.
The machine learning workflow
Pick your problem
The first thing you need to do is identify the type of problem you want to solve. Most types of problems in ML can be classified into one of these categories:
You first need to know the type of problem you want to solve (the algorithm you will use) since the data must be represented in a specifc format depending on the algorithm.
Next, you need lots of historical data in order to train your machine learning model to predict accurately. Once you have the data, you then need to be able to convert it into meaningful numeric attributes that represent the data. This is a crucial step in the machine learning process.
Apply an algorithm
At this point you should have chosen the algorithm that best fits your problem and the prepared historical data. You now “feed” the data to the algorithm, it finds patterns in the data that define the relationship between a set of features in the dataset and the value you want to predict. It then creates a set of rules (model) that are used to make predictions.
The end result is a model that can be used to make predictions based on the historical data used to train it.
The machine learing process is repeated so that model can get better and better at making accurate predictions.
Algorithms are usually plug and play. A lot of programming languages already have built in ML algorithms that you can use to train your models. Therefore, choosing an algorithm can be a matter of experimentation to see which one gives you the best results.
On the other hand, setting up the problem and representing/preparing the data correctly is crutial and a lot of thoughtful choices, wisdom, and experience goes into getting these right.
Now, here’s an overview of each algorithm mentioned above:
Classification is used when we can categorize our training data into a desired and distinct number of classes where we can assign label to each class.
Regression models are used to predict a continuous value.
And you know the values depend on certain inputs:
Unlike Classification and Regression, Clustering it is a type of unsupervised learning method. It’s the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group and dissimilar to the data points in other groups. It is basically a collection of objects on the basis of similarity and dissimilarity between them.
Recommendations systems use data from past user behavior in order to provide similar content by predicting what the user likes based on the historical data. ML recommendation systems are typically classified into two categories: content based and collaborative filtering methods. Although, modern recommendation systems combine both approaches.
Now, let’s quickly look at how the training data looks like if it were displayed in a table.
The “Features” are used to predict the “Target Value”.
Every good things comes to an end
I’m gonna leave it here for now. It was a good introduction to Machine Leaning. I hope you learned something and can now have good conversations with your colleagues regarding ML.