If you’ve landed on this article, chances are that you’ve been wondering what Machine Learning is all about or perhaps how to get started off. Do not worry, let us get all of this covered in the next few minutes!

Image for post
Image for post
Image Source: Thermo Fisher Scientific — Machine Learning is a subset of AI

Lately, it seems that every time you open your browser or casually scroll across the news feed, there’s always someone writing about machine learning, its impact on human-kind or the advancements in AI. What’s all this buzz about? Have you ever wondered how technologies ranging from Virtual Assistant Solutions to self-driving cars* and robots ever function?

*For more on Self-Driving…


In the next few minutes, we shall get ‘Pandas’ covered — An extremely popular Python library that comes with high-level data structures and a wide range of tools for data analysis that every Machine Learning practitioner must be familiar with!

Image for post
Image for post
Image Source: Pinterest

“Pandas aims to be the fundamental high-level building block for doing practical, real-world data analysis in Python” — Pandas’ Mission Statement

Salient Features of the Library —

  • Fast and efficient data manipulation with integrated indexing
  • Integrated tools for reading/writing in various formats — CSV, text files, MS Excel, SQL, HDF5 etc.
  • Smart data-alignment, integrated handling of missing values
  • Flexible in terms of reshaping/pivoting datasets
  • Supports slicing…


In the next few minutes, we shall get Numpy covered! An extremely popular core scientific computing Python library that every Machine Learning practitioner must be familiar with!

Image for post
Image for post
Image Source: https://github.com/numpy/numpy.org/issues/37

NumPyNumerical Python is a popular Python library for large multi-dimensional array and matrix processing, with the help of a large collection of high-level mathematical functions.

It is very useful for fundamental scientific computations in Machine Learning. It is particularly useful for linear algebra, Fourier transform, and random number capabilities. High-end libraries like TensorFlow uses NumPy internally for manipulation of Tensors.

Numpy employes something called — ‘Vectorization’.

Vectorization is a powerful ability…


Basics of Reinforcement Learning with Real-World Analogies and a Tutorial to Train a Self-Driving Cab to pick up and drop off passengers at right destinations using Python from Scratch.

Most of you must have probably heard of AI learning to play computer games on its own, a very popular example being Deepmind, which hit the news and took the world by storm when their AlphaGo program defeated the South Korean Go world champion in 2016.

So what’s the secret behind this major breakthrough? Hold your horses! You’ll understand this in a couple of minutes.

Image for post
Image for post
An Analogy of Reinforcement Learning

Let’s consider the analogy of teaching…


Crime pattern analysis uncovers the underlying interactive process between crime events by discovering where, when, and why particular crimes are likely to occur. The outcomes improve our understanding of the dynamics of unlawful activities and can enhance predictive policing.

For more on K-means Clustering: Everything you need to know about K-Means Clustering

Wget the data required at this link:

!wget https://raw.githubusercontent.com/namanvashistha/doctor_strange/master/crime.csv

Import libraries:

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt

Read and Display data:

data = pd.read_csv("crime.csv")
data
Image for post
Image for post

K-means from Scratch:

np.random.seed(42) def euclidean_distance(x1, x2): return np.sqrt(np.sum((x1 - x2)**2)) class KMeans(): def __init__(self…


KNN (K-Nearest Neighbours) is one of the most basic classification algorithms in machine learning. It belongs to the supervised learning category of machine learning. KNN is often used in search applications where you are looking for “similar” items.

The way we measure similarity is by creating a vector representation of the items, and then compare the vectors using an appropriate distance metric (like the Euclidean distance, for example).

Image for post
Image for post

For more on KNN: A Beginner’s Guide to KNN and MNIST Handwritten Digits Recognition using KNN from Scratch

Dataset used:
We used haarcascade_frontalface_default.xml dataset that could easily be downloaded from this link.


While many classifiers exist that can classify linearly separable data such as logistic regression, Support Vector Machines (SVM) can handle highly non-linear problems using a kernel trick which implicitly maps the input vectors to higher-dimensional feature spaces.

Let’s get into the depth of this in the next few minutes!

Image for post
Image for post
Image Source: Google Images

This transformation we were talking about — rearranges the dataset in such a way that it is then linearly solvable.

In this article, we are going to look at how SVM works, learn about kernel functions, hyperparameters and pros and cons of SVM along with some of the real-life applications of…


Naive Bayes is a probabilistic machine learning algorithm based on the Bayes Theorem, used in a wide variety of classification tasks.

In this article, we shall be understanding the Naive Bayes algorithm and its essential concepts so that there is no room for doubts in understanding.

Image for post
Image for post
Image Source: Becoming.ai

Naive Bayes is a simple but surprisingly powerful probabilistic machine learning algorithm used for predictive modeling and classification tasks.

Some typical applications of Naive Bayes are spam filtering, sentiment prediction, classification of documents, etc. …


In statistics, Naive Bayes classifiers are a family of simple “probabilistic classifiers” based on applying Bayes’ theorem with strong independence assumptions between the features. Source: Wikipedia

Image for post
Image for post
Image Source: Machine Learning Mastery

For the conceptual overview of Naive Bayes, refer — A Machine Learning Roadmap to Naive Bayes

We shall now go through the code walkthrough for the implementation of the Naive Bayes algorithm from scratch:

import numpy as np class NaiveBayes: def fit(self, X, y): n_samples, n_features = X.shape self._classes = np.unique(y) n_classes = len(self._classes) # calculate mean, var, and prior for each class self._mean = np.zeros((n_classes, n_features), dtype=np.float64) self._var = np.zeros((n_classes, n_features), dtype=np.float64) self._priors…


Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is used for Classification as well as Regression problems. However, primarily, it is used for Classification problems in Machine Learning.

The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future. This best decision boundary is called a hyperplane.

Source: Javatpoint

Image for post
Image for post
Image Source: R-bloggers

For the conceptual overview of SVM, refer — A Beginner’s Introduction to SVM

We shall now…

Tanvi Penumudy

CS Undergrad at Bennett University

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store