top of page

Updated: Mar 29, 2022

Convolutional Neural Networks(CNN) is used for computer vision, which is detecting patterns of the visual data. For example,

  • If we want to classify whether a picture of food is pizza or bread.

  • We can detect some specific objects through a security camera.

In this blog, we would be going to learn how to build CNN to detect a visual object.


Get Data


Prepare data is the most important part of any deep learning project, we are going to work from the Food-101 dataset, a collection of 101 different categories of 101,000 (1000 images per category) real-world images of food dishes, to simplify the scenario we would choose two of the categories, pizza, and steak to build a binary classifier. we are thankful to Daniel Bourke to prepare data for pizza and steak only.


Import the Data


First, we need to import the data from the storage

import zipfile

# Download zip file of pizza_steak images
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/pizza_steak.zip 

# Unzip the downloaded file
zip_ref = zipfile.ZipFile("pizza_steak.zip", "r")
zip_ref.extractall()
zip_ref.close()
ree

Inspect the Data


A very crucial step at the beginning of a machine learning project is to inspect the pattern and visualize the data. Let's inspect whatever data that we have just downloaded.

!ls pizza_steak
ree

!ls pizza_steak/train/
ree

!ls pizza_steak/train/steak/

-----

-----

ree









There are many images, now we need to find out how many images are there for train and test.

import os

# Walk through pizza_steak directory and list number of files
for dirpath, dirnames, filenames in os.walk("pizza_steak"):
  print(f"There are {len(dirnames)} directories and {len(filenames)} images in '{dirpath}'.")
ree






Get the class names (programmatically, this is much more helpful with a longer list of classes

import pathlib
import numpy as np
data_dir = pathlib.Path("pizza_steak/train/") # turn our training path into a Python path
class_names = np.array(sorted([item.name for item in data_dir.glob('*')])) # created a list of class_names from the subdirectories
print(class_names)
ree


So, we have 750 training images and 250 test images of pizza and steak. Now we have to create a function to visualize the random images

# View an image
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import random

def view_random_image(target_dir, target_class):
  # Setup target directory (we'll view images from here)
  target_folder = target_dir+target_class

  # Get a random image path
  random_image = random.sample(os.listdir(target_folder), 1)

  # Read in the image and plot it using matplotlib
  img = mpimg.imread(target_folder + "/" + random_image[0])
  plt.imshow(img)
  plt.title(target_class)
  plt.axis("off");

  print(f"Image shape: {img.shape}") # show the shape of the image

  return img

Using this function let's try to visualize an image using target class(steak/pizza)

# View a random image from the training dataset
img = view_random_image(target_dir="pizza_steak/train/",
                        target_class="steak")
ree








img = view_random_image(target_dir="pizza_steak/train/",
                        target_class="pizza")
ree











We can try out more results to get data. After getting more results we can get an idea of what we are working with. Now let's see the image in form of a big array/tensor and view the image shape.

# View the img (actually just a big array/tensor) and shape((width, height, colour channels)
img, img.shape

-----

-----

ree












Now look at the image shape, it's in a form of with, Height, and Color Channels. We can notice all the values of the image array between 0 and 255. This is because that's the possible range of red, green, and blue values. So when we build a model to differentiate between our images of pizza and steak, it will be finding patterns in these different pixel values which determine what each class looks like.

As we discuss before machine learning models prefer values between 0 and 1, one of the most common preprocessing steps for working with images is to scale the pixel value that is divided by 255.

# Get all the pixel values between 0 & 1
img/255.

-----

-----

ree













The architecture of a convolutional neural network(typical)


Why Typical? convolutional neural network deep learning network can be created in many different ways, we would discuss here the more traditional way of convolutional neural network(CNN).


Hyperparameter/Layer type What does it do? Typical values


Input image(s) Discover patterns of target image Photo or Video


Input layer Take a target image and input_shape = [batch_size,

preprocess image_height, image_width,

color_channels]


Convolution layer Extracts/learns the most Multiple, can create with

important features from target tf.keras.layers.ConvXD

images. (X can be multiple values)


Hidden activation Adds non-linearity to learned Usually ReLU

features (non-straight lines) (tf.keras.activations.relu)


Pooling layer Reduces the dimensionality of Average

learned image features (tf.keras.layers.AvgPool2D)

Max


Fully connected layer Further refines learned features tf.keras.layers.Dense

from convolution layers


Output layer Takes learned features and output_shape =

outputs them in shape of target [number_of_classes]

labels.

Output activation Adds non-linearities to output tf.keras.activations.sigmoid

layer (binary classification) or



How it looks

ree

Resource: The architecture we're using below is a scaled-down version of VGG-16, a convolutional neural network that came 2nd in the 2014 ImageNet classification competition.


Binary classification


We just went through a whirlwind of steps:

  1. Become one with the data (visualize, visualize, visualize...)

  2. Preprocess the data (prepare it for a model)

  3. Create a model (start with a baseline)

  4. Fit the model

  5. Evaluate the model

  6. Adjust different parameters and improve the model (try to beat your baseline)

  7. Repeat until satisfied

Let's step through each.



Prepare Data


Let's prepare data for our convolutional neural network (CNN) experiments.

One of the most important steps for a machine learning project is creating a training and test sets. In our case, our data is already split into training and test sets. Another option here might be to create a validation set as well, but we'll leave that for now. For an image classification project, it's standard to have your data separated into train and test directories with subfolders in each class.


A batch is a small subset of the dataset a model looks at during training. For example, rather than looking at 10,000 images at one time and trying to figure out the patterns, a model might only look at 32 images at a time.

It does this for a couple of reasons:

  • 10,000 images (or more) might not fit into the memory of your processor (GPU).

  • Trying to learn the patterns in 10,000 images in one hit could result in the model not being able to learn very well.

Why 32?

There are many different batch sizes you could use but 32 has proven to be very effective in many different use cases and is often the default for many data preprocessing functions.

To turn our data into batches, we'll first create an instance of ImageDataGenerator for each of our datasets.


import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Set the seed
tf.random.set_seed(42)

# Preprocess data (get all of the pixel values between 1 and 0, also called scaling/normalization)
train_datagen = ImageDataGenerator(rescale=1/255.)
test_datagen = ImageDataGenerator(rescale=1/255.)

# Setup the train and test directories
train_dir = "pizza_steak/train/"
test_dir = "pizza_steak/test/"

# Import data from directories and turn it into batches
# Turn it into batches
train_data = train_datagen.flow_from_directory(directory=train_dir,
                                               target_size=(224, 224),
                                               class_mode='binary',
                                               batch_size=32)

test_data = test_datagen.flow_from_directory(directory=test_dir,
                                             target_size=(224, 224),
                                             class_mode='binary',
                                             batch_size=32)
ree


Looks like our training dataset has 1500 images belonging to 2 classes (pizza and steak) and our test dataset has 500 images also belonging to 2 classes.

Some things to here:

  • Due to how our directories are structured, the classes get inferred by the subdirectory names in train_dir and test_dir.

  • The target_size parameter defines the input size of our images in (height, width) format.

  • The class_mode value of 'binary' defines our classification problem type. If we had more than two classes, we would use 'categorical'.

  • The batch_size defines how many images will be in each batch, we've used 32 which is the same as the default.

We can take a look at our batched images and labels by inspecting the train_data object.


# Get a sample of the training data batch 
images, labels = train_data.next() # get the 'next' batch of images/labels
len(images), len(labels)
ree


it seems our images and labels are in batches of 32.

How about the labels?

# View the first batch of labels
labels
ree


Due to the class_mode parameter being 'binary' our labels are either 0 (pizza) or 1 (steak).


Create Model


A simple heuristic for computer vision models is to use the model architecture which is performing best on ImageNet (a large collection of diverse images to benchmark different computer vision models). However, to begin with, it's good to build a smaller model to acquire a baseline result that you try to improve upon. Let's create a small version of the model to acquire a baseline result and try to improve upon it.

# Make the creating of our model a little easier
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPool2D, Activation
from tensorflow.keras import Sequential

# Create the model (this can be our baseline, a 3 layer Convolutional Neural Network)
model_1 = Sequential([
  Conv2D(filters=10, 
         kernel_size=3, 
         strides=1,
         padding='valid',
         activation='relu', 
         input_shape=(224, 224, 3)), # input layer (specify input shape)
  Conv2D(10, 3, activation='relu'),
  Conv2D(10, 3, activation='relu'),
  Flatten(),
  Dense(1, activation='sigmoid') # output layer (specify output shape)
])

Let's define the components of the Conv2D layer

  • The "2D" means our inputs are two-dimensional (height and width), even though they have 3 color channels, the convolutions are run on each channel individually.

  • filters - these are the number of "feature extractors" that will be moving over our images.

  • kernel_size - the size of our filters, for example, a kernel_size of (3, 3) (or just 3) will mean each filter will have the size 3x3, meaning it will look at a space of 3x3 pixels each time. The smaller the kernel, the more fine-grained features it will extract.

  • stride - the number of pixels a filter will move across as it covers the image. Astride of 1 means, the filter moves across each pixel 1 by 1. Astride of 2 means, it moves 2 pixels at a time.

  • padding - this can be either 'same' or 'valid', 'same' adds zeros the to outside of the image so the resulting output of the convolutional layer is the same as the input, whereas 'valid' (default) cuts off excess pixels where the filter doesn't fit (e.g. 224 pixels wide divided by a kernel size of 3 (224/3 = 74.6) means a single pixel will get cut off the end.

Resources:

Compile and Fit Model

# Compile the model
model_1.compile(loss='binary_crossentropy',
                optimizer=Adam(),
                metrics=['accuracy'])
# Fit the model
history_1 = model_1.fit(train_data,
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=test_data,
                        validation_steps=len(test_data))
ree

We'll notice two new parameters used here

  • steps_per_epoch - this is the number of batches a model will go through per epoch, in our case, we want our model to go through all batches so it's equal to the length of train_data (1500 images in batches of 32 = 1500/32 = ~47 steps)

  • validation_steps - same as above, except for the validation_data parameter (500 test images in batches of 32 = 500/32 = ~16 steps)

Let's create a function to investigate the model's training performance(separate accuracy and loss curves)

# Plot the validation and training data separately
def plot_loss_curves(history):
  """
  Returns separate loss curves for training and validation metrics.
  """ 
  loss = history.history['loss']
  val_loss = history.history['val_loss']

  accuracy = history.history['accuracy']
  val_accuracy = history.history['val_accuracy']

  epochs = range(len(history.history['loss']))

  # Plot loss
  plt.plot(epochs, loss, label='training_loss')
  plt.plot(epochs, val_loss, label='val_loss')
  plt.title('Loss')
  plt.xlabel('Epochs')
  plt.legend()

  # Plot accuracy
  plt.figure()
  plt.plot(epochs, accuracy, label='training_accuracy')
  plt.plot(epochs, val_accuracy, label='val_accuracy')
  plt.title('Accuracy')
  plt.xlabel('Epochs')
  plt.legend(); 
# Check out the loss curves of model_1
plot_loss_curves(history_1)
ree
















Repeat until satisified


After many iterations of the model experiment came up to dig into our bag of tricks and try another method of overfitting prevention, data augmentation.

Data augmentation is the process of altering our training data, leading to it having more diversity and in turn allowing our models to learn more generalizable patterns. Altering might mean adjusting the rotation of an image, flipping it, cropping it or something similar.


Doing this simulates the kind of data a model might be used on in the real world.


If we're building a pizza vs. steak application, not all of the images our users take might be in similar setups to our training data. Using data augmentation gives us another way to prevent overfitting and in turn make our model more generalizable. Let's data augmented test and train data.

# Create ImageDataGenerator training instance with data augmentation
train_datagen_augmented = ImageDataGenerator(rescale=1/255.,
                                             rotation_range=20, # rotate the image slightly between 0 and 20 degrees (note: this is an int not a float)
                                             shear_range=0.2, # shear the image
                                             zoom_range=0.2, # zoom into the image
                                             width_shift_range=0.2, # shift the image width ways
                                             height_shift_range=0.2, # shift the image height ways
                                             horizontal_flip=True) # flip the image on the horizontal axis

# Create ImageDataGenerator training instance without data augmentation
train_datagen = ImageDataGenerator(rescale=1/255.) 

# Create ImageDataGenerator test instance without data augmentation
test_datagen = ImageDataGenerator(rescale=1/255.)

# Import data and augment it from training directory
print("Augmented training images:")
train_data_augmented = train_datagen_augmented.flow_from_directory(train_dir,
                                                                   target_size=(224, 224),
                                                                   batch_size=32,
                                                                   class_mode='binary',
                                                                   shuffle=False) # Don't shuffle for demonstration purposes, usually a good thing to shuffle

# Create non-augmented data batches
print("Non-augmented training images:")
train_data = train_datagen.flow_from_directory(train_dir,
                                               target_size=(224, 224),
                                               batch_size=32,
                                               class_mode='binary',
                                               shuffle=False) # Don't shuffle for demonstration purposes

print("Unchanged test images:")
ree





Let's visualize data augmentation, how about we see it?


# Get data batch samples
images, labels = train_data.next()
augmented_images, augmented_labels = train_data_augmented.next() # Note: labels aren't augmented, they stay the same

# Show original image and augmented image
random_number = random.randint(0, 32) # we're making batches of size 32, so we'll get a random instance
plt.imshow(images[random_number])
plt.title(f"Original image")
plt.axis(False)
plt.figure()
plt.imshow(augmented_images[random_number])
plt.title(f"Augmented image")
plt.axis(False);
ree





















After going through a sample of original and augmented images, you can start to see some of the example transformations on the training images.


Notice how some of the augmented images look like slightly warped versions of the original image. This means our model will be forced to try and learn patterns in less-than-perfect images, which is often the case when using real-world images.


We need to try to create models until satisfy the result


Let's see what happens when we shuffle the augmented training data.



# Import data and augment it from directories
train_data_augmented_shuffled = train_datagen_augmented.flow_from_directory(train_dir,
                                                                            target_size=(224, 224),
                                                                            batch_size=32,
                                                                            class_mode='binary',
                                                                            shuffle=True) # Shuffle data (default)
ree


Since we've already beaten our baseline, there are a few things we could try to continue to improve our model:

  • Increase the number of model layers (e.g. add more convolutional layers).

  • Increase the number of filters in each convolutional layer (e.g. from 10 to 32, 64, or 128, these numbers aren't set in stone either, they are usually found through trial and error).

  • Train for longer (more epochs).

  • Finding an ideal learning rate.

  • Get more data (give the model more opportunities to learn).

Adjusting each of these settings (except for the last two) during model development is usually referred to as hyperparameter tuning.


You can think of hyperparameter tuning as similar to adjusting the settings on your oven to cook your favorite dish. Although your oven does most of the cooking for you, you can help it by tweaking the dials. Here is our final model

# Create a CNN model (same as Tiny VGG but for binary classification - https://poloclub.github.io/cnn-explainer/ )
model_final = Sequential([
  Conv2D(10, 3, activation='relu', input_shape=(224, 224, 3)), # same input shape as our images
  Conv2D(10, 3, activation='relu'),
  MaxPool2D(),
  Conv2D(10, 3, activation='relu'),
  Conv2D(10, 3, activation='relu'),
  MaxPool2D(),
  Flatten(),
  Dense(1, activation='sigmoid')
])

# Compile the model
model_final.compile(loss="binary_crossentropy",
                optimizer=tf.keras.optimizers.Adam(),
                metrics=["accuracy"])

# Fit the model
history_final = model_final.fit(train_data_augmented_shuffled,
                        epochs=5,
                        steps_per_epoch=len(train_data_augmented_shuffled),
                        validation_data=test_data,
                        validation_steps=len(test_data))
ree




Now let's check out our TinyVGG model's performance.

plot_loss_curves(history_final)
ree


















Now our training curves are looking good, however, we can improve more if we trained it a little longer.


Prediction Our Trained Model


What good is a trained model if you can't make predictions with it?

To really test it out, we'll upload a couple of our own images and see how the model goes, you can test with your own image as well.


The first test image we're going to use is a delicious steak.

# View our example image
!wget https://raw.githubusercontent.com/sumitdeyonline/machinelearning/main/03-steak.jpeg 
steak = mpimg.imread("03-steak.jpeg")
plt.imshow(steak)
plt.axis(False);
ree










Check the shape of the image

# Check the shape of our image
steak.shape
ree


Since our model takes in images of shapes (224, 224, 3), we've got to reshape our custom image to use it with our model.


To do so, we can import and decode our image using tf.io.read_file (for reading files) and tf.image (for resizing our image and turning it into a tensor). Let's create a function to convert the image preparation.

# Create a function to import an image and resize it to be able to be used with our model
def load_and_prep_image(filename, img_shape=224):
  """
  Reads an image from filename, turns it into a tensor
  and reshapes it to (img_shape, img_shape, colour_channel).
  """
  # Read in target file (an image)
  img = tf.io.read_file(filename)

  # Decode the read file into a tensor & ensure 3 colour channels 
  # (our model is trained on images with 3 colour channels and sometimes images have 4 colour channels)
  img = tf.image.decode_image(img, channels=3)

  # Resize the image (to the same size our model was trained on)
  img = tf.image.resize(img, size = [img_shape, img_shape])

  # Rescale the image (get all values between 0 and 1)
  img = img/255.
  return img

Now we've got a function to load our custom image, let's load it in.

# Load in and preprocess our custom image
steak = load_and_prep_image("03-steak.jpeg")
steak

----

----

ree









There's one more problem. Although our image is in the same shape as the images our model has been trained on, we're still missing a dimension. Remember how our model was trained in batches? Well, the batch size becomes the first dimension.

So in reality, our model was trained on data in the shape of (batch_size, 224, 224, 3).

We can fix this by adding an extra to our custom image tensor using tf.expand_dims.

# Add an extra axis
print(f"Shape before new dimension: {steak.shape}")
steak = tf.expand_dims(steak, axis=0) # add an extra dimension at axis 0
#steak = steak[tf.newaxis, ...] # alternative to the above, '...' is short for 'every other dimension'
print(f"Shape after new dimension: {steak.shape}")
steak
ree







----

----

ree






the predictions come out in prediction probability form. In other words, this means how likely the image is to be one class or another.

Since we're working with a binary classification problem, if the prediction probability is over 0.5, according to the model, the prediction is most likely to be a positive class (class 1).

And if the prediction probability is under 0.5, according to the model, the predicted class is most likely to be the negative class (class 0)


Let's create a function that would make a prediction of the input image using the trained model and plot the image with the predicted class as the title.

def pred_and_plot(model, filename, class_names):
  """
  Imports an image located at filename, makes a prediction on it with
  a trained model and plots the image with the predicted class as the title.
  """
  # Import the target image and preprocess it
  img = load_and_prep_image(filename)

  # Make a prediction
  pred = model.predict(tf.expand_dims(img, axis=0))

  # Get the predicted class
  pred_class = class_names[int(tf.round(pred)[0][0])]

  # Plot the image and predicted class
  plt.imshow(img)
  plt.title(f"Prediction: {pred_class}")
  plt.axis(False);

Finally, making a test of our custom image with prediction

# Test our model on a custom image
pred_and_plot(model_final, "03-steak.jpeg", class_names)
ree













Wow.. Our prediction is right, our model starts working. You can start predicting more images using pred_and_plot function. Please try yourself.


In the next part, we would discuss about Multi-class classification with Convolutional Neural Networks(CNN) (Part 2), stay tuned.



 
 
 

Now we are moving from regression problem to classification problem. Generally, classification problems predict whether something is one thing or another.


Following ways, we can explain the classification problem

  • Predict whether or not someone has cancer detection on their health parameters. This is called binary classification since there are only two options.

  • Decide whether a photo is of food, a person, or a cat. This is called multi-class classification since there are more than two options.

  • Predict what categories should be assigned to a Blog article. This is called multi-label classification since a single article could have more than one category assigned.

The architecture of a classification neural network(Typical)

Why typical? There are many ways you can write a neural network that is depending on the type of problem we are working on. There are some fundamentals all deep neural networks contain-

  • An input layer.

  • Some hidden layers.

  • An output layer.

Following are some standard values we'll often use in our classification of neural networks


Hyperparameter Binary Classification Multiclass classification


Input layer shape Number of features Same as binary classification


Hidden layer(s) Problem specific,min=1,max=unlimited Same as binary classification


Neurons per Problem specific, generally 10 to 100 Same as binary classification hidden layer


Output layer shape 1 (one class or the other) 1 per class


Hidden activation Usually ReLU (rectified linear unit) Same as binary classification


Output activation Sigmoid Softmax


Loss function Cross entropy Cross entropy

in TensorFlow) Crossentropy in TensorFlow)


Optimizer SGD (stochastic gradient descent) Same as binary classification

, Adam


Multiclass classification with a larger example


In this session we would do some experiments on multiclass classification, for example, we would build a neural network to predict where a piece of clothing was a shoe, a shirt, a jacket, or anything else.


To start, we'll need some data. The good thing for us is TensorFlow has a multiclass classification dataset known as Fashion MNIST built-in. Meaning we can get started straight away. We can import it using the tf.keras.datasets module.

Resource: The following multiclass classification problem has been adapted from the


Load data(Train, Test) from the fashion mnist dataset


import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist

# The data has already been sorted into training and test sets for us

(train_data, train_labels), (test_data, test_labels) = fashion_mnist.load_data()
ree

In deep learning, it is most important to see the shape of our data

# Check the shape of our data
train_data.shape, train_labels.shape, test_data.shape, test_labels.shape
ree

There are 60,000 training examples each with shape(28,28) and a label each as well as 10,000 test examples of shape(28,28), let's visualize a single example

# Plot a single example
import matplotlib.pyplot as plt
plt.imshow(train_data[7]);
ree









Now check the sample label

 # Check our samples label
 train_labels[7]   
ree

It looks like labels are in the numerical figure, it is good for neural networks, now we need to change to human-readable format.

Let's create a small list of the class names (we can find them on the dataset's Github page)


class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 
               'Coat','Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# Class names and How many classes are there(this'll be our output shape)?
 class_names, len(class_names)
ree








Let's plot another example with a human-readable class name

# Plot an example image and its label
plt.imshow(train_data[17], cmap=plt.cm.binary) # change the colours to B&W 
plt.title(class_names[train_labels[17]]);
ree









Wow! it is T-Shirt/Top, let's try a few random images from fashion MNIST



# Plot multiple random images of fashion MNIST
import random
plt.figure(figsize=(7, 7))
for i in range(8):
  ax = plt.subplot(4, 4, i + 1)
  rand_index = random.choice(range(len(train_data)))
  plt.imshow(train_data[rand_index], cmap=plt.cm.binary)
  plt.title(class_names[train_labels[rand_index]])
  plt.axis(False)
ree







Create Model


Now is time to build a model to figure out the relationship between the pixel values and their labels. Here is the input and output shape


The input shape would be 28X28 tensors(height and weight of the image)

The Output shape will predict for 10 different classes.


After working on many experiments on creating models, we came up with the following model close-to-ideal learning rate and performed pretty well.



# Set random seed
tf.random.set_seed(42)
# Create the model
model = tf.keras.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),#input layer reshape 28x28 
                                                #to 784)
  tf.keras.layers.Dense(4, activation="relu"),
  tf.keras.layers.Dense(4, activation="relu"),
  tf.keras.layers.Dense(10, activation="softmax") # output shape is 10
])
  
# Compile the model
model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
       optimizer=tf.keras.optimizers.Adam(lr=0.001), # ideal learning rate 
       metrics=["accuracy"])

# Fit the model
history = model.fit(train_data,
                    train_labels,
                    epochs=20,
                    validation_data=(test_data, test_labels))

----

----

ree

Now Let's evaluate the model



# Make predictions with the most recent model
y_probs = model.predict(test_data) # "probs" is short for probabilities

# View the first 5 predictions
y_probs[:5]
ree









This prediction is not human-readable, let's work on the human-readable prediction.

Let's create a function to get a prediction about an input image.


  
# Create a function is taking an input image and idex,provide its 
# prediction

def plot_random_image(model, image, index, true_labels, classes):
"""Get an input image, plots it and labels it with a predicted and truth label.
  Args:
    model: a trained model (trained on data similar to what's in images).
    image: an image.
    index: Index inside the tensor
    true_labels: array of ground truth labels for images.
    classes: array of class names for images.
  Returns:
    A plot of a image with a predicted class label from `model`
    as well as the truth class label from `true_labels`.
"""

# Create predictions and targets.
target_image = image[index]
pred_probs = model.predict(target_image.reshape(1, 28, 28)) # have to 
                              # reshape to get into right size for model 
pred_label = classes[pred_probs.argmax()]
true_label = classes[true_labels[index]]

# Plot the target image
plt.imshow(target_image, cmap=plt.cm.binary)

# Change the color of the titles depending on if the prediction is right 
# or wrong
if pred_label == true_label:
  color = "green"
else:
  color = "red"

# Add xlabel information (prediction/true label)
plt.xlabel("Prediction: {} {:2.0f}% (Actual Lebel: {})".format(pred_label,
                                            100*tf.reduce_max(pred_probs),
                                            true_label),
           color=color) # set the color to green or red     

Our function is ready to use, now we need to pick any image with an index number and pass it to this function, we would get a prediction of that particular image.


Let's try image index 18 from the tensor

# Plot an example image and its label from  test data
plt.imshow(test_data[18], cmap=plt.cm.binary) # change the colours to B&W 
plt.title(class_names[train_labels[18]]);
ree






So, index 18 from the test tensor is a Bag, let's try to pass this image in our function



# Check out the image as well as its prediction
plot_random_image(model=model, 
                  image=test_data,
                  index=18, 
                  true_labels=test_labels, 
                  classes=class_names)
ree










Wow! Our prediction is 88% accurate and it is a "Bag", now try out a negative scenario, let's try index 17 from the test tensor

# Plot an example image and its label fro  test data
plt.imshow(test_data[17], cmap=plt.cm.binary) # change the colours to B&W 
plt.title(class_names[train_labels[17]]); 
ree









The actual label of this image is Coat. Let's try to pass this image in our function

# Check out a random image as well as its prediction
plot_random_image(model=model, 
                  image=test_data,
                  index=17, 
                  true_labels=test_labels, 
                  classes=class_names
ree










It's came up with a wrong prediction(Pullover), the actual prediction is Coat.


Did you figure out which predictions the model gets confused on?


It seems to mix up with Coat and Pullover, or Sneaker and an Ankle Boot. The overall shapes of Coat and Pullover, or Sneaker and an Ankle Boot are SIMILAR. The overall shape might be one of the patterns the model has learned and so therefore when two images have a similar shape, their predictions get mixed up. This is a very common behavior of any deep learning model.



 
 
 

Continue after Part 1


Create a little bit of a bigger dataset and a new model


we'll create a little bit of a bigger dataset(using NumPy) and create new models to compares the predictions.


Let's create the bigger Input Dataset


import tensorflow as tf
import numpy as np
# Make a bigger dataset
X = np.arange(-100, 100, 4)
X
ree




Bigger Output Dataset


# Make labels for the dataset (adhering to the same pattern as before)
y = np.arange(-90, 110, 4)
y
ree



If we add y=X+10, we can make the labels like so:

# Same result as above
y = X + 10
y
ree



Split Dataset into training/test dataset

One of the other most common and important steps in a machine learning project is creating a training and test set (and when required, a validation set) and each set is required for a different purpose

  • Training set - the model learns from this data, which is typically 70-80% of the total data available (like the course materials you study during the semester).

  • Validation set - the model gets tuned on this data, which is typically 10-15% of the total data available (like the practice exam you take before the final exam).

  • Test set - the model gets evaluated on this data to test what it has learned, it's typically 10-15% of the total data available (like the final exam you take at the end of the semester).

Now we need to create Training and Validation dataset by splitting X and y arrays.

# Check how many samples we have
len(X), len(y)
ree


# Split data into train and test sets
X_train = X[:40] # first 40 examples (80% of data)
y_train = y[:40]
X_test = X[40:] # last 10 examples (20% of data)
y_test = y[40:]

len(X_train), len(X_test)
ree


Visualizing the data


Now we have to visualize our data, let's plot a nice colorful plot to visualize the data. using matpotlib.pyplot library to generate a plot



import matplotlib.pyplot as plt
plt.figure(figsize=(10, 7))
# Plot training data in blue
plt.scatter(X_train, y_train, c='b', label='Training data')
# Plot test data in green
plt.scatter(X_test, y_test, c='g', label='Testing data')
# Show the legend
plt.legend();
ree

Anytime we can visualize data, model, prediction etc. are a good idea.


Time to build model

With this graph in mind, what we'll be trying to do is build a model which learns the pattern in the blue dots (X_train) to draw the green dots (X_test).

# Set random seed
tf.random.set_seed(42)

# Create a model 
model = tf.keras.Sequential([
   tf.keras.layers.Dense(1, input_shape=[1]) 
   # define the input_shape to our model
])

# Compile model (same as above)  
model.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.SGD(),
              metrics=["mae"])
# Fit the model to the training data
model.fit(X_train, y_train, epochs=100, verbose=0) 
# verbose controls how much gets output
ree

Visualizing the predictions


Now we have the trained model, visualize some predictions. To visualize predictions, it's always a good idea to plot them against the ground truth labels.


Often you'll see this in the form of y_test vs. y_pred (ground truth vs. predictions).

First, we'll make some predictions on the test data (X_test), remember the model has never seen the test data.

# Make predictions
y_preds = model.predict(X_test)
# View the predictions
y_preds
ree








Let's create a plotting function to visualize the data

def plot_predictions(train_data=X_train,
                     train_labels=y_train,
                     test_data=X_test,
                     test_labels=y_test,
                     predictions=y_preds):
"""  
Plots training data, test data and compares predictions.  
"""
plt.figure(figsize=(10, 7))
# Plot training data in blue
plt.scatter(train_data, train_labels, c="b", label="Training data")
# Plot test data in green
plt.scatter(test_data, test_labels, c="g", label="Testing data")
# Plot the predictions in red (predictions were made on the test data)
plt.scatter(test_data, predictions, c="r", label="Predictions")
# Show the legend
plt.legend();

Now we are getting plot using this function

plot_predictions(train_data=X_train,
                 train_labels=y_train,
                 test_data=X_test,
                 test_labels=y_test,
                 predictions=y_preds)
ree

We can see, our predictions aren't totally correct, we can do more experiments to improve this result.

Running experiments to improve a model


There are many ways to improve your model, please find the most common way to do it

  1. Get more data - get more examples for your model to train on (more opportunities to learn patterns).

  2. Make your model larger (use a more complex model) - this might come in the form of more layers or more hidden units in each layer.

  3. Train for longer - give your model more of a chance to find the patterns in the data.

In a real-world situation, we would not be able to change data, let's experiment with the 2 and 3. Let's create three models with the following scenarios

  1. model_1 - same as the original model, trained for 100 epochs.

  2. model_2 - 2 layers, trained for 100 epochs.

  3. model_3 - 2 layers, trained for 500 epochs.

Build model_1

# Set random seed
tf.random.set_seed(42)

# Replicate original model 
model_1 = tf.keras.Sequential([
  tf.keras.layers.Dense(1)
])

# Compile model 
model_1.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.SGD(),
              metrics=["mae"])
              
# Fit the model to the training data
model_1.fit(tf.expand_dims(X_train, axis=-1), y_train, epochs=100) 

-----

-----

ree

Let's make predictions for model_1

# Make and plot predictions for model_1
y_preds_1 = model_1.predict(X_test)
plot_predictions(predictions=y_preds_1)
ree

Not much improvement from the previous model, let's build model_2


Build model_2

This time we are adding an extra dense layer, keeping everything else the same


# Set random seed
tf.random.set_seed(42)

# Replicate model_1 and add an extra layer 
model_2 = tf.keras.Sequential([
  tf.keras.layers.Dense(1),
  tf.keras.layers.Dense(1) # add a second layer,
])
# Compile model 
model_2.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.SGD(),
              metrics=["mae"])
              
# Fit the model to the training data
model_2.fit(tf.expand_dims(X_train, axis=-1), y_train, epochs=100, verbose=0) # set verbose to 0 for less output 
ree


Let's predict for the model_2

# Make and plot predictions for model_2
y_preds_2 = model_2.predict(X_test)
plot_predictions(predictions=y_preds_2)
ree

It's looking far better than model_1 after adding an extra layer, let's try to build the third model, everything keeps the same as model_2 except train for longer(500 instead of 100)


Build model_3

Train this model for 500 epochs

# Set random seed
tf.random.set_seed(42)

# Replicate model_2 
model_3 = tf.keras.Sequential([
  tf.keras.layers.Dense(1),
  tf.keras.layers.Dense(1) # add a second layer,
])
# Compile model 
model_3.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.SGD(),
              metrics=["mae"])
              
# Fit the model to the training data
model_3.fit(tf.expand_dims(X_train, axis=-1), y_train, epochs=500, verbose=0) # set verbose to 0 for less output
ree


Let's predict the model_3

# Make and plot predictions for model_3
y_preds_3 = model_3.predict(X_test)
plot_predictions(predictions=y_preds_3)
ree

It is really strange, model performance has been worse when trained for longer. As it turns out that when you train a model too long, the result can be worse.


Comparing the results


Now we got results of three similar models, we need to compare the results.

Before we compare the three models, create two functions to get the mean absolute error and mean squared error value for test data and predictions.


def mae(y_test, y_pred):
  """  Calculuates mean absolute error between y_test and y_preds.              
  """
  return tf.metrics.mean_absolute_error(y_test,
                                        y_pred)
                                        
 def mse(y_test, y_pred):
   """  
   Calculates mean squared error between y_test and y_preds.  
   """
   return tf.metrics.mean_squared_error(y_test,
                                        y_pred)

Now we are calculating mae and mse for all three models

# Calculate model_1 metrics
mae_1 = mae(y_test, y_preds_1.squeeze()).numpy()
mse_1 = mse(y_test, y_preds_1.squeeze()).numpy()
mae_1, mse_1
ree


# Calculate model_2 metrics
mae_2 = mae(y_test, y_preds_2.squeeze()).numpy()
mse_2 = mse(y_test, y_preds_2.squeeze()).numpy()
mae_2, mse_2
ree


# Calculate model_3 metrics
mae_3 = mae(y_test, y_preds_3.squeeze()).numpy()
mse_3 = mse(y_test, y_preds_3.squeeze()).numpy()
mae_3, mse_3
ree


Let's compare the model

import pandas as pd
model_results = [["model_1", mae_1, mse_1],
                 ["model_2", mae_2, mse_2],
                 ["model_3", mae_3, mae_3]]
all_results = pd.DataFrame(model_results, columns=["model", "mae", "mse"])
all_results               
ree






The results of our experiment are that model_2 performance is the best out of the three models. Comparing models is tedious, in this case, we are comparing three models, in the real-world scenario we may need to compare more models.


But this is part of what machine learning modeling is about, trying many different combinations of models and seeing which performs best.


Another thing you'll also find is what you thought may work (such as training a model for longer) may not always work and the exact opposite is also often the case.


 
 
 

Technology Blog - Python - Graph API and SharePoint 

© 2023 by T-MARKET. Proudly created with Wix.com

  • Facebook - Black Circle
  • Twitter - Black Circle
  • Google+ - Black Circle
bottom of page