Predict Sales Using a Neural Network

In this 20-minute tutorial you will learn how to build your own neural network from scratch that can make lemonade sales predictions! You will be using a synthetic dataset to train your network. You will code the entire project using Python and the popular A.I. library PyTorch completely on your own following the instructor.

Level:  Intermediate
Time:  20 minutes
Equipment:  Google Chrome Browser

Overview

In this free tutorial, you will build a neural network from scratch and train it to make sales predictions using a simple, synthetic dataset. You will be introduced to PyTorch, a deep learning library managed by Meta's AI group, powering lots of A.I. applications around the world today. We will be working on a synthetic dataset that catalogs the daily number of lemons sold at a lemon stand. After training, the neural network will be able to predict the number of lemons that are likely going to be sold on a given day.

Setup Option 1 – Use Google Colab (Simplest Way)

In order to complete this tutorial using Google Colab:

Watch our free tutorial on Google Colab Overview (optional)
Head over to Google Colab and load the starter notebook by clicking on the link below
https://colab.research.google.com/github/LeakyAI/BirdDetector/blob/main/BirdDetector%20-%20START%20HERE.ipynb
Follow along the video tutorial above to complete the notebook

Setup Option 2 – Run Notebook Directly on Your Own Machine

If you are using your own laptop or desktop to run the notebook locally, we recommend you complete the tutorial on how to configure your PC for A.I. and then complete this tutorial. Follow the steps below:

Complete the How to Configure your PC for A.I. (20 Minutes)
https://www.leaky.ai/configure-pc-for-ai-20-minutes
Return to this tutorial and open up the notebook FirstNeuralNetwork - Start Here.ipynb
Follow along the video

Step 1 - Setup our Environment

Let's start by importing the software libraries we will need to build our neural network. We will import PyTorch and check the version of PyTorch that has been imported. You will usually want to run the latest version. You can always check the latest version by heading over to PyTorch.org: https://pytorch.org/

# Import PyTorch libraries
import torch
from torch import nn 

# Import visualization library
import matplotlib.pyplot as plt 

# Verify PyTorch version
torch.__version__

Check Our Processing Capability (CPU vs. GPU)

When developing A.I. projects, it will help to have a powerful GPU. While this project does not require one, the code below will detect if one is present in your environment and use it during the training process.

# Check to see if we have a GPU to use for training
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print('A {} device was detected.'.format(device)) 

# Print the name of the cuda device, if detected
if device=='cuda':
  print (torch.cuda.get_device_name(device=device))

Step 2 - Download and Prepare our Dataset

When training a neural network from scratch, you will usually need a lot of data. We will start by loading all the lemonade stand data for one year (365 items) which is a rather small, simple synthetic dataset. It includes information about the day the lemonade was sold, whether or not it was a weekend, sunny, warm, a big sign was present to advertise and the price charged. Finally, there is the number of lemonade's sold on that day. Our neural network will be trained to predict the number of lemonade's sold (output) based on the other attributes (inputs).

# Use Pandas to do our dataprocessing on the dataset, start by downloading the dataset
import pandas as pd
url = 'https://raw.githubusercontent.com/LeakyAI/FirstNeuralNet/main/lemons.csv'
df = pd.read_csv(url)

# Explore the first 10 rows of the dataset
df.head(10)

# Check the size/shape of our dataset
df.shape

Create our Inputs and Outputs for Training our Neural Network

The data has been collected in a table with the following columns: Weekend, Sunny, Warm, BigSign, Price and NumberSold. While the dataset is more or less ready to be used, we have two fields (Price and NumberSold) that contain real values. Usually, it's easier to train neural networks if the values are in the range of -1..1. To accomplish this, we will simply need to standardize both values (Price and NumberSold).

# Calculate the mean and standard deviation of the price column, then standardize the price column
priceMean = df['Price'].mean()
priceStd = df['Price'].std()
df['Price'] = (df['Price']-priceMean)/priceStd

# Calculate the mean and standard deviation of the numSold column, then standardize numSold
numSoldMean = df['NumberSold'].mean()
numSoldStd = df['NumberSold'].std()
df['NumberSold'] = (df['NumberSold']-numSoldMean)/numSoldStd

Create our Input (x) and Output (y) to Train our Neural Network

Here you will create the input (x) and output (y) data needed to train our network. The number we want our neural network to predict is the column called 'NumberSold'. This will be the output (y). We will need to separate our input (Weekend, Sunny, Warm, BigSign, Price) from the output (NumberSold). Here we will use PyTorch tensors which are just multi-dimensional arrays where all values must be of the same type (usually floats).

# Create our PyTorch tensors and move to CPU or GPU if available
# Extract the inputs and create a PyTorch tensor x (inputs)
inputs = ['Weekend','Sunny','Warm','BigSign','Price']
x = torch.tensor(df[inputs].values,dtype=torch.float, device=device)

# Extract the outputs and create a PyTorch tensor y (outputs)
outputs = ['NumberSold']
y = torch.tensor(df[outputs].values,dtype=torch.float, device=device)

# Explore the first 5 inputs
x[0:5]
# Explore the first 5 outputs
y[0:5]

Step 3 - Build your Neural Network

Below you will build a simple neural network that will take as input the 5 input values above ('Weekend', 'Sunny', 'Warm', 'BigSign', 'Price') and produce a single value as an output. This network has a single hidden layer of 100 units.

# Define your PyTorch neural network
# Number of Inputs: 5
# Number of Hidden Units: 100
# Number of Hidden Layers: 1
# Activation Function:  Relu
# Number of Outputs: 1

model = nn.Sequential(
            nn.Linear(5,100),
            nn.ReLU(),
            nn.Linear(100,1)
        )

# Move it to either the CPU or GPU depending on what we have available
model.to(device)

Step 4 - Train your Neural Network

Here we will train our neural network on the dataset (our set of inputs and outputs above). The training loop will adjust the weights within our neural network to make it able to predict our number of lemonades sold during the training process.

import torch.optim as optim

# Measure our neural network by mean square error
criterion = torch.nn.MSELoss()

# Train our network with a simple SGD optimizer
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

# Train our network a using the entire dataset 5 times
for epoch in range(5):
    totalLoss = 0
    for i in range(len(x)):
       # Single Forward Pass
        ypred = model(x[i])

        # Measure how well the model predicted vs the actual value
        loss = criterion(ypred, y[i])

        # Track how well the model predicted (called loss)
        totalLoss+=loss.item()

        # Update the neural network
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    # Print out our loss after each training iteration
    print ("Total Loss: ", totalLoss)

Get A.I. Tutorials In Your Inbox

Stay up to-date with the latest A.I. tutorials. Join now, it is completely free and you can opt out anytime.

Step 5 - Analyze the Network's Performance

Now, the next thing we want to do is analyze the network's performance. That is, how well is our network able to make predictions? We can do that by making a simple plot with the bottom axis showing the actual number of lemonades sold on a particular day and the other with the number of lemonades predicted by our neural network. If the network is able to make good predictions, we should see a relatively simple straight line of dots!

# Plot predictions vs. true values
@torch.no_grad()
def graphPredictions(model, x, y , minValue, maxValue):
    model.eval()                               # Set the model to inference mode

    predictions=[]                             # Track predictions
    actual=[]                                  # Track the actual labels

    x.to(device)
    y.to(device)
    model.to(device)

    for i in range(len(x)):
        # Single forward pass
        pred = model(x[i])                               
        
        # Un-normalize our prediction
        pred = pred*numSoldStd+numSoldMean
        act = y[i]*numSoldStd+numSoldMean

        # Save prediction and actual label
        predictions.append(pred.tolist())
        actual.append(act.item())

    # Plot actuals vs predictions
    plt.scatter(actual, predictions)
    plt.xlabel('Actual Lemonades Sold')
    plt.ylabel('Predicted Lemonades Sold')
    plt.plot([minValue,maxValue], [minValue,maxValue])
    plt.xlim(minValue, maxValue)
    plt.ylim(minValue, maxValue)

    # Make the display equal in both dimensions
    plt.gca().set_aspect('equal', adjustable='box')
    plt.show()

graphPredictions(model, x, y, 0, 300)

Wow, our neural network did a really good job learning how to predict the number of lemonades sold based on all the inputs. With a chart like the one above, the closer the dots are to the line, the better the neural network predicted the number of lemonades sold compared to the actual number for that day.

Step 6 - Test with Your Own Predictions

Since our network is now trained, we can use it to make new predictions by passing in new input values. Since we have a synthetic dataset, we are able to evaluate the accuracy of the result.

# Below we use the synthetic data generator formula to
# determine what the actual result should have been.
def datasetGenerator(weekend, sunny, warm, bigsign, price):
    numlemonssold = 0
    if weekend:
        numlemonssold = (sunny*5  + int(500 / price))
        if bigsign:
            numlemonssold = 1.3 * numlemonssold
        if warm:
            numlemonssold = 2 * numlemonssold
        if sunny:
            numlemonssold = 1.25 * numlemonssold
    numlemonssold = int(numlemonssold)   

    return numlemonssold

# Data that affects the number of lemons sold in one day
weekend = 1
sunny = 0
warm = 0   
bigsign = 1
price = 5

# Calculate what would have been the actual result using
# the synthetic dataset's algorithm
actual = datasetGenerator(weekend, sunny, warm, bigsign, price)

# Use the CPU as we just need to do a single pass
model.to('cpu')

# Normalize our inputs using the same values for our training
price = (price - priceMean) / priceStd

# Create our input tensor
x1 = torch.tensor([weekend, sunny, warm, bigsign, price],dtype=float)

# Pass the input into the neural network
y1 = model(x1.float())

# Un-normalize our output y1
y1 = y1*numSoldStd+numSoldMean

# Compare what your network predicted to the actual
print ("Neural Network Predicts: ", y1.item())
print ("Actual Result: ", actual)

Solution

If you run into any trouble, have a look at the solution notebook below:

https://colab.research.google.com/github/LeakyAI/FirstNeuralNet/blob/main/FirstNeuralNetworkSolution.ipynb

Congratulations!

This is just a beginning but hopefully you can see the power of neural networks! There are lots of additional things you could do to make this project a lot better including:

Breaking the dataset up into a training, validation and a testing set
Attempt the same approach using a real-world dataset
Tune the training process to result in more accurate predictions

To learn more about techniques above, you can always check-out our Complete A.I. Programming Course where you can learn everything you need to know to start building your own A.I. projects!

Happy Learning!

leaky.ai team

Hands-On A.I. Programming Course

(4 Week Course - Beginner)

This is a self-paced hands-on course introducing you to the art of A.I. programming with the popular deep learning A.I. library PyTorch. The course will guide you through step-by-step all the basics of developing real-world A.I. projects including how to curate datasets, build neural networks, training and inference. Enroll today and get started!

Learn More