This is an automated email from the ASF dual-hosted git repository. damccorm pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/master by this push: new c1a5ce7bf0a Add notebook for image processing using beam (#27034) c1a5ce7bf0a is described below commit c1a5ce7bf0aee88ee8f827a72a15c4ce4b5b14d3 Author: Reeba Qureshi <64488642+reeba...@users.noreply.github.com> AuthorDate: Tue Jun 27 21:33:52 2023 +0530 Add notebook for image processing using beam (#27034) * Add notebook for image processing using beam * Delete image_processing_tensorflow.ipynb * add image processing use case using tensorflow * Delete image_processing_tensorflow.ipynb * Add image processing use case after suggestions * Delete image_processing_beam.ipynb * Add image processing with implemented suggestions --- .../beam-ml/image_processing_tensorflow.ipynb | 951 +++++++++++++++++++++ 1 file changed, 951 insertions(+) diff --git a/examples/notebooks/beam-ml/image_processing_tensorflow.ipynb b/examples/notebooks/beam-ml/image_processing_tensorflow.ipynb new file mode 100644 index 00000000000..0914653f1de --- /dev/null +++ b/examples/notebooks/beam-ml/image_processing_tensorflow.ipynb @@ -0,0 +1,951 @@ +{ + "cells": [ + { + "cell_type": "code", + "source": [ + "# @title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n", + "\n", + "# Licensed to the Apache Software Foundation (ASF) under one\n", + "# or more contributor license agreements. See the NOTICE file\n", + "# distributed with this work for additional information\n", + "# regarding copyright ownership. The ASF licenses this file\n", + "# to you under the Apache License, Version 2.0 (the\n", + "# \"License\"); you may not use this file except in compliance\n", + "# with the License. You may obtain a copy of the License at\n", + "#\n", + "# http://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing,\n", + "# software distributed under the License is distributed on an\n", + "# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n", + "# KIND, either express or implied. See the License for the\n", + "# specific language governing permissions and limitations\n", + "# under the License" + ], + "metadata": { + "id": "NsNImDL8TGM1" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "# Image Processing using Apache Beam\n", + "\n", + "<table align=\"left\">\n", + " <td>\n", + " <a target=\"_blank\" href=\"https://colab.sandbox.google.com/github/apache/beam/blob/master/examples/notebooks/beam-ml/image_processing_tensorflow.ipynb\"><img src=\"https://raw.githubusercontent.com/google/or-tools/main/tools/colab_32px.png\" />Run in Google Colab</a>\n", + " </td>\n", + " <td>\n", + " <a target=\"_blank\" href=\"https://github.com/apache/beam/blob/master/examples/notebooks/beam-ml/image_processing_tensorflow.ipynb\"><img src=\"https://raw.githubusercontent.com/google/or-tools/main/tools/github_32px.png\" />View source on GitHub</a>\n", + " </td>\n", + "</table>\n", + "\n" + ], + "metadata": { + "id": "SwN0Rj4cJSg5" + } + }, + { + "cell_type": "markdown", + "source": [ + "Image Processing is a machine learning technique to read, analyze and extract meaningful information from images. It involves multiple steps such as applying various preprocessing fuctions, getting predictions from a model, storing the predictions in a useful format, etc. Apache Beam is a suitable tool to handle these tasks and build a structured workflow. This notebook demonstrates the use of Apache Beam in image processing and performs the following:\n", + "* Import and preprocess the CIFAR-10 dataset\n", + "* Train a TensorFlow model to classify images\n", + "* Store the model in Google Cloud and create a model handler\n", + "* Build a Beam pipeline to:\n", + " 1. Create a [PCollection]('https://beam.apache.org/documentation/programming-guide/#pcollections') of input images\n", + " 2. Perform preprocessing [transforms]('https://beam.apache.org/documentation/programming-guide/#transforms')\n", + " 3. RunInference to get predictions from the previously trained model\n", + " 4. Store the results\n", + "\n", + "For more information on using Apache Beam for machine learning, have a look at [AI/ML Pipelines using Beam]('https://beam.apache.org/documentation/ml/overview/')." + ], + "metadata": { + "id": "yxLoBQxocAOv" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OSZrRmHl9NQY" + }, + "source": [ + "## Installing Apache Beam" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "MO7iNmvkBdA5", + "outputId": "6c76e29d-3c70-4c3e-aca2-7cc1dcd167a1" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m14.3/14.3 MB\u001b[0m \u001b[31m28.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m89.7/89.7 kB\u001b[0m \u001b[31m8.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25h Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m137.0/137.0 kB\u001b[0m \u001b[31m15.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m152.0/152.0 kB\u001b[0m \u001b[31m14.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25h Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.6/2.6 MB\u001b[0m \u001b[31m44.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m648.9/648.9 kB\u001b[0m \u001b[31m46.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.7/2.7 MB\u001b[0m \u001b[31m84.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25h Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m283.7/283.7 kB\u001b[0m \u001b[31m27.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25h Building wheel for crcmod (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + " Building wheel for dill (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + " Building wheel for docopt (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m307.5/307.5 kB\u001b[0m \u001b[31m7.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m152.8/152.8 kB\u001b[0m \u001b[31m16.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m138.3/138.3 kB\u001b[0m \u001b[31m13.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25h Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m798.7/798.7 kB\u001b[0m \u001b[31m24.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 MB\u001b[0m \u001b[31m46.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.1/2.1 MB\u001b[0m \u001b[31m60.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25h Building wheel for timeloop (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n", + "google-colab 1.0.0 requires ipykernel==5.5.6, but you have ipykernel 6.23.2 which is incompatible.\n", + "google-colab 1.0.0 requires ipython==7.34.0, but you have ipython 8.14.0 which is incompatible.\u001b[0m\u001b[31m\n", + "\u001b[0m" + ] + } + ], + "source": [ + "!pip install apache_beam --quiet\n", + "!pip install apache-beam[interactive] --quiet" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "45mf7oHu9XbI" + }, + "source": [ + "## Importing necessary libraries\n", + "Here is a brief overview of the uses of each library imported:\n", + "* **NumPy**: Multidimensional numpy arrays are used to store images, and the library also allows performing various operations on them.\n", + "* **Matplotlib**: Displays images stored in numpy array format.\n", + "* **TensorFlow**: Trains a machine learning model.\n", + "* **TFModelHandlerNumpy**: Defines the configuration used to load/use the model that we train. We use `TFModelHandlerNumpy` because the model was trained with TensorFlow and takes numpy arrays as input.\n", + "* **RunInference**: Loads the model and obtains predictions as part of the Apache Beam pipeline. For more information, see [docs on prediction and inference](https://beam.apache.org/documentation/ml/inference-overview/).\n", + "* **Apache Beam**: Builds a pipeline for Image Processing." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "z5_PUeZgOygU" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "import tensorflow as tf\n", + "from apache_beam.ml.inference.tensorflow_inference import TFModelHandlerNumpy\n", + "from apache_beam.ml.inference.base import RunInference\n", + "import apache_beam as beam" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "x3tSAqP7R2rZ" + }, + "source": [ + "## CIFAR-10 Dataset\n", + "CIFAR-10 is a popular dataset used for multiclass object classification.\n", + "It has 60,000 images of the following 10 categories:\n", + "\n", + "* airplane\n", + "* automobile\n", + "* bird\n", + "* cat\n", + "* deer\n", + "* dog\n", + "* frog\n", + "* horse\n", + "* ship\n", + "* truck\n", + "\n", + "The dataset can be directly imported from the TensorFlow library." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "MqylmjBhPCOW", + "outputId": "9d9f5854-80f2-4a4f-a52b-2b81d6295639" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz\n", + "170498071/170498071 [==============================] - 4s 0us/step\n" + ] + } + ], + "source": [ + "(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()" + ] + }, + { + "cell_type": "code", + "source": [ + "x_test.shape" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "pfzkgryZUV8P", + "outputId": "79bc798f-f93b-4d7b-8783-c5defa6a2322" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(10000, 32, 32, 3)" + ] + }, + "metadata": {}, + "execution_count": 4 + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "The labels in y_train and y_test are numeric, with each number representing a class. The labels list defined below contains the various classes, and their positions in the list represent the corresponding number used to refer to them." + ], + "metadata": { + "id": "6hEHIHPsVxw4" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "3uImFIBXv0My" + }, + "outputs": [], + "source": [ + "labels = ['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse','Ship', 'Truck']" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 447 + }, + "id": "zeE81PNOcGfZ", + "outputId": "d2a08cb5-4fdc-47af-c2b2-5602e7600f09" + }, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "<matplotlib.image.AxesImage at 0x7f441be49840>" + ] + }, + "metadata": {}, + "execution_count": 6 + }, + { + "output_type": "display_data", + "data": { + "text/plain": [ + "<Figure size 640x480 with 1 Axes>" + ], + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaAAAAGdCAYAAABU0qcqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAvsklEQVR4nO3df3Bc5Xn3/8/Z1e5KsqSVZVuShWVjG2Pzy85TBxwNCSXYxXanDARPB5LM1KR8YaAyU3DTJO4kEGg7SslMQpJxzB+luHkmhoQ+MQx8GyiYWDStTWsHPw5QHOwYbGJLBtv6rf2hPff3D76oFdhwX7bk2xLv18zOWNrLl+5zzu5eOtrdz0bOOScAAM6wROgFAAA+nhhAAIAgGEAAgCAYQACAIBhAAIAgGEAAgCAYQACAIBhAAIAgykIv4P3iONahQ4dUXV2tKIpCLwcAYOScU29vr5qampRInP [...] + }, + "metadata": {} + } + ], + "source": [ + "plt.imshow(x_train[800])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "4arvJDYwfsAj", + "outputId": "a355a4e2-c1a7-461e-bff9-059daaa6a9f7" + }, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(32, 32, 3)" + ] + }, + "metadata": {}, + "execution_count": 7 + } + ], + "source": [ + "x_train[0].shape" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ndeZ_RH32Upu" + }, + "source": [ + "(32, 32, 3) represents an image of size 32x32 in the RGB scale" + ] + }, + { + "cell_type": "markdown", + "source": [ + "### Preprocessing" + ], + "metadata": { + "id": "L2pg1uxSXPHn" + } + }, + { + "cell_type": "markdown", + "source": [ + "**Standardization** is the process of transforming the pixel values of an image to have zero mean and unit variance. This brings the pixel values to a similar scale and makes them easier to work with." + ], + "metadata": { + "id": "Hwwm-EHhW0rC" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ZlInmab9MD-N" + }, + "outputs": [], + "source": [ + "x_train = x_train/255.0" + ] + }, + { + "cell_type": "markdown", + "source": [ + "**Normalization** is the process of scaling the pixel values to a specified range, typically between 0 and 1. This improves the consistency of images." + ], + "metadata": { + "id": "6GFdU-HZWztg" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 447 + }, + "id": "TLmsgV9_Wij5", + "outputId": "03fb00c5-efb9-421c-ef55-bbd36679dfbe" + }, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "<matplotlib.image.AxesImage at 0x7f4412adeb30>" + ] + }, + "metadata": {}, + "execution_count": 9 + }, + { + "output_type": "display_data", + "data": { + "text/plain": [ + "<Figure size 640x480 with 1 Axes>" + ], + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaAAAAGdCAYAAABU0qcqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAvsklEQVR4nO3df3Bc5Xn3/8/Z1e5KsqSVZVuShWVjG2Pzy85TBxwNCSXYxXanDARPB5LM1KR8YaAyU3DTJO4kEGg7SslMQpJxzB+luHkmhoQ+MQx8GyiYWDStTWsHPw5QHOwYbGJLBtv6rf2hPff3D76oFdhwX7bk2xLv18zOWNrLl+5zzu5eOtrdz0bOOScAAM6wROgFAAA+nhhAAIAgGEAAgCAYQACAIBhAAIAgGEAAgCAYQACAIBhAAIAgykIv4P3iONahQ4dUXV2tKIpCLwcAYOScU29vr5qampRInP [...] + }, + "metadata": {} + } + ], + "source": [ + "x_train = (x_train - np.min(x_train)) / (np.max(x_train) - np.min(x_train))\n", + "plt.imshow(x_train[800])" + ] + }, + { + "cell_type": "markdown", + "source": [ + "**Grayscale Conversion** refers to the conversion of a colored image in RGB scale into a grayscale image. It represents the pixel intensities without considering colors, which makes calculations easier." + ], + "metadata": { + "id": "bfgy0Z_gX_lH" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "P2oPvZkbfEPo" + }, + "outputs": [], + "source": [ + "grayscale = []\n", + "for i in x_train:\n", + " grayImage = 0.07 * i[:,:,2] + 0.72 * i[:,:,1] + 0.21 * i[:,:,0]\n", + " grayscale.append(grayImage)\n", + "x_train_gray = np.asarray(grayscale)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jzQ2Zulg99NU" + }, + "source": [ + "## Defining DoFns for Image Preprocessing\n", + "\n", + "[DoFn](https://beam.apache.org/releases/typedoc/current/interfaces/transforms_pardo.DoFn) stands for \"Do Function\". In Apache Beam, it is a set of operations that can be applied to individual elements of a PCollection (a collection of data). It is similar to a function in Python, except that it is used in Beam Pipelines to apply various transformations. DoFns can be used in various Apache Beam transforms, such as ParDo, Map, Filter, and FlatMap." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "cqm-m0cONsZS" + }, + "outputs": [], + "source": [ + "class StandardizeImage(beam.DoFn):\n", + " def process(self, element: np.ndarray):\n", + " element = element/255.0\n", + " return [element]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "mZhFCgPxPEwm" + }, + "outputs": [], + "source": [ + "class NormalizeImage(beam.DoFn):\n", + " def process(self, element: np.ndarray):\n", + " element = (element-element.min())/(element.max()-element.min())\n", + " return [element]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "gv23KPt5NyXT" + }, + "outputs": [], + "source": [ + "class GrayscaleImage(beam.DoFn):\n", + " def process(self, element: np.ndarray):\n", + " element = 0.07 * element[:,:,2] + 0.72 * element[:,:,1] + 0.21 * element[:,:,0]\n", + " return [element]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8gz7_SvN-P2L" + }, + "source": [ + "## Training a Convolutional Neural Network\n", + "\n", + "A Convolutional Neural Network (CNN) is one of the most popular model types for image processing. Here is a brief description of the convolutional layers used in the model.\n", + "* **Reshape**: Changes the shape of the input data to the desired size.\n", + "The CIFAR-10 images are of 32x32 pixels in grayscale. We will train our model using these images and thus, all images fed into the model need to be reshaped to the required size, that is (32,32,1).\n", + "* **Conv2D**: Applies a set of filters to extract features from the input image, producing a feature map as the output.\n", + "This layer is used as it is an essential component of a CNN, and does the major task of finding patterns in images.\n", + "* **MaxPooling2D**: Reduces the spatial dimensions of the input while retaining the most prominent features.\n", + "We use this layer to downsample the images and preserve only the important features.\n", + "* **Flatten**: Flattens the input data or feature maps into a 1-dimensional vector.\n", + "The input images are 2-dimensional. However in the end we require our results in a 1-D array. Flatten layer is used for this.\n", + "* **Dense**: Connects every neuron in the current layer to every neuron in the subsequent layer.\n", + "The CIFAR-10 dataset contains images belonging to 10 different classes. This is why the last dense layer gives 10 outputs, where each output corresponds to the probability of an image belonging to one of the 10 classes." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "chYD9y7cH4Td", + "outputId": "b91eed4e-c383-40b8-96dc-8cf330d11a09" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Model: \"sequential\"\n", + "_________________________________________________________________\n", + " Layer (type) Output Shape Param # \n", + "=================================================================\n", + " reshape (Reshape) (None, 32, 32, 1) 0 \n", + " \n", + " conv2d (Conv2D) (None, 30, 30, 32) 320 \n", + " \n", + " max_pooling2d (MaxPooling2D (None, 15, 15, 32) 0 \n", + " ) \n", + " \n", + " conv2d_1 (Conv2D) (None, 13, 13, 64) 18496 \n", + " \n", + " max_pooling2d_1 (MaxPooling (None, 6, 6, 64) 0 \n", + " 2D) \n", + " \n", + " conv2d_2 (Conv2D) (None, 4, 4, 64) 36928 \n", + " \n", + " flatten (Flatten) (None, 1024) 0 \n", + " \n", + " dense (Dense) (None, 64) 65600 \n", + " \n", + " dense_1 (Dense) (None, 10) 650 \n", + " \n", + "=================================================================\n", + "Total params: 121,994\n", + "Trainable params: 121,994\n", + "Non-trainable params: 0\n", + "_________________________________________________________________\n" + ] + } + ], + "source": [ + "def create_model():\n", + " model = tf.keras.Sequential([\n", + " tf.keras.layers.Reshape((32,32,1),input_shape=x_train_gray.shape[1:]),\n", + " tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 1)),\n", + " tf.keras.layers.MaxPooling2D((2, 2)),\n", + " tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),\n", + " tf.keras.layers.MaxPooling2D((2, 2)),\n", + " tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),\n", + " tf.keras.layers.Flatten(),\n", + " tf.keras.layers.Dense(64, activation='relu'),\n", + " tf.keras.layers.Dense(10)\n", + " ])\n", + " model.compile(optimizer='adam',\n", + " loss='sparse_categorical_crossentropy',\n", + " metrics=['accuracy'])\n", + " return model\n", + "\n", + "model = create_model()\n", + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "source": [ + "The input shape is changed to (32,32,1) as our input images are of 32 x 32 pixels and 1 represents grayscale. In the final dense layer, there are 10 outputs as there are 10 possible classes in the CIFAR-10 dataset." + ], + "metadata": { + "id": "UpO090vNbHir" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3D1SdCC94DQi" + }, + "source": [ + "## Fitting the model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "KuMP46UVXkof", + "outputId": "5340424c-7a33-45c9-b773-3ff625c65290" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch 1/10\n", + "1563/1563 [==============================] - 87s 55ms/step - loss: 1.6511 - accuracy: 0.4054\n", + "Epoch 2/10\n", + "1563/1563 [==============================] - 84s 54ms/step - loss: 1.2737 - accuracy: 0.5540\n", + "Epoch 3/10\n", + "1563/1563 [==============================] - 80s 51ms/step - loss: 1.1204 - accuracy: 0.6095\n", + "Epoch 4/10\n", + "1563/1563 [==============================] - 79s 51ms/step - loss: 1.0184 - accuracy: 0.6461\n", + "Epoch 5/10\n", + "1563/1563 [==============================] - 80s 51ms/step - loss: 0.9430 - accuracy: 0.6724\n", + "Epoch 6/10\n", + "1563/1563 [==============================] - 81s 52ms/step - loss: 0.8810 - accuracy: 0.6946\n", + "Epoch 7/10\n", + "1563/1563 [==============================] - 80s 51ms/step - loss: 0.8299 - accuracy: 0.7135\n", + "Epoch 8/10\n", + "1563/1563 [==============================] - 80s 51ms/step - loss: 0.7904 - accuracy: 0.7248\n", + "Epoch 9/10\n", + "1563/1563 [==============================] - 80s 51ms/step - loss: 0.7504 - accuracy: 0.7385\n", + "Epoch 10/10\n", + "1563/1563 [==============================] - 84s 54ms/step - loss: 0.7150 - accuracy: 0.7498\n" + ] + }, + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "<keras.callbacks.History at 0x7f4412ead3c0>" + ] + }, + "metadata": {}, + "execution_count": 22 + } + ], + "source": [ + "model.fit(x_train_gray, y_train, epochs=10)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3obkK5fQ4KG6" + }, + "source": [ + "## Authenticating from Google Cloud\n", + "\n", + "We need to store our trained model in Google Cloud. For running inferences, we will load our model from cloud into the notebook using a Model Handler." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Jj_NX568mhsE" + }, + "outputs": [], + "source": [ + "from google.colab import auth\n", + "auth.authenticate_user()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7jn7HYPi4hH9" + }, + "source": [ + "Saving the trained model in a Google Cloud Storage bucket" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "CDqLPkzLfrt1" + }, + "outputs": [], + "source": [ + "save_model_dir = '' # Add the link to you GCS bucket here\n", + "model.save(save_model_dir)" + ] + }, + { + "cell_type": "markdown", + "source": [ + "A model handler is used to save, load and manage trained ML models. Here we used TFModelHandlerNumpy as our input images are in the form of numpy arrays." + ], + "metadata": { + "id": "ilYg15uOcZSY" + } + }, + { + "cell_type": "code", + "source": [ + "model_handler = TFModelHandlerNumpy(save_model_dir)" + ], + "metadata": { + "id": "RqefS6I_c1kc" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### Saving predictions\n", + "RunInference returns the predictions for each class. In the below DoFn, the maximum predicion is selected (which refers to the class the input image most probably belongs to) and is stored in a list of predictions." + ], + "metadata": { + "id": "4vJZEXFOboUi" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "9T_YW19P46dW" + }, + "outputs": [], + "source": [ + "from tensorflow.python.ops.numpy_ops import np_config\n", + "np_config.enable_numpy_behavior()\n", + "predictions = []\n", + "class SavePredictions(beam.DoFn):\n", + " def process(self, element, *args, **kwargs):\n", + " list_of_predictions = element.inference.tolist()\n", + " highest_prediction = max(list_of_predictions)\n", + " ans = labels[list_of_predictions.index(highest_prediction)]\n", + " predictions.append(ans)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "c5x1HD_V4-RS" + }, + "source": [ + "## Building a Beam Pipeline\n", + "\n", + "A Pipeline represents the workflow of a series of computations. Here we are performing the following tasks in our pipeline:\n", + "* Creating a PCollection of the data on which we need to run inference\n", + "* Appying the Image Preprocessing DoFns we defined earlier <br>\n", + " These include:\n", + " 1. Standardization\n", + " 2. Normalization\n", + " 3. Converting to grayscale\n", + "* Running Inference by using the trained model stored in Google Cloud.\n", + "* Displaying the output of the model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "KxYdNXefTFcD" + }, + "outputs": [], + "source": [ + "with beam.Pipeline() as p:\n", + " _ = (p | beam.Create(x_test)\n", + " | beam.ParDo(StandardizeImage())\n", + " | beam.ParDo(NormalizeImage())\n", + " | beam.ParDo(GrayscaleImage())\n", + " | RunInference(model_handler)\n", + " | beam.ParDo(SavePredictions())\n", + " )" + ] + }, + { + "cell_type": "markdown", + "source": [ + "So we got our predictions! Let us verify one of them." + ], + "metadata": { + "id": "yCADuo_yk1sK" + } + }, + { + "cell_type": "code", + "source": [ + "index = 5000\n", + "#You can change this index value to see and verify any image\n", + "predictions[index]" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 35 + }, + "id": "Zj2p_V2qmYMM", + "outputId": "f45c4f9f-3fa2-49f9-8471-602eea8faf68" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "'Horse'" + ], + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + } + }, + "metadata": {}, + "execution_count": 40 + } + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 447 + }, + "id": "kd5fXda1yr6G", + "outputId": "51589888-2986-4d99-a6a3-c89ff24ebd9e" + }, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "<matplotlib.image.AxesImage at 0x7f4418bb7ac0>" + ] + }, + "metadata": {}, + "execution_count": 30 + }, + { + "output_type": "display_data", + "data": { + "text/plain": [ + "<Figure size 640x480 with 1 Axes>" + ], + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaAAAAGdCAYAAABU0qcqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAxWElEQVR4nO3dfXTU9Z33/9fMZGZynxBC7iQgNwqigIqCuawuAstNr+NPK2d/2na72Hr06EavVbbblp5Wq7t7xbVnW9teiNfvrCvb31W0ulfR6rZaRQl1BSpUimiLgEFAkiA3uZtkksnM9/rDy7RRkM8bEj5JfD7OmXMk8/adz3e+M9/3fDMzrwkFQRAIAIAzLOx7AQCATycGEADACwYQAMALBhAAwAsGEADACwYQAMALBhAAwAsGEADAiyzfC/ioTCajgwcPqqCgQKFQyPdyAABGQRCovb1dVVVVCodPfJ [...] + }, + "metadata": {} + } + ], + "source": [ + "plt.imshow(x_test[index])" + ] + }, + { + "cell_type": "code", + "source": [ + "labels[y_test[index][0]]" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 35 + }, + "id": "O32stbnNkTLh", + "outputId": "c6dc9b40-290d-4029-e626-da88efc6e4e3" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "'Horse'" + ], + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + } + }, + "metadata": {}, + "execution_count": 32 + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "Let us make a dictionary to see how many predictions belong to each class" + ], + "metadata": { + "id": "qT00-He7pfNi" + } + }, + { + "cell_type": "code", + "source": [ + "aggregate_results = dict()\n", + "for i in range(len(predictions)):\n", + " if predictions[i] in aggregate_results:\n", + " aggregate_results[predictions[i]] += 1\n", + " else:\n", + " aggregate_results[predictions[i]] = 1" + ], + "metadata": { + "id": "x744tmPnowMr" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "aggregate_results" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "evQnjvDdqbuG", + "outputId": "69c6d8d6-7986-444b-c66b-76c2fda98553" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "{'Dog': 641,\n", + " 'Automobile': 3387,\n", + " 'Deer': 793,\n", + " 'Horse': 1030,\n", + " 'Truck': 392,\n", + " 'Frog': 290,\n", + " 'Airplane': 179,\n", + " 'Cat': 3175,\n", + " 'Bird': 91,\n", + " 'Ship': 22}" + ] + }, + "metadata": {}, + "execution_count": 37 + } + ] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file