[beam] branch master updated: Add notebook for image processing using beam (#27034)

damccorm Tue, 27 Jun 2023 09:04:08 -0700

This is an automated email from the ASF dual-hosted git repository.

damccorm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git



The following commit(s) were added to refs/heads/master by this push:
     new c1a5ce7bf0a Add notebook for image processing using beam (#27034)
c1a5ce7bf0a is described below

commit c1a5ce7bf0aee88ee8f827a72a15c4ce4b5b14d3
Author: Reeba Qureshi <64488642+reeba...@users.noreply.github.com>
AuthorDate: Tue Jun 27 21:33:52 2023 +0530

    Add notebook for image processing using beam (#27034)
    
    * Add notebook for image processing using beam
    
    * Delete image_processing_tensorflow.ipynb
    
    * add image processing use case using tensorflow
    
    * Delete image_processing_tensorflow.ipynb
    
    * Add image processing use case after suggestions
    
    * Delete image_processing_beam.ipynb
    
    * Add image processing with implemented suggestions
---
 .../beam-ml/image_processing_tensorflow.ipynb      | 951 +++++++++++++++++++++
 1 file changed, 951 insertions(+)

diff --git a/examples/notebooks/beam-ml/image_processing_tensorflow.ipynb 
b/examples/notebooks/beam-ml/image_processing_tensorflow.ipynb
new file mode 100644
index 00000000000..0914653f1de
--- /dev/null
+++ b/examples/notebooks/beam-ml/image_processing_tensorflow.ipynb
@@ -0,0 +1,951 @@
+{
+  "cells": [
+    {
+      "cell_type": "code",
+      "source": [
+        "# @title ###### Licensed to the Apache Software Foundation (ASF), 
Version 2.0 (the \"License\")\n",
+        "\n",
+        "# Licensed to the Apache Software Foundation (ASF) under one\n",
+        "# or more contributor license agreements. See the NOTICE file\n",
+        "# distributed with this work for additional information\n",
+        "# regarding copyright ownership. The ASF licenses this file\n",
+        "# to you under the Apache License, Version 2.0 (the\n",
+        "# \"License\"); you may not use this file except in compliance\n",
+        "# with the License. You may obtain a copy of the License at\n",
+        "#\n",
+        "#   http://www.apache.org/licenses/LICENSE-2.0\n";,
+        "#\n",
+        "# Unless required by applicable law or agreed to in writing,\n",
+        "# software distributed under the License is distributed on an\n",
+        "# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
+        "# KIND, either express or implied. See the License for the\n",
+        "# specific language governing permissions and limitations\n",
+        "# under the License"
+      ],
+      "metadata": {
+        "id": "NsNImDL8TGM1"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Image Processing using Apache Beam\n",
+        "\n",
+        "<table align=\"left\">\n",
+        "  <td>\n",
+        "    <a target=\"_blank\" 
href=\"https://colab.sandbox.google.com/github/apache/beam/blob/master/examples/notebooks/beam-ml/image_processing_tensorflow.ipynb\";><img
 
src=\"https://raw.githubusercontent.com/google/or-tools/main/tools/colab_32px.png\";
 />Run in Google Colab</a>\n",
+        "  </td>\n",
+        "  <td>\n",
+        "    <a target=\"_blank\" 
href=\"https://github.com/apache/beam/blob/master/examples/notebooks/beam-ml/image_processing_tensorflow.ipynb\";><img
 
src=\"https://raw.githubusercontent.com/google/or-tools/main/tools/github_32px.png\";
 />View source on GitHub</a>\n",
+        "  </td>\n",
+        "</table>\n",
+        "\n"
+      ],
+      "metadata": {
+        "id": "SwN0Rj4cJSg5"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Image Processing is a machine learning technique to read, analyze and 
extract meaningful information from images. It involves multiple steps such as 
applying various preprocessing fuctions, getting predictions from a model, 
storing the predictions in a useful format, etc. Apache Beam is a suitable tool 
to handle these tasks and build a structured workflow. This notebook 
demonstrates the use of Apache Beam in image processing and performs the 
following:\n",
+        "* Import and preprocess the CIFAR-10 dataset\n",
+        "* Train a TensorFlow model to classify images\n",
+        "* Store the model in Google Cloud and create a model handler\n",
+        "* Build a Beam pipeline to:\n",
+        " 1. Create a 
[PCollection]('https://beam.apache.org/documentation/programming-guide/#pcollections')
 of input images\n",
+        " 2. Perform preprocessing 
[transforms]('https://beam.apache.org/documentation/programming-guide/#transforms')\n",
+        " 3. RunInference to get predictions from the previously trained 
model\n",
+        " 4. Store the results\n",
+        "\n",
+        "For more information on using Apache Beam for machine learning, have 
a look at [AI/ML Pipelines using 
Beam]('https://beam.apache.org/documentation/ml/overview/')."
+      ],
+      "metadata": {
+        "id": "yxLoBQxocAOv"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OSZrRmHl9NQY"
+      },
+      "source": [
+        "## Installing Apache Beam"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";
+        },
+        "id": "MO7iNmvkBdA5",
+        "outputId": "6c76e29d-3c70-4c3e-aca2-7cc1dcd167a1"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m14.3/14.3 
MB\u001b[0m \u001b[31m28.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m89.7/89.7 
kB\u001b[0m \u001b[31m8.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... 
\u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m137.0/137.0 
kB\u001b[0m \u001b[31m15.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m152.0/152.0 
kB\u001b[0m \u001b[31m14.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... 
\u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.6/2.6 
MB\u001b[0m \u001b[31m44.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m648.9/648.9 
kB\u001b[0m \u001b[31m46.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.7/2.7 
MB\u001b[0m \u001b[31m84.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... 
\u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m283.7/283.7 
kB\u001b[0m \u001b[31m27.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Building wheel for crcmod (setup.py) ... 
\u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for dill (setup.py) ... 
\u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for docopt (setup.py) ... 
\u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m 
\u001b[32m307.5/307.5 kB\u001b[0m \u001b[31m7.7 MB/s\u001b[0m eta 
\u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m152.8/152.8 
kB\u001b[0m \u001b[31m16.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m138.3/138.3 
kB\u001b[0m \u001b[31m13.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... 
\u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m798.7/798.7 
kB\u001b[0m \u001b[31m24.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 
MB\u001b[0m \u001b[31m46.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.1/2.1 
MB\u001b[0m \u001b[31m60.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Building wheel for timeloop (setup.py) ... 
\u001b[?25l\u001b[?25hdone\n",
+            "\u001b[31mERROR: pip's dependency resolver does not currently 
take into account all the packages that are installed. This behaviour is the 
source of the following dependency conflicts.\n",
+            "google-colab 1.0.0 requires ipykernel==5.5.6, but you have 
ipykernel 6.23.2 which is incompatible.\n",
+            "google-colab 1.0.0 requires ipython==7.34.0, but you have ipython 
8.14.0 which is incompatible.\u001b[0m\u001b[31m\n",
+            "\u001b[0m"
+          ]
+        }
+      ],
+      "source": [
+        "!pip install apache_beam --quiet\n",
+        "!pip install apache-beam[interactive] --quiet"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "45mf7oHu9XbI"
+      },
+      "source": [
+        "## Importing necessary libraries\n",
+        "Here is a brief overview of the uses of each library imported:\n",
+        "* **NumPy**: Multidimensional numpy arrays are used to store images, 
and the library also allows performing various operations on them.\n",
+        "* **Matplotlib**: Displays images stored in numpy array format.\n",
+        "* **TensorFlow**: Trains a machine learning model.\n",
+        "* **TFModelHandlerNumpy**: Defines the configuration used to load/use 
the model that we train. We use `TFModelHandlerNumpy` because the model was 
trained with TensorFlow and takes numpy arrays as input.\n",
+        "* **RunInference**:  Loads the model and obtains predictions as part 
of the Apache Beam pipeline. For more information, see [docs on prediction and 
inference](https://beam.apache.org/documentation/ml/inference-overview/).\n",
+        "* **Apache Beam**: Builds a pipeline for Image Processing."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "z5_PUeZgOygU"
+      },
+      "outputs": [],
+      "source": [
+        "import numpy as np\n",
+        "import matplotlib.pyplot as plt\n",
+        "import tensorflow as tf\n",
+        "from apache_beam.ml.inference.tensorflow_inference import 
TFModelHandlerNumpy\n",
+        "from apache_beam.ml.inference.base import RunInference\n",
+        "import apache_beam as beam"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "x3tSAqP7R2rZ"
+      },
+      "source": [
+        "## CIFAR-10 Dataset\n",
+        "CIFAR-10 is a popular dataset used for multiclass object 
classification.\n",
+        "It has 60,000 images of the following 10 categories:\n",
+        "\n",
+        "* airplane\n",
+        "* automobile\n",
+        "* bird\n",
+        "* cat\n",
+        "* deer\n",
+        "* dog\n",
+        "* frog\n",
+        "* horse\n",
+        "* ship\n",
+        "* truck\n",
+        "\n",
+        "The dataset can be directly imported from the TensorFlow library."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";
+        },
+        "id": "MqylmjBhPCOW",
+        "outputId": "9d9f5854-80f2-4a4f-a52b-2b81d6295639"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Downloading data from 
https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz\n";,
+            "170498071/170498071 [==============================] - 4s 
0us/step\n"
+          ]
+        }
+      ],
+      "source": [
+        "(x_train, y_train), (x_test, y_test) = 
tf.keras.datasets.cifar10.load_data()"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "x_test.shape"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";
+        },
+        "id": "pfzkgryZUV8P",
+        "outputId": "79bc798f-f93b-4d7b-8783-c5defa6a2322"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "(10000, 32, 32, 3)"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 4
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "The labels in y_train and y_test are numeric, with each number 
representing a class. The labels list defined below contains the various 
classes, and their positions in the list represent the corresponding number 
used to refer to them."
+      ],
+      "metadata": {
+        "id": "6hEHIHPsVxw4"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "3uImFIBXv0My"
+      },
+      "outputs": [],
+      "source": [
+        "labels = ['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer', 'Dog', 
'Frog', 'Horse','Ship', 'Truck']"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";,
+          "height": 447
+        },
+        "id": "zeE81PNOcGfZ",
+        "outputId": "d2a08cb5-4fdc-47af-c2b2-5602e7600f09"
+      },
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "<matplotlib.image.AxesImage at 0x7f441be49840>"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 6
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ],
+            "image/png": 
"iVBORw0KGgoAAAANSUhEUgAAAaAAAAGdCAYAAABU0qcqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAvsklEQVR4nO3df3Bc5Xn3/8/Z1e5KsqSVZVuShWVjG2Pzy85TBxwNCSXYxXanDARPB5LM1KR8YaAyU3DTJO4kEGg7SslMQpJxzB+luHkmhoQ+MQx8GyiYWDStTWsHPw5QHOwYbGJLBtv6rf2hPff3D76oFdhwX7bk2xLv18zOWNrLl+5zzu5eOtrdz0bOOScAAM6wROgFAAA+nhhAAIAgGEAAgCAYQACAIBhAAIAgGEAAgCAYQACAIBhAAIAgykIv4P3iONahQ4dUXV2tKIpCLwcAYOScU29vr5qampRInP
 [...]
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "plt.imshow(x_train[800])"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";
+        },
+        "id": "4arvJDYwfsAj",
+        "outputId": "a355a4e2-c1a7-461e-bff9-059daaa6a9f7"
+      },
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "(32, 32, 3)"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 7
+        }
+      ],
+      "source": [
+        "x_train[0].shape"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ndeZ_RH32Upu"
+      },
+      "source": [
+        "(32, 32, 3) represents an image of size 32x32 in the RGB scale"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "### Preprocessing"
+      ],
+      "metadata": {
+        "id": "L2pg1uxSXPHn"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "**Standardization** is the process of transforming the pixel values 
of an image to have zero mean and unit variance. This brings the pixel values 
to a similar scale and makes them easier to work with."
+      ],
+      "metadata": {
+        "id": "Hwwm-EHhW0rC"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "ZlInmab9MD-N"
+      },
+      "outputs": [],
+      "source": [
+        "x_train = x_train/255.0"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "**Normalization** is the process of scaling the pixel values to a 
specified range, typically between 0 and 1. This improves the consistency of 
images."
+      ],
+      "metadata": {
+        "id": "6GFdU-HZWztg"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";,
+          "height": 447
+        },
+        "id": "TLmsgV9_Wij5",
+        "outputId": "03fb00c5-efb9-421c-ef55-bbd36679dfbe"
+      },
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "<matplotlib.image.AxesImage at 0x7f4412adeb30>"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 9
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ],
+            "image/png": 
"iVBORw0KGgoAAAANSUhEUgAAAaAAAAGdCAYAAABU0qcqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAvsklEQVR4nO3df3Bc5Xn3/8/Z1e5KsqSVZVuShWVjG2Pzy85TBxwNCSXYxXanDARPB5LM1KR8YaAyU3DTJO4kEGg7SslMQpJxzB+luHkmhoQ+MQx8GyiYWDStTWsHPw5QHOwYbGJLBtv6rf2hPff3D76oFdhwX7bk2xLv18zOWNrLl+5zzu5eOtrdz0bOOScAAM6wROgFAAA+nhhAAIAgGEAAgCAYQACAIBhAAIAgGEAAgCAYQACAIBhAAIAgykIv4P3iONahQ4dUXV2tKIpCLwcAYOScU29vr5qampRInP
 [...]
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "x_train = (x_train - np.min(x_train)) / (np.max(x_train) - 
np.min(x_train))\n",
+        "plt.imshow(x_train[800])"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "**Grayscale Conversion** refers to the conversion of a colored image 
in RGB scale into a grayscale image. It represents the pixel intensities 
without considering colors, which makes calculations easier."
+      ],
+      "metadata": {
+        "id": "bfgy0Z_gX_lH"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "P2oPvZkbfEPo"
+      },
+      "outputs": [],
+      "source": [
+        "grayscale = []\n",
+        "for i in x_train:\n",
+        "  grayImage = 0.07 * i[:,:,2] + 0.72 * i[:,:,1] + 0.21 * i[:,:,0]\n",
+        "  grayscale.append(grayImage)\n",
+        "x_train_gray = np.asarray(grayscale)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "jzQ2Zulg99NU"
+      },
+      "source": [
+        "## Defining DoFns for Image Preprocessing\n",
+        "\n",
+        
"[DoFn](https://beam.apache.org/releases/typedoc/current/interfaces/transforms_pardo.DoFn)
 stands for \"Do Function\". In Apache Beam, it is a set of operations that can 
be applied to individual elements of a PCollection (a collection of data). It 
is similar to a function in Python, except that it is used in Beam Pipelines to 
apply various transformations. DoFns can be used in various Apache Beam 
transforms, such as ParDo, Map, Filter, and FlatMap."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "cqm-m0cONsZS"
+      },
+      "outputs": [],
+      "source": [
+        "class StandardizeImage(beam.DoFn):\n",
+        "  def process(self, element: np.ndarray):\n",
+        "    element = element/255.0\n",
+        "    return [element]"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "mZhFCgPxPEwm"
+      },
+      "outputs": [],
+      "source": [
+        "class NormalizeImage(beam.DoFn):\n",
+        "  def process(self, element: np.ndarray):\n",
+        "    element = 
(element-element.min())/(element.max()-element.min())\n",
+        "    return [element]"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "gv23KPt5NyXT"
+      },
+      "outputs": [],
+      "source": [
+        "class GrayscaleImage(beam.DoFn):\n",
+        "  def process(self, element: np.ndarray):\n",
+        "    element = 0.07 * element[:,:,2] + 0.72 * element[:,:,1] + 0.21 * 
element[:,:,0]\n",
+        "    return [element]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "8gz7_SvN-P2L"
+      },
+      "source": [
+        "## Training a Convolutional Neural Network\n",
+        "\n",
+        "A Convolutional Neural Network (CNN) is one of the most popular model 
types for image processing. Here is a brief description of the convolutional 
layers used in the model.\n",
+        "* **Reshape**: Changes the shape of the input data to the desired 
size.\n",
+        "The CIFAR-10 images are of 32x32 pixels in grayscale. We will train 
our model using these images and thus, all images fed into the model need to be 
reshaped to the required size, that is (32,32,1).\n",
+        "* **Conv2D**: Applies a set of filters to extract features from the 
input image, producing a feature map as the output.\n",
+        "This layer is used as it is an essential component of a CNN, and does 
the major task of finding patterns in images.\n",
+        "* **MaxPooling2D**: Reduces the spatial dimensions of the input while 
retaining the most prominent features.\n",
+        "We use this layer to downsample the images and preserve only the 
important features.\n",
+        "* **Flatten**: Flattens the input data or feature maps into a 
1-dimensional vector.\n",
+        "The input images are 2-dimensional. However in the end we require our 
results in a 1-D array. Flatten layer is used for this.\n",
+        "* **Dense**: Connects every neuron in the current layer to every 
neuron in the subsequent layer.\n",
+        "The CIFAR-10 dataset contains images belonging to 10 different 
classes. This is why the last dense layer gives 10 outputs, where each output 
corresponds to the probability of an image belonging to one of the 10 classes."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";
+        },
+        "id": "chYD9y7cH4Td",
+        "outputId": "b91eed4e-c383-40b8-96dc-8cf330d11a09"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Model: \"sequential\"\n",
+            
"_________________________________________________________________\n",
+            " Layer (type)                Output Shape              Param #   
\n",
+            
"=================================================================\n",
+            " reshape (Reshape)           (None, 32, 32, 1)         0         
\n",
+            "                                                                 
\n",
+            " conv2d (Conv2D)             (None, 30, 30, 32)        320       
\n",
+            "                                                                 
\n",
+            " max_pooling2d (MaxPooling2D  (None, 15, 15, 32)       0         
\n",
+            " )                                                               
\n",
+            "                                                                 
\n",
+            " conv2d_1 (Conv2D)           (None, 13, 13, 64)        18496     
\n",
+            "                                                                 
\n",
+            " max_pooling2d_1 (MaxPooling  (None, 6, 6, 64)         0         
\n",
+            " 2D)                                                             
\n",
+            "                                                                 
\n",
+            " conv2d_2 (Conv2D)           (None, 4, 4, 64)          36928     
\n",
+            "                                                                 
\n",
+            " flatten (Flatten)           (None, 1024)              0         
\n",
+            "                                                                 
\n",
+            " dense (Dense)               (None, 64)                65600     
\n",
+            "                                                                 
\n",
+            " dense_1 (Dense)             (None, 10)                650       
\n",
+            "                                                                 
\n",
+            
"=================================================================\n",
+            "Total params: 121,994\n",
+            "Trainable params: 121,994\n",
+            "Non-trainable params: 0\n",
+            
"_________________________________________________________________\n"
+          ]
+        }
+      ],
+      "source": [
+        "def create_model():\n",
+        "  model = tf.keras.Sequential([\n",
+        "    
tf.keras.layers.Reshape((32,32,1),input_shape=x_train_gray.shape[1:]),\n",
+        "    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', 
input_shape=(32, 32, 1)),\n",
+        "    tf.keras.layers.MaxPooling2D((2, 2)),\n",
+        "    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),\n",
+        "    tf.keras.layers.MaxPooling2D((2, 2)),\n",
+        "    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),\n",
+        "    tf.keras.layers.Flatten(),\n",
+        "    tf.keras.layers.Dense(64, activation='relu'),\n",
+        "    tf.keras.layers.Dense(10)\n",
+        "  ])\n",
+        "  model.compile(optimizer='adam',\n",
+        "              loss='sparse_categorical_crossentropy',\n",
+        "              metrics=['accuracy'])\n",
+        "  return model\n",
+        "\n",
+        "model = create_model()\n",
+        "model.summary()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "The input shape is changed to (32,32,1) as our input images are of 32 
x 32 pixels and 1 represents grayscale. In the final dense layer, there are 10 
outputs as there are 10 possible classes in the CIFAR-10 dataset."
+      ],
+      "metadata": {
+        "id": "UpO090vNbHir"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "3D1SdCC94DQi"
+      },
+      "source": [
+        "## Fitting the model"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";
+        },
+        "id": "KuMP46UVXkof",
+        "outputId": "5340424c-7a33-45c9-b773-3ff625c65290"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Epoch 1/10\n",
+            "1563/1563 [==============================] - 87s 55ms/step - 
loss: 1.6511 - accuracy: 0.4054\n",
+            "Epoch 2/10\n",
+            "1563/1563 [==============================] - 84s 54ms/step - 
loss: 1.2737 - accuracy: 0.5540\n",
+            "Epoch 3/10\n",
+            "1563/1563 [==============================] - 80s 51ms/step - 
loss: 1.1204 - accuracy: 0.6095\n",
+            "Epoch 4/10\n",
+            "1563/1563 [==============================] - 79s 51ms/step - 
loss: 1.0184 - accuracy: 0.6461\n",
+            "Epoch 5/10\n",
+            "1563/1563 [==============================] - 80s 51ms/step - 
loss: 0.9430 - accuracy: 0.6724\n",
+            "Epoch 6/10\n",
+            "1563/1563 [==============================] - 81s 52ms/step - 
loss: 0.8810 - accuracy: 0.6946\n",
+            "Epoch 7/10\n",
+            "1563/1563 [==============================] - 80s 51ms/step - 
loss: 0.8299 - accuracy: 0.7135\n",
+            "Epoch 8/10\n",
+            "1563/1563 [==============================] - 80s 51ms/step - 
loss: 0.7904 - accuracy: 0.7248\n",
+            "Epoch 9/10\n",
+            "1563/1563 [==============================] - 80s 51ms/step - 
loss: 0.7504 - accuracy: 0.7385\n",
+            "Epoch 10/10\n",
+            "1563/1563 [==============================] - 84s 54ms/step - 
loss: 0.7150 - accuracy: 0.7498\n"
+          ]
+        },
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "<keras.callbacks.History at 0x7f4412ead3c0>"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 22
+        }
+      ],
+      "source": [
+        "model.fit(x_train_gray, y_train, epochs=10)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "3obkK5fQ4KG6"
+      },
+      "source": [
+        "## Authenticating from Google Cloud\n",
+        "\n",
+        "We need to store our trained model in Google Cloud. For running 
inferences, we will load our model from cloud into the notebook using a Model 
Handler."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "Jj_NX568mhsE"
+      },
+      "outputs": [],
+      "source": [
+        "from google.colab import auth\n",
+        "auth.authenticate_user()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "7jn7HYPi4hH9"
+      },
+      "source": [
+        "Saving the trained model in a Google Cloud Storage bucket"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "CDqLPkzLfrt1"
+      },
+      "outputs": [],
+      "source": [
+        "save_model_dir = '' # Add the link to you GCS bucket here\n",
+        "model.save(save_model_dir)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "A model handler is used to save, load and manage trained ML models. 
Here we used TFModelHandlerNumpy as our input images are in the form of numpy 
arrays."
+      ],
+      "metadata": {
+        "id": "ilYg15uOcZSY"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "model_handler = TFModelHandlerNumpy(save_model_dir)"
+      ],
+      "metadata": {
+        "id": "RqefS6I_c1kc"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "### Saving predictions\n",
+        "RunInference returns the predictions for each class. In the below 
DoFn, the maximum predicion is selected (which refers to the class the input 
image most probably belongs to) and is stored in a list of predictions."
+      ],
+      "metadata": {
+        "id": "4vJZEXFOboUi"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "9T_YW19P46dW"
+      },
+      "outputs": [],
+      "source": [
+        "from tensorflow.python.ops.numpy_ops import np_config\n",
+        "np_config.enable_numpy_behavior()\n",
+        "predictions = []\n",
+        "class SavePredictions(beam.DoFn):\n",
+        "  def process(self, element, *args, **kwargs):\n",
+        "    list_of_predictions = element.inference.tolist()\n",
+        "    highest_prediction = max(list_of_predictions)\n",
+        "    ans = labels[list_of_predictions.index(highest_prediction)]\n",
+        "    predictions.append(ans)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "c5x1HD_V4-RS"
+      },
+      "source": [
+        "## Building a Beam Pipeline\n",
+        "\n",
+        "A Pipeline represents the workflow of a series of computations. Here 
we are performing the following tasks in our pipeline:\n",
+        "* Creating a PCollection of the data on which we need to run 
inference\n",
+        "* Appying the Image Preprocessing DoFns we defined earlier <br>\n",
+        " These include:\n",
+        " 1. Standardization\n",
+        " 2. Normalization\n",
+        " 3. Converting to grayscale\n",
+        "* Running Inference by using the trained model stored in Google 
Cloud.\n",
+        "* Displaying the output of the model"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "KxYdNXefTFcD"
+      },
+      "outputs": [],
+      "source": [
+        "with beam.Pipeline() as p:\n",
+        "    _ = (p | beam.Create(x_test)\n",
+        "           | beam.ParDo(StandardizeImage())\n",
+        "           | beam.ParDo(NormalizeImage())\n",
+        "           | beam.ParDo(GrayscaleImage())\n",
+        "           | RunInference(model_handler)\n",
+        "           | beam.ParDo(SavePredictions())\n",
+        "        )"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "So we got our predictions! Let us verify one of them."
+      ],
+      "metadata": {
+        "id": "yCADuo_yk1sK"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "index = 5000\n",
+        "#You can change this index value to see and verify any image\n",
+        "predictions[index]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";,
+          "height": 35
+        },
+        "id": "Zj2p_V2qmYMM",
+        "outputId": "f45c4f9f-3fa2-49f9-8471-602eea8faf68"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'Horse'"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            }
+          },
+          "metadata": {},
+          "execution_count": 40
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";,
+          "height": 447
+        },
+        "id": "kd5fXda1yr6G",
+        "outputId": "51589888-2986-4d99-a6a3-c89ff24ebd9e"
+      },
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "<matplotlib.image.AxesImage at 0x7f4418bb7ac0>"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 30
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ],
+            "image/png": 
"iVBORw0KGgoAAAANSUhEUgAAAaAAAAGdCAYAAABU0qcqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAxWElEQVR4nO3dfXTU9Z33/9fMZGZynxBC7iQgNwqigIqCuawuAstNr+NPK2d/2na72Hr06EavVbbblp5Wq7t7xbVnW9teiNfvrCvb31W0ulfR6rZaRQl1BSpUimiLgEFAkiA3uZtkksnM9/rDy7RRkM8bEj5JfD7OmXMk8/adz3e+M9/3fDMzrwkFQRAIAIAzLOx7AQCATycGEADACwYQAMALBhAAwAsGEADACwYQAMALBhAAwAsGEADAiyzfC/ioTCajgwcPqqCgQKFQyPdyAABGQRCovb1dVVVVCodPfJ
 [...]
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "plt.imshow(x_test[index])"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "labels[y_test[index][0]]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";,
+          "height": 35
+        },
+        "id": "O32stbnNkTLh",
+        "outputId": "c6dc9b40-290d-4029-e626-da88efc6e4e3"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'Horse'"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            }
+          },
+          "metadata": {},
+          "execution_count": 32
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Let us make a dictionary to see how many predictions belong to each 
class"
+      ],
+      "metadata": {
+        "id": "qT00-He7pfNi"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "aggregate_results = dict()\n",
+        "for i in range(len(predictions)):\n",
+        "  if predictions[i] in aggregate_results:\n",
+        "    aggregate_results[predictions[i]] += 1\n",
+        "  else:\n",
+        "    aggregate_results[predictions[i]] = 1"
+      ],
+      "metadata": {
+        "id": "x744tmPnowMr"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "aggregate_results"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/";
+        },
+        "id": "evQnjvDdqbuG",
+        "outputId": "69c6d8d6-7986-444b-c66b-76c2fda98553"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "{'Dog': 641,\n",
+              " 'Automobile': 3387,\n",
+              " 'Deer': 793,\n",
+              " 'Horse': 1030,\n",
+              " 'Truck': 392,\n",
+              " 'Frog': 290,\n",
+              " 'Airplane': 179,\n",
+              " 'Cat': 3175,\n",
+              " 'Bird': 91,\n",
+              " 'Ship': 22}"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 37
+        }
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
\ No newline at end of file

[beam] branch master updated: Add notebook for image processing using beam (#27034)

Reply via email to