This is an automated email from the ASF dual-hosted git repository.

damccorm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
     new 58ac71f5994 Update TF notebook example to auto-restart the runtime 
after dependency installation (#28132)
58ac71f5994 is described below

commit 58ac71f59944d134580d229845fe0745c2c6d527
Author: Jack McCluskey <34928439+jrmcclus...@users.noreply.github.com>
AuthorDate: Thu Aug 24 09:15:12 2023 -0400

    Update TF notebook example to auto-restart the runtime after dependency 
installation (#28132)
---
 .../beam-ml/run_inference_tensorflow.ipynb         | 383 ++++++++++-----------
 1 file changed, 180 insertions(+), 203 deletions(-)

diff --git a/examples/notebooks/beam-ml/run_inference_tensorflow.ipynb 
b/examples/notebooks/beam-ml/run_inference_tensorflow.ipynb
index 2cb0eac860d..ad5bb671cce 100644
--- a/examples/notebooks/beam-ml/run_inference_tensorflow.ipynb
+++ b/examples/notebooks/beam-ml/run_inference_tensorflow.ipynb
@@ -1,22 +1,13 @@
 {
-  "nbformat": 4,
-  "nbformat_minor": 0,
-  "metadata": {
-    "colab": {
-      "provenance": []
-    },
-    "kernelspec": {
-      "name": "python3",
-      "display_name": "Python 3"
-    },
-    "language_info": {
-      "name": "python"
-    },
-    "accelerator": "GPU"
-  },
   "cells": [
     {
       "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "cellView": "form",
+        "id": "fFjof1NgAJwu"
+      },
+      "outputs": [],
       "source": [
         "# @title ###### Licensed to the Apache Software Foundation (ASF), 
Version 2.0 (the \"License\")\n",
         "\n",
@@ -36,13 +27,7 @@
         "# KIND, either express or implied. See the License for the\n",
         "# specific language governing permissions and limitations\n",
         "# under the License"
-      ],
-      "metadata": {
-        "cellView": "form",
-        "id": "fFjof1NgAJwu"
-      },
-      "execution_count": null,
-      "outputs": []
+      ]
     },
     {
       "cell_type": "markdown",
@@ -64,6 +49,9 @@
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "HrCtxslBGK8Z"
+      },
       "source": [
         "This notebook shows how to use the Apache Beam 
[RunInference](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.RunInference)
 transform for [TensorFlow](https://www.tensorflow.org/).\n",
         "Apache Beam has built-in support for two TensorFlow model handlers: 
[`TFModelHandlerNumpy`](https://github.com/apache/beam/blob/ca0787642a6b3804a742326147281c99ae8d08d2/sdks/python/apache_beam/ml/inference/tensorflow_inference.py#L91)
 and 
[`TFModelHandlerTensor`](https://github.com/apache/beam/blob/ca0787642a6b3804a742326147281c99ae8d08d2/sdks/python/apache_beam/ml/inference/tensorflow_inference.py#L184).\n",
@@ -86,84 +74,84 @@
         "  *   saved weights\n",
         "\n",
         "For more information about using RunInference, see [Get started with 
AI/ML pipelines](https://beam.apache.org/documentation/ml/overview/) in the 
Apache Beam documentation."
-      ],
-      "metadata": {
-        "id": "HrCtxslBGK8Z"
-      }
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "gVCtGOKTHMm4"
+      },
       "source": [
         "## Before you begin\n",
         "Set up your environment and download dependencies."
-      ],
-      "metadata": {
-        "id": "gVCtGOKTHMm4"
-      }
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "YDHPlMjZRuY0"
+      },
       "source": [
         "### Install Apache Beam\n",
         "To use RunInference with the built-in Tensorflow model handler, 
install Apache Beam version 2.46.0 or later."
-      ],
-      "metadata": {
-        "id": "YDHPlMjZRuY0"
-      }
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": null,
       "metadata": {
         "id": "jBakpNZnAhqk"
       },
+      "outputs": [],
       "source": [
         "!pip install protobuf --quiet\n",
-        "!pip install apache_beam==2.46.0 --quiet"
-      ],
-      "execution_count": null,
-      "outputs": []
+        "!pip install apache_beam==2.46.0 --quiet\n",
+        "\n",
+        "# To use the newly installed versions, restart the runtime.\n",
+        "exit()"
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "X80jy3FqHjK4"
+      },
       "source": [
         "### Authenticate with Google Cloud\n",
         "This notebook relies on saving your model to Google Cloud. To use 
your Google Cloud account, authenticate this notebook."
-      ],
-      "metadata": {
-        "id": "X80jy3FqHjK4"
-      }
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": 2,
       "metadata": {
         "id": "Kz9sccyGBqz3"
       },
+      "outputs": [],
       "source": [
         "from google.colab import auth\n",
         "auth.authenticate_user()"
-      ],
-      "execution_count": 2,
-      "outputs": []
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "40qtP6zJuMXm"
+      },
       "source": [
         "### Import dependencies and set up your bucket\n",
         "Use the following code to import dependencies and to set up your 
Google Cloud Storage bucket.\n",
         "\n",
-        "Replace `PROJECT_ID` and `BUCKET_NAME` with the ID of your project 
and the name of your bucket.\n",
-        "\n",
-        "**Important**: If an error occurs, restart your runtime."
-      ],
-      "metadata": {
-        "id": "40qtP6zJuMXm"
-      }
+        "Replace `PROJECT_ID` and `BUCKET_NAME` with the ID of your project 
and the name of your bucket."
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": 22,
       "metadata": {
         "id": "eEle839_Akqx"
       },
+      "outputs": [],
       "source": [
         "import argparse\n",
         "from typing import Dict, Text, Any, Tuple, List\n",
@@ -185,20 +173,18 @@
         "\n",
         "save_model_dir_multiply = 
f'gs://{bucket}/tf-inference/model/multiply_five/v1/'\n",
         "save_weights_dir_multiply = 
f'gs://{bucket}/tf-inference/weights/multiply_five/v1/'\n"
-      ],
-      "execution_count": 22,
-      "outputs": []
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "YzvZWEv-1oiK"
+      },
       "source": [
         "## Create and test a simple model\n",
         "\n",
         "This step creates and tests a model that predicts the 5 times table."
-      ],
-      "metadata": {
-        "id": "YzvZWEv-1oiK"
-      }
+      ]
     },
     {
       "cell_type": "markdown",
@@ -212,6 +198,7 @@
     },
     {
       "cell_type": "code",
+      "execution_count": 16,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/";
@@ -219,29 +206,10 @@
         "id": "SH7iq3zeBBJ-",
         "outputId": "5a3d3ce4-f9d8-4d87-a1bc-05afc3c9b06e"
       },
-      "source": [
-        "# Create training data that represents the 5 times multiplication 
table for the numbers 0 to 99.\n",
-        "# x is the data and y is the labels.\n",
-        "x = numpy.arange(0, 100)   # Examples\n",
-        "y = x * 5                  # Labels\n",
-        "\n",
-        "# Use create_model to build a simple linear regression model.\n",
-        "# Note that the model has a shape of (1) for its input layer and 
expects a single int64 value.\n",
-        "def create_model():\n",
-        "  input_layer = keras.layers.Input(shape=(1), dtype=tf.float32, 
name='x')\n",
-        "  output_layer= keras.layers.Dense(1)(input_layer)\n",
-        "  model = keras.Model(input_layer, output_layer)\n",
-        "  model.compile(optimizer=tf.optimizers.Adam(), 
loss='mean_absolute_error')\n",
-        "  return model\n",
-        "\n",
-        "model = create_model()\n",
-        "model.summary()"
-      ],
-      "execution_count": 16,
       "outputs": [
         {
-          "output_type": "stream",
           "name": "stdout",
+          "output_type": "stream",
           "text": [
             "Model: \"model_1\"\n",
             
"_________________________________________________________________\n",
@@ -258,21 +226,40 @@
             
"_________________________________________________________________\n"
           ]
         }
+      ],
+      "source": [
+        "# Create training data that represents the 5 times multiplication 
table for the numbers 0 to 99.\n",
+        "# x is the data and y is the labels.\n",
+        "x = numpy.arange(0, 100)   # Examples\n",
+        "y = x * 5                  # Labels\n",
+        "\n",
+        "# Use create_model to build a simple linear regression model.\n",
+        "# Note that the model has a shape of (1) for its input layer and 
expects a single int64 value.\n",
+        "def create_model():\n",
+        "  input_layer = keras.layers.Input(shape=(1), dtype=tf.float32, 
name='x')\n",
+        "  output_layer= keras.layers.Dense(1)(input_layer)\n",
+        "  model = keras.Model(input_layer, output_layer)\n",
+        "  model.compile(optimizer=tf.optimizers.Adam(), 
loss='mean_absolute_error')\n",
+        "  return model\n",
+        "\n",
+        "model = create_model()\n",
+        "model.summary()"
       ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "O_a0-4Gb19cy"
+      },
       "source": [
         "### Test the model\n",
         "\n",
         "This step tests the model that you created."
-      ],
-      "metadata": {
-        "id": "O_a0-4Gb19cy"
-      }
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": 17,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/";
@@ -280,20 +267,10 @@
         "id": "5XkIYXhJBFmS",
         "outputId": "ad2ff8a9-522c-41f4-e5d8-dbdcb53b0ded"
       },
-      "source": [
-        "model.fit(x, y, epochs=500, verbose=0)\n",
-        "test_examples =[20, 40, 60, 90]\n",
-        "value_to_predict = numpy.array(test_examples, dtype=numpy.float32)\n",
-        "predictions = model.predict(value_to_predict)\n",
-        "\n",
-        "print('Test Examples ' + str(test_examples))\n",
-        "print('Predictions ' + str(predictions))"
-      ],
-      "execution_count": 17,
       "outputs": [
         {
-          "output_type": "stream",
           "name": "stdout",
+          "output_type": "stream",
           "text": [
             "1/1 [==============================] - 0s 38ms/step\n",
             "Test Examples [20, 40, 60, 90]\n",
@@ -303,32 +280,44 @@
             " [91.544655]]\n"
           ]
         }
+      ],
+      "source": [
+        "model.fit(x, y, epochs=500, verbose=0)\n",
+        "test_examples =[20, 40, 60, 90]\n",
+        "value_to_predict = numpy.array(test_examples, dtype=numpy.float32)\n",
+        "predictions = model.predict(value_to_predict)\n",
+        "\n",
+        "print('Test Examples ' + str(test_examples))\n",
+        "print('Predictions ' + str(predictions))"
       ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "Y3BC2aY8cIMI"
+      },
       "source": [
         "### Save the model\n",
         "\n",
         "This step shows how to save your model."
-      ],
-      "metadata": {
-        "id": "Y3BC2aY8cIMI"
-      }
+      ]
     },
     {
       "cell_type": "code",
-      "source": [
-        "model.save(save_model_dir_multiply)"
-      ],
+      "execution_count": 18,
       "metadata": {
         "id": "2JbE7WkGcAkK"
       },
-      "execution_count": 18,
-      "outputs": []
+      "outputs": [],
+      "source": [
+        "model.save(save_model_dir_multiply)"
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "g_qVtXPeUcMS"
+      },
       "source": [
         "Instead of saving the entire model, you can [save the model weights 
for 
inference](https://www.tensorflow.org/guide/keras/save_and_serialize#saving_loading_only_the_models_weights_values).
 You can use this method when you need the model for inference but don't need 
any compilation information or optimizer state. In addition, when using 
transfer learning applications, you can use this method to load the weights 
with new models.\n",
         "\n",
@@ -340,49 +329,32 @@
         "model_handler = TFModelHandlerNumpy(path_to_weights, 
model_type=ModelType.SAVED_WEIGHTS, create_model_fn=build_tensorflow_model)\n",
         "```\n",
         "\n"
-      ],
-      "metadata": {
-        "id": "g_qVtXPeUcMS"
-      }
+      ]
     },
     {
       "cell_type": "code",
-      "source": [
-        "model.save_weights(save_weights_dir_multiply)"
-      ],
+      "execution_count": 19,
       "metadata": {
         "id": "Kl1C_NwaUbiv"
       },
-      "execution_count": 19,
-      "outputs": []
+      "outputs": [],
+      "source": [
+        "model.save_weights(save_weights_dir_multiply)"
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "0a1zerXycQ0z"
+      },
       "source": [
         "## Run the pipeline\n",
         "Use the following code to run the pipeline by specifying path to the 
trained TensorFlow model."
-      ],
-      "metadata": {
-        "id": "0a1zerXycQ0z"
-      }
+      ]
     },
     {
       "cell_type": "code",
-      "source": [
-        "class FormatOutput(beam.DoFn):\n",
-        "  def process(self, element, *args, **kwargs):\n",
-        "     yield \"example is {example} prediction is 
{prediction}\".format(example=element.example, prediction=element.inference)\n",
-        "\n",
-        "\n",
-        "examples = numpy.array([20, 40, 60, 90], dtype=numpy.float32)\n",
-        "model_handler = TFModelHandlerNumpy(save_model_dir_multiply)\n",
-        "with beam.Pipeline() as p:\n",
-        "    _ = (p | beam.Create(examples)\n",
-        "           | RunInference(model_handler)\n",
-        "           | beam.ParDo(FormatOutput())\n",
-        "           | beam.Map(print)\n",
-        "        )"
-      ],
+      "execution_count": 20,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/";,
@@ -391,49 +363,24 @@
         "id": "St07XoibcQSb",
         "outputId": "d36f77f2-d07e-4868-f4cb-d120ab54e653"
       },
-      "execution_count": 20,
       "outputs": [
         {
-          "output_type": "stream",
           "name": "stderr",
+          "output_type": "stream",
           "text": [
             
"WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies 
required for Interactive Beam PCollection visualization are not available, 
please use: `pip install apache-beam[interactive]` to install necessary 
dependencies to enable all data visualization features.\n"
           ]
         },
         {
-          "output_type": "display_data",
           "data": {
-            "application/javascript": [
-              "\n",
-              "        if (typeof window.interactive_beam_jquery == 
'undefined') {\n",
-              "          var jqueryScript = 
document.createElement('script');\n",
-              "          jqueryScript.src = 
'https://code.jquery.com/jquery-3.4.1.slim.min.js';\n",
-              "          jqueryScript.type = 'text/javascript';\n",
-              "          jqueryScript.onload = function() {\n",
-              "            var datatableScript = 
document.createElement('script');\n",
-              "            datatableScript.src = 
'https://cdn.datatables.net/1.10.20/js/jquery.dataTables.min.js';\n",
-              "            datatableScript.type = 'text/javascript';\n",
-              "            datatableScript.onload = function() {\n",
-              "              window.interactive_beam_jquery = 
jQuery.noConflict(true);\n",
-              "              
window.interactive_beam_jquery(document).ready(function($){\n",
-              "                \n",
-              "              });\n",
-              "            }\n",
-              "            document.head.appendChild(datatableScript);\n",
-              "          };\n",
-              "          document.head.appendChild(jqueryScript);\n",
-              "        } else {\n",
-              "          
window.interactive_beam_jquery(document).ready(function($){\n",
-              "            \n",
-              "          });\n",
-              "        }"
-            ]
+            "application/javascript": "\n        if (typeof 
window.interactive_beam_jquery == 'undefined') {\n          var jqueryScript = 
document.createElement('script');\n          jqueryScript.src = 
'https://code.jquery.com/jquery-3.4.1.slim.min.js';\n          
jqueryScript.type = 'text/javascript';\n          jqueryScript.onload = 
function() {\n            var datatableScript = 
document.createElement('script');\n            datatableScript.src = 
'https://cdn.datatables.net/1.10.20/j [...]
           },
-          "metadata": {}
+          "metadata": {},
+          "output_type": "display_data"
         },
         {
-          "output_type": "stream",
           "name": "stdout",
+          "output_type": "stream",
           "text": [
             "example is 20.0 prediction is [21.896107]\n",
             "example is 40.0 prediction is [41.795692]\n",
@@ -441,32 +388,37 @@
             "example is 90.0 prediction is [91.544655]\n"
           ]
         }
+      ],
+      "source": [
+        "class FormatOutput(beam.DoFn):\n",
+        "  def process(self, element, *args, **kwargs):\n",
+        "     yield \"example is {example} prediction is 
{prediction}\".format(example=element.example, prediction=element.inference)\n",
+        "\n",
+        "\n",
+        "examples = numpy.array([20, 40, 60, 90], dtype=numpy.float32)\n",
+        "model_handler = TFModelHandlerNumpy(save_model_dir_multiply)\n",
+        "with beam.Pipeline() as p:\n",
+        "    _ = (p | beam.Create(examples)\n",
+        "           | RunInference(model_handler)\n",
+        "           | beam.ParDo(FormatOutput())\n",
+        "           | beam.Map(print)\n",
+        "        )"
       ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "0lbPamYGV8E6"
+      },
       "source": [
         "Use the following code to run the pipeline with the saved weights of 
a TensorFlow model.\n",
         "\n",
         "To load the model with saved weights, the `TFModelHandlerNumpy` class 
requires a `create_model` function that builds and returns a TensorFlow model 
that is compatible with the saved weights."
-      ],
-      "metadata": {
-        "id": "0lbPamYGV8E6"
-      }
+      ]
     },
     {
       "cell_type": "code",
-      "source": [
-        "from apache_beam.ml.inference.tensorflow_inference import 
ModelType\n",
-        "examples = numpy.array([20, 40, 60, 90], dtype=numpy.float32)\n",
-        "model_handler = TFModelHandlerNumpy(save_weights_dir_multiply, 
model_type=ModelType.SAVED_WEIGHTS, create_model_fn=create_model)\n",
-        "with beam.Pipeline() as p:\n",
-        "    _ = (p | beam.Create(examples)\n",
-        "           | RunInference(model_handler)\n",
-        "           | beam.ParDo(FormatOutput())\n",
-        "           | beam.Map(print)\n",
-        "        )"
-      ],
+      "execution_count": 21,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/";
@@ -474,11 +426,10 @@
         "id": "QQam0O4cWG42",
         "outputId": "d2ab8603-cdc7-4cd6-b909-e6edfeaa5422"
       },
-      "execution_count": 21,
       "outputs": [
         {
-          "output_type": "stream",
           "name": "stdout",
+          "output_type": "stream",
           "text": [
             "example is 20.0 prediction is [21.896107]\n",
             "example is 40.0 prediction is [41.795692]\n",
@@ -486,10 +437,24 @@
             "example is 90.0 prediction is [91.544655]\n"
           ]
         }
+      ],
+      "source": [
+        "from apache_beam.ml.inference.tensorflow_inference import 
ModelType\n",
+        "examples = numpy.array([20, 40, 60, 90], dtype=numpy.float32)\n",
+        "model_handler = TFModelHandlerNumpy(save_weights_dir_multiply, 
model_type=ModelType.SAVED_WEIGHTS, create_model_fn=create_model)\n",
+        "with beam.Pipeline() as p:\n",
+        "    _ = (p | beam.Create(examples)\n",
+        "           | RunInference(model_handler)\n",
+        "           | beam.ParDo(FormatOutput())\n",
+        "           | beam.Map(print)\n",
+        "        )"
       ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "tRLArcjOcYuO"
+      },
       "source": [
         "## Use a keyed model handler\n",
         "To use a keyed model handler, use `KeyedModelHandler` with TensorFlow 
by using `TFModelHandlerNumpy`.\n",
@@ -498,30 +463,11 @@
         "\n",
         "* If you know that keys are associated with your examples, use 
`beam.KeyedModelHandler` to wrap the model handler.\n",
         "* If you don't know whether keys are associated with your examples, 
use `beam.MaybeKeyedModelHandler`."
-      ],
-      "metadata": {
-        "id": "tRLArcjOcYuO"
-      }
+      ]
     },
     {
       "cell_type": "code",
-      "source": [
-        "class FormatOutputKeyed(FormatOutput):\n",
-        "  # To simplify, inherit from FormatOutput.\n",
-        "  def process(self, tuple_in: Tuple):\n",
-        "    key, element = tuple_in\n",
-        "    output = super().process(element)\n",
-        "    yield \"{} : {}\".format(key, [op for op in output])\n",
-        "\n",
-        "examples = numpy.array([(1,20), (2,40), (3,60), (4,90)], 
dtype=numpy.float32)\n",
-        "keyed_model_handler = 
KeyedModelHandler(TFModelHandlerNumpy(save_model_dir_multiply))\n",
-        "with beam.Pipeline() as p:\n",
-        "    _ = (p | 'CreateExamples' >> beam.Create(examples)\n",
-        "           | RunInference(keyed_model_handler)\n",
-        "           | beam.ParDo(FormatOutputKeyed())\n",
-        "           | beam.Map(print)\n",
-        "        )"
-      ],
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/";
@@ -529,11 +475,10 @@
         "id": "P6l9RwL2cAW3",
         "outputId": "03459fea-7d0a-4501-93cb-18bbad915d13"
       },
-      "execution_count": null,
       "outputs": [
         {
-          "output_type": "stream",
           "name": "stdout",
+          "output_type": "stream",
           "text": [
             "1.0 : ['example is 20.0 prediction is [51.815357]']\n",
             "2.0 : ['example is 40.0 prediction is [101.63492]']\n",
@@ -541,7 +486,39 @@
             "4.0 : ['example is 90.0 prediction is [226.18384]']\n"
           ]
         }
+      ],
+      "source": [
+        "class FormatOutputKeyed(FormatOutput):\n",
+        "  # To simplify, inherit from FormatOutput.\n",
+        "  def process(self, tuple_in: Tuple):\n",
+        "    key, element = tuple_in\n",
+        "    output = super().process(element)\n",
+        "    yield \"{} : {}\".format(key, [op for op in output])\n",
+        "\n",
+        "examples = numpy.array([(1,20), (2,40), (3,60), (4,90)], 
dtype=numpy.float32)\n",
+        "keyed_model_handler = 
KeyedModelHandler(TFModelHandlerNumpy(save_model_dir_multiply))\n",
+        "with beam.Pipeline() as p:\n",
+        "    _ = (p | 'CreateExamples' >> beam.Create(examples)\n",
+        "           | RunInference(keyed_model_handler)\n",
+        "           | beam.ParDo(FormatOutputKeyed())\n",
+        "           | beam.Map(print)\n",
+        "        )"
       ]
     }
-  ]
-}
\ No newline at end of file
+  ],
+  "metadata": {
+    "accelerator": "GPU",
+    "colab": {
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}

Reply via email to