This is an automated email from the ASF dual-hosted git repository. damccorm pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/master by this push: new ec312f00788 Copy editing ML notebooks for DevSite import (#30759) ec312f00788 is described below commit ec312f00788d8d7dc76deb191360e70e3a219b20 Author: Rebecca Szper <98840847+rsz...@users.noreply.github.com> AuthorDate: Wed Mar 27 11:02:34 2024 -0700 Copy editing ML notebooks for DevSite import (#30759) * Copy editing ML notebooks for DevSite import * Update configuration parameter text --- .../notebooks/beam-ml/run_inference_gemma.ipynb | 10 ++++---- .../vertex_ai_feature_store_enrichment.ipynb | 30 +++++++++------------- 2 files changed, 17 insertions(+), 23 deletions(-) diff --git a/examples/notebooks/beam-ml/run_inference_gemma.ipynb b/examples/notebooks/beam-ml/run_inference_gemma.ipynb index 6af1bd07c14..489f01c4c9a 100644 --- a/examples/notebooks/beam-ml/run_inference_gemma.ipynb +++ b/examples/notebooks/beam-ml/run_inference_gemma.ipynb @@ -153,7 +153,7 @@ "id": "1FQdEMq8GEpl" }, "source": [ - "The pipeline defined below automatically pulls the model weights from Kaggle. Please go to https://www.kaggle.com/models/keras/gemma and accept the terms of usage for Gemma models, then generate an API token using the instructions at https://www.kaggle.com/docs/api and provide your username and token here." + "The pipeline defined here automatically pulls the model weights from Kaggle. First, accept the terms of use for Gemma models on the Keras [Gemma](https://www.kaggle.com/models/keras/gemma) page. Next, generate an API token by following the instructions in [How to use Kaggle](https://www.kaggle.com/docs/api). Provide your username and token." ] }, { @@ -231,7 +231,7 @@ "## Import dependencies and provide a model preset\n", "Use the following code to import dependencies.\n", "\n", - "Replace the `model_preset` variable with the name of the Gemma preset to use. For example, if you want to use the default English weights, set the preset to \"gemma_2b_en\". For this demo, we wil use the instruction-tuned preset \"gemma_instruct_2b_en\". We also optionally use keras to run the model at half-precision to reduce GPU memory usage." + "Replace the value for the `model_preset` variable with the name of the Gemma preset to use. For example, to use the default English weights, use the value `gemma_2b_en`. This example uses the instruction-tuned preset `gemma_instruct_2b_en`. Optionally, to run the model at half-precision and reduce GPU memory usage, use Keras." ] }, { @@ -269,8 +269,8 @@ "To run the pipeline, use a custom model handler.\n", "\n", "### Provide a custom model handler\n", - "To simplify model loading, this notebook defines a custom model handler that will load the model by pulling the model weights directly from Kaggle presets. Implementing `load_model()`, `validate_inference_args()`, and `share_model_across_processes()` allows us to customize the behavior of the handler. The Keras implementation of the Gemma models has a `generate()` method\n", - "that generates text based on a prompt. Using this function in `run_inference()` routes the prompts properly." + "To simplify model loading, this notebook defines a custom model handler that loads the model by pulling the model weights directly from Kaggle presets. To customize the behavior of the handler, implement `load_model`, `validate_inference_args`, and `share_model_across_processes`. The Keras implementation of the Gemma models has a `generate` method\n", + "that generates text based on a prompt. To route the prompts properly, use this function in the `run_inference` method." ] }, { @@ -281,7 +281,7 @@ }, "outputs": [], "source": [ - "# Define `GemmaModelHandler` to load the model and perform the inference.\n", + "# To load the model and perform the inference, define `GemmaModelHandler`.\n", "\n", "from apache_beam.ml.inference.base import ModelHandler\n", "from apache_beam.ml.inference.base import PredictionResult\n", diff --git a/examples/notebooks/beam-ml/vertex_ai_feature_store_enrichment.ipynb b/examples/notebooks/beam-ml/vertex_ai_feature_store_enrichment.ipynb index c8ae558a1ba..ebfcca34b94 100644 --- a/examples/notebooks/beam-ml/vertex_ai_feature_store_enrichment.ipynb +++ b/examples/notebooks/beam-ml/vertex_ai_feature_store_enrichment.ipynb @@ -53,7 +53,7 @@ "id": "HrCtxslBGK8Z" }, "source": [ - "This notebook shows how to enrich data by using the Apache Beam [enrichment transform](https://beam.apache.org/documentation/transforms/python/elementwise/enrichment/) with [Vertex AI Feature Store](https://cloud.google.com/vertex-ai/docs). The enrichment transform is a turnkey transform in Apache Beam that lets you enrich data using a key-value lookup. This transform has the following features:\n", + "This notebook shows how to enrich data by using the Apache Beam [enrichment transform](https://beam.apache.org/documentation/transforms/python/elementwise/enrichment/) with [Vertex AI Feature Store](https://cloud.google.com/vertex-ai/docs/featurestore/latest/overview). The enrichment transform is an Apache Beam turnkey transform that lets you enrich data by using a key-value lookup. This transform has the following features:\n", "\n", "- The transform has a built-in Apache Beam handler that interacts with Vertex AI to get precomputed feature values.\n", "- The transform uses client-side throttling to manage rate limiting the requests.\n", @@ -72,8 +72,8 @@ "\n", "* Use a stream of online transactions from [Pub/Sub](https://cloud.google.com/pubsub/docs/guides) that contains the following fields: `product_id`, `user_id`, and `sale_price`.\n", "* Deploy a pretrained model on Vertex AI based on the features `product_id`, `user_id`, `sale_price`, `age`, `gender`, `state`, and `country`.\n", - "* Precompute the feature values for the pretrained model, and store the values in the Vertex AI Feature Store.\n", - "* Enrich the stream of transactions from Pub/Sub with feature values from Vertex AI Feature Store by using the `Enrichment` transform.\n", + "* Precompute the feature values for the pretrained model, and store the values in Vertex AI Feature Store.\n", + "* Enrich the stream of transactions from Pub/Sub with feature values from Vertex AI Feature Store by using the enrichment transform.\n", "* Send the enriched data to the Vertex AI model for online prediction by using the `RunInference` transform, which predicts the product recommendation for the user." ] }, @@ -1139,7 +1139,7 @@ "source": [ "Deploy the model to the Vertex AI endpoint.\n", "\n", - "**Note:** This step is a Long Running Operation (LRO). Depending on the size of the model, it might take more than five minutes to complete." + "**Note:** This step is a long running operation (LRO). Depending on the size of the model, it might take more than five minutes to complete." ] }, { @@ -1186,7 +1186,7 @@ "id": "ouMQZ4sC4zuO" }, "source": [ - "### Set up the Vertex AI Feature Store for online serving\n" + "### Set up Vertex AI Feature Store for online serving\n" ] }, { @@ -1544,7 +1544,7 @@ "id": "Mm-HCUaa3ROZ" }, "source": [ - "Create a BigQuery dataset to use as the source for the Vertex AI Feature Store." + "Create a BigQuery dataset to use as the source for Vertex AI Feature Store." ] }, { @@ -2148,7 +2148,7 @@ "source": [ "## Use the Vertex AI Feature Store enrichment handler\n", "\n", - "The [`VertexAIFeatureStoreEnrichmentHandler`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.enrichment_handlers.vertex_ai_feature_store.html#apache_beam.transforms.enrichment_handlers.vertex_ai_feature_store.VertexAIFeatureStoreEnrichmentHandler) is a built-in handler included in the Apache Beam SDK versions 2.55.0 and later." + "The [`VertexAIFeatureStoreEnrichmentHandler`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.enrichment_handlers.vertex_ai_feature_store.html#apache_beam.transforms.enrichment_handlers.vertex_ai_feature_store.VertexAIFeatureStoreEnrichmentHandler) is a built-in handler in the Apache Beam SDK versions 2.55.0 and later." ] }, { @@ -2157,16 +2157,16 @@ "id": "K41xhvmA5yQk" }, "source": [ - "Configure the `VertexAIFeatureStoreEnrichmentHandler` with the following required parameters:\n", + "Configure the `VertexAIFeatureStoreEnrichmentHandler` handler with the following required parameters:\n", "\n", "* `project`: the Google Cloud project ID for the feature store\n", "* `location`: the region of the feature store, for example `us-central1`\n", "* `api_endpoint`: the public endpoint of the feature store\n", - "* `feature_store_name`: the name of the Vertex AI Feature Store\n", - "* `feature_view_name`: the name of the feature view within the Vertex AI Feature Store\n", + "* `feature_store_name`: the name of the Vertex AI feature store\n", + "* `feature_view_name`: the name of the feature view within the Vertex AI feature store\n", "* `row_key`: The field name in the input row containing the entity ID for the feature store. This value is used to extract the entity ID from each element. The entity ID is used to fetch feature values for that specific element in the enrichment transform.\n", "\n", - "Optionally, to provide more configuration values to connect with the Vertex AI client, the `VertexAIFeatureStoreEnrichmentHandler` accepts a keyword argument (kwargs). For more information, see [`FeatureOnlineStoreServiceClient`](https://cloud.google.com/php/docs/reference/cloud-ai-platform/latest/V1.FeatureOnlineStoreServiceClient).\n", + "Optionally, to provide more configuration values to connect with the Vertex AI client, the `VertexAIFeatureStoreEnrichmentHandler` handler accepts a keyword argument (kwargs). For more information, see [`FeatureOnlineStoreServiceClient`](https://cloud.google.com/php/docs/reference/cloud-ai-platform/latest/V1.FeatureOnlineStoreServiceClient).\n", "\n", "**Note:** When exceptions occur, by default, the logging severity is set to warning ([`ExceptionLevel.WARN`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.enrichment_handlers.utils.html#apache_beam.transforms.enrichment_handlers.utils.ExceptionLevel.WARN)). To configure the severity to raise exceptions, set `exception_level` to [`ExceptionLevel.RAISE`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.enrichment_handlers.utils.html#apa [...] "\n", @@ -2208,13 +2208,7 @@ "source": [ "## Use the enrichment transform\n", "\n", - "To use the [enrichment transform](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.enrichment.html#apache_beam.transforms.enrichment.Enrichment), the [`EnrichmentHandler`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.enrichment.html#apache_beam.transforms.enrichment.EnrichmentSourceHandler) parameter is required. You can also use a configuration parameter to specify a `lambda` for a join function, a timeout, a throttler, and a repeat [...] - "\n", - "\n", - "* `join_fn`: A lambda function that takes dictionaries as input and returns an enriched row (`Callable[[Dict[str, Any], Dict[str, Any]], beam.Row]`). The enriched row specifies how to join the data fetched from the API. Defaults to a [cross-join](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.enrichment.html#apache_beam.transforms.enrichment.cross_join).\n", - "* `timeout`: The number of seconds to wait for the request to be completed by the API before timing out. Defaults to 30 seconds.\n", - "* `throttler`: Specifies the throttling mechanism. The only supported option is default client-side adaptive throttling.\n", - "* `repeater`: Specifies the retry strategy when errors like `TooManyRequests` and `TimeoutException` occur. Defaults to [`ExponentialBackOffRepeater`](https://beam.apache.org/releases/pydoc/current/apache_beam.io.requestresponse.html#apache_beam.io.requestresponse.ExponentialBackOffRepeater).\n", + "To use the [enrichment transform](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.enrichment.html#apache_beam.transforms.enrichment.Enrichment), the [`EnrichmentHandler`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.enrichment.html#apache_beam.transforms.enrichment.EnrichmentSourceHandler) parameter is required. You can also use configuration parameters to specify a `lambda` for a join function, a timeout, a throttler, and a repeate [...] "\n", "\n", "To use the Redis cache, apply the `with_redis_cache` hook to the enrichment transform. The coders for encoding and decoding the input and output for the cache are optional and are internally inferred."