[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r395146160 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google Cloud AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use [Google Cloud AI Platform Prediction](https://cloud.google.com/ai-platform/prediction/docs/overview) to make predictions about new data from a cloud-hosted machine learning model. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library with a Beam PTransform called `RunInference`. `RunInference` is able to perform an inference that can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for every bunch of elements, sends a request to AI Platform Prediction. The transform produces a PCollection of type `PredictLog`, which contains predictions. Review comment: I believe BatchElements (the transform responsible for batching here) doc is sufficient. Here's a [link](https://github.com/apache/beam/blob/42d79c29949725b3afabfb3754bfb394be594460/sdks/python/apache_beam/transforms/util.py#L540) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r395118686 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use a cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library that provides `RunInference` Beam's PTransform. `RunInference` is a PTransform able to perform two types of inference. One of them can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for each element, sends a request to Google Cloud AI Platform Prediction service. The transform produces a PCollection of type `PredictLog` which contains predictions. + +Before getting started, deploy a machine learning model to the cloud. The cloud service manages the infrastructure needed to handle prediction requests in both efficient and scalable way. Only Tensorflow models are supported. For more information, see [Exporting a SavedModel for prediction](https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction). Review comment: In this case it's for receiving data. In other words, AI Platform Prediction exposes a service endpoint that can receive data. I'll clarify it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r395113820 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google Cloud AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use [Google Cloud AI Platform Prediction](https://cloud.google.com/ai-platform/prediction/docs/overview) to make predictions about new data from a cloud-hosted machine learning model. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library with a Beam PTransform called `RunInference`. `RunInference` is able to perform an inference that can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for every bunch of elements, sends a request to AI Platform Prediction. The transform produces a PCollection of type `PredictLog`, which contains predictions. Review comment: > Is there away to configure how many elements are in the batch? No, in this case Beam handles the size of a batch on its own. > I think the previous version of this paragraph stated that one request is sent per element. That's right. After a discussion on batching it comes to my mind this is not truly correct. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r395071873 ## File path: website/src/documentation/patterns/overview.md ## @@ -38,6 +38,9 @@ Pipeline patterns demonstrate common Beam use cases. Pipeline patterns are based **Custom window patterns** - Patterns for windowing functions * [Using data to dynamically set session window gaps]({{ site.baseurl }}/documentation/patterns/custom-windows/#using-data-to-dynamically-set-session-window-gaps) +**AI Platform integration patterns** - Patterns for Google AI Platform transforms Review comment: Looks good. One minor comment: there are more Google Cloud AI Platform transforms. This pull request is just about Prediction, more patterns could be added later. So I'd stick with `Patterns for Google Cloud AI Platform transforms` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r395049430 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use a cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library that provides `RunInference` Beam's PTransform. `RunInference` is a PTransform able to perform two types of inference. One of them can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for each element, sends a request to Google Cloud AI Platform Prediction service. The transform produces a PCollection of type `PredictLog` which contains predictions. + +Before getting started, deploy a machine learning model to the cloud. The cloud service manages the infrastructure needed to handle prediction requests in both efficient and scalable way. Only Tensorflow models are supported. For more information, see [Exporting a SavedModel for prediction](https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction). Review comment: Alright, but I'd rather keep this sentence: "only TensorFlow models are supported by the transform" anyway. AI Platform, unlike `RunInference`, doesn't only support TensorFlow models. It's important to put an emphasis on that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r395037728 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use a cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library that provides `RunInference` Beam's PTransform. `RunInference` is a PTransform able to perform two types of inference. One of them can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for each element, sends a request to Google Cloud AI Platform Prediction service. The transform produces a PCollection of type `PredictLog` which contains predictions. Review comment: Fair enough, this is an another evidence that this mention causes confusion. I'll remote it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r395033918 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use a cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. Review comment: No, AI Platform is a service that manages cloud-hosted machine learning models. How about something like this? > This section shows how to use Google Cloud AI Platform Prediction to make predictions about new data from a cloud-hosted machine learning model. I'll also put an url link to an overview of AI Platform Prediction. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r393764367 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use a cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library that provides `RunInference` Beam's PTransform. `RunInference` is a PTransform able to perform two types of inference. One of them can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for each element, sends a request to Google Cloud AI Platform Prediction service. The transform produces a PCollection of type `PredictLog` which contains predictions. + +Before getting started, deploy a machine learning model to the cloud. The cloud service manages the infrastructure needed to handle prediction requests in both efficient and scalable way. Only Tensorflow models are supported. For more information, see [Exporting a SavedModel for prediction](https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction). + +Once a machine learning model is deployed, prepare a list of instances to get predictions for. + +Here is an example of a pipeline that reads input instances from the file, converts JSON objects to `tf.train.Example` objects and sends data to the service. The content of a file can look like this: + +``` +{"input": "the quick brown"} +{"input": "la bruja le"} +``` + +The example creates `tf.train.BytesList` instances, thus it expects byte-like strings as input, but other data types, like `tf.train.FloatList` and `tf.train.Int64List`, are also supported by the transform. To send binary data, make sure that the name of an input ends in `_bytes`. Review comment: Basically, the input format is very easy: the transform expects a PCollection of `tf.train.Example` objects, which are pretty well-known in tensorflow world. But I agree that the information about sending binary data is not visible enough. I'll try to improve it. > do you mean that we need to change l74 to something like: feature={name+'_bytes', value} for sending binary data to endpoint? Yes. It's a sign for the transform that data should be b64-encoded. In a request we'd have something like this: ` "instances": [{"input_bytes": {"b64": "xxx"}}] ` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r393007213 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use a cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library that provides `RunInference` Beam's PTransform. `RunInference` is a PTransform able to perform two types of inference. One of them can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for each element, sends a request to Google Cloud AI Platform Prediction service. The transform produces a PCollection of type `PredictLog` which contains predictions. + +Before getting started, deploy a machine learning model to the cloud. The cloud service manages the infrastructure needed to handle prediction requests in both efficient and scalable way. Only Tensorflow models are supported. For more information, see [Exporting a SavedModel for prediction](https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction). Review comment: I'm curious what others think about this. @aaltay @Ardagan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r393005385 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use a cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library that provides `RunInference` Beam's PTransform. `RunInference` is a PTransform able to perform two types of inference. One of them can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for each element, sends a request to Google Cloud AI Platform Prediction service. The transform produces a PCollection of type `PredictLog` which contains predictions. + +Before getting started, deploy a machine learning model to the cloud. The cloud service manages the infrastructure needed to handle prediction requests in both efficient and scalable way. Only Tensorflow models are supported. For more information, see [Exporting a SavedModel for prediction](https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction). Review comment: Yes, the transform expects that a specified model is up and running. Users should also un-deploy redundant model manually. It is theoretically possible to implement such initial setup and clean-up operations, however, I wonder whether this is a good use-case for Beam: those are long-running operations and we'd have to wait until they are finished. As a result, Dataflow workers (or any other runner) would be idle for some time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r392991557 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use a cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library that provides `RunInference` Beam's PTransform. `RunInference` is a PTransform able to perform two types of inference. One of them can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for each element, sends a request to Google Cloud AI Platform Prediction service. The transform produces a PCollection of type `PredictLog` which contains predictions. + +Before getting started, deploy a machine learning model to the cloud. The cloud service manages the infrastructure needed to handle prediction requests in both efficient and scalable way. Only Tensorflow models are supported. For more information, see [Exporting a SavedModel for prediction](https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction). + +Once a machine learning model is deployed, prepare a list of instances to get predictions for. + +Here is an example of a pipeline that reads input instances from the file, converts JSON objects to `tf.train.Example` objects and sends data to the service. The content of a file can look like this: + +``` +{"input": "the quick brown"} +{"input": "la bruja le"} +``` + +The example creates `tf.train.BytesList` instances, thus it expects byte-like strings as input, but other data types, like `tf.train.FloatList` and `tf.train.Int64List`, are also supported by the transform. To send binary data, make sure that the name of an input ends in `_bytes`. + +Here is the code: + +{:.language-java} +```java +// Getting predictions is not yet available for Java. [BEAM-9501] +``` + +{:.language-py} +```py +import json + +import apache_beam as beam + +import tensorflow as tf +from tfx_bsl.beam.run_inference import RunInference +from tfx_bsl.proto import model_spec_pb2 + +def convert_json_to_tf_example(json_obj): + dict_ = json.loads(json_obj) + for name, text in dict_.items(): + value = tf.train.Feature(bytes_list=tf.train.BytesList( +value=[text.encode('utf-8')])) + feature = {name: value} + return tf.train.Example(features=tf.train.Features(feature=feature)) + +with beam.Pipeline() as p: + _ = (p + | beam.io.ReadFromText('gs://my-bucket/samples.json') Review comment: `RunInference` uses a Beam's built-in transform `BatchElements`, which batches elements (it means that it consumes a PCollection of `tf.train.Example` and produces a PCollection of `List[tf.train.Example]`). Thanks to that, multiple input items could be sent using a single HTTP request. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r392972669 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use a cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library that provides `RunInference` Beam's PTransform. `RunInference` is a PTransform able to perform two types of inference. One of them can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for each element, sends a request to Google Cloud AI Platform Prediction service. The transform produces a PCollection of type `PredictLog` which contains predictions. + +Before getting started, deploy a machine learning model to the cloud. The cloud service manages the infrastructure needed to handle prediction requests in both efficient and scalable way. Only Tensorflow models are supported. For more information, see [Exporting a SavedModel for prediction](https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction). + +Once a machine learning model is deployed, prepare a list of instances to get predictions for. + +Here is an example of a pipeline that reads input instances from the file, converts JSON objects to `tf.train.Example` objects and sends data to the service. The content of a file can look like this: + +``` +{"input": "the quick brown"} +{"input": "la bruja le"} +``` + +The example creates `tf.train.BytesList` instances, thus it expects byte-like strings as input, but other data types, like `tf.train.FloatList` and `tf.train.Int64List`, are also supported by the transform. To send binary data, make sure that the name of an input ends in `_bytes`. + +Here is the code: + +{:.language-java} +```java +// Getting predictions is not yet available for Java. [BEAM-9501] +``` + +{:.language-py} +```py +import json + +import apache_beam as beam + +import tensorflow as tf +from tfx_bsl.beam.run_inference import RunInference +from tfx_bsl.proto import model_spec_pb2 + +def convert_json_to_tf_example(json_obj): + dict_ = json.loads(json_obj) + for name, text in dict_.items(): + value = tf.train.Feature(bytes_list=tf.train.BytesList( +value=[text.encode('utf-8')])) + feature = {name: value} + return tf.train.Example(features=tf.train.Features(feature=feature)) + +with beam.Pipeline() as p: + _ = (p + | beam.io.ReadFromText('gs://my-bucket/samples.json') + | beam.Map(convert_json_to_tf_example) + | RunInference( Review comment: Yes. The transform uses this client library: https://github.com/googleapis/google-api-python-client. > is there a plan to support gRPC in the future? As far as I know, there is no such plan at the moment This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r392966450 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use a cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. + +[tfx_bsl](https://github.com/tensorflow/tfx-bsl) is a library that provides `RunInference` Beam's PTransform. `RunInference` is a PTransform able to perform two types of inference. One of them can use a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for each element, sends a request to Google Cloud AI Platform Prediction service. The transform produces a PCollection of type `PredictLog` which contains predictions. Review comment: Offline inference from a SavedModel instance. I decided it's not worth mentioning here, since the article is focused on AI Platform patterns. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton
kamilwu commented on a change in pull request #11075: [BEAM-9421] Website section that describes getting predictions using AI Platform Prediciton URL: https://github.com/apache/beam/pull/11075#discussion_r392306737 ## File path: website/src/documentation/patterns/ai-platform.md ## @@ -0,0 +1,87 @@ +--- +layout: section +title: "AI Platform integration patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/ai-platform/ +--- + + +# AI Platform integration patterns + +This page describes common patterns in pipelines with Google AI Platform transforms. + + + Adapt for: + +Java SDK +Python SDK + + + +## Getting predictions + +This section shows how to use your cloud-hosted machine learning model to make predictions about new data using Google Cloud AI Platform Prediction within Beam's pipeline. + +We are going to use [tfx_bsl](https://github.com/tensorflow/tfx-bsl) library which provides `RunInference` Beam's PTransform. `RunInference` is a PTransform able to perform two types of inference. In this section we are going to consider one of them that uses a service endpoint. When using a service endpoint, the transform takes a PCollection of type `tf.train.Example` and, for each element, sends a request to Google Cloud AI Platform Prediction service. The transform produces a PCollection of type `PredictLog` which contains predictions. + +Before we get started, we have to deploy a machine learning model to the cloud. The cloud service manages the infrastructure needed to handle prediction requests in both efficient and scalable way. Only Tensorflow models are supported. For more information, see [Exporting a SavedModel for prediction](https://cloud.google.com/ai-platform/prediction/docs/exporting-savedmodel-for-prediction). + +Let's show an example of a pipeline that reads input instances from the file, converts JSON objects to `tf.train.Example` objects and sends data to the service. The content of a file can look like this: + +``` +{"input": "the quick brown"} +{"input": "la bruja le"} +``` + +Here is the code: + +{:.language-java} +```java +// Getting predictions is not available for Java. Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services