[GitHub] [beam] ryanthompson591 commented on a change in pull request #16917: [BEAM-13972] Add RunInference interface

GitBox Thu, 03 Mar 2022 07:01:22 -0800


ryanthompson591 commented on a change in pull request #16917:
URL: https://github.com/apache/beam/pull/16917#discussion_r818729200




##########
File path: sdks/python/apache_beam/ml/inference/api.py
##########
@@ -0,0 +1,84 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from dataclasses import dataclass
+from enum import Enum
+from typing import Tuple, TypeVar, Union
+
+import apache_beam as beam
+
+
+class PyTorchDevice(Enum):
+  CPU = 1
+  GPU = 2
+
+
+class SklearnSerializationType(Enum):
+  PICKLE = 1
+  JOBLIB = 2
+
+
+@dataclass
+class BaseModelSpec:
+  model_url: str
+
+
+@dataclass
+class PyTorchModelSpec(BaseModelSpec):
+  device: PyTorchDevice
+
+
+@dataclass
+class SklearnModelSpec(BaseModelSpec):
+  serialization_type: SklearnSerializationType
+
+
+_K = TypeVar('_K')
+_INPUT_TYPE = TypeVar('_INPUT_TYPE')
+_OUTPUT_TYPE = TypeVar('_OUTPUT_TYPE')
+
+
+@dataclass
+class PredictionResult:
+  example: _INPUT_TYPE
+  inference: _OUTPUT_TYPE
+
+
[email protected]_fn
[email protected]_input_types(Union[_INPUT_TYPE, Tuple[_K, _INPUT_TYPE]])
[email protected]_output_types(
+    Union[PredictionResult, Tuple[_K, PredictionResult]])
+def RunInference(
+    examples: beam.pvalue.PCollection,
+    model: BaseModelSpec) -> beam.pvalue.PCollection:
+  """Run inference with a model.
+
+  There one type of inference you can perform using this PTransform:
+    1. In-process inference from a SavedModel instance.

Review comment:
       I'm not sure this will be true once we start implementing this 
interface. Ideally one of the earlier things we should do is wrap the TFX 
implementation which will use a service like Vertex AI.

##########
File path: sdks/python/apache_beam/ml/inference/api.py
##########
@@ -0,0 +1,84 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from dataclasses import dataclass
+import apache_beam as beam
+from typing import Tuple, TypeVar, Union
+# TODO: implement RunInferenceImpl
+# from apache_beam.ml.inference.base import RunInferenceImpl
+
+
+@dataclass
+class BaseModelSpec:
+  model_url: str
+
+
+@dataclass
+class PyTorchModelSpec(BaseModelSpec):

Review comment:
       It's fine, though, in general I prefer shorter names where they are 
adequately precise.

##########
File path: sdks/python/apache_beam/ml/inference/api.py
##########
@@ -0,0 +1,84 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from dataclasses import dataclass
+from enum import Enum
+from typing import Tuple, TypeVar, Union
+
+import apache_beam as beam
+
+
+class PyTorchDevice(Enum):
+  CPU = 1
+  GPU = 2
+
+
+class SklearnSerializationType(Enum):
+  PICKLE = 1
+  JOBLIB = 2
+
+
+@dataclass
+class BaseModelSpec:
+  model_url: str
+
+
+@dataclass
+class PyTorchModelSpec(BaseModelSpec):
+  device: PyTorchDevice
+
+
+@dataclass
+class SklearnModelSpec(BaseModelSpec):
+  serialization_type: SklearnSerializationType
+
+
+_K = TypeVar('_K')
+_INPUT_TYPE = TypeVar('_INPUT_TYPE')
+_OUTPUT_TYPE = TypeVar('_OUTPUT_TYPE')
+
+
+@dataclass
+class PredictionResult:
+  example: _INPUT_TYPE
+  inference: _OUTPUT_TYPE
+
+
[email protected]_fn
[email protected]_input_types(Union[_INPUT_TYPE, Tuple[_K, _INPUT_TYPE]])
[email protected]_output_types(
+    Union[PredictionResult, Tuple[_K, PredictionResult]])
+def RunInference(
+    examples: beam.pvalue.PCollection,
+    model: BaseModelSpec) -> beam.pvalue.PCollection:
+  """Run inference with a model.

Review comment:
       Getting this API doc bogged down in the details of what is implemented 
at any given time will probably have the effect of letting it get out of date 
and causing confusion. 
   
   I suggest instead we just clearly define what this does without getting into 
any details about what is yet supported and what our plans are.
   
   Here's a suggestion:
   
   A transform that takes a pcollection of examples (or features) to be used on 
an ML model. It will then output inferences (or predictions) for those examples 
in a pcollection of PredictionResults, containing the input examples and output 
inferences.
   
   If examples are paired with keys, it will output a tuple (key, 
PredictionResult) for each (key, example) input.
   
   Models for supported frameworks can be loaded via a URI. Supported services 
can also be used.
   
   TODO: link to a help page that shows what is/isn't supported.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] ryanthompson591 commented on a change in pull request #16917: [BEAM-13972] Add RunInference interface

Reply via email to