chamikaramj commented on a change in pull request #13283:
URL: https://github.com/apache/beam/pull/13283#discussion_r521540557
##########
File path: sdks/python/apache_beam/transforms/ptransform.py
##########
@@ -695,9 +695,9 @@ def from_runner_api(cls,
# type: (...) -> Optional[PTransform]
if proto is None or proto.spec is None or not proto.spec.urn:
return None
- parameter_type, constructor = cls._known_urns[proto.spec.urn]
try:
+ parameter_type, constructor = cls._known_urns[proto.spec.urn]
Review comment:
Ditto. Not sure why we need this update given that tests should already
pass for Dataflow.
##########
File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
##########
@@ -542,19 +593,32 @@ def run_pipeline(self, pipeline, options):
# TODO(chamikara): remove following pipeline and pipeline proto
recreation
# after portable job submission path is fully in place.
from apache_beam import Pipeline
- pipeline = Pipeline.from_runner_api(
+ pipeline, src_context = Pipeline.from_runner_api(
self.proto_pipeline,
pipeline.runner,
options,
+ return_context=True,
allow_proto_holders=True)
# Pipelines generated from proto do not have output set to PDone set for
# leaf elements.
pipeline.visit(self._set_pdone_visitor(pipeline))
+ from apache_beam.runners import pipeline_context
+ dst_context = pipeline_context.PipelineContext(
+ component_id_map=pipeline.component_id_map,
+ default_environment=self._default_environment)
+
+ # Copy external environments to prevent dangling environment ids
+ pipeline.visit(
Review comment:
Why do we need to update dataflow_runner for the test ?
Existing tests should already work for dataflow_runner without these updates.
Duplicates environments are handled here when submitting the Dataflow job
request:
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py#L303
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]