[ 
https://issues.apache.org/jira/browse/BEAM-8019?focusedWorklogId=411471&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-411471
 ]

ASF GitHub Bot logged work on BEAM-8019:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/Mar/20 23:07
            Start Date: 27/Mar/20 23:07
    Worklog Time Spent: 10m 
      Work Description: robertwb commented on pull request #11185: [BEAM-8019] 
Updates Python SDK to handle remote SDK coders and preserve tags added by 
remote SDKs
URL: https://github.com/apache/beam/pull/11185#discussion_r399570138
 
 

 ##########
 File path: sdks/python/apache_beam/pipeline.py
 ##########
 @@ -1128,29 +1136,67 @@ def from_runner_api(proto,  # type: 
beam_runner_api_pb2.PTransform
                       context  # type: PipelineContext
                      ):
     # type: (...) -> AppliedPTransform
-    def is_side_input(tag):
+    def is_python_side_input(tag):
       # type: (str) -> bool
       # As per named_inputs() above.
-      return tag.startswith('side')
+      return re.match(SIDE_INPUT_REGEX, tag)
+
+    side_input_tags = []
+    if common_urns.primitives.PAR_DO.urn == proto.spec.urn:
+      # Preserving side input tags.
+      from apache_beam.utils import proto_utils
+      from apache_beam.portability.api import beam_runner_api_pb2
+      payload = (
+          proto_utils.parse_Bytes(
+              proto.spec.payload, beam_runner_api_pb2.ParDoPayload))
+      for tag, si in payload.side_inputs.items():
+        side_input_tags.append(tag)
 
     main_inputs = [
         context.pcollections.get_by_id(id) for tag,
-        id in proto.inputs.items() if not is_side_input(tag)
+        id in proto.inputs.items() if tag not in side_input_tags
     ]
 
-    # Ordering is important here.
-    indexed_side_inputs = [
-        (get_sideinput_index(tag), context.pcollections.get_by_id(id)) for tag,
-        id in proto.inputs.items() if is_side_input(tag)
-    ]
+    # Using a list here so that we can pass this into a function
+    # TODO: use nonlocal after fully migrated to Python3.
+    next_index = [0]
+
+    def _get_sideinput_index(tag, next_index):
 
 Review comment:
   It feels like this could result in duplicate indices if side one has tags 
named ['tag', 'side0', ...]. But maybe in that case it's OK? 
   
   Please reference (create?) a JIRA about making side inputs always key-valued 
rather than having this kind of logic (here and elsewhere).
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 411471)
    Time Spent: 12h  (was: 11h 50m)

> Support cross-language transforms for DataflowRunner
> ----------------------------------------------------
>
>                 Key: BEAM-8019
>                 URL: https://issues.apache.org/jira/browse/BEAM-8019
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Chamikara Madhusanka Jayalath
>            Assignee: Chamikara Madhusanka Jayalath
>            Priority: Major
>          Time Spent: 12h
>  Remaining Estimate: 0h
>
> This is to capture the Beam changes needed for this task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to