[ 
https://issues.apache.org/jira/browse/BEAM-9322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17047920#comment-17047920
 ] 

Rui Wang commented on BEAM-9322:
--------------------------------

[~rohdesam] per some discussion happen in PR (or somewhere else), I move this 
Jira to 2.21.0. Please let me know if you don't agree.

> Python SDK ignores manually set PCollection tags
> ------------------------------------------------
>
>                 Key: BEAM-9322
>                 URL: https://issues.apache.org/jira/browse/BEAM-9322
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Sam Rohde
>            Assignee: Sam Rohde
>            Priority: Critical
>             Fix For: 2.21.0
>
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The Python SDK currently ignores any tags set on PCollections manually when 
> applying PTransforms when adding the PCollection to the PTransform 
> [outputs|[https://github.com/apache/beam/blob/688a4ea53f315ec2aa2d37602fd78496fca8bb4f/sdks/python/apache_beam/pipeline.py#L595]].
>  In the 
> [add_output|[https://github.com/apache/beam/blob/688a4ea53f315ec2aa2d37602fd78496fca8bb4f/sdks/python/apache_beam/pipeline.py#L872]]
>  method, the tag is set to None for all PValues, meaning the output tags are 
> set to an enumeration index over the PCollection outputs. The tags are not 
> propagated to correctly which can be a problem on relying on the output 
> PCollection tags to match the user set values.
> The fix is to correct BEAM-1833, and always pass in the tags. However, that 
> doesn't fix the problem for nested PCollections. If you have a dict of lists 
> of PCollections, what should their tags be correctly set to? In order to fix 
> this, first propagate the correct tag then talk with the community about the 
> best auto-generated tags.
> Some users may rely on the old implementation, so a flag will be created: 
> "force_generated_pcollection_output_ids" and be default set to False. If 
> True, this will go to the old implementation and generate tags for 
> PCollections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to