[ 
https://issues.apache.org/jira/browse/BEAM-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous updated BEAM-8088:
----------------------------
    Status: Triage Needed  (was: Resolved)

> PCollection boundedness should be tracked and propagated in python sdk
> ----------------------------------------------------------------------
>
>                 Key: BEAM-8088
>                 URL: https://issues.apache.org/jira/browse/BEAM-8088
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-core
>            Reporter: Chad Dombrova
>            Assignee: Chad Dombrova
>            Priority: P2
>             Fix For: 2.16.0
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> As far as I can tell Python does not care about boundedness of PCollections 
> even in streaming mode, but external transforms _do_.  In my ongoing effort 
> to get PubsubIO external transforms working I discovered that I could not 
> generate an unbounded write. 
> My pipeline looks like this:
> {code:python}
>     (
>         pipe
>         | 'PubSubInflow' >> 
> external.pubsub.ReadFromPubSub(subscription=subscription, 
> with_attributes=True)
>         | 'PubSubOutflow' >> 
> external.pubsub.WriteToPubSub(topic=OUTPUT_TOPIC, with_attributes=True)
>     )
> {code}
> The PCollections returned from the external Read are Unbounded, as expected, 
> but python is responsible for creating the intermediate PCollection, which is 
> always Bounded, and thus external Write is always Bounded. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to