[
https://issues.apache.org/jira/browse/BEAM-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anonymous updated BEAM-8088:
----------------------------
Status: Triage Needed (was: Resolved)
> PCollection boundedness should be tracked and propagated in python sdk
> ----------------------------------------------------------------------
>
> Key: BEAM-8088
> URL: https://issues.apache.org/jira/browse/BEAM-8088
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-core
> Reporter: Chad Dombrova
> Assignee: Chad Dombrova
> Priority: P2
> Fix For: 2.16.0
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> As far as I can tell Python does not care about boundedness of PCollections
> even in streaming mode, but external transforms _do_. In my ongoing effort
> to get PubsubIO external transforms working I discovered that I could not
> generate an unbounded write.
> My pipeline looks like this:
> {code:python}
> (
> pipe
> | 'PubSubInflow' >>
> external.pubsub.ReadFromPubSub(subscription=subscription,
> with_attributes=True)
> | 'PubSubOutflow' >>
> external.pubsub.WriteToPubSub(topic=OUTPUT_TOPIC, with_attributes=True)
> )
> {code}
> The PCollections returned from the external Read are Unbounded, as expected,
> but python is responsible for creating the intermediate PCollection, which is
> always Bounded, and thus external Write is always Bounded.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)