Returning PDone is an anti-pattern that should be avoided, but changing it now would be backwards incompatible. PRs to add non-PDone returning variants (probably as another option to the builders) that compose well with Wait, etc. would be welcome.
On Wed, Mar 24, 2021 at 11:14 AM Alexey Romanenko <aromanenko....@gmail.com> wrote: > In this way, I think “Wait” PTransform should work for you but, as it was > mentioned before, it doesn’t work with PDone, only with PCollection as a > signal. > > Since you already adjusted your own writer for that, it would be great to > contribute it back to Beam in the way as it was done for other IOs (for > example, JdbcIO [1] or BigtableIO [2]) > > In general, I think we need to have it for all IOs, at least to use with > “Wait” because this pattern it's quite often required. > > [1] > https://github.com/apache/beam/blob/ab1dfa13a983d41669e70e83b11f58a83015004c/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L1078 > [2] > https://github.com/apache/beam/blob/ab1dfa13a983d41669e70e83b11f58a83015004c/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java#L715 > > On 24 Mar 2021, at 18:01, Vincent Marquez <vincent.marq...@gmail.com> > wrote: > > No, it only needs to ensure that one record seen on Pubsub has > successfully written to a database. So "record by record" is fine, or even > "bundle". > > *~Vincent* > > > On Wed, Mar 24, 2021 at 9:49 AM Alexey Romanenko <aromanenko....@gmail.com> > wrote: > >> Do you want to wait for ALL records are written for Cassandra and then >> write all successfully written records to PubSub or it should be performed >> "record by record"? >> >> On 24 Mar 2021, at 04:58, Vincent Marquez <vincent.marq...@gmail.com> >> wrote: >> >> I have a common use case where my pipeline looks like this: >> CassandraIO.readAll -> Aggregate -> CassandraIO.write -> PubSubIO.write >> >> I do NOT want my pipeline to look like the following: >> >> CassandraIO.readAll -> Aggregate -> CassandraIO.write >> | >> -> >> PubsubIO.write >> >> Because I need to ensure that only items written to Pubsub have >> successfully finished a (quorum) write. >> >> Since CassandraIO.write is a PTransform<A, PDone> I can't actually use it >> here so I often roll my own 'writer', but maybe there is a recommended way >> of doing this? >> >> Thanks in advance for any help. >> >> *~Vincent* >> >> >> >