On Wed, Mar 24, 2021 at 10:04 AM Vincent Marquez <[email protected]> wrote:
> > *~Vincent* > > > On Wed, Mar 24, 2021 at 10:01 AM Reuven Lax <[email protected]> wrote: > >> Does that work if cassandra returns a PDone? >> > > No, it doesn't work. I wrote my own CassandraIO.Write that is a > PTransform<PCollection<A>, PCollection<A>> instead. > > I'm just asking if there's a better way of doing this because I'm having > to do this with multiple types of Writers, and don't want to have to hand > roll my own Write for each IO type I need this pattern for. > Ah OK. Thanks for the correction. I think we should update the documentation of the Wait transform to clarify this. I'm afraid there's no good solution for PTransforms that return PDone then :( There were discussion elsewhere to update some of the sinks to return some sort of a result PCollection but we haven't done this for all sinks. Any contributions related to this are welcome. Thanks, Cham > >> >> On Wed, Mar 24, 2021 at 10:00 AM Chamikara Jayalath <[email protected]> >> wrote: >> >>> If you want to wait for all records are written (per window) to >>> Cassandra before writing that window to PubSub, you should be able to use >>> the Wait transform: >>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Wait.java >>> >>> Thanks, >>> Cham >>> >>> On Wed, Mar 24, 2021 at 9:49 AM Alexey Romanenko < >>> [email protected]> wrote: >>> >>>> Do you want to wait for ALL records are written for Cassandra and then >>>> write all successfully written records to PubSub or it should be performed >>>> "record by record"? >>>> >>>> On 24 Mar 2021, at 04:58, Vincent Marquez <[email protected]> >>>> wrote: >>>> >>>> I have a common use case where my pipeline looks like this: >>>> CassandraIO.readAll -> Aggregate -> CassandraIO.write -> PubSubIO.write >>>> >>>> I do NOT want my pipeline to look like the following: >>>> >>>> CassandraIO.readAll -> Aggregate -> CassandraIO.write >>>> | >>>> -> >>>> PubsubIO.write >>>> >>>> Because I need to ensure that only items written to Pubsub have >>>> successfully finished a (quorum) write. >>>> >>>> Since CassandraIO.write is a PTransform<A, PDone> I can't actually use >>>> it here so I often roll my own 'writer', but maybe there is a recommended >>>> way of doing this? >>>> >>>> Thanks in advance for any help. >>>> >>>> *~Vincent* >>>> >>>> >>>>
