Re: Write to multiple IOs in linear fashion

2021-03-25 Thread Alexey Romanenko
I think you are right, since "writer.close()” contains a business logic, it must be moved to @FinishBundle. The same thing about DeleteFn. I’ll create a Jira for that. > On 25 Mar 2021, at 00:49, Kenneth Knowles wrote: > > Alex's idea sounds good and like what Vincent maybe implemented. I am j

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Kenneth Knowles
The reason I was checking out the code is that sometimes a natural thing to output would be a summary of what was written. So each chunk of writes and the final chunk written in @FinishBundle. This is, for example, what SQL engines do (output # of rows written). You could output both the summary a

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Robert Bradshaw
Yeah, the entire input is not always what is needed, and can generally be achieved via input -> wait(side input of write) -> do something with the input Of course one could also do entire_input_as_output_of_wait -> MapTo(KV.of(null, null)) -> CombineGlobally(TrivialCombineFn) to reduce this

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Reuven Lax
I believe that the Wait transform turns this output into a side input, so outputting the input PCollection might be problematic. On Wed, Mar 24, 2021 at 4:49 PM Kenneth Knowles wrote: > Alex's idea sounds good and like what Vincent maybe implemented. I am just > reading really quickly so sorry i

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Vincent Marquez
On Wed, Mar 24, 2021 at 4:49 PM Kenneth Knowles wrote: > Alex's idea sounds good and like what Vincent maybe implemented. I am just > reading really quickly so sorry if I missed something... > > Checking out the code for the WriteFn I see a big problem: > > @Setup > public void setup() {

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Kenneth Knowles
Alex's idea sounds good and like what Vincent maybe implemented. I am just reading really quickly so sorry if I missed something... Checking out the code for the WriteFn I see a big problem: @Setup public void setup() { writer = new Mutator<>(spec, Mapper::saveAsync, "writes");

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Robert Bradshaw
On Wed, Mar 24, 2021 at 2:36 PM Ismaël Mejía wrote: > +dev > > Since we all agree that we should return something different than > PDone the real question is what should we return. > My proposal is that one returns a PCollection that consists, internally, of something contentless like nulls. Thi

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Ismaël Mejía
+dev Since we all agree that we should return something different than PDone the real question is what should we return. As a reminder we had a pretty interesting discussion about this already in the past but uniformization of our return values has not happened. This thread is worth reading for Vi

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Alexey Romanenko
I thought that was said about returning a PCollection of write results as it’s done in other IOs (as I mentioned as examples) that have _additional_ write methods, like “withWriteResults()” etc, that return PTransform<…, PCollection>. In this case, we keep backwards compatibility and just add n

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Robert Bradshaw
Returning PDone is an anti-pattern that should be avoided, but changing it now would be backwards incompatible. PRs to add non-PDone returning variants (probably as another option to the builders) that compose well with Wait, etc. would be welcome. On Wed, Mar 24, 2021 at 11:14 AM Alexey Romanenko

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Alexey Romanenko
In this way, I think “Wait” PTransform should work for you but, as it was mentioned before, it doesn’t work with PDone, only with PCollection as a signal. Since you already adjusted your own writer for that, it would be great to contribute it back to Beam in the way as it was done for other IO

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Steve Niemitz
This has been a common problem I've run into with lots of built-in IOs, I've generally submitted PRs for them to add support for emitting something once writed are completed. On Wed, Mar 24, 2021 at 1:04 PM Vincent Marquez wrote: > > *~Vincent* > > > On Wed, Mar 24, 2021 at 10:01 AM Reuven Lax

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Chamikara Jayalath
On Wed, Mar 24, 2021 at 10:04 AM Vincent Marquez wrote: > > *~Vincent* > > > On Wed, Mar 24, 2021 at 10:01 AM Reuven Lax wrote: > >> Does that work if cassandra returns a PDone? >> > > No, it doesn't work. I wrote my own CassandraIO.Write that is a > PTransform, PCollection> instead. > > I'm ju

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Vincent Marquez
*~Vincent* On Wed, Mar 24, 2021 at 10:01 AM Reuven Lax wrote: > Does that work if cassandra returns a PDone? > No, it doesn't work. I wrote my own CassandraIO.Write that is a PTransform, PCollection> instead. I'm just asking if there's a better way of doing this because I'm having to do this

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Vincent Marquez
No, it only needs to ensure that one record seen on Pubsub has successfully written to a database. So "record by record" is fine, or even "bundle". *~Vincent* On Wed, Mar 24, 2021 at 9:49 AM Alexey Romanenko wrote: > Do you want to wait for ALL records are written for Cassandra and then > wri

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Reuven Lax
Does that work if cassandra returns a PDone? On Wed, Mar 24, 2021 at 10:00 AM Chamikara Jayalath wrote: > If you want to wait for all records are written (per window) to Cassandra > before writing that window to PubSub, you should be able to use the Wait > transform: > https://github.com/apache/

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Chamikara Jayalath
If you want to wait for all records are written (per window) to Cassandra before writing that window to PubSub, you should be able to use the Wait transform: https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Wait.java Thanks, Cham On Wed, Mar 2

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Alexey Romanenko
Do you want to wait for ALL records are written for Cassandra and then write all successfully written records to PubSub or it should be performed "record by record"? > On 24 Mar 2021, at 04:58, Vincent Marquez wrote: > > I have a common use case where my pipeline looks like this: > CassandraIO

Write to multiple IOs in linear fashion

2021-03-23 Thread Vincent Marquez
I have a common use case where my pipeline looks like this: CassandraIO.readAll -> Aggregate -> CassandraIO.write -> PubSubIO.write I do NOT want my pipeline to look like the following: CassandraIO.readAll -> Aggregate -> CassandraIO.write