Re: Beam Proposal: Pipeline Drain

2017-06-13 Thread Reuven Lax
Thanks Ismaël, I think the SDK portions of the Drain proposal are completely runner independent. Some parts of Drain (e.g. advancing watermarks) will have to be done by the runners of course. I'm working on the snapshot and update proposal. I hope to have time to send it out soon! Reuven On Mon

Re: Beam Proposal: Pipeline Drain

2017-06-12 Thread Ismaël Mejía
Hello Reuven, I finally took the time to read the Drain proposal, thanks a lot for bringing this, it looks like a nice fit with the current APIs and it would be great if this could be implemented as much as possible in a Runner independent way. I am eager now to see the snapshot and update propo

Re: Beam Proposal: Pipeline Drain

2017-06-06 Thread Reuven Lax
I believe so, but it looks like the dispenser is an interface to the user for such features. We still need to define what the semantics of features like Drain are, and how they affect the pipeline. On Tue, Jun 6, 2017 at 12:06 PM, Jean-Baptiste Onofré wrote: > Hi Reuven, > > In the "Apache Beam:

Re: Beam Proposal: Pipeline Drain

2017-06-06 Thread Jean-Baptiste Onofré
Hi Reuven, In the "Apache Beam: Technical vision" document (dating from the incubation) (https://docs.google.com/document/d/1UyAeugHxZmVlQ5cEWo_eOPgXNQA1oD-rGooWOSwAqh8/edit?usp=sharing), I added a section named "Beam Pipelines Dispenser". The idea is to be able to bootstrap, run and control

Beam Proposal: Pipeline Drain

2017-06-06 Thread Reuven Lax
Hi all, Beam is a great programming mode, but in order to really run pipelines (especially streaming pipelines which are "always on") in a production setting, there is a set of features necessary. Dataflow has a couple of those features built in (Drain and Update), and inspired by those I'll be se