Thanks Ismaël, I think the SDK portions of the Drain proposal are completely runner independent. Some parts of Drain (e.g. advancing watermarks) will have to be done by the runners of course.
I'm working on the snapshot and update proposal. I hope to have time to send it out soon! Reuven On Mon, Jun 12, 2017 at 11:49 PM, Ismaël Mejía <ieme...@gmail.com> wrote: > Hello Reuven, > > I finally took the time to read the Drain proposal, thanks a lot for > bringing this, it looks like a nice fit with the current APIs and it > would be great if this could be implemented as much as possible in a > Runner independent way. > > I am eager now to see the snapshot and update proposal. > Thanks again, > Ismaël > > On Tue, Jun 6, 2017 at 10:03 PM, Reuven Lax <re...@google.com.invalid> > wrote: > > > > I believe so, but it looks like the dispenser is an interface to the user > > for such features. We still need to define what the semantics of features > > like Drain are, and how they affect the pipeline. > > > > On Tue, Jun 6, 2017 at 12:06 PM, Jean-Baptiste Onofré <j...@nanthrax.net> > > wrote: > > > > > Hi Reuven, > > > > > > In the "Apache Beam: Technical vision" document (dating from the > > > incubation) (https://docs.google.com/document/d/ > 1UyAeugHxZmVlQ5cEWo_eOPg > > > XNQA1oD-rGooWOSwAqh8/edit?usp=sharing), I added a section named "Beam > > > Pipelines Dispenser". > > > > > > The idea is to be able to bootstrap, run and control pipelines (and the > > > runners). > > > > > > I think it's somehow related. WDYT ? > > > > > > Regards > > > JB > > > > > > On 06/06/2017 07:43 PM, Reuven Lax wrote: > > > > > >> Hi all, > > >> > > >> Beam is a great programming mode, but in order to really run pipelines > > >> (especially streaming pipelines which are "always on") in a production > > >> setting, there is a set of features necessary. Dataflow has a couple > of > > >> those features built in (Drain and Update), and inspired by those > I'll be > > >> sending out a few proposals for similar features in Beam. > > >> > > >> Please note that my intention here is _not_ to simply forklift the > > >> Dataflow > > >> features to Beam. The Dataflow features are being used as > inspiration, and > > >> we have two years of experience how real users have used these feature > > >> (and > > >> also experienced when users have found these features limited and > > >> frustrating). In every case my Beam proposals are different - > hopefully > > >> better! - than the actual Dataflow feature that exists today. > > >> > > >> I think all runners would greatly benefit from production-control > features > > >> like this, and I would love to see community input. The first > proposal is > > >> for a way of draining a streaming pipeline before stopping it, and > here it > > >> is > > >> <https://docs.google.com/document/d/1NExwHlj-2q2WUGhSO4jTu8X > > >> GhDPmm3cllSN8IMmWci8/edit> > > >> . > > >> > > >> Reuven > > >> > > >> > > > -- > > > Jean-Baptiste Onofré > > > jbono...@apache.org > > > http://blog.nanthrax.net > > > Talend - http://www.talend.com > > > >