I am +1 to add Flink runner early as preview. Just small reminder to get DataArtisans software grant for the runner contribution ;)
- Henry On Friday, February 12, 2016, Maximilian Michels <[email protected]> wrote: > Hi Beamers, > > Now that things are getting started and we discuss the technical > vision of Beam, we would like to contribute the Flink runner and start > by sharing some details about the status and the roadmap. > > The Flink Runner integrates deeply and naturally with the Dataflow SDK > (the Beam precursor), because the Flink DataStream API shares many > concepts with the Dataflow model. > Based on whether the program input is bounded or unbounded, the > program goes against Flink's DataStream or DataSet API. > > A quick preview at some of the nice features of the runner: > > - Support for stream transformations, event time, watermarks > - The full Dataflow windowing semantics, including fixed/sliding > time windows, and session windows > - Integration with Flink's streaming sources (Kafka, RabbitMQ, ...) > > - Batch (bounded sources) integrates fully with Flink's managed > memory techniques and out-of-core algorithms, supporting huge joins > and aggregations. > - Integration with Flink's batch API sources (plain text, CSV, Avro, > JDBC, HBase, ...) > > - Integration with Flink's fault tolerance - both batch and > streaming program recover from failures > - After upgrading the dependency to Flink 1.0, one could even use > the Flink Savepoints feature (save streaming state for later resuming) > with the Dataflow programs. > > Attached you can find the document we drew up with more information > about the current state of the Runner and the roadmap for its upcoming > features: > > > https://docs.google.com/document/d/1QM_X70VvxWksAQ5C114MoAKb1d9Vzl2dLxEZM4WYogo/edit?usp=sharing > > The Runner executes the quasi complete Beam streaming model (well, > Dataflow, actually, because Beam is not there, yet). > > Given the current excitement and buzz around Beam, we could add this > runner to the Beam repository and link it as a "preview" for the > people that want to get a feeling of what it will be like to write and > run streaming (unbounded) Beam programs. That would give people > something tangible until the actual Beam code is available. > > What do you think? > > Best, > Max >
