I need to update this ;) To start with, you could just take a look at branch-2.0.
On Sun, May 22, 2016, 01:23 Ovidiu-Cristian MARCU < ovidiu-cristian.ma...@inria.fr> wrote: > Thank you, Amit! I was looking for this kind of information. > > I did not fully read your paper, I see in it a TODO with basically the > same question(s) [1], maybe someone from Spark team (including Databricks) > will be so kind to send some feedback.. > > Best, > Ovidiu > > [1] Integrate “Structured Streaming”: //TODO - What (and how) will Spark > 2.0 support (out-of-order, event-time windows, watermarks, triggers, > accumulation modes) - how straight forward will it be to integrate with the > Beam Model ? > > > On 21 May 2016, at 23:00, Sela, Amit <ans...@paypal.com> wrote: > > It seems I forgot to add the link to the “Technical Vision” paper so there > it is - > https://docs.google.com/document/d/1y4qlQinjjrusGWlgq-mYmbxRW2z7-_X5Xax-GG0YsC0/edit?usp=sharing > > From: "Sela, Amit" <ans...@paypal.com> > Date: Saturday, May 21, 2016 at 11:52 PM > To: Ovidiu-Cristian MARCU <ovidiu-cristian.ma...@inria.fr>, "user @spark" > <user@spark.apache.org> > Cc: Ovidiu Cristian Marcu <ovidiu21ma...@gmail.com> > Subject: Re: What / Where / When / How questions in Spark 2.0 ? > > This is a “Technical Vision” paper for the Spark runner, which provides > general guidelines to the future development of Spark’s Beam support as > part of the Apache Beam (incubating) project. > This is our JIRA - > https://issues.apache.org/jira/browse/BEAM/component/12328915/?selectedTab=com.atlassian.jira.jira-projects-plugin:component-summary-panel > > Generally, I’m currently working on Datasets integration for Batch (to > replace RDD) against Spark 1.6, and going towards enhancing Stream > processing capabilities with Structured Streaming (2.0) > > And you’re welcomed to ask those questions at the Apache Beam (incubating) > mailing list as well ;) > http://beam.incubator.apache.org/mailing_lists/ > > Thanks, > Amit > > From: Ovidiu-Cristian MARCU <ovidiu-cristian.ma...@inria.fr> > Date: Tuesday, May 17, 2016 at 12:11 AM > To: "user @spark" <user@spark.apache.org> > Cc: Ovidiu Cristian Marcu <ovidiu21ma...@gmail.com> > Subject: Re: What / Where / When / How questions in Spark 2.0 ? > > Could you please consider a short answer regarding the Apache Beam > Capability Matrix todo’s for future Spark 2.0 release [4]? (some related > references below [5][6]) > > Thanks > > [4] http://beam.incubator.apache.org/capability-matrix/#cap-full-what > [5] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101 > [6] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102 > > On 16 May 2016, at 14:18, Ovidiu-Cristian MARCU < > ovidiu-cristian.ma...@inria.fr> wrote: > > Hi, > > We can see in [2] many interesting (and expected!) improvements (promises) > like extended SQL support, unified API (DataFrames, DataSets), improved > engine (Tungsten relates to ideas from modern compilers and MPP databases - > similar to Flink [3]), structured streaming etc. It seems we somehow assist > at a smart unification of Big Data analytics (Spark, Flink - best of two > worlds)! > > *How does Spark respond to the missing What/Where/When/How questions > (capabilities) highlighted in the unified model Beam [1] ?* > > Best, > Ovidiu > > [1] > https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective > [2] > https://databricks.com/blog/2016/05/11/spark-2-0-technical-preview-easier-faster-and-smarter.html > [3] http://stratosphere.eu/project/publications/ > > > > >