How questions in Spark 2.0 ?

Amit Sela Sat, 21 May 2016 23:17:54 -0700

I need to update this ;)
To start with, you could just take a look at branch-2.0.


On Sun, May 22, 2016, 01:23 Ovidiu-Cristian MARCU <
ovidiu-cristian.ma...@inria.fr> wrote:

> Thank you, Amit! I was looking for this kind of information.
>
> I did not fully read your paper, I see in it a TODO with basically the
> same question(s) [1], maybe someone from Spark team (including Databricks)
> will be so kind to send some feedback..
>
> Best,
> Ovidiu
>
> [1] Integrate “Structured Streaming”: //TODO - What (and how) will Spark
> 2.0 support (out-of-order, event-time windows, watermarks, triggers,
> accumulation modes) - how straight forward will it be to integrate with the
> Beam Model ?
>
>
> On 21 May 2016, at 23:00, Sela, Amit <ans...@paypal.com> wrote:
>
> It seems I forgot to add the link to the “Technical Vision” paper so there
> it is -
> https://docs.google.com/document/d/1y4qlQinjjrusGWlgq-mYmbxRW2z7-_X5Xax-GG0YsC0/edit?usp=sharing
>
> From: "Sela, Amit" <ans...@paypal.com>
> Date: Saturday, May 21, 2016 at 11:52 PM
> To: Ovidiu-Cristian MARCU <ovidiu-cristian.ma...@inria.fr>, "user @spark"
> <user@spark.apache.org>
> Cc: Ovidiu Cristian Marcu <ovidiu21ma...@gmail.com>
> Subject: Re: What / Where / When / How questions in Spark 2.0 ?
>
> This is a “Technical Vision” paper for the Spark runner, which provides
> general guidelines to the future development of Spark’s Beam support as
> part of the Apache Beam (incubating) project.
> This is our JIRA -
> https://issues.apache.org/jira/browse/BEAM/component/12328915/?selectedTab=com.atlassian.jira.jira-projects-plugin:component-summary-panel
>
> Generally, I’m currently working on Datasets integration for Batch (to
> replace RDD) against Spark 1.6, and going towards enhancing Stream
> processing capabilities with Structured Streaming (2.0)
>
> And you’re welcomed to ask those questions at the Apache Beam (incubating)
> mailing list as well ;)
> http://beam.incubator.apache.org/mailing_lists/
>
> Thanks,
> Amit
>
> From: Ovidiu-Cristian MARCU <ovidiu-cristian.ma...@inria.fr>
> Date: Tuesday, May 17, 2016 at 12:11 AM
> To: "user @spark" <user@spark.apache.org>
> Cc: Ovidiu Cristian Marcu <ovidiu21ma...@gmail.com>
> Subject: Re: What / Where / When / How questions in Spark 2.0 ?
>
> Could you please consider a short answer regarding the Apache Beam
> Capability Matrix todo’s for future Spark 2.0 release [4]? (some related
> references below [5][6])
>
> Thanks
>
> [4] http://beam.incubator.apache.org/capability-matrix/#cap-full-what
> [5] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
> [6] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102
>
> On 16 May 2016, at 14:18, Ovidiu-Cristian MARCU <
> ovidiu-cristian.ma...@inria.fr> wrote:
>
> Hi,
>
> We can see in [2] many interesting (and expected!) improvements (promises)
> like extended SQL support, unified API (DataFrames, DataSets), improved
> engine (Tungsten relates to ideas from modern compilers and MPP databases -
> similar to Flink [3]), structured streaming etc. It seems we somehow assist
> at a smart unification of Big Data analytics (Spark, Flink - best of two
> worlds)!
>
> *How does Spark respond to the missing What/Where/When/How questions
> (capabilities) highlighted in the unified model Beam [1] ?*
>
> Best,
> Ovidiu
>
> [1]
> https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective
> [2]
> https://databricks.com/blog/2016/05/11/spark-2-0-technical-preview-easier-faster-and-smarter.html
> [3] http://stratosphere.eu/project/publications/
>
>
>
>
>

Re: What / Where / When / How questions in Spark 2.0 ?

Reply via email to