It seems I forgot to add the link to the “Technical Vision” paper so there it 
is - 
https://docs.google.com/document/d/1y4qlQinjjrusGWlgq-mYmbxRW2z7-_X5Xax-GG0YsC0/edit?usp=sharing

From: "Sela, Amit" <ans...@paypal.com<mailto:ans...@paypal.com>>
Date: Saturday, May 21, 2016 at 11:52 PM
To: Ovidiu-Cristian MARCU 
<ovidiu-cristian.ma...@inria.fr<mailto:ovidiu-cristian.ma...@inria.fr>>, "user 
@spark" <user@spark.apache.org<mailto:user@spark.apache.org>>
Cc: Ovidiu Cristian Marcu 
<ovidiu21ma...@gmail.com<mailto:ovidiu21ma...@gmail.com>>
Subject: Re: What / Where / When / How questions in Spark 2.0 ?

This is a “Technical Vision” paper for the Spark runner, which provides general 
guidelines to the future development of Spark’s Beam support as part of the 
Apache Beam (incubating) project.
This is our JIRA - 
https://issues.apache.org/jira/browse/BEAM/component/12328915/?selectedTab=com.atlassian.jira.jira-projects-plugin:component-summary-panel

Generally, I’m currently working on Datasets integration for Batch (to replace 
RDD) against Spark 1.6, and going towards enhancing Stream processing 
capabilities with Structured Streaming (2.0)

And you’re welcomed to ask those questions at the Apache Beam (incubating) 
mailing list as well ;)
http://beam.incubator.apache.org/mailing_lists/

Thanks,
Amit

From: Ovidiu-Cristian MARCU 
<ovidiu-cristian.ma...@inria.fr<mailto:ovidiu-cristian.ma...@inria.fr>>
Date: Tuesday, May 17, 2016 at 12:11 AM
To: "user @spark" <user@spark.apache.org<mailto:user@spark.apache.org>>
Cc: Ovidiu Cristian Marcu 
<ovidiu21ma...@gmail.com<mailto:ovidiu21ma...@gmail.com>>
Subject: Re: What / Where / When / How questions in Spark 2.0 ?

Could you please consider a short answer regarding the Apache Beam Capability 
Matrix todo’s for future Spark 2.0 release [4]? (some related references below 
[5][6])

Thanks

[4] http://beam.incubator.apache.org/capability-matrix/#cap-full-what
[5] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
[6] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102

On 16 May 2016, at 14:18, Ovidiu-Cristian MARCU 
<ovidiu-cristian.ma...@inria.fr<mailto:ovidiu-cristian.ma...@inria.fr>> wrote:

Hi,

We can see in [2] many interesting (and expected!) improvements (promises) like 
extended SQL support, unified API (DataFrames, DataSets), improved engine 
(Tungsten relates to ideas from modern compilers and MPP databases - similar to 
Flink [3]), structured streaming etc. It seems we somehow assist at a smart 
unification of Big Data analytics (Spark, Flink - best of two worlds)!

How does Spark respond to the missing What/Where/When/How questions 
(capabilities) highlighted in the unified model Beam [1] ?

Best,
Ovidiu

[1] 
https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective
[2] 
https://databricks.com/blog/2016/05/11/spark-2-0-technical-preview-easier-faster-and-smarter.html
[3] http://stratosphere.eu/project/publications/



Reply via email to