Re: Ready to talk about Spark 2.0?

2015-11-08 Thread Romi Kuntsman
A major release usually means giving up on some API backward compatibility? Can this be used as a chance to merge efforts with Apache Flink ( https://flink.apache.org/) and create the one ultimate open source big data processing system? Spark currently feels like it was made for interactive use

Re: Ready to talk about Spark 2.0?

2015-11-08 Thread Sean Owen
Major releases can change APIs, yes. Although Flink is pretty similar in broad design and goals, the APIs are quite different in particulars. Speaking for myself, I can't imagine merging them, as it would either mean significantly changing Spark APIs, or making Flink use Spark APIs. It would mean

Re: Ready to talk about Spark 2.0?

2015-11-08 Thread Koert Kuipers
romi, unless am i misunderstanding your suggestion you might be interested in projects like the new mahout where they try to abstract out the engine with bindings, so that they can support multiple engines within a single platform. I guess cascading is heading in a similar direction (although no

Re: Ready to talk about Spark 2.0?

2015-11-08 Thread Romi Kuntsman
Hi, thanks for the feedback I'll try to explain better what I meant. First we had RDDs, then we had DataFrames, so could the next step be something like stored procedures over DataFrames? So I define the whole calculation flow, even if it includes any "actions" in between, and the whole thing is

Re: Ready to talk about Spark 2.0?

2015-11-08 Thread Romi Kuntsman
Since it seems we do have so much to talk about Spark 2.0, then the answer to the question "ready to talk about spark 2" is yes. But that doesn't mean the development of the 1.x branch is ready to stop or that there shouldn't be a 1.7 release. Regarding what should go into the next major version

Re: Ready to talk about Spark 2.0?

2015-11-08 Thread Mark Hamstra
Yes, that's clearer -- at least to me. But before going any further, let me note that we are already sliding past Sean's opening question of "Should we start talking about Spark 2.0?" to actually start talking about Spark 2.0. I'll try to keep the rest of this post at a higher- or meta-level in

Ready to talk about Spark 2.0?

2015-11-06 Thread Sean Owen
Since branch-1.6 is cut, I was going to make version 1.7.0 in JIRA. However I've had a few side conversations recently about Spark 2.0, and I know I and others have a number of ideas about it already. I'll go ahead and make 1.7.0, but thought I'd ask, how much other interest is there in starting