romi,
unless am i misunderstanding your suggestion you might be interested in
projects like the new mahout where they try to abstract out the engine with
bindings, so that they can support multiple engines within a single
platform. I guess cascading is heading in a similar direction (although no
spark or flink yet there, just mr1 and tez).

On Sun, Nov 8, 2015 at 6:33 AM, Sean Owen <so...@cloudera.com> wrote:

> Major releases can change APIs, yes. Although Flink is pretty similar
> in broad design and goals, the APIs are quite different in
> particulars. Speaking for myself, I can't imagine merging them, as it
> would either mean significantly changing Spark APIs, or making Flink
> use Spark APIs. It would mean effectively removing one project which
> seems infeasible.
>
> I am not sure of what you're saying the difference is, but I would not
> describe Spark as primarily for interactive use.
>
> Philosophically, I don't think One Big System to Rule Them All is a
> good goal. One project will never get it all right even within one
> niche. It's actually valuable to have many takes on important
> problems. Hence any problem worth solving gets solved 10 times. Just
> look at all those SQL engines and logging frameworks...
>
> On Sun, Nov 8, 2015 at 10:53 AM, Romi Kuntsman <r...@totango.com> wrote:
> > A major release usually means giving up on some API backward
> compatibility?
> > Can this be used as a chance to merge efforts with Apache Flink
> > (https://flink.apache.org/) and create the one ultimate open source big
> data
> > processing system?
> > Spark currently feels like it was made for interactive use (like Python
> and
> > R), and when used others (batch/streaming), it feels like scripted
> > interactive instead of really a standalone complete app. Maybe some base
> > concepts may be adapted?
> >
> > (I'm not currently a committer, but as a heavy Spark user I'd love to
> > participate in the discussion of what can/should be in Spark 2.0)
> >
> > Romi Kuntsman, Big Data Engineer
> > http://www.totango.com
> >
> > On Fri, Nov 6, 2015 at 2:53 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
> > wrote:
> >>
> >> Hi Sean,
> >>
> >> Happy to see this discussion.
> >>
> >> I'm working on PoC to run Camel on Spark Streaming. The purpose is to
> have
> >> an ingestion and integration platform directly running on Spark
> Streaming.
> >>
> >> Basically, we would be able to use a Camel Spark DSL like:
> >>
> >>
> >>
> from("jms:queue:foo").choice().when(predicate).to("job:bar").when(predicate).to("hdfs:path").otherwise("file:path")....
> >>
> >> Before a formal proposal (I have to do more work there), I'm just
> >> wondering if such framework can be a new Spark module (Spark
> Integration for
> >> instance, like Spark ML, Spark Stream, etc).
> >>
> >> Maybe it could be a good candidate for an addition in a "major" release
> >> like Spark 2.0.
> >>
> >> Just my $0.01 ;)
> >>
> >> Regards
> >> JB
> >>
> >>
> >> On 11/06/2015 01:44 PM, Sean Owen wrote:
> >>>
> >>> Since branch-1.6 is cut, I was going to make version 1.7.0 in JIRA.
> >>> However I've had a few side conversations recently about Spark 2.0, and
> >>> I know I and others have a number of ideas about it already.
> >>>
> >>> I'll go ahead and make 1.7.0, but thought I'd ask, how much other
> >>> interest is there in starting to plan Spark 2.0? is that even on the
> >>> table as the next release after 1.6?
> >>>
> >>> Sean
> >>
> >>
> >> --
> >> Jean-Baptiste Onofré
> >> jbono...@apache.org
> >> http://blog.nanthrax.net
> >> Talend - http://www.talend.com
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> >> For additional commands, e-mail: dev-h...@spark.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>

Reply via email to