Hi Philip,

Cascading is relatively agnostic about the distributed topology underneath
it, especially as of the 2.0 release over a year ago. There's been some
discussion about writing a flow planner for Spark -- e.g., which would
replace the Hadoop flow planner. Not sure if there's active work on that
yet.

There are a few commercial workflow abstraction layers (probably what was
meant by "application layer" ?), in terms of the Cascading family (incl.
Cascalog, Scalding), and also Actian's integration of Hadoop/Knime/etc.,
and also the work by Continuum, ODG, and others in the Py data stack.

Spark would not be at the same level of abstraction as Cascading (business
logic, effectively); however, something like MLbase is ostensibly intended
for that http://www.mlbase.org/

With respect to Spark, two other things to watch... One would definitely be
the Py data stack and ability to integrate with PySpark, which is turning
out to be very power abstraction -- quite close to a large segment of
industry needs.  The other project to watch, on the Scala side, is
Summingbird and it's evolution at Twitter:
https://blog.twitter.com/2013/streaming-mapreduce-with-summingbird

Paco
http://amazon.com/dp/1449358721/


On Mon, Oct 28, 2013 at 10:11 AM, Philip Ogren <philip.og...@oracle.com>wrote:

>
> My team is investigating a number of technologies in the Big Data space.
> A team member recently got turned on to 
> Cascading<http://www.cascading.org/about-cascading/>as an application layer 
> for orchestrating complex workflows/scenarios.  He
> asked me if Spark had an "application layer"?  My initial reaction is "no"
> that Spark would not have a separate orchestration/application layer.
> Instead, the core Spark API (along with Streaming) would compete directly
> with Cascading for this kind of functionality and that the two would not
> likely be all that complementary.  I realize that I am exposing my
> ignorance here and could be way off.  Is there anyone who knows a bit about
> both of these technologies who could speak to this in broad strokes?
>
> Thanks!
> Philip
>
>

Reply via email to