On Fri, Apr 25, 2014 at 6:30 AM, Mark Baker <dist...@acm.org> wrote:

> I've only had a quick look at Pig, but it seems that a declarative
> layer on top of Spark couldn't be anything other than a big win, as it
> allows developers to declare *what* they want, permitting the compiler
> to determine how best poke at the RDD API to implement it.
>

Having Pig too would certainly be a win, but Spark
SQL<http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html>is
also a declarative layer on top of Spark.  Since the optimization is
lazy, you can chain multiple SQL statements in a row and still optimize
them holistically (similar to a pig job).  Alpha version coming soon to a
Spark 1.0 release near you!

Spark SQL also lets to drop back into functional Scala when that is more
natural for a particular task.

Reply via email to