It depends, personally I have the opposite opinion. IMO expressing pipelines in a functional language feels natural, you just have to get used with the language (scala).
Testing spark jobs is easy where testing a Pig script is much harder and not natural. If you want a more high level language that deals with RDDs for you, you can use spark sql http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html Of course you can express less things this way, but if you have some complex logic I think it would make sense to write a classic spark job that would be more robust in the long term. 2014-04-25 15:30 GMT+02:00 Mark Baker <dist...@acm.org>: > I've only had a quick look at Pig, but it seems that a declarative > layer on top of Spark couldn't be anything other than a big win, as it > allows developers to declare *what* they want, permitting the compiler > to determine how best poke at the RDD API to implement it. > > In my brief time with Spark, I've often thought that it feels very > unnatural to use imperative code to declare a pipeline. >