The real question is why do you want to run pig script using Spark Are you planning to user spark as underlying processing engine for Spark? thats not simple Are you planning to feed Pig data to spark for further processing, then you can write it to HDFS & trigger your spark script.
rdd.pipe is basically similar to Hadoop streaming, allowing you to run a script on each partition of the RDD & get output as another RDD. Regards Mayur Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Wed, Mar 5, 2014 at 10:29 AM, suman bharadwaj <suman....@gmail.com>wrote: > Hi, > > How can i call pig script using SPARK. Can I use rdd.pipe() here ? > > And can anyone share sample implementation of rdd.pipe () and if you can > explain how rdd.pipe() works, it would really really help. > > Regards, > SB >