Hi Ankit, Thanx for all the work on Pig. Finally got it working. Couple of high level bugs right now:
- Getting it working on Spark 0.9.0 - Getting UDF working - Getting generate functionality working - Exhaustive test suite on Spark on Pig are you maintaining a Jira somewhere? I am currently trying to deploy it on 0.9.0. Regards Mayur Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Fri, Mar 14, 2014 at 1:37 PM, Aniket Mokashi <aniket...@gmail.com> wrote: > We will post fixes from our side at - https://github.com/twitter/pig. > > Top on our list are- > 1. Make it work with pig-trunk (execution engine interface) (with 0.8 or > 0.9 spark). > 2. Support for algebraic udfs (this mitigates the group by oom problems). > > Would definitely love more contribution on this. > > Thanks, > Aniket > > > On Fri, Mar 14, 2014 at 12:29 PM, Mayur Rustagi > <mayur.rust...@gmail.com>wrote: > >> Dam I am off to NY for Structure Conf. Would it be possible to meet >> anytime after 28th March? >> I am really interested in making it stable & production quality. >> >> Regards >> Mayur Rustagi >> Ph: +1 (760) 203 3257 >> http://www.sigmoidanalytics.com >> @mayur_rustagi <https://twitter.com/mayur_rustagi> >> >> >> >> On Fri, Mar 14, 2014 at 11:53 AM, Julien Le Dem <jul...@twitter.com>wrote: >> >>> Hi Mayur, >>> Are you going to the Pig meetup this afternoon? >>> http://www.meetup.com/PigUser/events/160604192/ >>> Aniket and I will be there. >>> We would be happy to chat about Pig-on-Spark >>> >>> >>> >>> On Tue, Mar 11, 2014 at 8:56 AM, Mayur Rustagi >>> <mayur.rust...@gmail.com>wrote: >>> >>>> Hi Lin, >>>> We are working on getting Pig on spark functional with 0.8.0, have you >>>> got it working on any spark version ? >>>> Also what all functionality works on it? >>>> Regards >>>> Mayur >>>> >>>> Mayur Rustagi >>>> Ph: +1 (760) 203 3257 >>>> http://www.sigmoidanalytics.com >>>> @mayur_rustagi <https://twitter.com/mayur_rustagi> >>>> >>>> >>>> >>>> On Mon, Mar 10, 2014 at 11:00 PM, Xiangrui Meng <men...@gmail.com>wrote: >>>> >>>>> Hi Sameer, >>>>> >>>>> Lin (cc'ed) could also give you some updates about Pig on Spark >>>>> development on her side. >>>>> >>>>> Best, >>>>> Xiangrui >>>>> >>>>> On Mon, Mar 10, 2014 at 12:52 PM, Sameer Tilak <ssti...@live.com> >>>>> wrote: >>>>> > Hi Mayur, >>>>> > We are planning to upgrade our distribution MR1> MR2 (YARN) and the >>>>> goal is >>>>> > to get SPROK set up next month. I will keep you posted. Can you >>>>> please keep >>>>> > me informed about your progress as well. >>>>> > >>>>> > ________________________________ >>>>> > From: mayur.rust...@gmail.com >>>>> > Date: Mon, 10 Mar 2014 11:47:56 -0700 >>>>> > >>>>> > Subject: Re: Pig on Spark >>>>> > To: user@spark.apache.org >>>>> > >>>>> > >>>>> > Hi Sameer, >>>>> > Did you make any progress on this. My team is also trying it out >>>>> would love >>>>> > to know some detail so progress. >>>>> > >>>>> > Mayur Rustagi >>>>> > Ph: +1 (760) 203 3257 >>>>> > http://www.sigmoidanalytics.com >>>>> > @mayur_rustagi >>>>> > >>>>> > >>>>> > >>>>> > On Thu, Mar 6, 2014 at 2:20 PM, Sameer Tilak <ssti...@live.com> >>>>> wrote: >>>>> > >>>>> > Hi Aniket, >>>>> > Many thanks! I will check this out. >>>>> > >>>>> > ________________________________ >>>>> > Date: Thu, 6 Mar 2014 13:46:50 -0800 >>>>> > Subject: Re: Pig on Spark >>>>> > From: aniket...@gmail.com >>>>> > To: user@spark.apache.org; tgraves...@yahoo.com >>>>> > >>>>> > >>>>> > There is some work to make this work on yarn at >>>>> > https://github.com/aniket486/pig. (So, compile pig with ant >>>>> > -Dhadoopversion=23) >>>>> > >>>>> > You can look at >>>>> https://github.com/aniket486/pig/blob/spork/pig-spark to >>>>> > find out what sort of env variables you need (sorry, I haven't been >>>>> able to >>>>> > clean this up- in-progress). There are few known issues with this, I >>>>> will >>>>> > work on fixing them soon. >>>>> > >>>>> > Known issues- >>>>> > 1. Limit does not work (spork-fix) >>>>> > 2. Foreach requires to turn off schema-tuple-backend (should be a >>>>> pig-jira) >>>>> > 3. Algebraic udfs dont work (spork-fix in-progress) >>>>> > 4. Group by rework (to avoid OOMs) >>>>> > 5. UDF Classloader issue (requires SPARK-1053, then you can put >>>>> > pig-withouthadoop.jar as SPARK_JARS in SparkContext along with udf >>>>> jars) >>>>> > >>>>> > ~Aniket >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > On Thu, Mar 6, 2014 at 1:36 PM, Tom Graves <tgraves...@yahoo.com> >>>>> wrote: >>>>> > >>>>> > I had asked a similar question on the dev mailing list a while back >>>>> (Jan >>>>> > 22nd). >>>>> > >>>>> > See the archives: >>>>> > >>>>> http://mail-archives.apache.org/mod_mbox/spark-dev/201401.mbox/browser-> >>>>> > look for spork. >>>>> > >>>>> > Basically Matei said: >>>>> > >>>>> > Yup, that was it, though I believe people at Twitter picked it up >>>>> again >>>>> > recently. I'd suggest >>>>> > asking Dmitriy if you know him. I've seen interest in this from >>>>> several >>>>> > other groups, and >>>>> > if there's enough of it, maybe we can start another open source repo >>>>> to >>>>> > track it. The work >>>>> > in that repo you pointed to was done over one week, and already had >>>>> most of >>>>> > Pig's operators >>>>> > working. (I helped out with this prototype over Twitter's hack >>>>> week.) That >>>>> > work also calls >>>>> > the Scala API directly, because it was done before we had a Java >>>>> API; it >>>>> > should be easier >>>>> > with the Java one. >>>>> > >>>>> > >>>>> > Tom >>>>> > >>>>> > >>>>> > >>>>> > On Thursday, March 6, 2014 3:11 PM, Sameer Tilak <ssti...@live.com> >>>>> wrote: >>>>> > Hi everyone, >>>>> > >>>>> > We are using to Pig to build our data pipeline. I came across Spork >>>>> -- Pig >>>>> > on Spark at: https://github.com/dvryaboy/pig and not sure if it is >>>>> still >>>>> > active. >>>>> > >>>>> > Can someone please let me know the status of Spork or any other >>>>> effort that >>>>> > will let us run Pig on Spark? We can significantly benefit by using >>>>> Spark, >>>>> > but we would like to keep using the existing Pig scripts. >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > "...:::Aniket:::... Quetzalco@tl" >>>>> > >>>>> > >>>>> >>>> >>>> >>> >> > > > -- > "...:::Aniket:::... Quetzalco@tl" >