Thanks Mayur! I'm looking for something that would allow me to easily describe and manage a workflow on Spark. A workflow in my context is a composition of Spark applications that may depend on one another based on hdfs inputs/outputs. Is Spork a good fit? The orchestration I want is on app level.
On Sat, Feb 28, 2015 at 9:38 PM, Mayur Rustagi <mayur.rust...@gmail.com> wrote: > We do maintain it but in apache repo itself. However Pig cannot do > orchestration for you. I am not sure what you are looking at from Pig in > this context. > > Regards, > Mayur Rustagi > Ph: +1 (760) 203 3257 > http://www.sigmoid.com <http://www.sigmoidanalytics.com/> > @mayur_rustagi <http://www.twitter.com/mayur_rustagi> > > On Sat, Feb 28, 2015 at 6:36 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Here was latest modification in spork repo: >> Mon Dec 1 10:08:19 2014 >> >> Not sure if it is being actively maintained. >> >> On Sat, Feb 28, 2015 at 6:26 PM, Qiang Cao <caoqiang...@gmail.com> wrote: >> >>> Thanks for the pointer, Ashish! I was also looking at Spork >>> https://github.com/sigmoidanalytics/spork Pig-on-Spark), but wasn't >>> sure if that's the right direction. >>> >>> On Sat, Feb 28, 2015 at 6:36 PM, Ashish Nigam <ashnigamt...@gmail.com> >>> wrote: >>> >>>> You have to call spark-submit from oozie. >>>> I used this link to get the idea for my implementation - >>>> >>>> >>>> http://mail-archives.apache.org/mod_mbox/oozie-user/201404.mbox/%3CCAHCsPn-0Grq1rSXrAZu35yy_i4T=fvovdox2ugpcuhkwmjp...@mail.gmail.com%3E >>>> >>>> >>>> >>>> On Feb 28, 2015, at 3:25 PM, Qiang Cao <caoqiang...@gmail.com> wrote: >>>> >>>> Thanks, Ashish! Is Oozie integrated with Spark? I knew it can >>>> accommodate some Hadoop jobs. >>>> >>>> >>>> On Sat, Feb 28, 2015 at 6:07 PM, Ashish Nigam <ashnigamt...@gmail.com> >>>> wrote: >>>> >>>>> Qiang, >>>>> Did you look at Oozie? >>>>> We use oozie to run spark jobs in production. >>>>> >>>>> >>>>> On Feb 28, 2015, at 2:45 PM, Qiang Cao <caoqiang...@gmail.com> wrote: >>>>> >>>>> Hi Everyone, >>>>> >>>>> We need to deal with workflows on Spark. In our scenario, each >>>>> workflow consists of multiple processing steps. Among different steps, >>>>> there could be dependencies. I'm wondering if there are tools >>>>> available that can help us schedule and manage workflows on Spark. I'm >>>>> looking for something like pig on Hadoop, but it should fully function on >>>>> Spark. >>>>> >>>>> Any suggestion? >>>>> >>>>> Thanks in advance! >>>>> >>>>> Qiang >>>>> >>>>> >>>>> >>>> >>>> >>> >> >