to manage workflows on Spark
Sorry not really. Spork is a way to migrate your existing pig scripts to
Spark or write new pig jobs then can execute on spark.
For orchestration you are better off using Oozie especially if you are
using other execution engines/systems besides spark
.
--- Original Message ---
From: Mayur Rustagi mayur.rust...@gmail.com
Sent: February 28, 2015 7:07 PM
To: Qiang Cao caoqiang...@gmail.com
Cc: Ted Yu yuzhih...@gmail.com, Ashish Nigam ashnigamt...@gmail.com,
user user@spark.apache.org
Subject: Re: Tools to manage workflows on Spark
Sorry not really
Yu yuzhih...@gmail.com, Ashish Nigam
ashnigamt...@gmail.com, user user@spark.apache.org
Subject: Re: Tools to manage workflows on Spark
Sorry not really. Spork is a way to migrate your existing pig scripts
to Spark or write new pig jobs then can execute on spark.
For orchestration you
Thanks, Ashish! Is Oozie integrated with Spark? I knew it can accommodate
some Hadoop jobs.
On Sat, Feb 28, 2015 at 6:07 PM, Ashish Nigam ashnigamt...@gmail.com
wrote:
Qiang,
Did you look at Oozie?
We use oozie to run spark jobs in production.
On Feb 28, 2015, at 2:45 PM, Qiang Cao
You have to call spark-submit from oozie.
I used this link to get the idea for my implementation -
http://mail-archives.apache.org/mod_mbox/oozie-user/201404.mbox/%3CCAHCsPn-0Grq1rSXrAZu35yy_i4T=fvovdox2ugpcuhkwmjp...@mail.gmail.com%3E
On Feb 28, 2015, at 3:25 PM, Qiang Cao
Hi Everyone,
We need to deal with workflows on Spark. In our scenario, each workflow
consists of multiple processing steps. Among different steps, there could
be dependencies. I'm wondering if there are tools available that can help
us schedule and manage workflows on Spark. I'm looking for
Qiang,
Did you look at Oozie?
We use oozie to run spark jobs in production.
On Feb 28, 2015, at 2:45 PM, Qiang Cao caoqiang...@gmail.com wrote:
Hi Everyone,
We need to deal with workflows on Spark. In our scenario, each workflow
consists of multiple processing steps. Among different
Sorry not really. Spork is a way to migrate your existing pig scripts to
Spark or write new pig jobs then can execute on spark.
For orchestration you are better off using Oozie especially if you are
using other execution engines/systems besides spark.
Regards,
Mayur Rustagi
Ph: +1 (760) 203 3257
Thanks for the pointer, Ashish! I was also looking at Spork
https://github.com/sigmoidanalytics/spork Pig-on-Spark), but wasn't sure if
that's the right direction.
On Sat, Feb 28, 2015 at 6:36 PM, Ashish Nigam ashnigamt...@gmail.com
wrote:
You have to call spark-submit from oozie.
I used this
Here was latest modification in spork repo:
Mon Dec 1 10:08:19 2014
Not sure if it is being actively maintained.
On Sat, Feb 28, 2015 at 6:26 PM, Qiang Cao caoqiang...@gmail.com wrote:
Thanks for the pointer, Ashish! I was also looking at Spork
https://github.com/sigmoidanalytics/spork
Thanks Mayur! I'm looking for something that would allow me to easily
describe and manage a workflow on Spark. A workflow in my context is a
composition of Spark applications that may depend on one another based on
hdfs inputs/outputs. Is Spork a good fit? The orchestration I want is on
app level.
We do maintain it but in apache repo itself. However Pig cannot do
orchestration for you. I am not sure what you are looking at from Pig in
this context.
Regards,
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoid.com http://www.sigmoidanalytics.com/
@mayur_rustagi
12 matches
Mail list logo