Re: Spark job workflow engine recommendations

Nick Pentreath Wed, 07 Oct 2015 10:26:13 -0700

We're also using Azkaban for scheduling, and we simply use spark-submit via 
she'll scripts. It works fine.





The auto retry feature with a large number of retries (like 100 or 1000 
perhaps) should take care of long-running jobs with restarts on failure. We 
haven't used it for streaming yet though we have long-running jobs and Azkaban 
won't kill them unless an SLA is in place.









—
Sent from Mailbox

On Wed, Oct 7, 2015 at 7:18 PM, Vikram Kone <vikramk...@gmail.com> wrote:

> Hien,
> I saw this pull request and from what I understand this is geared towards
> running spark jobs over hadoop. We are using spark over cassandra and not
> sure if this new jobtype supports that. I haven't seen any documentation in
> regards to how to use this spark job plugin, so that I can test it out on
> our cluster.
> We are currently submitting our spark jobs using command job type using the
> following command  "dse spark-submit --class com.org.classname ./test.jar"
> etc. What would be the advantage of using the native spark job type over
> command job type?
> I didn't understand from your reply if azkaban already supports long
> running jobs like spark streaming..does it? streaming jobs generally need
> to be running indefinitely or forever and needs to be restarted if for some
> reason they fail (lack of resources may be..). I can probably use the auto
> retry feature for this, but not sure
> I'm looking forward to the multiple executor support which will greatly
> enhance the scalability issue.
> On Wed, Oct 7, 2015 at 9:56 AM, Hien Luu <h...@linkedin.com> wrote:
>> The spark job type was added recently - see this pull request
>> https://github.com/azkaban/azkaban-plugins/pull/195.  You can leverage
>> the SLA feature to kill a job if it ran longer than expected.
>>
>> BTW, we just solved the scalability issue by supporting multiple
>> executors.  Within a week or two, the code for that should be merged in the
>> main trunk.
>>
>> Hien
>>
>> On Tue, Oct 6, 2015 at 9:40 PM, Vikram Kone <vikramk...@gmail.com> wrote:
>>
>>> Does Azkaban support scheduling long running jobs like spark steaming
>>> jobs? Will Azkaban kill a job if it's running for a long time.
>>>
>>>
>>> On Friday, August 7, 2015, Vikram Kone <vikramk...@gmail.com> wrote:
>>>
>>>> Hien,
>>>> Is Azkaban being phased out at linkedin as rumored? If so, what's
>>>> linkedin going to use for workflow scheduling? Is there something else
>>>> that's going to replace Azkaban?
>>>>
>>>> On Fri, Aug 7, 2015 at 11:25 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>>> In my opinion, choosing some particular project among its peers should
>>>>> leave enough room for future growth (which may come faster than you
>>>>> initially think).
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Fri, Aug 7, 2015 at 11:23 AM, Hien Luu <h...@linkedin.com> wrote:
>>>>>
>>>>>> Scalability is a known issue due the the current architecture.
>>>>>> However this will be applicable if you run more 20K jobs per day.
>>>>>>
>>>>>> On Fri, Aug 7, 2015 at 10:30 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>>>
>>>>>>> From what I heard (an ex-coworker who is Oozie committer), Azkaban
>>>>>>> is being phased out at LinkedIn because of scalability issues (though
>>>>>>> UI-wise, Azkaban seems better).
>>>>>>>
>>>>>>> Vikram:
>>>>>>> I suggest you do more research in related projects (maybe using their
>>>>>>> mailing lists).
>>>>>>>
>>>>>>> Disclaimer: I don't work for LinkedIn.
>>>>>>>
>>>>>>> On Fri, Aug 7, 2015 at 10:12 AM, Nick Pentreath <
>>>>>>> nick.pentre...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Vikram,
>>>>>>>>
>>>>>>>> We use Azkaban (2.5.0) in our production workflow scheduling. We
>>>>>>>> just use local mode deployment and it is fairly easy to set up. It is
>>>>>>>> pretty easy to use and has a nice scheduling and logging interface, as 
>>>>>>>> well
>>>>>>>> as SLAs (like kill job and notify if it doesn't complete in 3 hours or
>>>>>>>> whatever).
>>>>>>>>
>>>>>>>> However Spark support is not present directly - we run everything
>>>>>>>> with shell scripts and spark-submit. There is a plugin interface where 
>>>>>>>> one
>>>>>>>> could create a Spark plugin, but I found it very cumbersome when I did
>>>>>>>> investigate and didn't have the time to work through it to develop 
>>>>>>>> that.
>>>>>>>>
>>>>>>>> It has some quirks and while there is actually a REST API for adding
>>>>>>>> jos and dynamically scheduling jobs, it is not documented anywhere so 
>>>>>>>> you
>>>>>>>> kinda have to figure it out for yourself. But in terms of ease of use I
>>>>>>>> found it way better than Oozie. I haven't tried Chronos, and it seemed
>>>>>>>> quite involved to set up. Haven't tried Luigi either.
>>>>>>>>
>>>>>>>> Spark job server is good but as you say lacks some stuff like
>>>>>>>> scheduling and DAG type workflows (independent of spark-defined job 
>>>>>>>> flows).
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Aug 7, 2015 at 7:00 PM, Jörn Franke <jornfra...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Check also falcon in combination with oozie
>>>>>>>>>
>>>>>>>>> Le ven. 7 août 2015 à 17:51, Hien Luu <h...@linkedin.com.invalid>
>>>>>>>>> a écrit :
>>>>>>>>>
>>>>>>>>>> Looks like Oozie can satisfy most of your requirements.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Aug 7, 2015 at 8:43 AM, Vikram Kone <vikramk...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>> I'm looking for open source workflow tools/engines that allow us
>>>>>>>>>>> to schedule spark jobs on a datastax cassandra cluster. Since there 
>>>>>>>>>>> are
>>>>>>>>>>> tonnes of alternatives out there like Ozzie, Azkaban, Luigi , 
>>>>>>>>>>> Chronos etc,
>>>>>>>>>>> I wanted to check with people here to see what they are using today.
>>>>>>>>>>>
>>>>>>>>>>> Some of the requirements of the workflow engine that I'm looking
>>>>>>>>>>> for are
>>>>>>>>>>>
>>>>>>>>>>> 1. First class support for submitting Spark jobs on Cassandra.
>>>>>>>>>>> Not some wrapper Java code to submit tasks.
>>>>>>>>>>> 2. Active open source community support and well tested at
>>>>>>>>>>> production scale.
>>>>>>>>>>> 3. Should be dead easy to write job dependencices using XML or
>>>>>>>>>>> web interface . Ex; job A depends on Job B and Job C, so run Job A 
>>>>>>>>>>> after B
>>>>>>>>>>> and C are finished. Don't need to write full blown java 
>>>>>>>>>>> applications to
>>>>>>>>>>> specify job parameters and dependencies. Should be very simple to 
>>>>>>>>>>> use.
>>>>>>>>>>> 4. Time based  recurrent scheduling. Run the spark jobs at a
>>>>>>>>>>> given time every hour or day or week or month.
>>>>>>>>>>> 5. Job monitoring, alerting on failures and email notifications
>>>>>>>>>>> on daily basis.
>>>>>>>>>>>
>>>>>>>>>>> I have looked at Ooyala's spark job server which seems to be
>>>>>>>>>>> hated towards making spark jobs run faster by sharing contexts 
>>>>>>>>>>> between the
>>>>>>>>>>> jobs but isn't a full blown workflow engine per se. A combination 
>>>>>>>>>>> of spark
>>>>>>>>>>> job server and workflow engine would be ideal
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the inputs
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>

Re: Spark job workflow engine recommendations

Reply via email to