Hi Feng,
Does airflow allow remote submissions of spark jobs via spark-submit?
On Wed, Nov 18, 2015 at 6:01 PM, Fengdong Yu
wrote:
> Hi,
>
> we use ‘Airflow' as our job workflow scheduler.
>
>
>
>
> On Nov 19, 2015, at 9:47 AM, Vikram Kone
Hi,
we use ‘Airflow' as our job workflow scheduler.
> On Nov 19, 2015, at 9:47 AM, Vikram Kone wrote:
>
> Hi Nick,
> Quick question about spark-submit command executed from azkaban with command
> job type.
> I see that when I press kill in azkaban portal on a
Yes, you can submit job remotely.
> On Nov 19, 2015, at 10:10 AM, Vikram Kone wrote:
>
> Hi Feng,
> Does airflow allow remote submissions of spark jobs via spark-submit?
>
> On Wed, Nov 18, 2015 at 6:01 PM, Fengdong Yu
Hi Nick,
Quick question about spark-submit command executed from azkaban with
command job type.
I see that when I press kill in azkaban portal on a spark-submit job, it
doesn't actually kill the application on spark master and it continues to
run even though azkaban thinks that it's killed.
How do
We're also using Azkaban for scheduling, and we simply use spark-submit via
she'll scripts. It works fine.
The auto retry feature with a large number of retries (like 100 or 1000
perhaps) should take care of long-running jobs with restarts on failure. We
haven't used it for streaming yet
Hien,
I saw this pull request and from what I understand this is geared towards
running spark jobs over hadoop. We are using spark over cassandra and not
sure if this new jobtype supports that. I haven't seen any documentation in
regards to how to use this spark job plugin, so that I can test it
The spark job type was added recently - see this pull request
https://github.com/azkaban/azkaban-plugins/pull/195. You can leverage the
SLA feature to kill a job if it ran longer than expected.
BTW, we just solved the scalability issue by supporting multiple
executors. Within a week or two, the
Does Azkaban support scheduling long running jobs like spark steaming jobs?
Will Azkaban kill a job if it's running for a long time.
On Friday, August 7, 2015, Vikram Kone wrote:
> Hien,
> Is Azkaban being phased out at linkedin as rumored? If so, what's linkedin
> going
We use Talend, but not for Spark workflows.
Although it does have Spark componenets.
https://www.talend.com/download/talend-open-studio
It is free (commercial support available), easy to design and deploy
workflows.
Talend for BigData 6.0 was released as month ago.
Is anybody using Talend for
We are in the middle of figuring that out. At the high level, we want to
combine the best parts of existing workflow solutions.
On Fri, Aug 7, 2015 at 3:55 PM, Vikram Kone vikramk...@gmail.com wrote:
Hien,
Is Azkaban being phased out at linkedin as rumored? If so, what's linkedin
going to
I also tend to agree that Azkaban is somehqat easier to get set up. Though I
haven't used the new UI for Oozie that is part of CDH, so perhaps that is
another good option.
It's a pity Azkaban is a little rough in terms of documenting its API, and the
scalability is an issue. However it
I used to maintain Luigi at Spotify, and got some insight in workflow
manager characteristics and production behaviour in the process.
I am evaluating options for my current employer, and the short list is
basically: Luigi, Azkaban, Pinball, Airflow, and rolling our own. The
latter is not
Hien,
Is Azkaban being phased out at linkedin as rumored? If so, what's linkedin
going to use for workflow scheduling? Is there something else that's going
to replace Azkaban?
On Fri, Aug 7, 2015 at 11:25 AM, Ted Yu yuzhih...@gmail.com wrote:
In my opinion, choosing some particular project
Hi,
I'm looking for open source workflow tools/engines that allow us to
schedule spark jobs on a datastax cassandra cluster. Since there are tonnes
of alternatives out there like Ozzie, Azkaban, Luigi , Chronos etc, I
wanted to check with people here to see what they are using today.
Some of the
Looks like Oozie can satisfy most of your requirements.
On Fri, Aug 7, 2015 at 8:43 AM, Vikram Kone vikramk...@gmail.com wrote:
Hi,
I'm looking for open source workflow tools/engines that allow us to
schedule spark jobs on a datastax cassandra cluster. Since there are tonnes
of
Thanks for the suggestion Hien. I'm curious why not azkaban from linkedin.
From what I read online Oozie was very cumbersome to setup and use compared
to azkaban. Since you are from linkedin wanted to get some perspective on
what it lacks compared to Oozie. Ease of use is very important more than
Check also falcon in combination with oozie
Le ven. 7 août 2015 à 17:51, Hien Luu h...@linkedin.com.invalid a écrit :
Looks like Oozie can satisfy most of your requirements.
On Fri, Aug 7, 2015 at 8:43 AM, Vikram Kone vikramk...@gmail.com wrote:
Hi,
I'm looking for open source workflow
From what I heard (an ex-coworker who is Oozie committer), Azkaban is being
phased out at LinkedIn because of scalability issues (though UI-wise,
Azkaban seems better).
Vikram:
I suggest you do more research in related projects (maybe using their
mailing lists).
Disclaimer: I don't work for
Oh ok. That's a good enough reason against azkaban then. So looks like
Oozie is the best choice here.
On Friday, August 7, 2015, Ted Yu yuzhih...@gmail.com wrote:
From what I heard (an ex-coworker who is Oozie committer), Azkaban is
being phased out at LinkedIn because of scalability issues
19 matches
Mail list logo