Hi,

Thanks Sandy.


Another way to look at this is that would we like to have our long running
application to die?

So let's say, we create a window of around 10 batches, and we are using
incremental kind of operations inside our application, as restart here is a
relatively more costlier, so should it be the maximum number of executor
failure's kind of criteria to fail the application or should we have some
parameters around minimum number of executor's availability for some x time?

So, if the application is not able to have minimum n number of executors
within x period of time, then we should fail the application.

Adding time factor here, will allow some window for spark to get more
executors allocated if some of them fails.

Thoughts please.

Thanks,
Twinkle


On Wed, Apr 1, 2015 at 10:19 PM, Sandy Ryza <sandy.r...@cloudera.com> wrote:

> That's a good question, Twinkle.
>
> One solution could be to allow a maximum number of failures within any
> given time span.  E.g. a max failures per hour property.
>
> -Sandy
>
> On Tue, Mar 31, 2015 at 11:52 PM, twinkle sachdeva <
> twinkle.sachd...@gmail.com> wrote:
>
>> Hi,
>>
>> In spark over YARN, there is a property "spark.yarn.max.executor.failures"
>> which controls the maximum number of executor's failure an application will
>> survive.
>>
>> If number of executor's failures ( due to any reason like OOM or machine
>> failure etc ), exceeds this value then applications quits.
>>
>> For small duration spark job, this looks fine, but for the long running
>> jobs as this does not take into account the duration, this can lead to same
>> treatment for two different scenarios ( mentioned below) :
>> 1. executors failing with in 5 mins.
>> 2. executors failing sparsely, but at some point even a single executor
>> failure ( which application could have survived ) can make the application
>> quit.
>>
>> Sending it to the community to listen what kind of behaviour / strategy
>> they think will be suitable for long running spark jobs or spark streaming
>> jobs.
>>
>> Thanks and Regards,
>> Twinkle
>>
>
>

Reply via email to