g at the code
>
> https://github.com/apache/incubator-spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala#L532
>
> if I set spark.task.maxFailures to 1, this means that job will fail after
> task fails for the second time. Shouldn
ng this setting? Is this expected behaviour? Is there some
>> other way I can have Spark fail-fast?
>>
>> Thanks!
>>
>> On Mon, Dec 9, 2013 at 4:35 PM, Grega Kešpret wrote:
>>
>> > Hi!
>> >
>> > I tried this (by setting spark.task.maxF
, Dec 9, 2013 at 4:35 PM, Grega Kešpret wrote:
>
> > Hi!
> >
> > I tried this (by setting spark.task.maxFailures to 1) and it still does
> > not fail-fast. I started a job and after some time, I killed all JVMs
> > running on one of the two workers. I was expecting
Any news regarding this setting? Is this expected behaviour? Is there some
other way I can have Spark fail-fast?
Thanks!
On Mon, Dec 9, 2013 at 4:35 PM, Grega Kešpret wrote:
> Hi!
>
> I tried this (by setting spark.task.maxFailures to 1) and it still does
> not fail-fast. I start
w does
spark.task.maxFailures affect this (if at all) ?
Log on driver:
https://gist.github.com/gregakespret/7874908#file-gistfile1-txt-L1045-L1062(lines
where I killed JVM worker are selected)
Grega
--
[image: Inline image 1]
*Grega Kešpret*
Analytics engineer
Celtra — Rich Media Mobile Advertising
celtra.com
Hi!
I tried this (by setting spark.task.maxFailures to 1) and it still does not
fail-fast. I started a job and after some time, I killed all JVMs running
on one of the two workers. I was expecting Spark job to fail, however it
re-fetched tasks to one of the two workers that was still alive and
the code
> >
> >
> https://github.com/apache/incubator-spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala#L532
> >
> > if I set spark.task.maxFailures to 1, this means that job will fail after
> > task fails
t; greater than or equal to 1. Number of allowed retries = this value - 1.
>
> However, looking at the code
>
> https://github.com/apache/incubator-spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala#L532
>
> if I set spark.task.maxF
-spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala#L532
if I set spark.task.maxFailures to 1, this means that job will fail after
task fails for the second time. Shouldn't this line be corrected to if (
numFailures(index) >= MAX_TASK_