Looking at http://spark.incubator.apache.org/docs/latest/configuration.html
docs says:
Number of individual task failures before giving up on the job. Should be
greater than or equal to 1. Number of allowed retries = this value - 1.

However, looking at the code
https://github.com/apache/incubator-spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala#L532

if I set spark.task.maxFailures to 1, this means that job will fail after
task fails for the second time. Shouldn't this line be corrected to if (
numFailures(index) >= MAX_TASK_FAILURES) {
?

I can open a pull request if this is the case.

Thanks,
Grega
--
[image: Inline image 1]
*Grega Kešpret*
Analytics engineer

Celtra — Rich Media Mobile Advertising
celtra.com <http://www.celtra.com/> |
@celtramobile<http://www.twitter.com/celtramobile>

Reply via email to