Looking at http://spark.incubator.apache.org/docs/latest/configuration.html docs says: Number of individual task failures before giving up on the job. Should be greater than or equal to 1. Number of allowed retries = this value - 1.
However, looking at the code https://github.com/apache/incubator-spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala#L532 if I set spark.task.maxFailures to 1, this means that job will fail after task fails for the second time. Shouldn't this line be corrected to if ( numFailures(index) >= MAX_TASK_FAILURES) { ? I can open a pull request if this is the case. Thanks, Grega -- [image: Inline image 1] *Grega Kešpret* Analytics engineer Celtra — Rich Media Mobile Advertising celtra.com <http://www.celtra.com/> | @celtramobile<http://www.twitter.com/celtramobile>