Peng Cheng created SPARK-2083: --------------------------------- Summary: Allow local task to retry after failure. Key: SPARK-2083 URL: https://issues.apache.org/jira/browse/SPARK-2083 Project: Spark Issue Type: Improvement Components: Deploy Affects Versions: 1.0.0 Reporter: Peng Cheng Priority: Trivial
If a job is submitted to run locally using masterURL = "local[X]", spark will not retry a failed task regardless of your "spark.task.maxFailures" setting. This design is to facilitate debugging and QA of spark application where all tasks are expected to succeed and yield a results. Unfortunately, such setting will prevent a local job from finished if any of its task cannot guarantee a result (e.g. visiting an external resouce/API), and retrying inside the task is less favoured (e.g. the task needs to be executed on a different computer on production). User however can still set masterURL ="local[X,Y]" to override this (where Y is the local maxFailures), but it is not documented and hard to manage. A quick fix to this can be to add a new configuration property "spark.local.maxFailures" with a default value of 1. So user knows exactly where to change when reading the documentation -- This message was sent by Atlassian JIRA (v6.2#6252)