Peng Cheng created SPARK-2083:
---------------------------------

             Summary: Allow local task to retry after failure.
                 Key: SPARK-2083
                 URL: https://issues.apache.org/jira/browse/SPARK-2083
             Project: Spark
          Issue Type: Improvement
          Components: Deploy
    Affects Versions: 1.0.0
            Reporter: Peng Cheng
            Priority: Trivial


If a job is submitted to run locally using masterURL = "local[X]", spark will 
not retry a failed task regardless of your "spark.task.maxFailures" setting. 
This design is to facilitate debugging and QA of spark application where all 
tasks are expected to succeed and yield a results. Unfortunately, such setting 
will prevent a local job from finished if any of its task cannot guarantee a 
result (e.g. visiting an external resouce/API), and retrying inside the task is 
less favoured (e.g. the task needs to be executed on a different computer on 
production).

User however can still set masterURL ="local[X,Y]" to override this (where Y is 
the local maxFailures), but it is not documented and hard to manage. A quick 
fix to this can be to add a new configuration property 
"spark.local.maxFailures" with a default value of 1. So user knows exactly 
where to change when reading the documentation




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to