[ 
https://issues.apache.org/jira/browse/SPARK-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14901901#comment-14901901
 ] 

Sandy Ryza commented on SPARK-10739:
------------------------------------

I recall there was a JIRA similar to this that avoided killing the application 
when we reached a certain number of executor failures.  However, IIUC, this is 
about something different: deciding whether to have YARN restart the 
application when it fails.

> Add attempt window for long running Spark application on Yarn
> -------------------------------------------------------------
>
>                 Key: SPARK-10739
>                 URL: https://issues.apache.org/jira/browse/SPARK-10739
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>            Reporter: Saisai Shao
>            Priority: Minor
>
> Currently Spark on Yarn uses max attempts to control the failure number, if 
> application's failure number reaches to the max attempts, application will 
> not be recovered by RM, it is not very effective for long running 
> applications, since it will easily exceed the max number at a long time 
> period, also setting a very large max attempts will hide the real problem.
> So here introduce an attempt window to control the application attempt times, 
> this will ignore the out of window attempts, it is introduced in Hadoop 2.6+ 
> to support long running application, it is quite useful for Spark Streaming, 
> Spark shell like applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to