[ 
https://issues.apache.org/jira/browse/SPARK-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-10739:
--------------------------------
    Description: 
Currently Spark on Yarn uses max attempts to control the failure number, if 
application's failure number reaches to the max attempts, application will not 
be recovered by RM, it is not very effective for long running applications, 
since it will easily exceed the max number at a long time period, also setting 
a very large max attempts will hide the real problem.

So here introduce an attempt window to control the application attempt times, 
this will ignore the out of window attempts, it is introduced in Hadoop 2.6+ to 
support long running application, it is quite useful for Spark Streaming, Spark 
shell like applications.


  was:
Currently Spark on Yarn uses max attempts to control the failure number, if 
application's failure number reaches to the max attempts, application will not 
be recovered by RM, it is not very for long running applications, since it will 
easily exceed the max number, also setting a very large max attempts will hide 
the real problem.

So here introduce an attempt window to control the application attempt times, 
this will ignore the out of window attempts, it is introduced in Hadoop 2.6+ to 
support long running application, it is quite useful for Spark Streaming, Spark 
shell like applications.



> Add attempt window for long running Spark application on Yarn
> -------------------------------------------------------------
>
>                 Key: SPARK-10739
>                 URL: https://issues.apache.org/jira/browse/SPARK-10739
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>            Reporter: Saisai Shao
>            Priority: Minor
>
> Currently Spark on Yarn uses max attempts to control the failure number, if 
> application's failure number reaches to the max attempts, application will 
> not be recovered by RM, it is not very effective for long running 
> applications, since it will easily exceed the max number at a long time 
> period, also setting a very large max attempts will hide the real problem.
> So here introduce an attempt window to control the application attempt times, 
> this will ignore the out of window attempts, it is introduced in Hadoop 2.6+ 
> to support long running application, it is quite useful for Spark Streaming, 
> Spark shell like applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to