[ 
https://issues.apache.org/jira/browse/YARN-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743628#comment-14743628
 ] 

Jason Lowe commented on YARN-4142:
----------------------------------

One issue with sending diagnostics on the heartbeat is what to do when an AM 
ends up sending them _every_ heartbeat.  If they all concatenate then this 
could be a significant memory burden on the RM.  If subsequent diagnostics 
eclipse earlier ones then it's less of an issue, but then there will be 
situations where that may be undesirable.  Maybe we only store the last N 
Kbytes of diagnostic information from an AM attempt to keep it from 
overwhelming the RM.

> add a way for an attempt to report an attempt failure
> -----------------------------------------------------
>
>                 Key: YARN-4142
>                 URL: https://issues.apache.org/jira/browse/YARN-4142
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>
> Currently AMs can report a failure with exit code and diagnostics text —but 
> only when exiting to a failed state. If the AM terminates for any other 
> reason there's no information held in the RM, just the logs somewhere —and we 
> know they don't always last.
> When an application explicitly terminates an attempt, it would be nice if it 
> could  optionally report something to the RM before it exited. The most 
> recent set of these could then be included in Application Reports, so 
> allowing client apps to count attempt failures and get exit details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to