[ 
https://issues.apache.org/jira/browse/YARN-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146075#comment-16146075
 ] 

Jian He commented on YARN-6168:
-------------------------------

Probably one way would be to change the AM heartbeat to also return previous 
running containers, right now it is only returned in registerApplicationMaster 
response.
We can even deprecate the old one, and only have one place (AM heartbeat 
response) to return the old containers

> Restarted RM may not inform AM about all existing containers
> ------------------------------------------------------------
>
>                 Key: YARN-6168
>                 URL: https://issues.apache.org/jira/browse/YARN-6168
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Billie Rinaldi
>
> There appears to be a race condition when an RM is restarted. I had a 
> situation where the RMs and AM were down, but NMs and app containers were 
> still running. When I restarted the RM, the AM restarted, registered with the 
> RM, and received its list of existing containers before the NMs had reported 
> all of their containers to the RM. The AM was only told about some of the 
> app's existing containers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to