[ 
https://issues.apache.org/jira/browse/YARN-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-1421.
-------------------------------------------

    Resolution: Duplicate

Yes, closing as duplicate..

> Node managers will not receive application finish event where containers ran 
> before RM restart
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-1421
>                 URL: https://issues.apache.org/jira/browse/YARN-1421
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Omkar Vinit Joshi
>            Assignee: Omkar Vinit Joshi
>            Priority: Critical
>
> Problem :- Today for every application we track the node managers where 
> containers ran. So when application finishes it notifies all those node 
> managers about application finish event (via node manager heartbeat). However 
> if rm restarts then we forget this past information and those node managers 
> will never get application finish event and will keep reporting finished 
> applications.
> Proposed Solution :- Instead of remembering the node managers where 
> containers ran for this particular application it would be better if we 
> depend on node manager heartbeat to take this decision. i.e. when node 
> manager heartbeats saying it is running application (app1, app2) then we 
> should check those application's status in RM's memory 
> {code}rmContext.getRMApps(){code} and if either they are not found (very old 
> applications) or they are in their final state (FINISHED, KILLED, FAILED) 
> then we should immediately notify the node manager about the application 
> finish event. By doing this we are reducing the state which we need to store 
> at RM after restart.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to