[ https://issues.apache.org/jira/browse/YARN-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinod Kumar Vavilapalli resolved YARN-1421. ------------------------------------------- Resolution: Duplicate Yes, closing as duplicate.. > Node managers will not receive application finish event where containers ran > before RM restart > ---------------------------------------------------------------------------------------------- > > Key: YARN-1421 > URL: https://issues.apache.org/jira/browse/YARN-1421 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Omkar Vinit Joshi > Assignee: Omkar Vinit Joshi > Priority: Critical > > Problem :- Today for every application we track the node managers where > containers ran. So when application finishes it notifies all those node > managers about application finish event (via node manager heartbeat). However > if rm restarts then we forget this past information and those node managers > will never get application finish event and will keep reporting finished > applications. > Proposed Solution :- Instead of remembering the node managers where > containers ran for this particular application it would be better if we > depend on node manager heartbeat to take this decision. i.e. when node > manager heartbeats saying it is running application (app1, app2) then we > should check those application's status in RM's memory > {code}rmContext.getRMApps(){code} and if either they are not found (very old > applications) or they are in their final state (FINISHED, KILLED, FAILED) > then we should immediately notify the node manager about the application > finish event. By doing this we are reducing the state which we need to store > at RM after restart. -- This message was sent by Atlassian JIRA (v6.2#6252)