[ https://issues.apache.org/jira/browse/YARN-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322154#comment-14322154 ]
Jian He commented on YARN-3194: ------------------------------- bq. I have one doubt on this method ResourceTrackerService#handleNMContainerStatus This is legacy code for non-work-preserving restart. we could remove that. Just disregard this method. bq. NM RESTART is Enabled – Problem is here For node_reconnect event, it's removing the old node and adding the newly connected node. RM is also not restarted. I don't think we need to handle the RMNodeReconnectEvent > After NM restart,completed containers are not released which are sent during > NM registration > -------------------------------------------------------------------------------------------- > > Key: YARN-3194 > URL: https://issues.apache.org/jira/browse/YARN-3194 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.6.0 > Environment: NM restart is enabled > Reporter: Rohith > Assignee: Rohith > > On NM restart ,NM sends all the outstanding NMContainerStatus to RM. But RM > process only ContainerState.RUNNING. If container is completed when NM was > down then those containers resources wont be release which result in > applications to hang. -- This message was sent by Atlassian JIRA (v6.3.4#6332)