[ https://issues.apache.org/jira/browse/YARN-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162740#comment-14162740 ]
zhihai xu commented on YARN-2641: --------------------------------- Hi [~djp], thanks to review the patch. I removed the following RMNode decommission in nodeHeartbeat(ResourceTrackerService.java). {code} this.rmContext.getDispatcher().getEventHandler().handle( new RMNodeEvent(nodeId, RMNodeEventType.DECOMMISSION)); {code} I added RMNode decommission in refreshNodes(NodesListManager.java). Did you still see the decommission happen after the heartbeat back to NM in the patch? I didn't have unit test in my first patch(YARN-2641.000.patch). In my second patch(YARN-2641.001.patch), I change the unit test in TestResourceTrackerService to verify the RMNodeEventType.DECOMMISSION is sent in {code}rm.getNodesListManager().refreshNodes(conf);{code} instead of {code}nodeHeartbeat = nm1.nodeHeartbeat(true); {code} > improve node decommission latency in RM. > ---------------------------------------- > > Key: YARN-2641 > URL: https://issues.apache.org/jira/browse/YARN-2641 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Affects Versions: 2.5.0 > Reporter: zhihai xu > Assignee: zhihai xu > Attachments: YARN-2641.000.patch, YARN-2641.001.patch > > > improve node decommission latency in RM. > Currently the node decommission only happened after RM received nodeHeartbeat > from the Node Manager. The node heartbeat interval is configurable. The > default value is 1 second. > It will be better to do the decommission during RM Refresh(NodesListManager) > instead of nodeHeartbeat(ResourceTrackerService). -- This message was sent by Atlassian JIRA (v6.3.4#6332)