[ 
https://issues.apache.org/jira/browse/YARN-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162740#comment-14162740
 ] 

zhihai xu commented on YARN-2641:
---------------------------------

Hi [~djp], thanks to review the patch. I removed the following RMNode 
decommission in nodeHeartbeat(ResourceTrackerService.java).

{code}
   this.rmContext.getDispatcher().getEventHandler().handle(
          new RMNodeEvent(nodeId, RMNodeEventType.DECOMMISSION));
{code}

I added RMNode decommission in refreshNodes(NodesListManager.java).

Did you still see  the decommission happen after the heartbeat back to NM in 
the patch?

I didn't have unit test in my first patch(YARN-2641.000.patch).

In my second patch(YARN-2641.001.patch), I change the unit test in 
TestResourceTrackerService to verify the RMNodeEventType.DECOMMISSION is sent 
in {code}rm.getNodesListManager().refreshNodes(conf);{code} instead of
{code}nodeHeartbeat = nm1.nodeHeartbeat(true); {code}


> improve node decommission latency in RM.
> ----------------------------------------
>
>                 Key: YARN-2641
>                 URL: https://issues.apache.org/jira/browse/YARN-2641
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.5.0
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>         Attachments: YARN-2641.000.patch, YARN-2641.001.patch
>
>
> improve node decommission latency in RM. 
> Currently the node decommission only happened after RM received nodeHeartbeat 
> from the Node Manager. The node heartbeat interval is configurable. The 
> default value is 1 second.
> It will be better to do the decommission during RM Refresh(NodesListManager) 
> instead of nodeHeartbeat(ResourceTrackerService).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to