[ 
https://issues.apache.org/jira/browse/YARN-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167692#comment-14167692
 ] 

Wei Yan commented on YARN-2641:
-------------------------------

bq. I think the actual decommission happen when NM receive shutdown from RM 
heartbeat back. Isn't it? So the latency between decommission CLI and node get 
decommissioned won't affected. Also, in most cases, resource scheduling is 
triggered by NM's heartbeat with RM. So the latency of decommission CLI and 
scheduling container on nodes won't get affected (except attempt scheduling). 
So IMO, this patch only improve the latency for attempt scheduling case. Do we 
have some other scenarios to address?

>From my understanding, currently if one NM failed or killed, the RM cannot 
>gets that information until yarn.nm.liveness-monitor.expiry-interval-ms 
>expired. That means, all containers running on that failed NM are assumed to 
>be still running from the RM and AM sides, until the timeout. However, 
>[~zxu]'s point is that, the RM doesn't need to wait a long time to get NM 
>killed information, the RM can get this information directly when 
>"refreshNodes" command is triggered. For example, if the user removes one NM, 
>and then does refreshNodes, the RM can understand that NM killed quickly and 
>can notify all applications about that, without needing to wait for the 
>heartbeat timeout. And the AMs can act on that quickly.

> improve node decommission latency in RM.
> ----------------------------------------
>
>                 Key: YARN-2641
>                 URL: https://issues.apache.org/jira/browse/YARN-2641
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.5.0
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>         Attachments: YARN-2641.000.patch, YARN-2641.001.patch
>
>
> improve node decommission latency in RM. 
> Currently the node decommission only happened after RM received nodeHeartbeat 
> from the Node Manager. The node heartbeat interval is configurable. The 
> default value is 1 second.
> It will be better to do the decommission during RM Refresh(NodesListManager) 
> instead of nodeHeartbeat(ResourceTrackerService).
> This will be a much more serious issue:
> After RM is refreshed (refreshNodes), If the NM to be decommissioned is 
> killed before NM sent heartbeat to RM. The RMNode will never be 
> decommissioned in RM. The RMNode will only expire in RM after  
> "yarn.nm.liveness-monitor.expiry-interval-ms"(default value 10 minutes) time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to