[ 
https://issues.apache.org/jira/browse/YARN-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742469#comment-13742469
 ] 

Jason Lowe commented on YARN-1071:
----------------------------------

The NM counts are only for NMs that have connected to the RM since it has 
started.  Restarting the RM sets these all to zero.  Since the 3 NMs that were 
previously active would retry and reconnect to the RM after it restarted that 
explains why ActiveNM count is 3.  However the other three will not contact the 
RM since they're not running, and that explains why they are zero after the 
restart.
                
> ResourceManager's decommissioned and lost node count is 0 after restart
> -----------------------------------------------------------------------
>
>                 Key: YARN-1071
>                 URL: https://issues.apache.org/jira/browse/YARN-1071
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.1.0-beta
>            Reporter: Srimanth Gunturi
>            Priority: Critical
>
> I had 6 nodes in a cluster with 2 NMs stopped. Then I put a host into YARN's 
> {{yarn.resourcemanager.nodes.exclude-path}}. After running {{yarn rmadmin 
> -refreshNodes}}, RM's JMX correctly showed decommissioned node count:
> {noformat}
> "NumActiveNMs" : 3,
> "NumDecommissionedNMs" : 1,
> "NumLostNMs" : 2,
> "NumUnhealthyNMs" : 0,
> "NumRebootedNMs" : 0
> {noformat}
> After restarting RM, the counts were shown as below in JMX.
> {noformat}
> "NumActiveNMs" : 3,
> "NumDecommissionedNMs" : 0,
> "NumLostNMs" : 0,
> "NumUnhealthyNMs" : 0,
> "NumRebootedNMs" : 0
> {noformat}
> Notice that the lost and decommissioned NM counts are both 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to