[ 
https://issues.apache.org/jira/browse/YARN-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051121#comment-15051121
 ] 

Kuhu Shukla commented on YARN-3102:
-----------------------------------

Following the discussion from YARN-4402:

Given that we consider exclude list as canonical truth of decomm-ed nodes, 
which means during serviceInit, {{setDecomissionedNMsMetrics}} call is kept as 
is, the only way to have these nodes be part of inactiveNodes map, which today 
gets reinitialized to a new empty concurrent map in {{RMActiveServiceContext}} 
during startup, is to have node hostnames/ips read in from exclude list and 
added to this map even though we lose the port information. This is because the 
node would ideally not have the NM process running and we don't keep that state 
across RM restarts. What that means is, we add a (NodeId,RMNode) entry where 
the hostname is legal but the ports are a defined invalid value like -1. This 
allows us to track the nodes that were decommissioned in the previous life 
cycle of the RM. We can also tweak the GUI to display N/A when the port is -1. 
Since the check of {{isValidNode}} is only on the basis of hostname/ip , this 
does not affect the rejoining behavior of the node. Requesting [~eepayne] for 
comments and ideas. 

> Decommisioned Nodes not listed in Web UI
> ----------------------------------------
>
>                 Key: YARN-3102
>                 URL: https://issues.apache.org/jira/browse/YARN-3102
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>         Environment: 2 Node Manager and 1 Resource Manager 
>            Reporter: Bibin A Chundatt
>            Assignee: Kuhu Shukla
>            Priority: Minor
>
> Configure yarn.resourcemanager.nodes.exclude-path in yarn-site.xml to 
> yarn.exlude file In RM1 machine
> Add Yarn.exclude with NM1 Host Name 
> Start the node as listed below NM1,NM2 Resource manager
> Now check Nodes decommisioned in /cluster/nodes
> Number of decommisioned node is listed as 1 but Table is empty in 
> /cluster/nodes/decommissioned (detail of Decommision node not shown)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to