Marouane RAJI created YARN-9506:
-----------------------------------

             Summary: Node Managers fail to update cached IP entries of 
Resource Managers 
                 Key: YARN-9506
                 URL: https://issues.apache.org/jira/browse/YARN-9506
             Project: Hadoop YARN
          Issue Type: Bug
          Components: nodemanager
    Affects Versions: 2.7.1
            Reporter: Marouane RAJI
         Attachments: NM_logs.txt

Hi,

We are running a Yarn Cluster (for Samza Jobs) on AWS. We are running it in HA 
mode, with yarn.resourcemanager.ha.automatic-failover.enabled= true

To reproduce the issue : 
 # Have a running cluster with 2 NodeManagers and 2 Resource Managers in HA 
mode, with fail-over enabled.
 ** These Resource Managers need to have DNS entries defined, and set in the 
config:
 *** ex: yarnrm1.me.local and yarnrm2.me.local
 # stop the active resource manager (yarnrm1.me.local), and retire its 
instance. (Node Managers will fallback to the standby yarnrm2.me.local)
 # provision a new resource manager with a new IP. Make sure the DNS entry 
yarnrm1.me.local is assigned to it.
 # stop the new active resource manager (yarnrm2.me.local).
 # Check the logs of NodeManagers failing to access the newly provisioned 
Resource Manager, and trying to access it through the old IP.

I can provide config files, yarn-site and core-site if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to