Yesha Vora created YARN-8409:
--------------------------------

             Summary: ActiveStandbyElectorBasedElectorService is failing with 
NPE
                 Key: YARN-8409
                 URL: https://issues.apache.org/jira/browse/YARN-8409
             Project: Hadoop YARN
          Issue Type: Bug
    Affects Versions: 3.1.1
            Reporter: Yesha Vora


In RM-HA env, kill ZK leader and then perform RM failover. 

Sometimes, active RM gets NPE and fail to come up successfully
{code:java}

2018-06-08 10:31:03,007 INFO  client.ZooKeeperSaslClient 
(ZooKeeperSaslClient.java:run(289)) - Client will use GSSAPI as SASL mechanism.

2018-06-08 10:31:03,008 INFO  zookeeper.ClientCnxn 
(ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server 
xxx/xxx:2181. Will attempt to SASL-authenticate using Login Context section 
'Client'

2018-06-08 10:31:03,009 WARN  zookeeper.ClientCnxn (ClientCnxn.java:run(1146)) 
- Session 0x0 for server null, unexpected error, closing socket connection and 
attempting reconnect

java.net.ConnectException: Connection refused

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)

at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)

at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)

2018-06-08 10:31:03,344 INFO  service.AbstractService 
(AbstractService.java:noteFailure(267)) - Service 
org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService
 failed in state INITED

java.lang.NullPointerException

at 
org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1033)

at 
org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1030)

at 
org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095)

at 
org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1087)

at 
org.apache.hadoop.ha.ActiveStandbyElector.createWithRetries(ActiveStandbyElector.java:1030)

at 
org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:347)

at 
org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.serviceInit(ActiveStandbyElectorBasedElectorService.java:110)

at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)

at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)

at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:336)

at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)

at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1479)

2018-06-08 10:31:03,345 INFO  ha.ActiveStandbyElector 
(ActiveStandbyElector.java:quitElection(409)) - Yielding from election{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to