Harsh J created HADOOP-13515: -------------------------------- Summary: Redundant transitionToActive call can cause a NameNode to crash Key: HADOOP-13515 URL: https://issues.apache.org/jira/browse/HADOOP-13515 Project: Hadoop Common Issue Type: Bug Components: ha Affects Versions: 2.5.0 Reporter: Harsh J Priority: Minor
The situation in parts is similar to HADOOP-8217, but the cause is different and so is the result. Consider this situation: - At the beginning NN1 is Active, NN2 is Standby - ZKFC1 faces a ZK disconnect (not a session timeout, just a socket disconnect) and thereby reconnects {code} 2016-08-11 07:00:46,068 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 4000ms for sessionid 0x4566f0c97500bd9, closing socket connection and attempting reconnect 2016-08-11 07:00:46,169 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session disconnected. Entering neutral mode... … 2016-08-11 07:00:46,610 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected. {code} - The reconnection on the ZKFC1 triggers the elector code, and the elector re-run finds that NN1 should be the new active (a redundant decision cause NN1 is already active) {code} 2016-08-11 07:00:46,615 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced... 2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: … 2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: But old node has our own data, so don't need to fence it. {code} - The ZKFC1 sets the new ZK data, and fires a NN1 RPC call of transitionToActive {code} 2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing znode /hadoop-ha/nameservice1/ActiveBreadCrumb to indicate that the local node is the most recent active... 2016-08-11 07:00:46,649 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 175: Call -> nn01/10.10.10.10:8022: transitionToActive {reqInfo { reqSource: REQUEST_BY_ZKFC }} {code} - At the same time as the transitionToActive call is in progress at NN1, but not complete yet, the ZK session of ZKFC1 is timed out by ZK Quorum, and a watch notification is sent to ZKFC2 {code} 2016-08-11 07:01:00,003 DEBUG org.apache.zookeeper.ClientCnxn: Got notification sessionid:0x4566f0c97500bde 2016-08-11 07:01:00,004 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent state:SyncConnected type:NodeDeleted path:/hadoop-ha/nameservice1/ActiveStandbyElectorLock for sessionid 0x4566f0c97500bde {code} - ZKFC2 responds by marking NN2 as standby, which succeeds (NN hasn't handled transitionToActive call yet due to busy status, but has handled transitionToStandby before it) {code} 2016-08-11 07:01:00,013 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced... 2016-08-11 07:01:00,018 INFO org.apache.hadoop.ha.ZKFailoverController: Should fence: NameNode at nn01/10.10.10.10:8022 2016-08-11 07:01:00,020 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 412: Call -> nn01/10.10.10.10:8022: transitionToStandby {reqInfo { reqSource: REQUEST_BY_ZKFC }} 2016-08-11 07:01:03,880 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: transitionToStandby took 3860ms {code} - ZKFC2 then marks NN2 as active, and NN2 begins its transition (is in midst of it, not done yet at this point) {code} 2016-08-11 07:01:03,894 INFO org.apache.hadoop.ha.ZKFailoverController: Trying to make NameNode at nn02/11.11.11.11:8022 active... 2016-08-11 07:01:03,895 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 412: Call -> nn02/11.11.11.11:8022: transitionToActive {reqInfo { reqSource: REQUEST_BY_ZKFC }} … {code} {code} 2016-08-11 07:01:09,558 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for active state … 2016-08-11 07:01:19,968 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing edit logs at txnid 5635 {code} - At the same time in parallel NN1 processes the transitionToActive requests finally, and becomes active {code} 2016-08-11 07:01:13,281 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for active state … 2016-08-11 07:01:19,599 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing edit logs at txnid 5635 … 2016-08-11 07:01:19,602 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 5635 {code} - NN2's active transition fails as a result of this parallel active transition on NN1 which has completed right before it tries to take over {code} 2016-08-11 07:01:19,968 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing edit logs at txnid 5635 2016-08-11 07:01:22,799 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring NN shutdown. Shutting down immediately. java.lang.IllegalStateException: Cannot start writing at txid 5635 when there is a stream available for read: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@53feee4f at org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:312) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1174) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1654) at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64) at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49) at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1527) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1302) at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107) at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) {code} (Reflects as a failure on the ZKFC2 too): {code} 2016-08-11 07:01:31,598 DEBUG org.apache.hadoop.ipc.Client: closing ipc connection to nn02/11.11.11.11:8022: null java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1055) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:950) 2016-08-11 07:01:31,626 FATAL org.apache.hadoop.ha.ZKFailoverController: Couldn't make NameNode at nn02/11.11.11.11:8022 active java.io.IOException: Failed on local exception: java.io.EOFException; Host Details : local host is: "nn02/11.11.11.11"; destination host is: "nn02":8022; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) at org.apache.hadoop.ipc.Client.call(Client.java:1415) at org.apache.hadoop.ipc.Client.call(Client.java:1364) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy15.transitionToActive(Unknown Source) at org.apache.hadoop.ha.protocolPB.HAServiceProtocolClientSideTranslatorPB.transitionToActive(HAServiceProtocolClientSideTranslatorPB.java:100) at org.apache.hadoop.ha.HAServiceProtocolHelper.transitionToActive(HAServiceProtocolHelper.java:48) at org.apache.hadoop.ha.ZKFailoverController.becomeActive(ZKFailoverController.java:377) at org.apache.hadoop.ha.ZKFailoverController.access$900(ZKFailoverController.java:60) at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.becomeActive(ZKFailoverController.java:868) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:804) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1055) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:950) 2016-08-11 07:01:31,665 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election org.apache.hadoop.ha.ServiceFailedException: Couldn't transition to active {code} We should perhaps avoid sending an active RPC if the NN is already active, by checking the service status before firing the transition RPC inside ZKFailoverController. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org