Vincent Bernat created ZOOKEEPER-1738:
-----------------------------------------
Summary: Xid out of order from a 3.4.5 client to a 3.3.5 cluster
Key: ZOOKEEPER-1738
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1738
Project: ZooKeeper
Issue Type: Bug
Affects Versions: 3.3.5
Environment: Server: zookeeper 3.3.5+dfsg1-1ubuntu1
Client: zookeeper 3.4.5 from Cloudera 4.3.0
Reporter: Vincent Bernat
Fix For: 3.4.5
This happens in the context of HBase master nodes getting connections from
HBase region server. Once an HBase region server joins the cluster, I get the
following error:
{code}
2013-08-07 13:35:18,676 WARN org.apache.zookeeper.ClientCnxn: Session
0xd4058c4d7940003 for server zk-01.dev.dailymotion.com/10.194.60.13:2181,
unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Xid out of order. Got Xid 56 with err -101 expected Xid 55
for a packet with details: clientPath:null serverPath:null finished:false
header:: 55,14 replyHeader:: 0,0,-4 request::
org.apache.zookeeper.MultiTransactionRecord@360193e5 response::
org.apache.zookeeper.MultiResponse@0
at
org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:795)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:94)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2013-08-07 13:35:18,676 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss
2013-08-07 13:35:18,676 ERROR
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper multi failed
after 3 retries
2013-08-07 13:35:18,677 ERROR org.apache.hadoop.hbase.master.AssignmentManager:
Unable to ensure that the table -ROOT- will be enabled because of a ZooKeeper
issue
2013-08-07 13:35:18,677 FATAL org.apache.hadoop.hbase.master.HMaster: Master
server abort: loaded coprocessors are: []
2013-08-07 13:35:18,677 FATAL org.apache.hadoop.hbase.master.HMaster: Unable to
ensure that the table -ROOT- will be enabled because of a ZooKeeper issue
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:931)
at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:911)
at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.multi(RecoverableZooKeeper.java:531)
at
org.apache.hadoop.hbase.zookeeper.ZKUtil.multiOrSequential(ZKUtil.java:1440)
at
org.apache.hadoop.hbase.zookeeper.ZKTable.setTableState(ZKTable.java:245)
at
org.apache.hadoop.hbase.zookeeper.ZKTable.setEnabledTable(ZKTable.java:325)
at
org.apache.hadoop.hbase.master.AssignmentManager.setEnabledTable(AssignmentManager.java:3576)
at
org.apache.hadoop.hbase.master.AssignmentManager.setEnabledTable(AssignmentManager.java:2340)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1674)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
at
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
at
org.apache.hadoop.hbase.master.AssignmentManager.addToRITandCallClose(AssignmentManager.java:675)
at
org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:586)
at
org.apache.hadoop.hbase.master.AssignmentManager.processRegionInTransition(AssignmentManager.java:525)
at
org.apache.hadoop.hbase.master.AssignmentManager.processRegionInTransitionAndBlockUntilAssigned(AssignmentManager.java:489)
at
org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:679)
at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:583)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:395)
at java.lang.Thread.run(Thread.java:722)
2013-08-07 13:35:18,678 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
2013-08-07 13:35:18,678 DEBUG org.apache.hadoop.hbase.master.AssignmentManager:
Server stopped; skipping assign of -ROOT-,,0.70236052 state=OFFLINE,
ts=1375881792131, server=null
2013-08-07 13:35:18,678 DEBUG org.apache.hadoop.hbase.master.AssignmentManager:
Waiting on 70236052/-ROOT-
2013-08-07 13:35:18,678 INFO
org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor:
masternode-01.dev.dailymotion.com,60000,1375880747185.timeoutMonitor exiting
2013-08-07 13:35:18,679 DEBUG org.apache.hadoop.hbase.master.AssignmentManager:
Handling transition=M_ZK_REGION_OFFLINE,
server=masternode-01.dev.dailymotion.com,60000,1375880747185,
region=70236052/-ROOT-, which is more than 15 seconds late
2013-08-07 13:35:18,776 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for /hbase/root-region-server
2013-08-07 13:35:18,776 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for /hbase
2013-08-07 13:35:18,776 ERROR
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper getData
failed after 3 retries
2013-08-07 13:35:18,777 INFO org.apache.hadoop.hbase.util.RetryCounter:
Sleeping 2000ms before retry #1...
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira