Dear HBase mailing list,

We got following error all the time once after the HDFS is reformatted.
=====================================================================
ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2293) at org.apache.hadoop.hbase.master.MasterRpcServices.getTableNames(MasterRpcServices.java:900) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55650)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2180)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
        at java.lang.Thread.run(Thread.java:745)
=====================================================================

The configuration is the following:

hadoop: 2.6.5
hbase: 1.2.4
hbase use its own zookeeper: 3.4.6

All regionservers work fine while the master could not complete initlization. After checking some articles from online, it seems that there are some problems in zookeeper.
I attach the log from master and zookeeper in the following:

hbase-hadoop-master.log
=============================================================================
2017-05-07 23:36:10,551 INFO [atbara:16000.activeMasterManager] zookeeper.MetaTableLocator: Deleting hbase:meta region location in ZooKeeper 2017-05-07 23:36:10,573 INFO [atbara:16000.activeMasterManager] master.AssignmentManager: Setting node as OFFLINED in ZooKeeper for region {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''} 2017-05-07 23:36:10,586 INFO [atbara:16000.activeMasterManager] zookeeper.ZKTableStateManager: Moving table hbase:meta state from null to ENABLED 2017-05-07 23:36:10,599 INFO [atbara:16000.activeMasterManager] master.AssignmentManager: Assigning hbase:meta,,1.1588230740 to hnode4.informatik.uni-kl.de,16020,1494192962751 2017-05-07 23:36:10,599 INFO [atbara:16000.activeMasterManager] master.RegionStates: Transition {1588230740 state=OFFLINE, ts=1494192970573, server=null} to {1588230740 state=PENDING_OPEN, ts=1494192970599, server=hnode4.informatik.uni-kl.de,16020,1494192962751} 2017-05-07 23:36:10,808 INFO [atbara:16000.activeMasterManager] master.ServerManager: AssignmentManager hasn't finished failover cleanup; waiting 2017-05-07 23:36:10,842 INFO [AM.ZK.Worker-pool2-t1] master.RegionStates: Transition {1588230740 state=PENDING_OPEN, ts=1494192970599, server=hnode4.informatik.uni-kl.de,16020,1494192962751} to {1588230740 state=OPENING, ts=1494192970842, server=hnode4.informatik.uni-kl.de,16020,1494192962751} 2017-05-07 23:36:11,201 INFO [AM.ZK.Worker-pool2-t2] master.RegionStates: Transition {1588230740 state=OPENING, ts=1494192970842, server=hnode4.informatik.uni-kl.de,16020,1494192962751} to {1588230740 state=OPEN, ts=1494192971201, server=hnode4.informatik.uni-kl.de,16020,1494192962751} 2017-05-07 23:36:11,204 INFO [AM.ZK.Worker-pool2-t2] coordination.ZkOpenRegionCoordination: Handling OPENED of 1588230740 from atbara.informatik.uni-kl.de,16000,1494192960469; deleting unassigned node 2017-05-07 23:36:11,208 INFO [atbara:16000.activeMasterManager] master.HMaster: hbase:meta with replicaId 0 assigned=1, rit=false, location=hnode4.informatik.uni-kl.de,16020,1494192962751 2017-05-07 23:36:49,727 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=10, retries=350, started=38384 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:36:59,791 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=11, retries=350, started=48448 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:37:19,808 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=12, retries=350, started=68465 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:37:39,885 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=13, retries=350, started=88542 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:38:00,059 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=14, retries=350, started=108716 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:38:20,113 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=15, retries=350, started=128770 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:38:40,166 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=16, retries=350, started=148823 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:39:00,279 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=17, retries=350, started=168936 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:39:20,448 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=18, retries=350, started=189105 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:39:40,566 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=19, retries=350, started=209223 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:40:00,770 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=20, retries=350, started=229427 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:40:20,900 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=21, retries=350, started=249557 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:40:40,962 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=22, retries=350, started=269619 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:41:01,020 INFO [hconnection-0x47c57609-shared--pool3-t2] client.RpcRetryingCaller: Call exception, tries=23, retries=350, started=289677 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0 2017-05-07 23:41:04,765 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=3.24 MB, freeSize=3.09 GB, max=3.09 GB, blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=29, evicted=0, evictedPerRun=0.0
==============================================================

hbase-hadoop-zookeeper.log
==============================================================
2017-05-07 23:36:04,977 INFO [SyncThread:0] server.ZooKeeperServer: Established session 0x15be4d72bfa000c with negotiated timeout 90000 for client /192.168.18.8:38830 2017-05-07 23:36:05,070 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0000 type:create cxid:0x2c zxid:0x1c txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc Error:KeeperErrorCode = NoNode for /hbase/flush-table-proc 2017-05-07 23:36:05,091 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0000 type:create cxid:0x36 zxid:0x21 txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot Error:KeeperErrorCode = NoNode for /hbase/online-snapshot 2017-05-07 23:36:05,153 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0003 type:create cxid:0x7 zxid:0x26 txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc/acquired Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired 2017-05-07 23:36:05,154 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0006 type:create cxid:0x7 zxid:0x27 txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc/acquired Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired 2017-05-07 23:36:05,155 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0005 type:create cxid:0x7 zxid:0x28 txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc/acquired Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired 2017-05-07 23:36:05,158 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0002 type:create cxid:0x7 zxid:0x29 txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc/acquired Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired 2017-05-07 23:36:05,173 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0003 type:create cxid:0xa zxid:0x2a txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired 2017-05-07 23:36:05,173 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0006 type:create cxid:0xa zxid:0x2b txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired 2017-05-07 23:36:05,177 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0005 type:create cxid:0xa zxid:0x2c txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired 2017-05-07 23:36:05,180 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0002 type:create cxid:0xa zxid:0x2d txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired 2017-05-07 23:36:05,184 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222] server.NIOServerCnxnFactory: Accepted socket connection from /192.168.18.8:38832 2017-05-07 23:36:05,185 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222] server.ZooKeeperServer: Client attempting to establish new session at /192.168.18.8:38832 2017-05-07 23:36:05,190 INFO [SyncThread:0] server.ZooKeeperServer: Established session 0x15be4d72bfa000d with negotiated timeout 90000 for client /192.168.18.8:38832 2017-05-07 23:36:05,192 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa000d type:create cxid:0x1 zxid:0x2f txntype:-1 reqpath:n/a Error Path:/hbase/replication Error:KeeperErrorCode = NoNode for /hbase/replication 2017-05-07 23:36:05,202 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0004 type:create cxid:0x7 zxid:0x32 txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc/acquired Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired 2017-05-07 23:36:05,212 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0004 type:create cxid:0xa zxid:0x33 txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired 2017-05-07 23:36:05,743 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0002 type:create cxid:0x12 zxid:0x3e txntype:-1 reqpath:n/a Error Path:/hbase/replication/peers Error:KeeperErrorCode = NodeExists for /hbase/replication/peers 2017-05-07 23:36:10,552 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0000 type:delete cxid:0x6e zxid:0x40 txntype:-1 reqpath:n/a Error Path:/hbase/meta-region-server Error:KeeperErrorCode = NoNode for /hbase/meta-region-server 2017-05-07 23:36:11,194 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x15be4d72bfa0006 type:setData cxid:0x29 zxid:0x45 txntype:-1 reqpath:n/a Error Path:/hbase/meta-region-server Error:KeeperErrorCode = NoNode for /hbase/meta-region-server 2017-05-07 23:39:33,056 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222] server.NIOServerCnxnFactory: Accepted socket connection from /192.168.18.8:38858 2017-05-07 23:39:33,061 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222] server.ZooKeeperServer: Client attempting to establish new session at /192.168.18.8:38858 2017-05-07 23:39:33,063 INFO [SyncThread:0] server.ZooKeeperServer: Established session 0x15be4d72bfa000e with negotiated timeout 90000 for client /192.168.18.8:38858 2017-05-07 23:44:18,506 WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222] server.NIOServerCnxn: caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x15be4d72bfa000e, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2017-05-07 23:44:18,510 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222] server.NIOServerCnxn: Closed socket connection for client /192.168.18.8:38858 which had sessionid 0x15be4d72bfa000e 2017-05-07 23:45:36,000 INFO [SessionTracker] server.ZooKeeperServer: Expiring session 0x15be4d72bfa000e, timeout of 90000ms exceeded 2017-05-07 23:45:36,001 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Processed session termination for sessionid: 0x15be4d72bfa000e
======================================================================

I tried the following:
1. wipe hadoop tmp dir, data dir across all nodes and then hadoop namenode -format
2. wipe zookeeper.property.dataDir set in hbase-site.xml

still not working..
Thank you for your help in advance,
Best,
Weiping

Reply via email to