Dear HBase mailing list,
We got following error all the time once after the HDFS is reformatted.
=====================================================================
ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
at
org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2293)
at
org.apache.hadoop.hbase.master.MasterRpcServices.getTableNames(MasterRpcServices.java:900)
at
org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55650)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2180)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
=====================================================================
The configuration is the following:
hadoop: 2.6.5
hbase: 1.2.4
hbase use its own zookeeper: 3.4.6
All regionservers work fine while the master could not complete
initlization. After checking some articles from online, it seems that
there are some problems in zookeeper.
I attach the log from master and zookeeper in the following:
hbase-hadoop-master.log
=============================================================================
2017-05-07 23:36:10,551 INFO [atbara:16000.activeMasterManager]
zookeeper.MetaTableLocator: Deleting hbase:meta region location in ZooKeeper
2017-05-07 23:36:10,573 INFO [atbara:16000.activeMasterManager]
master.AssignmentManager: Setting node as OFFLINED in ZooKeeper for
region {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '',
ENDKEY => ''}
2017-05-07 23:36:10,586 INFO [atbara:16000.activeMasterManager]
zookeeper.ZKTableStateManager: Moving table hbase:meta state from null
to ENABLED
2017-05-07 23:36:10,599 INFO [atbara:16000.activeMasterManager]
master.AssignmentManager: Assigning hbase:meta,,1.1588230740 to
hnode4.informatik.uni-kl.de,16020,1494192962751
2017-05-07 23:36:10,599 INFO [atbara:16000.activeMasterManager]
master.RegionStates: Transition {1588230740 state=OFFLINE,
ts=1494192970573, server=null} to {1588230740 state=PENDING_OPEN,
ts=1494192970599, server=hnode4.informatik.uni-kl.de,16020,1494192962751}
2017-05-07 23:36:10,808 INFO [atbara:16000.activeMasterManager]
master.ServerManager: AssignmentManager hasn't finished failover
cleanup; waiting
2017-05-07 23:36:10,842 INFO [AM.ZK.Worker-pool2-t1]
master.RegionStates: Transition {1588230740 state=PENDING_OPEN,
ts=1494192970599,
server=hnode4.informatik.uni-kl.de,16020,1494192962751} to {1588230740
state=OPENING, ts=1494192970842,
server=hnode4.informatik.uni-kl.de,16020,1494192962751}
2017-05-07 23:36:11,201 INFO [AM.ZK.Worker-pool2-t2]
master.RegionStates: Transition {1588230740 state=OPENING,
ts=1494192970842,
server=hnode4.informatik.uni-kl.de,16020,1494192962751} to {1588230740
state=OPEN, ts=1494192971201,
server=hnode4.informatik.uni-kl.de,16020,1494192962751}
2017-05-07 23:36:11,204 INFO [AM.ZK.Worker-pool2-t2]
coordination.ZkOpenRegionCoordination: Handling OPENED of 1588230740
from atbara.informatik.uni-kl.de,16000,1494192960469; deleting
unassigned node
2017-05-07 23:36:11,208 INFO [atbara:16000.activeMasterManager]
master.HMaster: hbase:meta with replicaId 0 assigned=1, rit=false,
location=hnode4.informatik.uni-kl.de,16020,1494192962751
2017-05-07 23:36:49,727 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=10, retries=350,
started=38384 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:36:59,791 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=11, retries=350,
started=48448 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:37:19,808 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=12, retries=350,
started=68465 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:37:39,885 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=13, retries=350,
started=88542 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:38:00,059 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=14, retries=350,
started=108716 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:38:20,113 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=15, retries=350,
started=128770 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:38:40,166 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=16, retries=350,
started=148823 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:39:00,279 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=17, retries=350,
started=168936 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:39:20,448 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=18, retries=350,
started=189105 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:39:40,566 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=19, retries=350,
started=209223 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:40:00,770 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=20, retries=350,
started=229427 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:40:20,900 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=21, retries=350,
started=249557 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:40:40,962 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=22, retries=350,
started=269619 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:41:01,020 INFO [hconnection-0x47c57609-shared--pool3-t2]
client.RpcRetryingCaller: Call exception, tries=23, retries=350,
started=289677 ms ago, cancelled=false, msg=row '' on table 'hbase:meta'
at region=hbase:meta,,1.1588230740,
hostname=hnode4.informatik.uni-kl.de,16020,1494192962751, seqNum=0
2017-05-07 23:41:04,765 INFO [LruBlockCacheStatsExecutor]
hfile.LruBlockCache: totalSize=3.24 MB, freeSize=3.09 GB, max=3.09 GB,
blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0,
cachingHits=0, cachingHitsRatio=0,evictions=29, evicted=0, evictedPerRun=0.0
==============================================================
hbase-hadoop-zookeeper.log
==============================================================
2017-05-07 23:36:04,977 INFO [SyncThread:0] server.ZooKeeperServer:
Established session 0x15be4d72bfa000c with negotiated timeout 90000 for
client /192.168.18.8:38830
2017-05-07 23:36:05,070 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0000 type:create cxid:0x2c zxid:0x1c
txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc
Error:KeeperErrorCode = NoNode for /hbase/flush-table-proc
2017-05-07 23:36:05,091 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0000 type:create cxid:0x36 zxid:0x21
txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot
Error:KeeperErrorCode = NoNode for /hbase/online-snapshot
2017-05-07 23:36:05,153 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0003 type:create cxid:0x7 zxid:0x26
txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc/acquired
Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired
2017-05-07 23:36:05,154 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0006 type:create cxid:0x7 zxid:0x27
txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc/acquired
Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired
2017-05-07 23:36:05,155 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0005 type:create cxid:0x7 zxid:0x28
txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc/acquired
Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired
2017-05-07 23:36:05,158 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0002 type:create cxid:0x7 zxid:0x29
txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc/acquired
Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired
2017-05-07 23:36:05,173 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0003 type:create cxid:0xa zxid:0x2a
txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired
Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
2017-05-07 23:36:05,173 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0006 type:create cxid:0xa zxid:0x2b
txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired
Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
2017-05-07 23:36:05,177 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0005 type:create cxid:0xa zxid:0x2c
txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired
Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
2017-05-07 23:36:05,180 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0002 type:create cxid:0xa zxid:0x2d
txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired
Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
2017-05-07 23:36:05,184 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
server.NIOServerCnxnFactory: Accepted socket connection from
/192.168.18.8:38832
2017-05-07 23:36:05,185 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
server.ZooKeeperServer: Client attempting to establish new session at
/192.168.18.8:38832
2017-05-07 23:36:05,190 INFO [SyncThread:0] server.ZooKeeperServer:
Established session 0x15be4d72bfa000d with negotiated timeout 90000 for
client /192.168.18.8:38832
2017-05-07 23:36:05,192 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa000d type:create cxid:0x1 zxid:0x2f
txntype:-1 reqpath:n/a Error Path:/hbase/replication
Error:KeeperErrorCode = NoNode for /hbase/replication
2017-05-07 23:36:05,202 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0004 type:create cxid:0x7 zxid:0x32
txntype:-1 reqpath:n/a Error Path:/hbase/flush-table-proc/acquired
Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired
2017-05-07 23:36:05,212 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0004 type:create cxid:0xa zxid:0x33
txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired
Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
2017-05-07 23:36:05,743 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0002 type:create cxid:0x12 zxid:0x3e
txntype:-1 reqpath:n/a Error Path:/hbase/replication/peers
Error:KeeperErrorCode = NodeExists for /hbase/replication/peers
2017-05-07 23:36:10,552 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0000 type:delete cxid:0x6e zxid:0x40
txntype:-1 reqpath:n/a Error Path:/hbase/meta-region-server
Error:KeeperErrorCode = NoNode for /hbase/meta-region-server
2017-05-07 23:36:11,194 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Got user-level KeeperException when
processing sessionid:0x15be4d72bfa0006 type:setData cxid:0x29 zxid:0x45
txntype:-1 reqpath:n/a Error Path:/hbase/meta-region-server
Error:KeeperErrorCode = NoNode for /hbase/meta-region-server
2017-05-07 23:39:33,056 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
server.NIOServerCnxnFactory: Accepted socket connection from
/192.168.18.8:38858
2017-05-07 23:39:33,061 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
server.ZooKeeperServer: Client attempting to establish new session at
/192.168.18.8:38858
2017-05-07 23:39:33,063 INFO [SyncThread:0] server.ZooKeeperServer:
Established session 0x15be4d72bfa000e with negotiated timeout 90000 for
client /192.168.18.8:38858
2017-05-07 23:44:18,506 WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
server.NIOServerCnxn: caught end of stream exception
EndOfStreamException: Unable to read additional data from client
sessionid 0x15be4d72bfa000e, likely client has closed socket
at
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-05-07 23:44:18,510 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
server.NIOServerCnxn: Closed socket connection for client
/192.168.18.8:38858 which had sessionid 0x15be4d72bfa000e
2017-05-07 23:45:36,000 INFO [SessionTracker] server.ZooKeeperServer:
Expiring session 0x15be4d72bfa000e, timeout of 90000ms exceeded
2017-05-07 23:45:36,001 INFO [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor: Processed session termination for
sessionid: 0x15be4d72bfa000e
======================================================================
I tried the following:
1. wipe hadoop tmp dir, data dir across all nodes and then hadoop
namenode -format
2. wipe zookeeper.property.dataDir set in hbase-site.xml
still not working..
Thank you for your help in advance,
Best,
Weiping