Hi,
   the HMaster died as well as regionservers, below is hmaster's log. could
you please find what's problem?


2012-10-12 00:14:19,444 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to bj-ecsxhm4f3I-r3-5-r810-2-hbase-stor-3/
10.20.16.34:2181, initiating session
2012-10-12 00:14:19,520 INFO org.apache.zookeeper.ClientCnxn: Session
establishment complete on server bj-ecsxhm4f3I-r3-5-r810-2-hbase-stor-3/
10.20.16.34:2181, sessionid = 0x139c539bc090002, negotiated timeout = 40000
2012-10-12 00:14:23,738 INFO org.apache.zookeeper.ClientCnxn: Client
session timed out, have not heard from server in 15046ms for sessionid
0x239c539ba630001, closing socket connection and attempting reconnect
2012-10-12 00:14:24,246 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2/
10.20.16.33:2181
2012-10-12 00:14:25,173 INFO org.apache.zookeeper.ClientCnxn: Client
session timed out, have not heard from server in 15245ms for sessionid
0x139c539bc090003, closing socket connection and attempting reconnect
2012-10-12 00:14:25,328 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2/
10.20.16.33:2181
2012-10-12 00:14:25,328 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2/
10.20.16.33:2181, initiating session
2012-10-12 00:14:25,507 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2012-10-12 00:14:25,507 INFO org.apache.zookeeper.ClientCnxn: Unable to
reconnect to ZooKeeper service, session 0x139c539bc090003 has expired,
closing socket connection
2012-10-12 00:14:27,247 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2/
10.20.16.33:2181, initiating session
2012-10-12 00:14:27,248 WARN org.apache.zookeeper.ClientCnxn: Session
0x239c539ba630001 for server bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2/
10.20.16.33:2181, unexpected error, closing socket connection and
attempting reconnect
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:218)
    at sun.nio.ch.IOUtil.read(IOUtil.java:186)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359)
    at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:859)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1157)
2012-10-12 00:14:28,026 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server bj-ecsxhm4f3I-r3-5-r810-2-hbase-stor-3/
10.20.16.34:2181
2012-10-12 00:14:41,359 INFO org.apache.zookeeper.ClientCnxn: Client
session timed out, have not heard from server in 14007ms for sessionid
0x239c539ba630001, closing socket connection and attempting reconnect
2012-10-12 00:14:41,592 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server bj-ecsxhm4f3I-r3-5-r810-4-hbase-stor-1/
10.20.16.32:2181
2012-10-12 00:14:46,186 INFO org.apache.zookeeper.ClientCnxn: Client
session timed out, have not heard from server in 26666ms for sessionid
0x139c539bc090002, closing socket connection and attempting reconnect
2012-10-12 00:14:46,572 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2/
10.20.16.33:2181
2012-10-12 00:14:46,572 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2/
10.20.16.33:2181, initiating session
2012-10-12 00:14:46,726 INFO org.apache.zookeeper.ClientCnxn: Session
establishment complete on server bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2/
10.20.16.33:2181, sessionid = 0x139c539bc090002, negotiated timeout = 40000
2012-10-12 00:14:54,925 INFO org.apache.zookeeper.ClientCnxn: Client
session timed out, have not heard from server in 13464ms for sessionid
0x239c539ba630001, closing socket connection and attempting reconnect
2012-10-12 00:14:56,524 ERROR org.apache.hadoop.hbase.master.HMaster:
Region server
serverName=bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2,60020,1347901025673,
load=(requests=75, regions=1, usedHeap=162, maxHeap=9725) reported a fatal
error:
ABORTING region server
serverName=bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2,60020,1347901025673,
load=(requests=75, regions=1, usedHeap=162, maxHeap=9725):
regionserver:60020-0x339c539ba640003 regionserver:60020-0x339c539ba640003
received expired from ZooKeeper, aborting
Cause:
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:353)
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:271)
    at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531)
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507)

2012-10-12 00:14:56,813 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server bj-ecsxhm4f3I-r3-5-r810-3-hbase-stor-2/
10.20.16.33:2181
2012-10-12 00:15:10,147 INFO org.apache.zookeeper.ClientCnxn: Client
session timed out, have not heard from server in 15119ms for sessionid
0x239c539ba630001, closing socket connection and attempting reconnect
2012-10-12 00:15:10,625 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server bj-ecsxhm4f3I-r3-5-r810-2-hbase-stor-3/
10.20.16.34:2181
2012-10-12 00:15:10,625 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to bj-ecsxhm4f3I-r3-5-r810-2-hbase-stor-3/
10.20.16.34:2181, initiating session
2012-10-12 00:15:10,750 INFO org.apache.zookeeper.ClientCnxn: Unable to
reconnect to ZooKeeper service, session 0x239c539ba630001 has expired,
closing socket connection
2012-10-12 00:15:10,750 FATAL org.apache.hadoop.hbase.master.HMaster:
master:60000-0x239c539ba630001 master:60000-0x239c539ba630001 received
expired from ZooKeeper, aborting
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:353)
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:271)
    at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531)
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507)
2012-10-12 00:15:10,751 INFO org.apache.hadoop.hbase.master.HMaster:
Aborting
2012-10-12 00:15:10,751 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2012-10-12 00:15:11,392 DEBUG org.apache.hadoop.hbase.master.HMaster:
Stopping service threads
2012-10-12 00:15:11,392 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
server on 60000
2012-10-12 00:15:11,392 INFO org.apache.hadoop.hbase.master.CatalogJanitor:
bj-ecsxhm4f3I-r3-7-r810-3-hbase-stor-6:60000-CatalogJanitor exiting
2012-10-12 00:15:11,392 INFO org.apache.hadoop.hbase.master.HMaster$2:
bj-ecsxhm4f3I-r3-7-r810-3-hbase-stor-6:60000-BalancerChore exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 0 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 11 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 6 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 9 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
IPC Server listener on 60000
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 3 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 7 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 5 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.hbase.master.HMaster:
Stopping infoServer
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 20 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 23 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 19 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 25 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 29 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 18 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 15 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
IPC Server Responder
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 16 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 37 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 40 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 41 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 46 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 47 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 50 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 51 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 12 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 14 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 13 on 60000: exiting
2012-10-12 00:15:11,393 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 10 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 59 on 60000: exiting
2012-10-12 00:15:11,395 INFO
org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor:
bj-ecsxhm4f3I-r3-7-r810-3-hbase-stor-6:60000.timeoutMonitor exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 53 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 54 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 58 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 57 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 56 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 55 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 52 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 49 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 48 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 44 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 43 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 45 on 60000: exiting
2012-10-12 00:15:11,395 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 42 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 39 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 38 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 35 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 36 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 34 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 2 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 17 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 4 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 32 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 8 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 33 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.mortbay.log: Stopped
SelectChannelConnector@0.0.0.0:60010
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 30 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 31 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 28 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 27 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 26 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 24 on 60000: exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.hbase.master.LogCleaner:
master-bj-ecsxhm4f3I-r3-7-r810-3-hbase-stor-6:60000.oldLogCleaner exiting
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 22 on 60000: exiting
2012-10-12 00:15:11,398 INFO
org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: Stopping
replicationLogCleaner-0x139c539bc090003
2012-10-12 00:15:11,394 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 21 on 60000: exiting
2012-10-12 00:15:11,502 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil:
master:60000-0x239c539ba630001 Unable to get data of znode /hbase/master
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/master
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
    at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
    at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
    at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620)
    at
org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:197)
    at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:310)
2012-10-12 00:15:11,502 ERROR
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
master:60000-0x239c539ba630001 Received unexpected KeeperException,
re-throwing exception
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/master
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
    at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
    at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
    at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620)
    at
org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:197)
    at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:310)
2012-10-12 00:15:11,503 ERROR
org.apache.hadoop.hbase.master.ActiveMasterManager:
master:60000-0x239c539ba630001 Error deleting our own master address node
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/master
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
    at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
    at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
    at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620)
    at
org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:197)
    at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:310)
2012-10-12 00:15:11,503 DEBUG
org.apache.hadoop.hbase.catalog.CatalogTracker: Stopping catalog tracker
org.apache.hadoop.hbase.catalog.CatalogTracker@36664140
2012-10-12 00:15:11,503 DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
The connection to
hconnection-0x139c539bc090002-0x139c539bc090002-0x139c539bc090002 has been
closed.
2012-10-12 00:15:11,503 DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
The connection to
hconnection-0x139c539bc090002-0x139c539bc090002-0x139c539bc090002 has been
closed.
2012-10-12 00:15:11,503 INFO org.apache.hadoop.hbase.master.HMaster:
HMaster main thread exiting


Best R.

beatls

Reply via email to