regionserver goes down on system suspend and does not start back.
------------------------------------------------------------------
Key: HBASE-1674
URL: https://issues.apache.org/jira/browse/HBASE-1674
Project: Hadoop HBase
Issue Type: Bug
Affects Versions: 0.20.0
Environment: ubuntu 9.04
Reporter: Irfan Mohammed
when i suspend my system and resume it ... regionserver does not start back.
looks like it actually shuts down completely. but the master and the zookeeper
resume properly.
i cannot stop-hbase.sh also properly. it goes on for a long time without doing
anything. i have kill the master and zookeeper processes manually and do to
"start-hbase.sh" to get back to the normal state.
ir...@damascus:~/qw/sandbox_7/qws$ stop-hbase.sh
stopping
master....................................................................................................
ir...@damascus:~$ jps
956
11871 JobTracker
1816 HMaster
5908 Launcher
1742 HQuorumPeer
11790 SecondaryNameNode
3352
32390 RunJar
11974 TaskTracker
4656 Child
11673 DataNode
6121 Jps
4669 Child
11568 NameNode
12770 PluginMain
ir...@damascus:~/apps/hbase-latest/logs$ tail -1000f
hbase-irfan-regionserver-damascus.log
...
...
...
2009-07-19 11:48:59,538 INFO org.apache.hadoop.hbase.regionserver.HRegion:
region site,,1247899770208/471872655 available; sequence id is 0
2009-07-19 11:48:59,539 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Starting compaction on region site,,1247899770208
2009-07-19 11:48:59,542 INFO org.apache.hadoop.hbase.regionserver.HRegion:
compaction completed on region site,,1247899770208 in 0sec
2009-07-19 11:58:09,369 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: compactions no longer
limited
2009-07-19 12:47:59,493 INFO org.apache.hadoop.hbase.regionserver.HLog: Roll
/hbase/.logs/damascus,60020,1247984279075/hlog.dat.1247984279311,
entries=109890, calcsize=18397518, filesize=12396338. New hlog
/hbase/.logs/damascus,60020,1247984279075/hlog.dat.1247987879487
2009-07-19 15:37:42,291 WARN org.apache.zookeeper.ClientCnxn: Exception closing
session 0x12291a8d2be0001 to sun.nio.ch.selectionkeyi...@1542a75
java.io.IOException: TIMED OUT
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-07-19 15:37:42,292 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
7207717ms, ten times longer than scheduled: 3000
2009-07-19 15:37:42,292 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to master
for 7207717 milliseconds - retrying
2009-07-19 15:37:42,294 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
7212721ms, ten times longer than scheduled: 10000
2009-07-19 15:37:42,295 WARN org.apache.zookeeper.ClientCnxn: Exception closing
session 0x12291a8d2be0005 to sun.nio.ch.selectionkeyi...@628704
java.io.IOException: TIMED OUT
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-07-19 15:37:42,296 WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream
ResponseProcessor exception for block
blk_5565548861312875890_4766java.net.SocketTimeoutException: 63000 millis
timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/127.0.0.1:15928
remote=/127.0.0.1:50010]
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2369)
2009-07-19 15:37:42,297 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery
for block blk_5565548861312875890_4766 bad datanode[0] 127.0.0.1:50010
2009-07-19 15:37:42,298 FATAL org.apache.hadoop.hbase.regionserver.LogRoller:
Log rolling failed with ioe:
java.io.IOException: All datanodes 127.0.0.1:50010 are bad. Aborting...
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2495)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2048)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2211)
2009-07-19 15:37:42,300 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
request=0.0, regions=3, stores=6, storefiles=4, storefileIndexSize=0,
memstoreSize=0, usedHeap=29, maxHeap=996, blockCacheSize=1961680,
blockCacheFree=416131792, blockCacheCount=2, blockCacheHitRatio=99
2009-07-19 15:37:42,300 INFO org.apache.hadoop.hbase.regionserver.LogRoller:
LogRoller exiting.
2009-07-19 15:37:42,300 INFO org.apache.hadoop.hbase.regionserver.LogFlusher:
regionserver/127.0.1.1:60020.logFlusher exiting
2009-07-19 15:37:42,392 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, state:
Disconnected, type: None, path: null
2009-07-19 15:37:44,192 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server localhost/127.0.0.1:2181
2009-07-19 15:37:44,193 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/127.0.0.1:55018
remote=localhost/127.0.0.1:2181]
2009-07-19 15:37:44,193 INFO org.apache.zookeeper.ClientCnxn: Server connection
successful
2009-07-19 15:37:44,197 WARN org.apache.zookeeper.ClientCnxn: Exception closing
session 0x12291a8d2be0005 to sun.nio.ch.selectionkeyi...@118cb72
java.io.IOException: Session Expired
at
org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
2009-07-19 15:37:44,198 INFO org.apache.zookeeper.ZooKeeper: Closing session:
0x12291a8d2be0005
2009-07-19 15:37:44,199 INFO org.apache.zookeeper.ClientCnxn: Closing
ClientCnxn for session: 0x12291a8d2be0005
2009-07-19 15:37:44,199 INFO org.apache.zookeeper.ClientCnxn: Disconnecting
ClientCnxn for session: 0x12291a8d2be0005
2009-07-19 15:37:44,199 INFO org.apache.zookeeper.ZooKeeper: Session:
0x12291a8d2be0005 closed
2009-07-19 15:37:44,200 INFO org.apache.zookeeper.ClientCnxn: EventThread shut
down
2009-07-19 15:37:44,297 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server localhost/127.0.0.1:2181
2009-07-19 15:37:44,297 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/127.0.0.1:55020
remote=localhost/127.0.0.1:2181]
2009-07-19 15:37:44,297 INFO org.apache.zookeeper.ClientCnxn: Server connection
successful
2009-07-19 15:37:44,298 WARN org.apache.zookeeper.ClientCnxn: Exception closing
session 0x12291a8d2be0001 to sun.nio.ch.selectionkeyi...@d4b411
java.io.IOException: Session Expired
at
org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
2009-07-19 15:37:44,299 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, state:
Expired, type: None, path: null
2009-07-19 15:37:45,302 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server
on 60020
2009-07-19 15:37:45,303 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 4 on 60020: exiting
2009-07-19 15:37:45,303 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
2009-07-19 15:37:45,314 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
Server Responder
2009-07-19 15:37:45,352 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
regionserver/127.0.1.1:60020.cacheFlusher exiting
2009-07-19 15:37:45,353 INFO
org.apache.hadoop.hbase.regionserver.CompactSplitThread:
regionserver/127.0.1.1:60020.compactor exiting
2009-07-19 15:37:45,353 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker:
regionserver/127.0.1.1:60020.majorCompactionChecker exiting
2009-07-19 15:37:45,353 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: On abort, closed hlog
2009-07-19 15:37:45,354 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Closed .META.,,1
2009-07-19 15:37:45,354 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Closed site,,1247899770208
2009-07-19 15:37:45,354 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Closed -ROOT-,,0
2009-07-19 15:37:45,354 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at:
127.0.1.1:60020
2009-07-19 15:37:45,362 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
Server listener on 60020
2009-07-19 15:37:45,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 3 on 60020: exiting
2009-07-19 15:37:45,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 2 on 60020: exiting
2009-07-19 15:37:45,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 0 on 60020: exiting
2009-07-19 15:37:45,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 9 on 60020: exiting
2009-07-19 15:37:45,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 8 on 60020: exiting
2009-07-19 15:37:45,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 7 on 60020: exiting
2009-07-19 15:37:45,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 6 on 60020: exiting
2009-07-19 15:37:45,373 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 5 on 60020: exiting
2009-07-19 15:37:45,373 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1 on 60020: exiting
2009-07-19 15:37:52,295 INFO org.apache.hadoop.hbase.Leases:
regionserver/127.0.1.1:60020.leaseChecker closing leases
2009-07-19 15:37:52,295 INFO org.apache.hadoop.hbase.Leases:
regionserver/127.0.1.1:60020.leaseChecker closed leases
2009-07-19 15:37:52,295 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
2009-07-19 15:37:52,295 INFO org.apache.zookeeper.ZooKeeper: Closing session:
0x12291a8d2be0001
2009-07-19 15:37:52,295 INFO org.apache.zookeeper.ClientCnxn: Closing
ClientCnxn for session: 0x12291a8d2be0001
2009-07-19 15:37:52,296 INFO org.apache.zookeeper.ClientCnxn: Disconnecting
ClientCnxn for session: 0x12291a8d2be0001
2009-07-19 15:37:52,296 INFO org.apache.zookeeper.ZooKeeper: Session:
0x12291a8d2be0001 closed
2009-07-19 15:37:52,296 INFO org.apache.zookeeper.ClientCnxn: EventThread shut
down
2009-07-19 15:37:52,398 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer:
regionserver/127.0.1.1:60020 exiting
2009-07-19 15:37:52,399 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread.
2009-07-19 15:37:52,400 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.