Hi lztaomin, > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired
indicates that you have experienced the "Juliet Pause" issue, which means you ran into a JVM garbage collection that lasted longer than the configured ZooKeeper timeout threshold. If you search for it on Google http://www.google.com/search?q=juliet+pause+hbase you will find quite a few pages explaining the problem, and what you can do to avoid this. Lars On Jul 2, 2012, at 10:30 AM, lztaomin wrote: > HI ALL > > My HBase group a total of 3 machine, Hadoop HBase mounted in the same > machine, zookeeper using HBase own. Operation 3 months after the reported > abnormal as follows. Cause hmaster and HRegionServer processes are gone. > Please help me. > Thanks > > The following is a log > > ABORTING region server serverName=datanode1,60020,1325326435553, > load=(requests=332, regions=188, usedHeap=2741, maxHeap=8165): > regionserver:60020-0x3488dec38a02b1 regionserver:60020-0x3488dec38a02b1 > received expired from ZooKeeper, aborting > Cause: > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:343) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:261) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) > 2012-07-01 13:45:38,707 INFO > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs > for datanode1,60020,1325326435553 > 2012-07-01 13:45:38,756 INFO > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting 32 hlog(s) > in hdfs://namenode:9000/hbase/.logs/datanode1,60020,1325326435553 > 2012-07-01 13:45:38,764 INFO > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting hlog 1 of > 32: > hdfs://namenode:9000/hbase/.logs/datanode1,60020,1325326435553/datanode1%3A60020.1341006689352, > length=5671397 > 2012-07-01 13:45:38,764 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering > file > hdfs://namenode:9000/hbase/.logs/datanode1,60020,1325326435553/datanode1%3A60020.1341006689352 > 2012-07-01 13:45:39,766 INFO org.apache.hadoop.hbase.util.FSUtils: Finished > lease recover attempt for > hdfs://namenode:9000/hbase/.logs/datanode1,60020,1325326435553/datanode1%3A60020.1341006689352 > 2012-07-01 13:45:39,880 INFO > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs > -- HDFS-200 > 2012-07-01 13:45:39,925 INFO > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs > -- HDFS-200 > > ABORTING region server serverName=datanode2,60020,1325146199444, > load=(requests=614, regions=189, usedHeap=3662, maxHeap=8165): > regionserver:60020-0x3488dec38a0002 regionserver:60020-0x3488dec38a0002 > received expired from ZooKeeper, aborting > Cause: > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:343) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:261) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) > 2012-07-01 13:24:10,308 INFO org.apache.hadoop.hbase.util.FSUtils: Finished > lease recover attempt for > hdfs://namenode:9000/hbase/.logs/datanode1,60020,1325326435553/datanode1%3A60020.1341075090535 > 2012-07-01 13:24:10,918 INFO > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting hlog 21 of > 32: > hdfs://namenode:9000/hbase/.logs/datanode1,60020,1325326435553/datanode1%3A60020.1341078690560, > length=11778108 > 2012-07-01 13:24:29,809 INFO > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path > hdfs://namenode:9000/hbase/t_speakfor_relation_chapter/ffd2057b46da227e078c82ff43f0f9f2/recovered.edits/0000000000660951991 > (wrote 8178 edits in 403ms) > 2012-07-01 13:24:29,809 INFO > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: hlog file splitting > completed in -1268935 ms for > hdfs://namenode:9000/hbase/.logs/datanode1,60020,1325326435553 > 2012-07-01 13:24:29,824 INFO > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Received > exception accessing META during server shutdown of > datanode1,60020,1325326435553, retrying META read > org.apache.hadoop.ipc.RemoteException: java.io.IOException: Server not > running, aborting > at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2408) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:1649) > at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) > > > > lztaomin