Hi Did you check your zookeeper? Check the zookeeper logs. May be some network fluctuation between your HBase cluster and Zookeeper.
Also read the trouble shooting section of HBase w.r.t Zookeeper. May be the load on ZK is too heavy and there is some long Full GC happening. Regards Ram -----Original Message----- From: tvi...@socialyantra.com [mailto:tvi...@socialyantra.com] On Behalf Of T Vinod Gupta Sent: Thursday, December 08, 2011 11:58 PM To: user@hbase.apache.org Subject: hbase regionserver dead - org.apache.hadoop.hbase.YouAreDeadException My hbase cluster completely stopped working this morning. When i looked at the log files, I saw the below. I am wondering why this happened and what can be done to avoid this in future. I restarted the master and regionserver and things look ok now. but i don't know how much data i must have lost in the process. can someone help? 2011-12-08 17:21:24,400 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer : ABORTING region server serverName=ip-10-68-145-124.ec2.internal,60020,13181075 55030, load=(requests=1709, regions=377, usedHeap=1402, maxHeap=2991): regionser ver:60020-0x132d86477bf02c3-0x132d86477bf02c3-0x132d86477bf02c3-0x132d86477b f02c 3 regionserver:60020-0x132d86477bf02c3-0x132d86477bf02c3-0x132d86477bf02c3-0x1 32 d86477bf02c3 received expired from ZooKeeper, aborting org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(Zo oKeeperWatcher.java:343) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperW atcher.java:261) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.j ava:530) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) 2011-12-08 17:21:28,310 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server serverName=ip-10-68-145-124.ec2.internal,60020,1318107555030, load=(requests=4120, regions=377, usedHeap=1450, maxHeap=2991): Unhandled exception: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently processing ip-10-68-145-124.ec2.internal,60020,1318107555030 as dead server org.apache.hadoop.hbase.YouAreDeadException: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently processing ip-10-68-145-124.ec2.internal,60020,1318107555030 as dead server at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcces sorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstruc torAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.j ava:95) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException. java:79) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRe gionServer.java:733) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:59 4) at java.lang.Thread.run(Thread.java:636) Caused by: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently processing ip-10-68-145-124.ec2.internal,60020,1318107555030 as dead server at org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java: 201) at org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManage r.java:259) at org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:641) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy5.regionServerReport(Unknown Source) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRe gionServer.java:727) ... 2 more thanks vinod