Re: help why do my regionservers shut themselves down?

2013-04-23 Thread Leonid Fedotov
This could be a reason as well: 2013-04-22 16:47:21,900 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: Too many consecutive RollWriter requests, it's a sign of the total number of live datanodes is lower than the tolerable replicas. Make sure your cluster is in good health conditions... Th

Re: help why do my regionservers shut themselves down?

2013-04-23 Thread Kevin O'dell
Hi Kaveh, How large is your heap that you are using? Also, what GC settings do you have in place? Your main issues looks to be here: 2013-04-22 16:47:21,843 FATAL org.apache.hadoop.hbase.**regionserver.HRegionServer: ABORTING region server serverName=d1r1n17.prod.**plutoz.com

Re: help why do my regionservers shut themselves down?

2013-04-22 Thread kaveh minooie
thanks everyone for responding. No I don't have the GC logs. I don't even know how i can get that. but it seems that the regionserver did recovere from that and then gets into trouble here: 2013-04-22 16:47:56,830 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction interrupted by

Re: help why do my regionservers shut themselves down?

2013-04-22 Thread Ted Yu
Kaveh: What version of HBase are you using ? Around 2013-04-22 16:47:56, did you observe anything else happening in your cluster ? See below: 2013-04-22 16:47:56,830 INFO org.apache.hadoop.hbase.**regionserver.HRegion: compaction interrupted by user: java.io.**InterruptedIOException: Aborting comp

Re: help why do my regionservers shut themselves down?

2013-04-22 Thread Jean-Marc Spaggiari
Hi Kaveh, the respons is maybe already displayed on the logs you sent ;) "This disconnect could have been caused by a network partition or a long-running GC pause, either way it's recommended that you verify your environment." Do you have GC logs? Have you tried anything to solve that? JM 2013

help why do my regionservers shut themselves down?

2013-04-22 Thread kaveh minooie
Hi after a few mapreduce jobs my regionservers shut themselves down. this is the latest time that this has happened: 2013-04-22 16:47:21,843 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: This client just lost it's session with ZooKeeper, trying to reconne