Shouldn't the RS just shutdown then? Because it stays half alive and none of the puts succeed. Also the oome happen right after flush/compaction/split... so clearly the RS was busy, and it could be just a matter of hitting Heap ceiling perhaps.
-Jack On Thu, Apr 21, 2011 at 12:13 AM, Stack <st...@duboce.net> wrote: > This looks like a bug. Elsewhere in the RPC you can register a > handler for OOME explicitly and we have a callback up into the > regionserver where we will set that the server abort or stop dependent > on type of OOME we see. In this case it looks like on OOME we just > throw and the then all the executors fill so no more executors > available to process requests (This is my current accessment -- it > could be a different one by morning). > > The root cause would look to be a big put. Could that be the case. > > On the naming, that looks to be the default naming of executor threads > done by the hosting executorservice. > > St.Ack > > > On Wed, Apr 20, 2011 at 10:11 PM, Jack Levin <magn...@gmail.com> wrote: >> Hello, with 0.89 HBASE, we see the following, all REST servers get >> locked on trying to connect to one of our RS servers, the error in the >> .out file on that Region Server looks like this: >> >> Exception in thread "pool-1-thread-3" java.lang.OutOfMemoryError: Java >> heap space >> at >> org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:120) >> at >> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:959) >> at >> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:927) >> at >> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:503) >> at >> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:297) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:619) >> >> Question is, how come the region server did not die after this but >> just hogged the REST connections? And what is pool1-thread-3 actually >> do? >> >> -Jack >> >