Is there a reason you are only running HBase with 3GB of heap? The CMS garbage collector is known to not work especially well in less than 4GB of heap. With 24GB of RAM you should be able to give HBase something like 8GB, and this should help tremendously.
Check out this wiki page: http://wiki.apache.org/hadoop/PerformanceTuning There's some good info on GC params. JG > -----Original Message----- > From: Vidhyashankar Venkataraman [mailto:vidhy...@yahoo-inc.com] > Sent: Wednesday, June 09, 2010 11:16 AM > To: user@hbase.apache.org > Subject: Re: OOME during frequent updates... > > What do you mean by pastebinning it? I will try hosting it on a > webserver.. > > I know that OOME is Java running out of heap space: Can you let me know > what are the usual causes for OOME happening in Hbase? Was I pounding > the servers a bit too hard with updates? > > Thank you > Vidhya > > > On 6/9/10 11:05 AM, "Jean-Daniel Cryans" <jdcry...@apache.org> wrote: > > OOME is a Java exception, nothing HBase specific. It means that the > JVM ran out of memory. > > BTW your log wasn't attached to your email (they are usually blocked), > so please post it on a web server or pastebin it so we can help you. > > J-D > > On Wed, Jun 9, 2010 at 11:02 AM, Vidhyashankar Venkataraman > <vidhy...@yahoo-inc.com> wrote: > > > > I am trying to understand the reasons behind Hbase throwing OOME when > trying > > to run updates (the updates include insertions of new rows, > modifications of > > existing rows and deletions).. I am getting OOME almost every time > after > > running it for a few hours: which either (hopefully) means I have to > > rate-limit my updates or my config settings are wrong for my use > case. Can > > any of you help me with this issue? > > > > Can you guys let me know what the usual causes are for OOME in hbase? > > > > The machines have 16 cpus, 24 gigs ram.. > > > > The db: 120m rows 15KB each. 2 column families. 1 column family is 1 > KB in > > size while the other is 14 KB. No compression for now. > > > > 5 region servers, ran around 4 or 5 clients per node on the 5 nodes > that run > > the region servers.. > > 2MB block size, 2gigs region size, WAL disabled.. 2MB write buffer.. > 3 gigs > > heap size, flush size is a large value (600 MB which means it will > never hit > > it).. Major compactions disabled.. > > > > My experiment: Run 5 processes in each node that hosts the RS's (25 > in > > total): choose an operation at random (delete,insert or modify) > biased with > > a prespecified ratio and then a row key at random and then perform > the > > operation. I don't modify a row after it has been deleted. Auto flush > is > > disabled > > > > Other config values > > blockingStoreFiles: 16 > > hfile.min.blocksize.size: 1MB > > hfile.block.cache.size: 0.3 (not relevant here..) > > <name>hbase.hregion.memstore.block.multiplier</name> > <value>8</value> > > <name>hbase.regionserver.handler.count</name> <value>100</value> > > Global memstore limit is anywhere between 0.35 and 0.4 of the max > heap > > size... i.e. 1 to 1.2 gigs... > > > > I have attached a log file just in case.. > > > > Thank you in advance :) > > Vidhya