Yep... Very likely HBASE-9428: 8 threads: java.lang.Thread.State: RUNNABLE at java.util.Arrays.copyOf(Arrays.java:2786) at java.lang.StringCoding.decode(StringCoding.java:178) at java.lang.String.<init>(String.java:483) at org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96) ...
4 threads: java.lang.Thread.State: RUNNABLE at sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79) at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106) at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544) at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140) at java.lang.StringCoding.decode(StringCoding.java:179) at java.lang.String.<init>(String.java:483) at org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96) It's also consistent with what you see: Lots of garbage (hence tweaking your GC options had a significant effect) The fix is in 0.94.12, which is in RC right now, probably to be released early next week. -- Lars ________________________________ From: OpenSource Dev <dev.opensou...@gmail.com> To: user@hbase.apache.org Sent: Thursday, September 12, 2013 8:15 AM Subject: Re: High cpu usage on a region server A server started getting busy last night, but this time it took ~5 hrs to get from 15% busy to 75% busy. It is not running 80% flat-out yet. But this is still very high compared to other servers that are running under ~25% cpu usage. Only change that I made yesterday was the addition of "-XX:+UseParNewGC" to hbase startup command. http://pastebin.com/VRmujgyH On Wed, Sep 11, 2013 at 2:28 PM, Stack <st...@duboce.net> wrote: > Can you thread dump the busy server and pastebin it? > Thanks, > St.Ack > > > On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev > <dev.opensou...@gmail.com>wrote: > >> Hi, >> >> I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no >> issues with writes/puts. System is handles upto 800k puts per seconds >> without issue. On average we do 250k puts per second. >> >> I am having the problem with Reads, I've also isolated where the >> problem is but not been able to find the root cause. >> >> I have 16 machines running hbase-region server, each has ~35 regions. >> Once in a while cpu goes flatout 80% in 1 region server. These are the >> things i've noticed in ganglia: >> >> hbase.regionserver.request - evenly distributed. Not seeing any spikes >> on the busy server >> hbase.regionserver.blockCacheSize - between 500MB and 1000MB >> hbase.regionserver.compactionQueueSize - avg 2 or less >> hbase.regionserver.blockCacheHitRatio - 30% on busy node, >60% on other >> nodes >> >> >> JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC >> -XX:+UseConcMarkSweepGC >> >> I've noticed the system load moves to a different region, sometimes >> within a minute, if the busy region is restarted. >> >> Any suggestion what could be causing the load and/or what other >> metrics should I check ? >> >> >> Thank you! >>