Thinking about it again, if you ran into a HBASE-7336 you'd see high CPU load, but *not* IOWAIT. 0.94 is at 0.94.23, you should upgrade. A lot of fixes, improvements, and performance enhancements went in since 0.94.4. You can do a rolling upgrade straight to 0.94.23.
With that out of the way, can you post a jstack of the processes that experience high wait times? -- Lars ________________________________ From: kiran <kiran.sarvabho...@gmail.com> To: user@hbase.apache.org; lars hofhansl <la...@apache.org> Sent: Saturday, September 6, 2014 11:30 AM Subject: Re: HBase - Performance issue Lars, We are facing a similar situation on the similar cluster configuration... We are having high I/O wait percentages on some machines in our cluster... We have short circuit reads enabled but still we are facing the similar problem.. the cpu wait goes upto 50% also in some case while issuing scan commands with multiple threads.. Is there a work around other than applying the patch for 0.94.4 ?? Thanks Kiran On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl <la...@apache.org> wrote: You may have run into https://issues.apache.org/jira/browse/HBASE-7336 (which is in 0.94.4) >(Although I had not observed this effect as much when short circuit reads are >enabled) > > > > >----- Original Message ----- >From: kzurek <kzu...@proximetry.pl> >To: user@hbase.apache.org >Cc: >Sent: Wednesday, April 24, 2013 3:12 AM >Subject: HBase - Performance issue > >The problem is that when I'm putting my data (multithreaded client, ~30MB/s >traffic outgoing) into the cluster the load is equally spread over all >RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When >I've added similar, mutlithreaded client that Scans for, let say, 100 last >samples of randomly generated key from chosen time range, I'm getting high >CPU wait time (20% and up) on two (or more if there is higher number of >threads, default 10) random RegionServers. Therefore, machines that held >those RS are getting very hot - one of the consequences is that number of >store file is constantly increasing, up to the maximum limit. Rest of the RS >are having 10-12% CPU wait time and everything seems to be OK (number of >store files varies so they are being compacted and not increasing over >time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it >possible? If so what would be the best way to that and where it should be >placed - on the client or cluster side)? > >Cluster specification: >HBase Version 0.94.2-cdh4.2.0 >Hadoop Version 2.0.0-cdh4.2.0 >There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes >Other settings: >- Bloom filters (ROWCOL) set >- Short circuit turned on >- HDFS Block Size: 128MB >- Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB >- Java Heap Size of HBase RegionServer in Bytes: 12 GiB >- Java Heap Size of HBase Master in Bytes: 4 GiB >- Java Heap Size of DataNode in Bytes: 1 GiB (default) >Number of regions per RegionServer: 19 (total 114 regions on 6 RS) >Key design: <UUID><TIMESTAMP> -> UUID: 1-10M, TIMESTAMP: 1-N >Table design: 1 column family with 20 columns of 8 bytes > >Get client: >Multiple threads >Each thread have its own tables instance with their Scanner. >Each thread have its own range of UUIDs and randomly draws beginning of time >range to build rowkey properly (see above). >Each time Scan requests same amount of rows, but with random rowkey. > > > > > >-- >View this message in context: >http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html >Sent from the HBase User mailing list archive at Nabble.com. > > -- Thank you Kiran Sarvabhotla -----Even a correct decision is wrong when it is taken late