Re: Hbase read performance with increasing number of client threads

2010-09-14 Thread Ryan Rawson
Yes, its all about the block cache. IN_MEMORY is a useful tool as well, but be careful you can choke out other regions/tables. -ryan On Tue, Sep 14, 2010 at 12:07 AM, Abhijit Pol wrote: > @Ryan > when you mentioned caching and lots of RAM you referred giving it to block > cache or memstore? > >

Re: Hbase read performance with increasing number of client threads

2010-09-14 Thread Abhijit Pol
@Ryan when you mentioned caching and lots of RAM you referred giving it to block cache or memstore? we have a table with two column families A and B. For column family A we have set "IN_MEMORY" ==> true and we have multiple 64GB ram machines where we would like to hold this column family in RAM fo

Re: Hbase read performance with increasing number of client threads

2010-09-13 Thread Abhijit Pol
Noticed discussion on this thread. We filed HBASE-2939 with patch On Thu, Sep 9, 2010 at 11:10 PM, tsuna wrote: > On Thu, Sep 9, 2010 at 1:48 PM, MauMau wrote: > > From: "tsuna" > >> In my recent loadtests on my HBase-heavy application (be it with > >> HBase's traditional client or with async

Re: Hbase read performance with increasing number of client threads

2010-09-09 Thread tsuna
On Thu, Sep 9, 2010 at 1:48 PM, MauMau wrote: > From: "tsuna" >> In my recent loadtests on my HBase-heavy application (be it with >> HBase's traditional client or with asynchbase) I've always been CPU >> bound (except sometimes HBase's traditional client incurs too much >> lock contention to real

Re: Hbase read performance with increasing number of client threads

2010-09-09 Thread MauMau
From: "tsuna" Sent: Friday, September 10, 2010 12:41 AM Having more TCP connections makes the code more complicated (since you need to manage them all, implement a scheme to try to use them in a round-robin fashion, etc). It can also put more strain on some network gear or OS components. For

Re: Hbase read performance with increasing number of client threads

2010-09-09 Thread tsuna
On Thu, Sep 9, 2010 at 5:10 AM, MauMau wrote: > - Original Message - From: "tsuna" > Sent: Thursday, September 09, 2010 7:52 AM >> If the server side latency consistently remains around 5-15ms but the >> client side latency shoots up through the roof, you may be >> experiencing lock conte

Re: Hbase read performance with increasing number of client threads

2010-09-09 Thread MauMau
I'm curious about this problem. I would apologize if I say something wrong. Isn't it possible that the latency is due to the client-side serialized send/receive? Only one TCP connection is established to each region server in a client process regardless of how many HTable objects are created. T

Re: Hbase read performance with increasing number of client threads

2010-09-08 Thread tsuna
On Thu, Aug 19, 2010 at 10:15 AM, Abhijit Pol wrote: > We are using Hbase 0.20.5 drop with latest cloudera Hadoop distribution. > > - We are hitting 3 nodes Hbase cluster from a client which has 10 > threads each with thread local copy of HTable client object and > established connection to server

Re: Hbase read performance with increasing number of client threads

2010-08-20 Thread Ryan Rawson
Thanks, One of the major problems we are facing is the lack of IO pushdown. We need to push the IO requests from HBase -> regionserver -> OS -> disk IO. The latter two do IO path optimization and this is where we will see speedups. There is also a chance to do IO path optimization in the HDFS la

Re: Hbase read performance with increasing number of client threads

2010-08-20 Thread Jeff Hammerbacher
Great, thanks. Most critical HDFS features that have JIRAs against them get reviewed, in my experience, but they just take time. Symlinks ( https://issues.apache.org/jira/browse/HDFS-245) and appends are two examples that come to mind. Gathering up public opinion about the importance of various fe

Re: Hbase read performance with increasing number of client threads

2010-08-20 Thread Ryan Rawson
I dug out these two issues: https://issues.apache.org/jira/browse/HDFS-918 https://issues.apache.org/jira/browse/HDFS-1323 There was also something about speeding up random reads in HDFS, but as is typical these kinds of issues go to JIRA to die. -ryan On Thu, Aug 19, 2010 at 11:51 PM, Jeff H

Re: Hbase read performance with increasing number of client threads

2010-08-19 Thread Jeff Hammerbacher
Hey Ryan, Could you point to the particular JIRA issues for the DFS client that are causing these performance issues for HBase? Knowing is half the battle. Thanks, Jeff On Thu, Aug 19, 2010 at 9:20 PM, Ryan Rawson wrote: > Due to DFS client things are a little not as good as they should be...

Re: Hbase read performance with increasing number of client threads

2010-08-19 Thread Ryan Rawson
Due to DFS client things are a little not as good as they should be... They are being worked on, so it will get resolved in time. In the mean time, the key to fast access is caching... ram ram ram. -ryan On Thu, Aug 19, 2010 at 10:15 AM, Abhijit Pol wrote: > We are using Hbase 0.20.5 drop with

Hbase read performance with increasing number of client threads

2010-08-19 Thread Abhijit Pol
We are using Hbase 0.20.5 drop with latest cloudera Hadoop distribution. - We are hitting 3 nodes Hbase cluster from a client which has 10 threads each with thread local copy of HTable client object and established connection to server. - Each of 10 threads issuing 10,000 read requests of keys ran