Vlad: You might want to look at HBASE-9087 Handlers being blocked during reads
On Thu, Aug 1, 2013 at 9:24 AM, Vladimir Rodionov <vladrodio...@gmail.com>wrote: > All tests I have run were hitting single region on a region server. I > suspect this is not a right scenario. There are some points in the Store > class which are heavily synchronized: > > For example this one: > // All access must be synchronized. > private final CopyOnWriteArraySet<ChangedReadersObserver> > changedReaderObservers = > new CopyOnWriteArraySet<ChangedReadersObserver>(); > > I will re-run tests against all available regions on a RS and will post > results later on today. > > > > > On Wed, Jul 31, 2013 at 11:15 PM, lars hofhansl <la...@apache.org> wrote: > > > Yeah, that would seem to indicate that seeking into the block is not a > > bottleneck (and you said earlier that everything fits into the > blockcache). > > Need to profile to know more. If you have time, would be cool if you can > > start jvisualvm and attach it to the RS start the profiling and let the > > workload run for a bit. > > > > -- Lars > > > > > > > > ----- Original Message ----- > > From: Vladimir Rodionov <vladrodio...@gmail.com> > > To: dev@hbase.apache.org; lars hofhansl <la...@apache.org> > > Cc: > > Sent: Wednesday, July 31, 2013 9:57 PM > > Subject: Re: HBase read perfomnance and HBase client > > > > Smaller block size (32K) does not give any performance gain and this is > > strange, to say the least. > > > > > > On Wed, Jul 31, 2013 at 9:33 PM, lars hofhansl <la...@apache.org> wrote: > > > > > Would be interesting to profile MultiGet. With RTT of 0.1ms, the > internal > > > RS friction is probably the main contributor. > > > In fact MultiGet just loops over the set at the RS and calls single > gets > > > on the various regions. > > > > > > Each Get needs to reseek into the block (even when it is cached, since > > KVs > > > have variable size). > > > > > > There are HBASE-6136 and HBASE-8362. > > > > > > > > > -- Lars > > > > > > ________________________________ > > > From: Vladimir Rodionov <vladrodio...@gmail.com> > > > To: dev@hbase.apache.org; lars hofhansl <la...@apache.org> > > > Sent: Wednesday, July 31, 2013 7:27 PM > > > Subject: Re: HBase read perfomnance and HBase client > > > > > > > > > Some final numbers : > > > > > > Test config: > > > > > > HBase 0.94.6 > > > blockcache=true, block size = 64K, KV size = 62 bytes (raw). > > > > > > 5 Clients: 96GB, 16(32) CPUs (2.2Ghz), CentOS 5.7 > > > 1 RS Server: the same config. > > > > > > Local network with ping between hosts: 0.1 ms > > > > > > > > > 1. HBase client hits the wall at ~ 50K per sec regardless of # of CPU, > > > threads, IO pool size and other settings. > > > 2. HBase server was able to sustain 170K per sec (with 64K block size). > > All > > > from block cache. KV size = 62 bytes (very small). This is for single > Get > > > op, 60 threads per client, 5 clients (on different hosts) > > > 3. Multi - get hits the wall at the same 170K-200K per sec. Batch size > > > tested: 30, 100. The same performance absolutely as with batch size = > 1. > > > Multi get has some internal issues on RegionServer side. May be > excessive > > > locking or some thing else. > > > > > > > > > > > > > > > > > > On Tue, Jul 30, 2013 at 2:01 PM, Vladimir Rodionov > > > <vladrodio...@gmail.com>wrote: > > > > > > > 1. SCR are enabled > > > > 2. Single Configuration for all table did not work well, but I will > try > > > it > > > > again > > > > 3. With Nagel I had 0.8ms avg, w/o - 0.4ms - I see the difference > > > > > > > > > > > > On Tue, Jul 30, 2013 at 1:50 PM, lars hofhansl <la...@apache.org> > > wrote: > > > > > > > >> With Nagle's you'd see something around 40ms. You are not saying > 0.8ms > > > >> RTT is bad, right? Are you seeing ~40ms latencies? > > > >> > > > >> This thread has gotten confusing. > > > >> > > > >> I would try these: > > > >> * one Configuration for all tables. Or even use a single > > > >> HConnection/Threadpool and use the HTable(byte[], HConnection, > > > >> ExecutorService) constructor > > > >> * disable Nagle's: set both ipc.server.tcpnodelay and > > > >> hbase.ipc.client.tcpnodelay to true in hbase-site.xml (both client > > *and* > > > >> server) > > > >> * increase hbase.client.ipc.pool.size in client's hbase-site.xml > > > >> * enable short circuit reads (details depend on exact version of > > > Hadoop). > > > >> Google will help :) > > > >> > > > >> -- Lars > > > >> > > > >> > > > >> ----- Original Message ----- > > > >> From: Vladimir Rodionov <vladrodio...@gmail.com> > > > >> To: dev@hbase.apache.org > > > >> Cc: > > > >> Sent: Tuesday, July 30, 2013 1:30 PM > > > >> Subject: Re: HBase read perfomnance and HBase client > > > >> > > > >> This hbase.ipc.client.tcpnodelay (default - false) explains poor > > single > > > >> thread performance and high latency ( 0.8ms in local network)? > > > >> > > > >> > > > >> On Tue, Jul 30, 2013 at 1:22 PM, Vladimir Rodionov > > > >> <vladrodio...@gmail.com>wrote: > > > >> > > > >> > One more observation: One Configuration instance per HTable gives > > 50% > > > >> > boost as compared to single Configuration object for all HTable's > - > > > from > > > >> > 20K to 30K > > > >> > > > > >> > > > > >> > On Tue, Jul 30, 2013 at 1:17 PM, Vladimir Rodionov < > > > >> vladrodio...@gmail.com > > > >> > > wrote: > > > >> > > > > >> >> This thread dump has been taken when client was sending 60 > requests > > > in > > > >> >> parallel (at least, in theory). There are 50 server handler > > threads. > > > >> >> > > > >> >> > > > >> >> On Tue, Jul 30, 2013 at 1:15 PM, Vladimir Rodionov < > > > >> >> vladrodio...@gmail.com> wrote: > > > >> >> > > > >> >>> Sure, here it is: > > > >> >>> > > > >> >>> http://pastebin.com/8TjyrKRT > > > >> >>> > > > >> >>> epoll is not only to read/write HDFS but to connect/listen to > > > clients > > > >> as > > > >> >>> well? > > > >> >>> > > > >> >>> > > > >> >>> On Tue, Jul 30, 2013 at 12:31 PM, Jean-Daniel Cryans < > > > >> >>> jdcry...@apache.org> wrote: > > > >> >>> > > > >> >>>> Can you show us what the thread dump looks like when the > threads > > > are > > > >> >>>> BLOCKED? There aren't that many locks on the read path when > > reading > > > >> >>>> out of the block cache, and epoll would only happen if you need > > to > > > >> hit > > > >> >>>> HDFS, which you're saying is not happening. > > > >> >>>> > > > >> >>>> J-D > > > >> >>>> > > > >> >>>> On Tue, Jul 30, 2013 at 12:16 PM, Vladimir Rodionov > > > >> >>>> <vladrodio...@gmail.com> wrote: > > > >> >>>> > I am hitting data in a block cache, of course. The data set > is > > > very > > > >> >>>> small > > > >> >>>> > to fit comfortably into block cache and all request are > > directed > > > to > > > >> >>>> the > > > >> >>>> > same Region to guarantee single RS testing. > > > >> >>>> > > > > >> >>>> > To Ted: > > > >> >>>> > > > > >> >>>> > Yes, its CDH 4.3 . What the difference between 94.10 and 94.6 > > > with > > > >> >>>> respect > > > >> >>>> > to read performance? > > > >> >>>> > > > > >> >>>> > > > > >> >>>> > On Tue, Jul 30, 2013 at 12:06 PM, Jean-Daniel Cryans < > > > >> >>>> jdcry...@apache.org>wrote: > > > >> >>>> > > > > >> >>>> >> That's a tough one. > > > >> >>>> >> > > > >> >>>> >> One thing that comes to mind is socket reuse. It used to > come > > up > > > >> more > > > >> >>>> >> more often but this is an issue that people hit when doing > > loads > > > >> of > > > >> >>>> >> random reads. Try enabling tcp_tw_recycle but I'm not > > > guaranteeing > > > >> >>>> >> anything :) > > > >> >>>> >> > > > >> >>>> >> Also if you _just_ want to saturate something, be it CPU or > > > >> network, > > > >> >>>> >> wouldn't it be better to hit data only in the block cache? > > This > > > >> way > > > >> >>>> it > > > >> >>>> >> has the lowest overhead? > > > >> >>>> >> > > > >> >>>> >> Last thing I wanted to mention is that yes, the client > doesn't > > > >> scale > > > >> >>>> >> very well. I would suggest you give the asynchbase client a > > run. > > > >> >>>> >> > > > >> >>>> >> J-D > > > >> >>>> >> > > > >> >>>> >> On Tue, Jul 30, 2013 at 11:23 AM, Vladimir Rodionov > > > >> >>>> >> <vrodio...@carrieriq.com> wrote: > > > >> >>>> >> > I have been doing quite extensive testing of different > read > > > >> >>>> scenarios: > > > >> >>>> >> > > > > >> >>>> >> > 1. blockcache disabled/enabled > > > >> >>>> >> > 2. data is local/remote (no good hdfs locality) > > > >> >>>> >> > > > > >> >>>> >> > and it turned out that that I can not saturate 1 RS using > > one > > > >> >>>> >> (comparable in CPU power and RAM) client host: > > > >> >>>> >> > > > > >> >>>> >> > I am running client app with 60 read threads active (with > > > >> >>>> multi-get) > > > >> >>>> >> that is going to one particular RS and > > > >> >>>> >> > this RS's load is 100 -150% (out of 3200% available) - it > > > means > > > >> >>>> that > > > >> >>>> >> load is ~5% > > > >> >>>> >> > > > > >> >>>> >> > All threads in RS are either in BLOCKED (wait) or in > > IN_NATIVE > > > >> >>>> states > > > >> >>>> >> (epoll) > > > >> >>>> >> > > > > >> >>>> >> > I attribute this to the HBase client implementation which > > > seems > > > >> >>>> to be > > > >> >>>> >> not scalable (I am going dig into client later on today). > > > >> >>>> >> > > > > >> >>>> >> > Some numbers: The maximum what I could get from Single get > > (60 > > > >> >>>> threads): > > > >> >>>> >> 30K per sec. Multiget gives ~ 75K (60 threads) > > > >> >>>> >> > > > > >> >>>> >> > What are my options? I want to measure the limits and I do > > not > > > >> >>>> want to > > > >> >>>> >> run Cluster of clients against just ONE Region Server? > > > >> >>>> >> > > > > >> >>>> >> > RS config: 96GB RAM, 16(32) CPU > > > >> >>>> >> > Client : 48GB RAM 8 (16) CPU > > > >> >>>> >> > > > > >> >>>> >> > Best regards, > > > >> >>>> >> > Vladimir Rodionov > > > >> >>>> >> > Principal Platform Engineer > > > >> >>>> >> > Carrier IQ, www.carrieriq.com > > > >> >>>> >> > e-mail: vrodio...@carrieriq.com > > > >> >>>> >> > > > > >> >>>> >> > > > > >> >>>> >> > Confidentiality Notice: The information contained in this > > > >> message, > > > >> >>>> >> including any attachments hereto, may be confidential and is > > > >> >>>> intended to be > > > >> >>>> >> read only by the individual or entity to whom this message > is > > > >> >>>> addressed. If > > > >> >>>> >> the reader of this message is not the intended recipient or > an > > > >> agent > > > >> >>>> or > > > >> >>>> >> designee of the intended recipient, please note that any > > review, > > > >> use, > > > >> >>>> >> disclosure or distribution of this message or its > attachments, > > > in > > > >> >>>> any form, > > > >> >>>> >> is strictly prohibited. If you have received this message > in > > > >> error, > > > >> >>>> please > > > >> >>>> >> immediately notify the sender and/or > > > Notifications@carrieriq.comand > > > >> >>>> >> delete or destroy any copy of this message and its > > attachments. > > > >> >>>> >> > > > >> >>>> > > > >> >>> > > > >> >>> > > > >> >> > > > >> > > > > >> > > > >> > > > > > > > > > > > >