Hi Yousuf > The client caches the locations of the regionservers, so after a couple > of > minutes of the experiment running, it wouldn't need to re-visit > ZooKeeper, > I believe. Correct me if I am wrong please. Yes you are right.
Regards Ram > -----Original Message----- > From: Yousuf Ahmad [mailto:myahm...@gmail.com] > Sent: Friday, October 19, 2012 1:30 AM > To: user@hbase.apache.org; lars hofhansl > Cc: Ivan Brondino; Ricardo Vilaça > Subject: Re: High IPC Latency > > Hi, > > Thank you for your questions guys. > > We are using HBase 0.92 with HDFS 1.0.1. > > The experiment lasts 15 minutes. The measurements stabilize in the > first > two minutes of the run. > > The data is distributed almost evenly across the regionservers so each > client hits most of them over the course of the experiment. However, > for > the data we have, any given multi-get or scan should touch only one or > at > most two regions. > > The client caches the locations of the regionservers, so after a couple > of > minutes of the experiment running, it wouldn't need to re-visit > ZooKeeper, > I believe. Correct me if I am wrong please. > > Regards, > Yousuf > > > On Thu, Oct 18, 2012 at 2:42 PM, lars hofhansl <lhofha...@yahoo.com> > wrote: > > > Also, what version of HBase/HDFS is this using? > > > > > > > > > > ----- Original Message ----- > > From: "Pamecha, Abhishek" <apame...@x.com> > > To: "user@hbase.apache.org" <user@hbase.apache.org> > > Cc: Ivan Brondino <ibrond...@fi.upm.es>; Ricardo Vilaça < > > rmvil...@di.uminho.pt> > > Sent: Thursday, October 18, 2012 11:38 AM > > Subject: RE: High IPC Latency > > > > Is it sustained for the same client hitting the same region server OR > does > > it get better for the same client-RS combination when run for longer > > duration? Trying to eliminate Zookeeper from this. > > > > Thanks, > > Abhishek > > > > From: Yousuf Ahmad [mailto:myahm...@gmail.com] > > Sent: Thursday, October 18, 2012 11:26 AM > > To: user@hbase.apache.org > > Cc: Ivan Brondino; Ricardo Vilaça > > Subject: High IPC Latency > > > > Hello, > > > > We are seeing slow times for read operations in our experiments. We > are > > hoping that you guys can help us figure out what's going wrong. > > > > Here are some details: > > > > * We are running a read-only benchmark on our HBase cluster. > > * > > * There are 10 regionservers, each co-located with a datanode. > HDFS > > replication is 3x. > > * All the data read by the experiment is already in the block > cache > > and the hit ratio is 99%. > > * > > * We have 10 clients, each with around 400 threads making a mix > of > > read-only requests involving multi-gets and scans. > > * > > * We settled on the default client pool type/size (roundrobin/1) > and a > > regionserver handler count of 100 after testing various combinations > to see > > what setting worked best. > > * > > * Our scans are short, fetching around 10 rows on average. > Scanner > > caching is set to 50. > > * An average row in a scan has either around 10 columns (small > row) or > > around 200 columns (big row). > > * > > * Our multi-gets fetch around 200 rows on average. > > * An average row in a multi-get has around 10 columns. > > * Each column holds an integer (encoded into bytes). > > * > > * None of the machines involved reach CPU, memory, or IO > saturation. > > In fact resource utilization stays quite low. > > * > > * Our statistics show that the average time for a scan, measured > > starting from the first scanner.next() call to the last one which > returns a > > null, is around 2-3 seconds. > > * Since we use scanner caching, the major portion of this time > (around > > 2 seconds) is spent on the first call to next(), while the remaining > calls > > take a negligible amount of time. > > * Similarly, we see that a multi-get on average takes around 2 > seconds. > > * A single get on average takes around 1 second. > > We are not sure what the bottleneck is or where it lies. We thought > we > > should look deeper into what is going on at the regionservers. We > monitored > > the IPC calls during one of the experiments. Here is a sample of one > > regionserver log: > > > > 2012-10-18 17:00:09,969 DEBUG > org.apache.hadoop.ipc.HBaseServer.trace: > > Call #115483; Served: HRegionInterface#get queueTime=0 > processingTime=1 > > contents=1 Get, 75 bytes > > 2012-10-18 17:00:09,969 DEBUG > org.apache.hadoop.ipc.HBaseServer.trace: > > Call #115487; Served: HRegionInterface#get queueTime=0 > processingTime=0 > > contents=1 Get, 75 bytes > > 2012-10-18 17:00:09,969 DEBUG > org.apache.hadoop.ipc.HBaseServer.trace: > > Call #115489; Served: HRegionInterface#get queueTime=0 > processingTime=0 > > contents=1 Get, 75 bytes > > 2012-10-18 17:00:09,982 DEBUG > org.apache.hadoop.ipc.HBaseServer.trace: > > Call #111421; Served: HRegionInterface#get queueTime=0 > processingTime=0 > > contents=1 Get, 75 bytes > > 2012-10-18 17:00:09,982 DEBUG > org.apache.hadoop.ipc.HBaseServer.trace: > > Call #115497; Served: HRegionInterface#multi queueTime=0 > processingTime=9 > > contents=200 Gets > > 2012-10-18 17:00:09,984 DEBUG > org.apache.hadoop.ipc.HBaseServer.trace: > > Call #115499; Served: HRegionInterface#openScanner queueTime=0 > > processingTime=0 contents=1 Scan, 63 bytes > > 2012-10-18 17:00:09,990 DEBUG > org.apache.hadoop.ipc.HBaseServer.trace: > > Call #115503; Served: HRegionInterface#get queueTime=0 > processingTime=0 > > contents=1 Get, 75 bytes > > 2012-10-18 17:00:09,992 DEBUG > org.apache.hadoop.ipc.HBaseServer.trace: > > Call #103230; Served: HRegionInterface#next queueTime=0 > processingTime=0 > > contents=1 Long, 1 Integer > > 2012-10-18 17:00:09,994 DEBUG > org.apache.hadoop.ipc.HBaseServer.trace: > > Call #103234; Served: HRegionInterface#close queueTime=0 > processingTime=0 > > contents=1 Long > > 2012-10-18 17:00:09,994 DEBUG > org.apache.hadoop.ipc.HBaseServer.trace: > > Call #103232; Served: HRegionInterface#next queueTime=0 > processingTime=0 > > contents=1 Long, 1 Integer > > > > I have attached a larger chunk of the logs we collected for this > > experiment in case that helps. > > > > From the logs, we saw that the next() operation at the regionserver > takes > > 1 millisecond or less; and a multi-get takes 10 ms on average. > > Yet the corresponding times we see at the client are orders of > magnitude > > higher. > > Ping times between the machines are at most 1ms and we are not > saturating > > the network. > > > > We would really appreciate some insights from you guys on this. > > Where do you suggest we focus our efforts in order to hunt down this > > bottleneck/contention? > > > > Thanks! > > Yousuf > >