This actually sounds like there is a problem with concurrency either on the
client or the server side.  TCP is plenty fast for this and having a
dedicated TCP connection over which multiple requests can be multiplexed is
probably much better than UDP because you would have to adapt your own
window loss recovery anyway.   Having a long-lived TCP channel lets you
benefit from the decades of research in how to make that work right.

Hadoop rpc allows multiple outstanding requests at once so that isn't
inherently the problem either.  I feel like I have a memory of null requests
taking < 1 ms with Hadoop RPC, but I can't place where that memory might
have come from.

Also, I can push > 20,000 transactions per second through 20 threads in YCSB
and average latencies on those threads are often < 5 ms and sometimes near
1ms.

My first suspicion would be a concurrency limit somewhere that is
artificially throttling things down.  Why it would be sooo extreme, I cannot
imagine.

On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov <dlie...@gmail.com>wrote:

> So of course this test is stupid becuase in reality nobody would scan
> a table with 40 rows. So all the traffic goes to a single region
> server, so with a relatively low stress we could get an idea how the
> rest of the cluster would behave with proportionally higher load.
>
> Anyway. For a million requests shot at a region server at various
> speeds between 300 and 500 qps the picture is not pretty. RPC metrics
> are arctually good -- no more than 1ms average per next() and 0 per
> get(). So region server is lightning fast.
>
> What doesn't seem so fast is RPC. As i reported before, i was getting
> 25ms TTLB under the circumstances. In this case all the traffic to the
> node goes thru same client (but in reality of course the node's
> portion per client should be much less). All that traffic is using
> single regionserver node rpc queue as HConnection would not open more
> than one socket to same region. And tcp doesn't seem to perform very
> well for some reason in this scenario.
>
> So, it seems to help to actually open multiple hbase connections and
> round-robin them between scans. that way even though we waste more
> zookeeper connections, we also have more than one rpc channel open for
> the high-traffic region as well. A little coding and it brings us down
> from 25ms to 18ms average at 500QPS per region and 3 pooled hbase
> connections  Perhaps normally it is not as much a problem as traffic
> is more uniformly distributed among regions from the same client.
>
> The next thing i did was to enable tcp_nodelay on both client and
> server. That got us down even more to 13ms average.
>
> However, it is still about two times slower if i run all processes at
> the same machine (i get around 6-7ms average TTLBs for the same type
> of scan).
>
> Ping time for about same packet size between hosts involved seems to
> revolve around 1ms. Where another 5ms average time are getting lost is
> still a mystery. But oh well i guess it is as good as it gets.
> In real life hbase applications traffic would be much more uniformly
> distributed among regions and this would be much less of an issue
> perhaps.
>
> I also suspect that using udp for short scans and gets might reduce
> latency a bit as well.
>
> On Wed, Apr 20, 2011 at 3:05 PM, Dmitriy Lyubimov <dlie...@gmail.com>
> wrote:
> > So i can't seem to be able to immediately find the explanation for those
> metrics
> >
> > - rpcQueueTime -- do I assume it correctly it's the time a request
> > sits waiting int the incoming rpc queue before being picked up by
> > handler ?
> >
> > -rpcProcessingTime -- do i assume it correctly it's time of request
> > being processed by region server's handler?
> >
> > So inner time to last byte should be approximately sum of those, right?
> >
> > Thanks.
> > -Dmitriy
> >
> > On Wed, Apr 20, 2011 at 1:17 PM, Dmitriy Lyubimov <dlie...@gmail.com>
> wrote:
> >> Yes that's what i said. there's metric for fs latency but we are not
> >> hitting it so it's not useful.
> >>
> >> Question is which one might be useful to measure inner ttlb, and i
> >> don't see it there.
> >>
> >> On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning <tdunn...@maprtech.com>
> wrote:
> >>> FS latency shouldn't matter with your 99.9% cache hit rate as reported.
> >>>
> >>> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <dlie...@gmail.com
> >wrote:
> >>>
> >>>> Yes -- I already looked thru 'regionserver' metrics some time ago in
> >>>> hbase book. And i am not sure there's a 'inner ttlb' metric.
> >>>>
> >>>> There are fs latency metrics there but nothing for the respons times.
> >>>> fs latency is essentially hdfs latency AFAICT and that would not be
> >>>> relevant to what i am asking for (for as long as we are hitting LRU
> >>>> block cache anyway). we are not hitting fs.
> >>>>
> >>>> Unless there are more metrics than listed in the Hbase Book?
> >>>>
> >>>>
> >>>> On Wed, Apr 20, 2011 at 12:46 PM, Stack <st...@duboce.net> wrote:
> >>>> > Enable rpc logging.  Will show in your ganglia.  See metrics article
> >>>> > on hbase home page.
> >>>> >
> >>>> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <
> dlie...@gmail.com>
> >>>> wrote:
> >>>> >> Is there any way to log 'inner' TTLB times the region server incurs
> for
> >>>> reads?
> >>>> >>
> >>>> >>
> >>>> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <
> dlie...@gmail.com>
> >>>> wrote:
> >>>> >>> i just enabled debug logging for o.a.h.hbase logger in that
> particular
> >>>> >>> region server... so far not much except for LRUBlock cache
> spitting
> >>>> >>> metrics ..
> >>>> >>>
> >>>> >>> 2011-04-20 12:28:48,375 DEBUG
> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
> total=8.26
> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
> >>>> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
> >>>> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
> >>>> >>> evicted=0, evictedPerRun=NaN
> >>>> >>> 2011-04-20 12:33:48,375 DEBUG
> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
> total=8.26
> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
> >>>> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
> >>>> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
> >>>> >>> evicted=0, evictedPerRun=NaN
> >>>> >>> 2011-04-20 12:38:48,375 DEBUG
> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
> total=8.26
> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
> >>>> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
> >>>> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
> >>>> >>> evicted=0, evictedPerRun=NaN
> >>>> >>>
> >>>> >>>
> >>>> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <st...@duboce.net> wrote:
> >>>> >>>> If one region only, then its located on a single regionserver.
>  Tail
> >>>> >>>> that regionservers logs.  It might tell us something.
> >>>> >>>> St.Ack
> >>>> >>>>
> >>>> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net>
> wrote:
> >>>> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <st...@duboce.net>
> wrote:
> >>>> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <
> >>>> dlyubi...@apache.org> wrote:
> >>>> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows
> and
> >>>> >>>>>>> regardless of data size, approximately 500-400 QPS. The data
> tables
> >>>> >>>>>>> are almost empty and in-memory, so they surely should fit in
> those
> >>>> 40%
> >>>> >>>>>>> heap dedicated to them.
> >>>> >>>>>>>
> >>>> >>>>>>
> >>>> >>>>>> How many clients are going against the cluster?  If you use
> less, do
> >>>> >>>>>> your numbers improve?
> >>>> >>>>>>
> >>>> >>>>>
> >>>> >>>>> And all these clients are going against a single 40 row table?
> >>>> >>>>> St.Ack
> >>>> >>>>>
> >>>> >>>>
> >>>> >>>
> >>>> >>
> >>>> >
> >>>>
> >>>
> >>
> >
>

Reply via email to