Thank you, sir.
On Fri, Apr 22, 2011 at 12:31 PM, tsuna wrote:
> On Fri, Apr 22, 2011 at 12:15 PM, Dmitriy Lyubimov wrote:
>> is it possible to configure this client to open more than one socket
>> connection from same client to same region server?
>> In other words, is HBASE-2939 already non-is
On Fri, Apr 22, 2011 at 12:15 PM, Dmitriy Lyubimov wrote:
> is it possible to configure this client to open more than one socket
> connection from same client to same region server?
> In other words, is HBASE-2939 already non-issue there?
No asynchbase doesn't have HBASE-2939, but as I said, I ha
Benoit,
Thank you.
is it possible to configure this client to open more than one socket
connection from same client to same region server?
In other words, is HBASE-2939 already non-issue there?
> asynchbase implements the HBase RPC protocol in a different way, it's
> written from scratch. It us
Got it . So that's why:
HBaseRPC:
protected final static ClientCache CLIENTS = new ClientCache();
Client Cache is static regardless of HConnection instances and
connection id is pretty much server address.
So i guess no external hack is possible to overcome that than.
On Fri, Apr 22, 2011 at
It's all multiplexed.
J-D
On Fri, Apr 22, 2011 at 11:52 AM, Dmitriy Lyubimov wrote:
>>
>> I doubt that TCP doesn't perform well. If you really believe so, can
>> you provide a packet capture collected with:
>> sudo tcpdump -nvi eth0 -s0 -w /tmp/pcap port 60020
>>
>
> I doubt that TCP doesn't perform well. If you really believe so, can
> you provide a packet capture collected with:
> sudo tcpdump -nvi eth0 -s0 -w /tmp/pcap port 60020
>
Hm. What i discovered there is that I assumed my hack at RS connection
pooling was working but it doesn't seem to be.
Even
On Thu, Apr 21, 2011 at 11:25 PM, Dmitriy Lyubimov wrote:
> I certainly would. Even more, i already read the code there just a
> bit although not enough to understand where the efficiency comes from.
> Do you actually implement another version of RPC on non-blocking
> sockets there?
asynchbase i
W
- Original Message -
From: Dmitriy Lyubimov [mailto:dlie...@gmail.com]
Sent: Friday, April 22, 2011 02:50 AM
To: user@hbase.apache.org
Subject: Re: 0.90 latency performance, cdh3b4
>
> You saw "HBASE-2939 Allow Client-Side Connection Pooling"? Would that help?
Ok ju
>
> You saw "HBASE-2939 Allow Client-Side Connection Pooling"? Would that help?
Ok just read thru the issue. That's exactly what i thought upon
reading the code in HBaseClient class. Although in my cluster it did
not seem to have more than about 20% effect and it was more or less
evaporated afte
Exactly. that's why i said 'for short scans and gets' and perhaps a
combo. As soon as it exceeds a frame, we'd rather not to mess with
reassembly. But I agree it is most likely not worth it. Most likely
reason for my latencies is not this.
On Thu, Apr 21, 2011 at 11:22 PM, Ted Dunning wrote:
> Ye
> I doubt that TCP doesn't perform well. If you really believe so, can
> you provide a packet capture collected with:
> sudo tcpdump -nvi eth0 -s0 -w /tmp/pcap port 60020
Thanks, i will certainly try. However same class machine same data
same test locally vs. remote same subnet is de facto 100% d
Yeah... but with UDP you have to do packet reassembly yourself.
And do source quench and all kinds of things.
Been there. Done that. Don't recommend it unless it is your day job.
We built the Veoh peer to peer system on UDP. It had compelling advantages
for us as we moved a terabit of data pe
yes that was closer to my expectations, too. i am scratching my head
as well but i don't have time to figure this out any longer. in
reality i won't have 500QPS stream between single client and single
region so i don't care much.
On Thu, Apr 21, 2011 at 11:08 PM, Ted Dunning wrote:
> This actuall
in this case i pool them as well, which doesn't seem to make any
difference (compared to when i just reuse them -- but i am not writing
but outside of the test i do so i do pool them using techniques
similar to those in HTablePool, CAS-based queues etc. )
On Thu, Apr 21, 2011 at 11:09 PM, Ted Dun
yes this is for 500 QPS of scans returning back approx. 15k worth of data total.
>
> You saw "HBASE-2939 Allow Client-Side Connection Pooling"? Would that help?
Interesting. let me take a look. i kind of was thinking maybe there's
some sense to allow to pool more than one tcp connection from sam
On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov wrote:
> What doesn't seem so fast is RPC. As i reported before, i was getting
> 25ms TTLB under the circumstances. In this case all the traffic to the
> node goes thru same client (but in reality of course the node's
> portion per client should b
Dmitriy,
Did I hear you say that you are instantiating a new Htable for each request?
Or was that somebody else?
On Thu, Apr 21, 2011 at 11:04 PM, Stack wrote:
> On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov
> wrote:
> > Anyway. For a million requests shot at a region server at various
>
This actually sounds like there is a problem with concurrency either on the
client or the server side. TCP is plenty fast for this and having a
dedicated TCP connection over which multiple requests can be multiplexed is
probably much better than UDP because you would have to adapt your own
window
On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov wrote:
> Anyway. For a million requests shot at a region server at various
> speeds between 300 and 500 qps the picture is not pretty. RPC metrics
> are arctually good -- no more than 1ms average per next() and 0 per
> get(). So region server is l
So of course this test is stupid becuase in reality nobody would scan
a table with 40 rows. So all the traffic goes to a single region
server, so with a relatively low stress we could get an idea how the
rest of the cluster would behave with proportionally higher load.
Anyway. For a million reques
So i can't seem to be able to immediately find the explanation for those metrics
- rpcQueueTime -- do I assume it correctly it's the time a request
sits waiting int the incoming rpc queue before being picked up by
handler ?
-rpcProcessingTime -- do i assume it correctly it's time of request
being
Yes. In the sense that it measures time until operation is complete
according to the client.
And assuming that TTLB = time to last bit.
YCSB is, however, a frail vessel. I have been unable to stress even
moderate sized clusters with it. It is fine
as a starting point.
On Wed, Apr 20, 2011 at
btw, Ted, your version of YCSB in github should show TTLBs, right?
On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning wrote:
> FS latency shouldn't matter with your 99.9% cache hit rate as reported.
>
> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov wrote:
>
>> Yes -- I already looked thru 'regions
Yes that's what i said. there's metric for fs latency but we are not
hitting it so it's not useful.
Question is which one might be useful to measure inner ttlb, and i
don't see it there.
On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning wrote:
> FS latency shouldn't matter with your 99.9% cache hit r
FS latency shouldn't matter with your 99.9% cache hit rate as reported.
On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov wrote:
> Yes -- I already looked thru 'regionserver' metrics some time ago in
> hbase book. And i am not sure there's a 'inner ttlb' metric.
>
> There are fs latency metrics
Yes -- I already looked thru 'regionserver' metrics some time ago in
hbase book. And i am not sure there's a 'inner ttlb' metric.
There are fs latency metrics there but nothing for the respons times.
fs latency is essentially hdfs latency AFAICT and that would not be
relevant to what i am asking f
Enable rpc logging. Will show in your ganglia. See metrics article
on hbase home page.
On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov wrote:
> Is there any way to log 'inner' TTLB times the region server incurs for reads?
>
>
> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov wrote:
>> i
Is there any way to log 'inner' TTLB times the region server incurs for reads?
On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov wrote:
> i just enabled debug logging for o.a.h.hbase logger in that particular
> region server... so far not much except for LRUBlock cache spitting
> metrics ..
>
>
i just enabled debug logging for o.a.h.hbase logger in that particular
region server... so far not much except for LRUBlock cache spitting
metrics ..
2011-04-20 12:28:48,375 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
MB, free=190.08 MB, max=198.34 MB, blocks=112, a
If one region only, then its located on a single regionserver. Tail
that regionservers logs. It might tell us something.
St.Ack
On Wed, Apr 20, 2011 at 12:25 PM, Stack wrote:
> On Wed, Apr 20, 2011 at 12:25 PM, Stack wrote:
>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov
>> wrote:
>>> R
Yes, this is the only stress test running on the cluster and nothing else.
And it all goes to block cache as evidenced by metrics.
2011-04-20 12:28:48,375 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=8.26
MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
h
On Wed, Apr 20, 2011 at 12:25 PM, Stack wrote:
> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov
> wrote:
>> Right now i am shooting scans returning between 3 and 40 rows and
>> regardless of data size, approximately 500-400 QPS. The data tables
>> are almost empty and in-memory, so they surel
On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov wrote:
> Right now i am shooting scans returning between 3 and 40 rows and
> regardless of data size, approximately 500-400 QPS. The data tables
> are almost empty and in-memory, so they surely should fit in those 40%
> heap dedicated to them.
>
H
8% percentile. sorry. 8% requests do land under 3ms or less.
On Wed, Apr 20, 2011 at 12:06 PM, Ted Dunning wrote:
> What is meant by 8% quartile? 75th %-ile? 98%-ile? Should quartile have
> been quantile?
>
> On Wed, Apr 20, 2011 at 12:00 PM, Dmitriy Lyubimov wrote:
>
>> Ok actually we do have
What is meant by 8% quartile? 75th %-ile? 98%-ile? Should quartile have
been quantile?
On Wed, Apr 20, 2011 at 12:00 PM, Dmitriy Lyubimov wrote:
> Ok actually we do have 1 region for these exact tables... so back to
> square one.
>
> FWIW i do get 8% quartile under 3ms TTLB. So it is algorithm
Ok actually we do have 1 region for these exact tables... so back to
square one.
FWIW i do get 8% quartile under 3ms TTLB. So it is algorithmically
sound it seems. question is why outliers spread is so much longer than
in tests on one machine. must be network. What else.
On Wed, Apr 20, 2011 at
On Wed, Apr 20, 2011 at 10:06 AM, Dmitriy Lyubimov wrote:
> Got it. This must be the reason. Cause it is a laugh check, and i do
> see 6 regions for 40 rows so it can span them, although i can't
> confirm it for sure.
Run a scan from the shell:
hbase> scan 'YOUR_TABLE'
... and you should see
Got it. This must be the reason. Cause it is a laugh check, and i do
see 6 regions for 40 rows so it can span them, although i can't
confirm it for sure. It may be due to how table was set up or due to
some time running them and rotating some data there. The uniformly
distributed hashes are used fo
On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov wrote:
> Right now i am shooting scans returning between 3 and 40 rows and
> regardless of data size, approximately 500-400 QPS. The data tables
> are almost empty and in-memory, so they surely should fit in those 40%
> heap dedicated to them.
>
L
On Wed, Apr 20, 2011 at 9:49 AM, Dmitriy Lyubimov wrote:
> Ok. Let me ask a question.
>
> When scan is performed and it obviously covers several regions, are
> scan performance calls done in sinchronous succession or they are done
> in parallel?
>
The former.
> Assuming scan is returning 40 res
Ok. Let me ask a question.
When scan is performed and it obviously covers several regions, are
scan performance calls done in sinchronous succession or they are done
in parallel?
Assuming scan is returning 40 results but for some weird reason it
goes to 6 regions and caching is set to 100 (so it
I am having similar results but hadn't done enough testing yet.
Sent from my BlackBerry® wireless device
-Original Message-
From: Dmitriy Lyubimov
Date: Wed, 20 Apr 2011 08:09:29
To:
Reply-To: user@hbase.apache.org
Subject: Re: 0.90 latency performance, cdh3b4
Yep. In all bench
Yep. In all benchmarks response times for tiny data start at about 1-2ms but
not in our new setup. Which is why I am at loss where to start looking.
Seems like a network congestion but it can't be. Its a barebone setup and
admins tell me they have tested it for performance.
apologies for brevity.
For a tiny test like this, everything should be in memory and latency
should be very low.
On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov wrote:
> PS so what should latency be for reads in 0.90, assuming moderate thruput?
>
> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov wrote:
>> for this
also we had another cluster running previous CDH versions with
pre-0.89 hbase and the latencies weren't as nearly as bad.
On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov wrote:
> PS so what should latency be for reads in 0.90, assuming moderate thruput?
>
> On Tue, Apr 19, 2011 at 5:39 PM, Dmit
PS so what should latency be for reads in 0.90, assuming moderate thruput?
On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov wrote:
> for this test, there's just no more than 40 rows in every given table.
> This is just a laugh check.
>
> so i think it's safe to assume it all goes to same region
for this test, there's just no more than 40 rows in every given table.
This is just a laugh check.
so i think it's safe to assume it all goes to same region server.
But latency would not depend on which server call is going to, would
it? Only throughput would, assuming we are not overloading.
An
How many regions? How are they distributed?
Typically it is good to fill the table some what and then drive some
splits and balance operations via the shell. One more split to make
the regions be local and you should be good to go. Make sure you have
enough keys in the table to support these sp
Hi,
I would like to see how i can attack hbase performance.
Right now i am shooting scans returning between 3 and 40 rows and
regardless of data size, approximately 500-400 QPS. The data tables
are almost empty and in-memory, so they surely should fit in those 40%
heap dedicated to them.
My loca
49 matches
Mail list logo