I'm curious about this problem. I would apologize if I say something wrong.
Isn't it possible that the latency is due to the client-side serialized
send/receive? Only one TCP connection is established to each region server
in a client process regardless of how many HTable objects are created. That
is, many threads share one connection to each region server even if they use
different HTable instances. The sends from different threads are processed
one-by-one, while the receives are processed one-by-one. For example, the
sends are serialized in HBaseClient.java as follows:
protected void sendParam(Call call) {
...
synchronized (this.out) { // FindBugs IS2_INCONSISTENT_SYNC
// send data via TCP connection
...
So, I guess the latency is due to the HBase client library implementation,
not due to the application. If I understand correctly, I wish this to be
improved.
I think the solution is to allocate as many TCP connections as the number of
processor cores to each region server. For example, if the client machine
has 8 cores and there are three region servers, each application process
will have at most 24 TCP connections. The application threads use those 8
connections in a round-robin fashion.
Regards,
Maumau
----- Original Message -----
From: "tsuna" <[email protected]>
Sent: Thursday, September 09, 2010 7:52 AM
On Thu, Aug 19, 2010 at 10:15 AM, Abhijit Pol <[email protected]>
wrote:
We are using Hbase 0.20.5 drop with latest cloudera Hadoop distribution.
- We are hitting 3 nodes Hbase cluster from a client which has 10
threads each with thread local copy of HTable client object and
established connection to server.
- Each of 10 threads issuing 10,000 read requests of keys randomly
selected from pool of 1000 keys. All keys are present on HBase and
table is pinned in memory (to make sure we don't have any disk seeks).
- If we run this test with 10 threads we get avg latency as seen by
client = 8ms (excluding initial 10 connection setup time) . But if we
increase # threads to 100, 250 to 500 we get increasing latency
numbers like 26ms, 51ms, 90ms.
- We have enabled HBase metrics on RS and we see "get_avg_time" on all
RS between 5-15ms in all tests, consistently.
Is this expected? Any tips to get consistent performance below 20ms?
If the server side latency consistently remains around 5-15ms but the
client side latency shoots up through the roof, you may be
experiencing lock contention or some other problem potentially even
unrelated to HBase in your application. Maybe your application is
generating too much garbage and the GC has to run too frequently.
Maybe you have so many threads that you're trashing the caches of the
CPUs you're running on.
Adding more threads only makes things faster up to a certain point.
Past that point, things actually become slower.