I would like to remind that in original BigTable's design  there is scan cache 
to take care of random reads and this
important feature is still missing in HBase.

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: [email protected]

________________________________________
From: lars hofhansl [[email protected]]
Sent: Saturday, June 29, 2013 3:24 PM
To: [email protected]
Subject: Re: Poor HBase random read performance

Should also say that random reads this way are somewhat of a worst case 
scenario.

If the working set is much larger than the block cache and the reads are 
random, then each read will likely have to bring in an entirely new block from 
the OS cache,
even when the KVs are much smaller than a block.

So in order to read a (say) 1k KV HBase needs to bring 64k (default block size) 
from the OS cache.
As long as the dataset fits into the block cache this difference in size has no 
performance impact, but as soon as the dataset does not fit, we have to bring 
much more data from the OS cache than we're actually interested in.

Indeed in my test I found that HBase brings in about 60x the data size from the 
OS cache (used PE with ~1k KVs). This can be improved with smaller block sizes; 
and with a more efficient way to instantiate HFile blocks in Java (which we 
need to work on).


-- Lars

________________________________
From: lars hofhansl <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Saturday, June 29, 2013 3:09 PM
Subject: Re: Poor HBase random read performance


I've seen the same bad performance behavior when I tested this on a real 
cluster. (I think it was in 0.94.6)


Instead of en/disabling the blockcache, I tested sequential and random reads on 
a data set that does not fit into the (aggregate) block cache.
Sequential reads were drastically faster than Random reads (7 vs 34 minutes), 
which can really only be explained with the fact that the next get will with 
high probability hit an already cached block, whereas in the random read case 
it likely will not.

In the RandomRead case I estimate that each RegionServer brings in between 100 
and 200mb/s from the OS cache. Even at 200mb/s this would be quite slow.I 
understand that performance is bad when index/bloom blocks are not cached, but 
bringing in data blocks from the OS cache should be faster than it is.


So this is something to debug.

-- Lars



________________________________
From: Varun Sharma <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Saturday, June 29, 2013 12:13 PM
Subject: Poor HBase random read performance


Hi,

I was doing some tests on how good HBase random reads are. The setup is
consists of a 1 node cluster with dfs replication set to 1. Short circuit
local reads and HBase checksums are enabled. The data set is small enough
to be largely cached in the filesystem cache - 10G on a 60G machine.

Client sends out multi-get operations in batches to 10 and I try to measure
throughput.

Test #1

All Data was cached in the block cache.

Test Time = 120 seconds
Num Read Ops = 12M

Throughput = 100K per second

Test #2

I disable block cache. But now all the data is in the file system cache. I
verify this by making sure that IOPs on the disk drive are 0 during the
test. I run the same test with batched ops.

Test Time = 120 seconds
Num Read Ops = 0.6M
Throughput = 5K per second

Test #3

I saw all the threads are now stuck in idLock.lockEntry(). So I now run
with the lock disabled and the block cache disabled.

Test Time = 120 seconds
Num Read Ops = 1.2M
Throughput = 10K per second

Test #4

I re enable block cache and this time hack hbase to only cache Index and
Bloom blocks but data blocks come from File System cache.

Test Time = 120 seconds
Num Read Ops = 1.6M
Throughput = 13K per second

So, I wonder how come such a massive drop in throughput. I know that HDFS
code adds tremendous overhead but this seems pretty high to me. I use
0.94.7 and cdh 4.2.0

Thanks
Varun

Confidentiality Notice:  The information contained in this message, including 
any attachments hereto, may be confidential and is intended to be read only by 
the individual or entity to whom this message is addressed. If the reader of 
this message is not the intended recipient or an agent or designee of the 
intended recipient, please note that any review, use, disclosure or 
distribution of this message or its attachments, in any form, is strictly 
prohibited.  If you have received this message in error, please immediately 
notify the sender and/or [email protected] and delete or destroy any 
copy of this message and its attachments.

Reply via email to