Can anyone shed some light on the state of HBase's memcaching? Cheers, Joost.
I'm working on a web application with primarily read-oriented
performance requirements. I've been running some benchmarking tests
that include our application layer, to get a sense of what is possible
with Hbase. A variation on the Bigtable test that is reproduced by
org.apache.hadoop.hbase.PerformanceEvaluation, I'm randomly reading 1
column from a table with 1 million rows. In our case, the contents of
that column need to be deserialized by our application (which adds some
overhead that I'm also trying to measure), the deserialized contents
represent a little over 1K of data.
Although a single thread can only achieve 125 reads per second, with 12
client threads (from 3 different machines) I'm able to read as many as
500 objects per second. Now, I've replicated my test on a basic MySQL
table and am able to get a throughput of 2,300 reads/sec; roughly 5
times what I'm seeing with Hbase. Besides the obvious code maturity
thing, is the discrepancy related to random reads not actually being
served from memcache, but rather from the disk, by Hbase? The HBase
performance page
(http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation) shows random
reads(mem) as "Not implemented."
- HBase Random Read Performance Joost Ouwerkerk
- Re: HBase Random Read Performance stack
- Re: HBase Random Read Performance stack
- Re: HBase Random Read Performance Joost Ouwerkerk
