Re: HBase Random Read Performance

stack Thu, 07 Feb 2008 21:08:29 -0800

Tthe test described can only favor mysql (single column, just a millionrows). Do you need Hbase?You might also tell us more about your hbase setup. Is it using localfsor hdfs? Is it a distributed hdfs or all on single server?


Thanks,
St.Ack






Joost Ouwerkerk wrote:

I'm working on a web application with primarily read-orientedperformance requirements. I've been running some benchmarking teststhat include our application layer, to get a sense of what is possiblewith Hbase. A variation on the Bigtable test that is reproduced byorg.apache.hadoop.hbase.PerformanceEvaluation, I'm randomly reading 1column from a table with 1 million rows. In our case, the contents ofthat column need to be deserialized by our application (which addssome overhead that I'm also trying to measure), the deserializedcontents represent a little over 1K of data.Although a single thread can only achieve 125 reads per second, with12 client threads (from 3 different machines) I'm able to read as manyas 500 objects per second. Now, I've replicated my test on a basicMySQL table and am able to get a throughput of 2,300 reads/sec;roughly 5 times what I'm seeing with Hbase. Besides the obvious codematurity thing, is the discrepancy related to random reads notactually being served from memcache, but rather from the disk, byHbase? The HBase performance page(http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation) showsrandom reads(mem) as "Not implemented."
Can anyone shed some light on the state of HBase's memcaching?
Cheers,
Joost.

Re: HBase Random Read Performance

Reply via email to