HBase is about the same or slightly faster speed than Cassandra.
Cassandra does a write by sending "W" requests out. HBase is 1 call,
and that overlays HDFS so there is calls out to HDFS to persist in a
log. So the speeds should be about the same. I can get 100-300k
writes/sec to a cluster (19 n
Thanks guys - super helpful. My background is in p2p, but I adhere to
Martin Fowler's "First Law of Distributed Object Design" wherever
possible - Don’t distribute your objects! The timestamp trick for
avoiding hotspots makes a lot of sense, and it's tough to argue with
"hbase is faster," as I gene
Ah the classic. Well since you're on the HBase list, my suggestion is
going to have to be "use HBase". There are other advantages to HBase
over cassandra:
- atomic row changes
- row locking
- increment value operation
- strong local consistency
- multiple versioning
- no possibility of corrupted
Hi Adam,
I am not the person to answer having not used Cassandra, but have
spotted this being discussed on the list recently on a long thread:
Search for "Cassandra vs HBase" on this page:
http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200909.mbox/thread
There is also an article:
htt
Hi Everyone- I'm implementing a new data layer and am struggling to
decide between HBase and Cassandra. The primary advantages of HBase as
far as I can tell are:
1) Tighter integration with Hadoop, making it easier to run M/R for
reporting and analytics
2) Better caching layer
Cassandra's thrift