Re: hbase versus cassandra

2009-11-23 Thread Ryan Rawson
HBase is about the same or slightly faster speed than Cassandra. Cassandra does a write by sending "W" requests out. HBase is 1 call, and that overlays HDFS so there is calls out to HDFS to persist in a log. So the speeds should be about the same. I can get 100-300k writes/sec to a cluster (19 n

Re: hbase versus cassandra

2009-11-23 Thread Adam Fisk
Thanks guys - super helpful. My background is in p2p, but I adhere to Martin Fowler's "First Law of Distributed Object Design" wherever possible - Don’t distribute your objects! The timestamp trick for avoiding hotspots makes a lot of sense, and it's tough to argue with "hbase is faster," as I gene

Re: hbase versus cassandra

2009-11-23 Thread Ryan Rawson
Ah the classic. Well since you're on the HBase list, my suggestion is going to have to be "use HBase". There are other advantages to HBase over cassandra: - atomic row changes - row locking - increment value operation - strong local consistency - multiple versioning - no possibility of corrupted

Re: hbase versus cassandra

2009-11-23 Thread Tim Robertson
Hi Adam, I am not the person to answer having not used Cassandra, but have spotted this being discussed on the list recently on a long thread: Search for "Cassandra vs HBase" on this page: http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200909.mbox/thread There is also an article: htt

hbase versus cassandra

2009-11-23 Thread Adam Fisk
Hi Everyone- I'm implementing a new data layer and am struggling to decide between HBase and Cassandra. The primary advantages of HBase as far as I can tell are: 1) Tighter integration with Hadoop, making it easier to run M/R for reporting and analytics 2) Better caching layer Cassandra's thrift