Re: Cassandra vs HBase

stack Tue, 01 Sep 2009 15:56:33 -0700

Hey Jonathan:

On Tue, Sep 1, 2009 at 3:12 PM, Jonathan Ellis <[email protected]> wrote:


> The big win for Cassandra is that its p2p distribution model -- which
> drives the consistency model -- means there is no single point of
> failure.  SPF can be mitigated by failover but it's really, really
> hard to get all the corner cases right with that approach.  Even
> Google with their 3 year head start and huge engineering resources
> still has trouble with that occasionally.  (See e.g.
> http://groups.google.com/group/google-appengine/msg/ba95ded980c8c179.)
>
>
Its hard to answer the above -- No SPOF > failover because some corner cases
will be missed as though P2P was without corners -- so I'll pass on it.



> > + Cassandra does not have have a natural sharding notion as there is in
> > HBase -- i.e. HBase Regions -- so hooking Cassandra to MapReduce is
> awkward.
>
> Actually that's not a big deal -- the token ring is known, so you can
> break up at a coarse granularity there, and each node has a sampling
> of the keys stored on it thanks to the way the sstable indexing works,
> so generating hadoop input regions is pretty easy.  Jeff Hodges wrote
> a proof of concept over at
> https://issues.apache.org/jira/browse/CASSANDRA-342.
>

Thanks.  Yeah, I'd read that issue before making the comment.  It was my
reading of the issue that provoked my 'awkward' comment.



> > + The Cassandra fellas talk of their app being one ball of code only
> whereas
> > with HBase there is HDFS, ZooKeeper and then HBase itself (Apparently it
> has
> > less lines of code too).
>
> Opinions may differ, but I still think this is a huge win for
> troubleshooting.
>

The parenthetical was to poke fun at what, IMO, is a silly guage for
comparing very different projects.

Go easy,
St.Ack

Re: Cassandra vs HBase

Reply via email to