Re: Riak compared to CouchDB

Mark Phillips Fri, 28 Jan 2011 16:43:12 -0800

So I cobbled together the beginnings of a wiki page for a Riak and
CouchDB comparison based on this thread. It's in a branch called
"riak-couchdb-comparison" in the wiki repo.


https://github.com/basho/riak_wiki/tree/riak-couchdb-comparison

As you'll see, it's _very_ sparse, and needs a lot of love and
expanding from people who know couchdb better than me.

I've also added a corresponding issue to the wiki repo:

https://github.com/basho/riak_wiki/issues/issue/22

A little help fleshing this out would be much appreciated.

Thanks,

Mark



On Fri, Jan 28, 2011 at 8:44 AM, Alexander Sicular <[email protected]> wrote:
> Jamie hits a lot of points, ill cover sone, mind the overlap.
> Couch has a built in indexing mechanism called a b-tree. Every view gets
> it's own b-tree. The larger your data set the longer it takes to build the
> b-tree.
> Couch also knows when the b-tree was built in relation to data on disk via
> internal sequence identifiers for each record. If your index is older than
> data on disk it will update the b-tree with the new data. Riak has no native
> indexing mechanism. If you want to reuse m/r results you need to stash them
> somewhere.
>
> The subscription mechanism for changes is very nice but works because couch
> is not a distributed system. It is a replicated system. That feature should
> be worked into all rdbms's. Very handy.
> The big win for couch is that you can arrange it in all sorts of interesting
> topologies for replication. It can be used in offline systems that need to
> sync when they get back online.
> The couch guys are also working to scale couch down so you can use couch on
> phones and other portable devices.
> Like riak, couch is erlang based so you get all the erlang love. But unlike
> riak couch is only accessible over http. Riak has protocolbuffers and native
> access.
>
> Couch uses a wol (write only log) like bitcask (the default backend for
> riak). They both need to be compacted to reclaim space. But unlike couch,
> riak can also use other backends in the same cluster which gives you
> flexibility.
> And as hit on already but imho the biggest difference between couch and riak
> is that in order to scale couch you need to implement a sharding layer to
> split your data between multiple couches (see big couch, lounge). Riak is a
> distributed system so all you need to do to scale riak is add more nodes. I
> once tweeted something like "couch: divide and conquer. Riak: one ring to
> rule them all."
> Best, Alexander
> @siculars on twitter
> http://siculars.posterous.com
> Sent from my iPhone
> On Jan 28, 2011, at 8:29, Jamie Talbot <[email protected]> wrote:
>
> Hey Joshua,
> I'm relatively new to Riak, but have done quite a bit of investigation into
> CouchDB, so this is as much to confirm my own understanding as anything.
>  With that disclaimer out of the way, here's what I understand about the
> two.
> Couch has excellent database consistency - killing the server process dead
> won't lose you any data, and recovering after a crash is very quick.  Fault
> tolerance I would say is Riak's biggest selling point, with the ability to
> configure how many nodes can fail before results can no longer be returned
> or written.  You can kind of achieve fault tolerance with Couch by
> load-balancing behind a proxy, but it's a kludge compared to the
> fault-tolerance that is at the very heart of Riak.
> Both CouchDB and Riak have map/reduce functionality available through REST,
> using Erlang or Javascript.  With Couch, querying the data can be
> problematic though, especially on large sets of data as you have to
> pre-define views of how you want to extract data and then wait for them to
> be built.  It's certainly not true that you can just choose any old design
> and then figure things out later.  Building views can take a long time - on
> a few hundred million rows in my sample, it took a number of weeks to build
> one relatively minor view (though hardware was quite limited).  This makes
> RAD with CouchDB difficult, and was a significant business risk.  The upside
> here is that once built, I could query 7 years of ISP data at year, month,
> day, hour, minute granularity, across any cross-section of services in a
> handful of milliseconds.  This was incredible, and pretty addictive - it's
> lightning fast, for very specific use cases.
> The space requirements for Couch are enormous though, as updates and even
> deletes increase the size of the DB, until compacted.  Riak too will use
> additional space to store duplicate copies of data on different nodes, to
> provide fault tolerance, though from my experiments the overhead is nothing
> like recent versions of CouchDB for my specific use cases.  Your mileage
> will vary greatly, based on your configuration of Riak and the
> characteristics of your Couch views.
> Riak, from what I understand is not currently particularly well-suited to
> retrieving large amounts of data sequentially by key, but CouchDB works very
> quickly here, as long as you have defined a suitable view.
> Couch does bi-directional replication, though I did find that a little
> flaky, sometimes dying for no reason.  No data loss of course, and it did
> eventually sync, but frustrating nonetheless.  This was as of the previous
> version.  Riak does replication of data as part of its architecture, but if
> you want to scale to multiple datacentres, you need the enterprise, non-free
> version.
> Scalability is hard with Couch, from what I can tell - certainly not the
> ability just to add a new node for better performance like you can with
> Riak.  For me, this is a killer feature of Riak.
> Couch has a nice subscription mechanism for changes to the database, which
> allows you to set triggers and the like.  Don't be fooled by the talk of
> document versioning though - it is built in, but it is purely a mechanism
> for the MVCC (replication and concurrency) mechanism to work and old
> versions of documents are specifically removed whenever the database is
> compacted.
> This page has a high-level comparison of a number of NoSQL options,
> including Riak and CouchDB, which was generally considered to be pretty
> reasonable: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
> Hopefully that's a reasonable representation of the two systems.  I will let
> more seasoned pros correct and expand on the above as necessary!
> Cheers,
> Jamie.
> PS: Hello, list!
> On Fri, Jan 28, 2011 at 21:44, Joshua Partogi <[email protected]> wrote:
>>
>> Hi there.
>>
>> Has anyone here done any comparison between Riak and CouchDB? I am
>> interested to see how similar and different Riak compared to Couch. If this
>> can be added to the Riak wiki, I think it would be great for all of us here.
>>
>> Thanks heaps.
>>
>> Kind regards,
>> Joshua.
>> --
>> http://twitter.com/jpartogi
>>
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak compared to CouchDB

Reply via email to