Re: Riak compared to CouchDB

Alexander Sicular Fri, 28 Jan 2011 08:45:05 -0800

Jamie hits a lot of points, ill cover sone, mind the overlap.

Couch has a built in indexing mechanism called a b-tree. Every viewgets it's own b-tree. The larger your data set the longer it takes tobuild the b-tree.

Couch also knows when the b-tree was built in relation to data on diskvia internal sequence identifiers for each record. If your index isolder than data on disk it will update the b-tree with the new data.Riak has no native indexing mechanism. If you want to reuse m/rresults you need to stash them somewhere.

The subscription mechanism for changes is very nice but works becausecouch is not a distributed system. It is a replicated system. Thatfeature should be worked into all rdbms's. Very handy.

The big win for couch is that you can arrange it in all sorts ofinteresting topologies for replication. It can be used in offlinesystems that need to sync when they get back online.

The couch guys are also working to scale couch down so you can usecouch on phones and other portable devices.

Like riak, couch is erlang based so you get all the erlang love. Butunlike riak couch is only accessible over http. Riak hasprotocolbuffers and native access.

Couch uses a wol (write only log) like bitcask (the default backendfor riak). They both need to be compacted to reclaim space. But unlikecouch, riak can also use other backends in the same cluster whichgives you flexibility.

And as hit on already but imho the biggest difference between couchand riak is that in order to scale couch you need to implement asharding layer to split your data between multiple couches (see bigcouch, lounge). Riak is a distributed system so all you need to do toscale riak is add more nodes. I once tweeted something like "couch:divide and conquer. Riak: one ring to rule them all."


Best, Alexander

@siculars on twitter
http://siculars.posterous.com

Sent from my iPhone

On Jan 28, 2011, at 8:29, Jamie Talbot <[email protected]> wrote:

Hey Joshua,
I'm relatively new to Riak, but have done quite a bit ofinvestigation into CouchDB, so this is as much to confirm my ownunderstanding as anything. With that disclaimer out of the way,here's what I understand about the two.
Couch has excellent database consistency - killing the serverprocess dead won't lose you any data, and recovering after a crashis very quick. Fault tolerance I would say is Riak's biggestselling point, with the ability to configure how many nodes can failbefore results can no longer be returned or written. You can kindof achieve fault tolerance with Couch by load-balancing behind aproxy, but it's a kludge compared to the fault-tolerance that is atthe very heart of Riak.
Both CouchDB and Riak have map/reduce functionality availablethrough REST, using Erlang or Javascript. With Couch, querying thedata can be problematic though, especially on large sets of data asyou have to pre-define views of how you want to extract data andthen wait for them to be built. It's certainly not true that youcan just choose any old design and then figure things out later.Building views can take a long time - on a few hundred million rowsin my sample, it took a number of weeks to build one relativelyminor view (though hardware was quite limited). This makes RAD withCouchDB difficult, and was a significant business risk. The upsidehere is that once built, I could query 7 years of ISP data at year,month, day, hour, minute granularity, across any cross-section ofservices in a handful of milliseconds. This was incredible, andpretty addictive - it's lightning fast, for very specific use cases.
The space requirements for Couch are enormous though, as updates andeven deletes increase the size of the DB, until compacted. Riak toowill use additional space to store duplicate copies of data ondifferent nodes, to provide fault tolerance, though from myexperiments the overhead is nothing like recent versions of CouchDBfor my specific use cases. Your mileage will vary greatly, based onyour configuration of Riak and the characteristics of your Couchviews.
Riak, from what I understand is not currently particularly well-suited to retrieving large amounts of data sequentially by key, butCouchDB works very quickly here, as long as you have defined asuitable view.
Couch does bi-directional replication, though I did find that alittle flaky, sometimes dying for no reason. No data loss ofcourse, and it did eventually sync, but frustrating nonetheless.This was as of the previous version. Riak does replication of dataas part of its architecture, but if you want to scale to multipledatacentres, you need the enterprise, non-free version.
Scalability is hard with Couch, from what I can tell - certainly notthe ability just to add a new node for better performance like youcan with Riak. For me, this is a killer feature of Riak.
Couch has a nice subscription mechanism for changes to the database,which allows you to set triggers and the like. Don't be fooled bythe talk of document versioning though - it is built in, but it ispurely a mechanism for the MVCC (replication and concurrency)mechanism to work and old versions of documents are specificallyremoved whenever the database is compacted.
This page has a high-level comparison of a number of NoSQL options,including Riak and CouchDB, which was generally considered to bepretty reasonable: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
Hopefully that's a reasonable representation of the two systems. Iwill let more seasoned pros correct and expand on the above asnecessary!
Cheers,

Jamie.

PS: Hello, list!
On Fri, Jan 28, 2011 at 21:44, Joshua Partogi<[email protected]> wrote:
Hi there.
Has anyone here done any comparison between Riak and CouchDB? I aminterested to see how similar and different Riak compared to Couch.If this can be added to the Riak wiki, I think it would be great forall of us here.
Thanks heaps.

Kind regards,
Joshua.
--
http://twitter.com/jpartogi

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak compared to CouchDB

Reply via email to