Hi Ben, First, allow me to welcome to the list! Stick around, I think you'll like it here. :)
How many nodes of Riak are you running vs how many nodes of Mongo? How much more disk space did Riak take? Riak is designed to run as a cluster of several nodes, utilizing replication to provide resiliency and high-availability during partial failure. By default Riak stores three replicas of every object you persist. If you are only running a single node of Riak for your testing purposes, I suspect this may explain the significant divergence you're seeing when compared to the disk space used vs a single mongo, as each replica in Riak is being stored to the same disk. Also, Snappy is optimizes for speed over disk utility, which will have a negligible impact on total disk usage when compared to other compression libraries such as zlib, etc. That said, for sufficiently large JSON files I know that BSON's prefixes can add significant overhead to object sizes such that BSON is actually heavier than the JSON it represents. What is the average size of the documents you're seeking to store? Could you tell us a bit more about what you're trying to achieve with both Riak and Mongo, respectfully? Tom On Wed, Apr 10, 2013 at 12:39 AM, Ben McCann <b...@benmccann.com> wrote: > Hi, > > I'm currently storing data in MongoDB and would like to evaluate Riak as > an alternative. Riak is appealing to me because LevelDB uses Snappy, so I > would expect it to take less disk space to store my data set than MongoDB > which does not use compression. However, when I benchmarked it by inserting > a few hundred thousand JSON records into each datastore, Riak in fact took > far more disk space. I'm wondering if there's something I might be missing > here as a newcomer to Riak. E.g. I checked the disk space used by running > "du -ch /var/lib/riak/leveldb". Is this perhaps not a good way to check > disk space usage because perhaps Riak/LevelDB preallocates files? (I know > MongoDB does this and has a built-in db.collection.stats command to provide > true disk usage information). Are there any other reasons why Riak might be > taking more space or anything I could have screwed up? > > Thanks, > Ben > > -- > about.me/benmccann > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com