Greetings all, I'm a riak newbie, trying out version 0.12.1. One of the use-cases we're interested in is using riak as a backend for brackup[1], an open source backup tool.
brackup supports pluggable targets/backends, including filesystems, ftp, sftp, Amazon S3, etc. I've written a first-pass riak target that I'm testing, which works nicely for small backups. I'm now looking to scale that up, and had a couple of questions. 1. Almost entirely brackup only needs per-key lookups and writes. The one one exception is garbage collection, where I need to walk the entire set of keys to figure out which chunks are orphaned and can therefore be deleted. So I'm wondering is there an upper limit on number of keys where "listing keys is expensive" turns into "listing keys is insane"? I'm looking at millions of keys/chunks for large backups, I guess. I guess splitting chunks over multiple buckets and performing multiple queries might help. Is there an recommended upper limit for keys per bucket on bitcask for sane list keys performance? 2. There seems to be a standard Link header coming back on my bucket key queries that is huge - twice the size of the response body with my 45b keys. So for 50k keys the response is about 1MB, and the Link header is about 2MB! I'm wondering if there's any way of turning this off, given I aren't doing any Link walking? Thanks, Gavin [1] http://code.google.com/p/brackup/ _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
