I'd like to preface my comments by saying we love your product!  We've really 
got high hopes for Riak at Vimeo!

Also, my interest in backups has mainly to do with data corruption/loss from 
developers doing something bad.  Hence, a node loss is one scenario, but having 
to restore all data from backup being another scenario.



On Apr 25, 2012, at 1:31 PM, Mark Phillips wrote:

2) is it a consistent backup or is consistency old fashioned thinking?

The backup will be complete up to the point at which it was taken. You'll get a 
dump of all the keys in the order in which they were listed. Updates that 
happened during the backup may or may not be captured. (I would have to verify 
exactly how you would know which made it and which didn't.)


It would be nice to know for the record what the case is, so follow up is 
always appreciated..

With riak-admin backup, would it be possible to add a --stdout option so that 
the output could be pipped to gzip/bzip2?  The resulting backup file from 
multiple nodes could easily be larger than any one node's storage…  We have 
1.8GB on five nodes that ended up being 15GB in the dump file.  It compressed 
down very well.  We are just using a small dataset project now, but there is 
talk of moving 2T of compressed mysql data over.

[root@ip-10-0-0-231 mnt]# time riak-admin backup 
[email protected]<mailto:[email protected]> riak /mnt/backup/test.dump all
Attempting to restart script through sudo -u riak
Backing up (all nodes) to '/mnt/backup/test.dump'.
...from 
['[email protected]<mailto:[email protected]>','[email protected]<mailto:[email protected]>','[email protected]<mailto:[email protected]>',
         
'[email protected]<mailto:[email protected]>','[email protected]<mailto:[email protected]>']
Backup of '[email protected]<mailto:[email protected]>' complete
Backup of '[email protected]<mailto:[email protected]>' complete
Backup of '[email protected]<mailto:[email protected]>' complete
Backup of '[email protected]<mailto:[email protected]>' complete
Backup of '[email protected]<mailto:[email protected]>' complete
syncing and closing log

real    19m40.045s
user    6m19.976s
sys     4m12.192s

-rw-r--r-- 1 riak riak  15G Apr 24 15:56 test.dump
-rw-r--r-- 1 root root 1.9G Apr 24 16:13 test.dump.tgz




3) if you use the tar'ing up of leveldb + ring files per node, you lose one 
node, then you restore it from this tar file that is hours or days old, how 
does riak deal with bringing its data up to date?



After you restored the node, it would gradually sync its replicas with those on 
the other nodes via read/repair. That said, doing a complete restore of the 
node would probably not be needed. When the node disappears, Riak will 
compensate for it by sending its writes/reads to fallback nodes. When it comes 
back online, hinted handoff and read repair will make sure it gets all the 
replicas it was supposed to have and that those replicas were up to date. (You 
will have to force Read Repar on the replicas on that node which can be done 
via a list keys or using an existing snippet of code [1] for doing this but be 
warned that it'll put some load on that node. We're working on making the Read 
Repair process less reactive in future releases, but this is the best way to do 
it right now.) To be clear, I'm in no way advocating not backing-up your data. 
You just might not need to use them in this situation.


Understood.   It is also worth noting that the `riak-admin backup [all]` 
absorbed the meager CPU resources of my ec2 m1.large really well,  I loved the 
way it scaled across all 2 cores. ( http://cl.ly/2Q1i2l0n2X2o2W1J0X27  ;-)  I 
didn't expect that from a mysql background.  I'm so used to single threaded 
processes.



Another thing worth noting - the 'riak-admin backup' command is not known to be 
the speediest. If you have any non-trivial amount of data that needs backing 
up, you're probably best to do a FS snapshot of Level on each node. 
Unfortunately doing a live snapshot of Level is less than bulletproof at the 
moment, so you're advised to stop the node, snapshot level, and restart. You'll 
have to take the node offline for this but with five Riak nodes, your cluster 
should Just Keep Cranking™.



Is snapshotting known to be bullet-proof with other storage engines besides 
Level?  I was thinking lvm snapshots would be a decent solution when it grows 
larger.

That does help, thanks!

Austin

Hope that helps.

Mark

[1] Fair warning: I'm not sure the last time this was tested - 
http://contrib.basho.com/bucket_inspector.html

Thanks Riakers!

Austin

_______________________________________________
riak-users mailing list
[email protected]<mailto:[email protected]>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to