Den 22-10-2013 09:53, Shane McEwan skrev:
On 21/10/13 17:57, Joe Caswell wrote:
The only use case left for reip is when you have simultaneously changed the
node name for every node in the cluster, such as when loading an entire
cluster's worth of backups to new machines.
When I need to do this I just create a new, empty cluster with the new
names. Then shut down the cluster and restore only the data directories
(leveldb, for example) from the backup, leaving the ring directory
alone. Then I start up the cluster and it finds the restored data.
Thanks for the tip. Beware that this will not restore bucket props, because they are stored in the ring dir and not the data dir.
You need to be careful about restoring the old node's data to the
corresponding new node otherwise you'll get hinted handoffs flying
between all your nodes but after a bit of trial and error you can figure
out which node is which.
When you create the new, empty cluster, Riak distributes the partitions between the nodes using the claim function. I believe claim_v2 (riak_core_claim.erl) is still the default claim function and it will produce different partition distributions in different runs when joining nodes to form a cluster. Including the dreaded 12,12,12,12,16 on a default config with 5 nodes. I guess sometimes you'll be lucky, and the new shiny cluster will have the same partition distribution as the backup, but in my experience, this is not always the case, which means the new cluster will need to handoff data between nodes, to match the data with the ring-files' partition distribution.

The good(tm) way to restore a complete backup to a new environment is to restore data and partition distribution together - i.e. both the data dirs and the ring files. For this purpose, reip was very useful in v 1.3.x, where the node it was called on, did not have to be running. Unfortunately in 1.4.2, the reip'ed node must be running (which sort of defies the purpose of reip):

"riak-1.3.2/rel/riak/bin> ./riak-admin reip bla bla2
Backed up existing ring file to "./data/ring/riak_core_ring.default.20131004091923.BAK"
New ring file written to "./data/ring/riak_core_ring.default.20131022105325

riak-1.4.2/rel/riak/bin> ./riak-admin reip bla bla2
Node is not running!"

A common case where you need reip'ing non-running nodes, is when you copy production data to a staging environment, and need to ensure that your new staging cluster doesn't reference production nodes before firing it up. Does anyone know a good solution to this in 1.4? The only two ways I see are: 1) Edit the ring-files by hand 2) Restore to a new cluster with potentially mismatched partition distribution between data and ring-files, and wait for handoffs to complete.

- Rune, Trifork

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to