Den 22-10-2013 09:53, Shane McEwan skrev:
On 21/10/13 17:57, Joe Caswell wrote:
The only use case left for reip is when you have simultaneously changed the
node name for every node in the cluster, such as when loading an entire
cluster's worth of backups to new machines.
When I need to do this I just create a new, empty cluster with the new
names. Then shut down the cluster and restore only the data directories
(leveldb, for example) from the backup, leaving the ring directory
alone. Then I start up the cluster and it finds the restored data.
Thanks for the tip. Beware that this will not restore bucket props,
because they are stored in the ring dir and not the data dir.
You need to be careful about restoring the old node's data to the
corresponding new node otherwise you'll get hinted handoffs flying
between all your nodes but after a bit of trial and error you can figure
out which node is which.
When you create the new, empty cluster, Riak distributes the partitions
between the nodes using the claim function. I believe claim_v2
(riak_core_claim.erl) is still the default claim function and it will
produce different partition distributions in different runs when joining
nodes to form a cluster. Including the dreaded 12,12,12,12,16 on a
default config with 5 nodes.
I guess sometimes you'll be lucky, and the new shiny cluster will have
the same partition distribution as the backup, but in my experience,
this is not always the case, which means the new cluster will need to
handoff data between nodes, to match the data with the ring-files'
partition distribution.
The good(tm) way to restore a complete backup to a new environment is to
restore data and partition distribution together - i.e. both the data
dirs and the ring files.
For this purpose, reip was very useful in v 1.3.x, where the node it was
called on, did not have to be running. Unfortunately in 1.4.2, the
reip'ed node must be running (which sort of defies the purpose of reip):
"riak-1.3.2/rel/riak/bin> ./riak-admin reip bla bla2
Backed up existing ring file to
"./data/ring/riak_core_ring.default.20131004091923.BAK"
New ring file written to "./data/ring/riak_core_ring.default.20131022105325
riak-1.4.2/rel/riak/bin> ./riak-admin reip bla bla2
Node is not running!"
A common case where you need reip'ing non-running nodes, is when you
copy production data to a staging environment, and need to ensure that
your new staging cluster doesn't reference production nodes before
firing it up. Does anyone know a good solution to this in 1.4? The only
two ways I see are: 1) Edit the ring-files by hand 2) Restore to a new
cluster with potentially mismatched partition distribution between data
and ring-files, and wait for handoffs to complete.
- Rune, Trifork
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com