Re: Read inconsistency after backup and restore to different cluster

Robert Coli Thu, 14 Nov 2013 13:16:21 -0800

On Thu, Nov 14, 2013 at 12:37 PM, David Laube <d...@stormpath.com> wrote:


> It is almost as if the data only exists on some of the nodes, or perhaps
> the token ranges are dramatically different --again, we are using vnodes so
> I am not exactly sure how this plays into the equation.


The token ranges are dramatically different, due to vnode random token
selection from not setting initial_token, and setting num_tokens.

You can verify this by listing the tokens per physical node in nodetool
gossipinfo or (iirc) nodetool status.


> 5. Copy 1 of the 5 snapshot archives from cluster-A to each of the five
> nodes in the new cluster-B ring.
>

I don't understand this at all, do you mean that you are using one source
node's data to load each of of the target nodes? Or are you just saying
there's a 1:1 relationship between source snapshots and target nodes to
load into? Unless you have RF=N, using one source for 5 target nodes won't
work.

To do what I think you're attempting to do, you have basically two options.

1) don't use vnodes and do a 1:1 copy of snapshots
2) use vnodes and
   a) get a list of tokens per node from the source cluster
   b) put a comma delimited list of these in initial_token in
cassandra.yaml on target nodes
   c) probably have to un-set num_tokens (this part is unclear to me, you
will have to test..)
   d) set auto_bootstrap:false in cassandra.yaml
   e) start target nodes, they will not-bootstrap into the same ranges as
the source cluster
   f) load schema / copy data into datadir (being careful of
https://issues.apache.org/jira/browse/CASSANDRA-6245)
   g) restart node or use nodetool refresh (I'd probably restart the node
to avoid the bulk rename that refresh does) to pick up sstables
   h) remove auto_bootstrap:false from cassandra.yaml

I *believe* this *should* work, but have never tried it as I do not
currently run with vnodes. It should work because it basically makes
implicit vnode tokens explicit in the conf file. If it *does* work, I'd
greatly appreciate you sharing details of your experience with the list.

General reference on tasks of this nature (does not consider vnodes, but
treat vnodes as "just a lot of physical nodes" and it is mostly relevant) :
http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra

=Rob

Re: Read inconsistency after backup and restore to different cluster

Reply via email to