On Thu, Nov 14, 2013 at 12:37 PM, David Laube <d...@stormpath.com> wrote:
> It is almost as if the data only exists on some of the nodes, or perhaps > the token ranges are dramatically different --again, we are using vnodes so > I am not exactly sure how this plays into the equation. The token ranges are dramatically different, due to vnode random token selection from not setting initial_token, and setting num_tokens. You can verify this by listing the tokens per physical node in nodetool gossipinfo or (iirc) nodetool status. > 5. Copy 1 of the 5 snapshot archives from cluster-A to each of the five > nodes in the new cluster-B ring. > I don't understand this at all, do you mean that you are using one source node's data to load each of of the target nodes? Or are you just saying there's a 1:1 relationship between source snapshots and target nodes to load into? Unless you have RF=N, using one source for 5 target nodes won't work. To do what I think you're attempting to do, you have basically two options. 1) don't use vnodes and do a 1:1 copy of snapshots 2) use vnodes and a) get a list of tokens per node from the source cluster b) put a comma delimited list of these in initial_token in cassandra.yaml on target nodes c) probably have to un-set num_tokens (this part is unclear to me, you will have to test..) d) set auto_bootstrap:false in cassandra.yaml e) start target nodes, they will not-bootstrap into the same ranges as the source cluster f) load schema / copy data into datadir (being careful of https://issues.apache.org/jira/browse/CASSANDRA-6245) g) restart node or use nodetool refresh (I'd probably restart the node to avoid the bulk rename that refresh does) to pick up sstables h) remove auto_bootstrap:false from cassandra.yaml I *believe* this *should* work, but have never tried it as I do not currently run with vnodes. It should work because it basically makes implicit vnode tokens explicit in the conf file. If it *does* work, I'd greatly appreciate you sharing details of your experience with the list. General reference on tasks of this nature (does not consider vnodes, but treat vnodes as "just a lot of physical nodes" and it is mostly relevant) : http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra =Rob