Hey, 

Thanks for the reply! One clarification: the replacement node WOULD be DC-local 
as far as Cassandra is is concerned; it would just be in a different physical 
DC. Using the Orlando -> Tampa example, suppose my DC was named 'floridaDC' in 
Cassandra. Then I would just kill a node in Orlando, and start a new one in 
Tampa with the same DC name, 'floridaDC'. So from Cassandra's perspective, the 
replacement node is in the same datacenter as the old one was. It will be 
responsible for the same tokens as the old Orlando node, and bootstrap 
accordingly.  

Would this work? 

-Saleil 

From: oleksandr.shul...@zalando.de At: 04/03/19 03:28:37To:  Saleil Bhat 
(BLOOMBERG/ 731 LEX ) ,  user@cassandra.apache.org
Subject: Re: Procedures for moving part of a C* cluster to a different 
datacenter

On Wed, Apr 3, 2019 at 12:28 AM Saleil Bhat (BLOOMBERG/ 731 LEX) 
<sbha...@bloomberg.net> wrote:


The standard procedure for doing this seems to be add a 3rd datacenter to the 
cluster, stream data to the new datacenter via nodetool rebuild, then 
decommission the old datacenter. A more detailed review of this procedure can 
be found here: 
http://thelastpickle.com/blog/2019/02/26/data-center-switch.html



However, I see two problems with the above protocol.  First, it requires 
changes on the application layer because of the datacenter name change; e.g. 
all applications referring to the datacenter ‘Orlando’ will now have to be 
changed to refer to ‘Tampa’.

Alternatively, you may omit DC specification in the client and provide internal 
network addresses as the contact points.


As such, I was wondering what peoples’ thoughts were on the following 
alternative procedure: 

1) Kill one node in the old datacenter

2) Add a new node in the new datacenter but indicate that it is to REPLACE the 
one just shutdown; this node will bootstrap, and all the data which it is 
supposed to be responsible for will be streamed to it


I don't think this is going to work.  First, I believe streaming for bootstrap 
or for replacing a node is DC-local, so the first node won't have any peers to 
stream from.  Even if it would stream from the remote DC, this single node will 
own 100% of the ring and will most likely die of the load well before it 
finishes streaming.

Regards,-- 
Alex


Reply via email to