Recently, I successfully used the following procedure when decommissioning a 
datacenter:

1. Reduced the replication factor for this DC to zero for all keyspaces except 
the system_auth keyspace. For that keyspace, I reduced the RF to one.
2. Decommissioned all nodes except one in the DC using the regular procedure 
(no --force needed).
3. Decommissioned the last node using --force.
4. Set the RF for the system_auth keyspace to 0.

This procedure has two benefits:

1. Authentication on the nodes in the DC being decommissioned will work until 
the last node has been decommissioned. This is important when authentication is 
enabled for JMX. Otherwise, you cannot proceed when there are too few nodes 
left to get a LOCAL_QUORUM on system_auth.
2. One does not have to use --force except when removing the last node.

It would be nice if the RF for the system_auth keyspace could be reduced to 
zero before decommissioning the nodes. However, I think that implementing this 
correctly may be hard. If there are no local replicas, queries with a 
consistency level of LOCAL_QUORUM will probably fail, and this is the 
consistency level used for all authentication and authorization related 
queries. So, setting the RF to zero might break authentication and 
authorization, which in turn might make it impossible to decommission the nodes 
(without disabling authentication for that DC).

So, I guess that the code dealing with authentication and authorization would 
have to be changed to use a CL of QUORUM instead of LOCAL_QUORUM when 
system_auth is not replicated in the local DC.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to