Hello Alain, Thanks a lot for the confirmation. Yes this procedure seems like a workaround. But for my use case where system_auth contains a small amount of data and consistency level for authentication/authorization is switched to LOCAL_ONE, I think it is good enough. I completely get that this could be improved since there might be requirements from other users that cannot be covered with the proposed procedure.
BR MK From: Alain Rodriguez <al...@casterix.fr> Sent: April 22, 2024 18:27 To: user@cassandra.apache.org Cc: Michalis Kotsiouros (EXT) <michalis.kotsiouros....@ericsson.com> Subject: Re: Datacenter decommissioning on Cassandra 4.1.4 Hi Michalis, It's been a while since I removed a DC for the last time, but I see there is now a protection to avoid accidentally leaving a DC without auth capability. This was introduced in C* 4.1 through CASSANDRA-17478 (https://issues.apache.org/jira/browse/CASSANDRA-17478). The process of dropping a data center might have been overlooked while doing this work. It's never correct for an operator to remove a DC from system_auth replication settings while there are currently nodes up in that DC. I believe this assertion is not correct. As Jon and Jeff mentioned, usually we remove the replication before decommissioning any node in the case of removing an entire DC, for reasons exposed by Jeff. The existing documentation is also clear about this: https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsDecomissionDC.html and https://thelastpickle.com/blog/2019/02/26/data-center-switch.html<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-3f5d78e47d9f728a&q=1&e=1b5f9bb8-e8af-49b9-9e2d-26622cb77bfc&u=https%3A%2F%2Fthelastpickle.com%2Fblog%2F2019%2F02%2F26%2Fdata-center-switch.html>. Michalis, the solution you suggest seems to be the (good/only?) way to go, even though it looks like a workaround, not really "clean" and something we need to improve. It was also mentioned here: https://dba.stackexchange.com/questions/331732/not-a-able-to-decommission-the-old-datacenter#answer-334890. It should work quickly, but only because this keyspace has a fairly low amount of data, but it will still not be optimal and as fast as it should (it should be a near no-op as explained above by Jeff). It also obliges you to use "--force" option that could lead you to delete one of your nodes in another DC by mistake and in a loaded cluster or a 3-node cluster - RF = 3, this could hurt...). Having to operate using "nodetool decommission --force" cannot be standard, but for now I can't think of anything better for you. Maybe wait for someone else's confirmation, it's been a while since operated Cassandra :). I think it would make sense to fix this somehow in Cassandra. Maybe should we ensure that no other keyspaces has a RF > 0 for this data center instead of looking at active nodes, or that there is no client connected to the nodes, add a manual flag somewhere, or something else? Even though I understand the motivation to protect users against a wrongly distributed system_auth keyspace, I think this protection should not be kept with this implementation. If that makes sense I can create a ticket for this problem. C*heers, Alain Rodriguez casterix.fr<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-c572154016f8885b&q=1&e=1b5f9bb8-e8af-49b9-9e2d-26622cb77bfc&u=http%3A%2F%2Fcasterix.fr%2F> [Image removed by sender.] Le lun. 8 avr. 2024 à 16:26, Michalis Kotsiouros (EXT) via user <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> a écrit : Hello Jon and Jeff, Thanks a lot for your replies. I completely get your points. Some more clarification about my issue. When trying to update the Replication before the decommission, I get the following error message when I remove the replication for system_auth kesypace. ConfigurationException: Following datacenters have active nodes and must be present in replication options for keyspace system_auth: [datacenter1] This error message does not appear in the rest of the application keyspaces. So, may I change the procedure to: 1. Make sure no clients are still writing to any nodes in the datacenter. 2. Run a full repair with nodetool repair. 3. Change all keyspaces so they no longer reference the datacenter being removed apart from system_auth keyspace. 4. Run nodetool decommission using the --force option on every node in the datacenter being removed. 5. Change system_auth keyspace so they no longer reference the datacenter being removed. BR MK From: Jeff Jirsa <jji...@gmail.com<mailto:jji...@gmail.com>> Sent: April 08, 2024 17:19 To: cassandra <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Cc: Michalis Kotsiouros (EXT) <michalis.kotsiouros....@ericsson.com<mailto:michalis.kotsiouros....@ericsson.com>> Subject: Re: Datacenter decommissioning on Cassandra 4.1.4 To Jon’s point, if you remove from replication after step 1 or step 2 (probably step 2 if your goal is to be strictly correct), the nodetool decommission phase becomes almost a no-op. If you use the order below, the last nodes to decommission will cause those surviving machines to run out of space (assuming you have more than a few nodes to start) On Apr 8, 2024, at 6:58 AM, Jon Haddad <j...@jonhaddad.com<mailto:j...@jonhaddad.com>> wrote: You shouldn’t decom an entire DC before removing it from replication. — Jon Haddad Rustyrazorblade Consulting rustyrazorblade.com<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-1624a77accb6d839&q=1&e=8a954d2d-17da-40df-8732-bdcc7893179a&u=http%3A%2F%2Frustyrazorblade.com%2F> On Mon, Apr 8, 2024 at 6:26 AM Michalis Kotsiouros (EXT) via user <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> wrote: Hello community, In our deployments, we usually rebuild the Cassandra datacenters for maintenance or recovery operations. The procedure used since the days of Cassandra 3.x was the one documented in datastax documentation. Decommissioning a datacenter | Apache Cassandra 3.x (datastax.com)<https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsDecomissionDC.html> After upgrading to Cassandra 4.1.4, we have realized that there are some stricter rules that do not allo to remove the replication when active Cassandra nodes still exist in a datacenter. This check makes the above-mentioned procedure obsolete. I am thinking to use the following as an alternative: 1. Make sure no clients are still writing to any nodes in the datacenter. 2. Run a full repair with nodetool repair. 3. Run nodetool decommission using the --force option on every node in the datacenter being removed. 4. Change all keyspaces so they no longer reference the datacenter being removed. What is the procedure followed by other users? Do you see any risk following the proposed procedure? BR MK