[
https://issues.apache.org/jira/browse/CASSANDRA-21026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18061804#comment-18061804
]
Arup Chauhan edited comment on CASSANDRA-21026 at 2/28/26 9:49 AM:
-------------------------------------------------------------------
Hey, [~samt], I would like to help with this one.
I can focus on host-id-based ring-consistency validation so
foreign/inconsistent state is rejected consistently across the relevant
gossip/state admission paths.
I can also add or strengthen targeted regression tests to assert that
mismatched host-id ownership (or equivalent ring mismatch signals) is not
admitted.
Please let me know what scope would be most helpful first.
was (Author: JIRAUSER312424):
[~samt]
I would like to help with this one.
I can focus on host-id-based ring-consistency validation so
foreign/inconsistent state is rejected consistently across the relevant
gossip/state admission paths.
I can also add or strengthen targeted regression tests to assert that
mismatched host-id ownership (or equivalent ring mismatch signals) is not
admitted.
Please let me know what scope would be most helpful first.
> Reusing the address of a removed node is not possible with Accord enabled
> -------------------------------------------------------------------------
>
> Key: CASSANDRA-21026
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21026
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Accord, Cluster/Membership
> Reporter: Sam Tunnicliffe
> Priority: Normal
>
> If the address of a decommissioned node is re-used by another new node at
> some later time, any node in the cluster with Accord enabled will be unable
> to start up, including the new node.
> As the new node comes up and registers with the {{ClusterMetadataService}} it
> is added to the {{Directory}}.
> The decommissioned node's details are also preserved in the directory present
> to ensure that transactions which were in-flight can be completed after the
> node has left.
> (https://issues.apache.org/jira/browse/CASSANDRA-20142)
> During AccordService initialization building the endpoint mapping will fail
> because of this check in {{EndpointMapping.Builder}}:
> {code}
> Invariants.requireArgument(!mapping.containsValue(endpoint), "Mapping already
> exists for %s", endpoint);
> {code}
> Additionally, it seems possible that the wrong method is being called in
> {{AccordTopology::directoryToEndpointMapping}}
> {code}
> // There are cases where nodes are removed from the cluster (host
> replacement, decom, etc.), but inflight events
> // may still be happening; keep the ids around so pending events do
> not fail with a mapping error
> for (Directory.RemovedNode removedNode : directory.removedNodes())
> builder.add(removedNode.endpoint, tcmIdToAccord(removedNode.id));
> {code}
> which should probably call {{builder::removed}} rather than {{builder::add}}
> but that also contains the the same invariant check.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]