Hi all, A failure node can rejoin a cluster. On the node, all data in /var/lib/cassandra were deleted. Is it normal?
I can reproduce it as below. cluster: - C* 2.2.7 - a cluster has node1, 2, 3 - node1 is a seed - replication_factor: 3 how to: 1) stop C* process and delete all data in /var/lib/cassandra on node2 ($sudo rm -rf /var/lib/cassandra/*) 2) stop C* process on node1 and node3 3) restart C* on node1 4) restart C* on node2 nodetool status after 4): Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack DN [node3 IP] ? 256 100.0% 325553c6-3e05-41f6-a1f7-47436743816f rack1 UN [node2 IP] 7.76 MB 256 100.0% 05bdb1d4-c39b-48f1-8248-911d61935925 rack1 UN [node1 IP] 416.13 MB 256 100.0% a8ec0a31-cb92-44b0-b156-5bcd4f6f2c7b rack1 If I restart C* on node 2 when C* on node1 and node3 are running (without 2), 3)), a runtime exception happens. RuntimeException: "A node with address [node2 IP] already exists, cancelling join..." I'm not sure this causes data lost. All data can be read properly just after this rejoin. But some rows are lost when I kill&restart C* for destructive tests after this rejoin. Thanks.