[ https://issues.apache.org/jira/browse/CASSANDRA-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597110#comment-16597110 ]
Dinesh Joshi commented on CASSANDRA-14679: ------------------------------------------ If the operator misconfigures the node by removing a directory from {{data_file_directories}}, I don't really think Cassandra can tell whether it was intentional or unintentional. I think the best would be to store a small file alongside {{cassandra.yaml}} on the node to remember state information. That way when unexpected configuration changes do occur, the node knows about them and stops the bootstrap process. If the disk / volume that had your {{cassandra.yaml}} is inaccessible; should be a fatal error anyway and the node won't start up. It almost feels like Cassandra should dump key in-memory settings to a separate file and compare them with the on-disk settings in {{cassandra.yaml}} on the subsequent start up. Then we can have well defined set of rules to stop a potentially problematic bootstrap. The only caveat being, many people don't gracefully shutdown Cassandra. > Prevent generating new tokens on a node when data exists > -------------------------------------------------------- > > Key: CASSANDRA-14679 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14679 > Project: Cassandra > Issue Type: Bug > Reporter: mck > Priority: Critical > > Data loss is possible if a node starts up without {{system.local}} data > available. > If a node restarts and its {{system.local}} data is unavailable it will > generate new tokens. This will cause range movements in the cluster causing > potential data loss, as these range movements are not part of a > bootstrap/decommission and leaves orphaned data around the cluster. > This can happen if a node restarts without a JBOD entry available, or if the > cassandra.yaml changes and leaves a JBOD entry out. > If a node starts up, finds data but not its {{system.local}} it should not > generate new tokens. Neither should it assign itself a new Host ID. > This is described in more detail in > http://thelastpickle.com/blog/2018/08/22/the-fine-print-when-using-multiple-data-directories.html -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org