[ 
https://issues.apache.org/jira/browse/CASSANDRA-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597110#comment-16597110
 ] 

Dinesh Joshi commented on CASSANDRA-14679:
------------------------------------------

If the operator misconfigures the node by removing a directory from 
{{data_file_directories}}, I don't really think Cassandra can tell whether it 
was intentional or unintentional. I think the best would be to store a small 
file alongside {{cassandra.yaml}} on the node to remember state information. 
That way when unexpected configuration changes do occur, the node knows about 
them and stops the bootstrap process. If the disk / volume that had your 
{{cassandra.yaml}} is inaccessible; should be a fatal error anyway and the node 
won't start up.

It almost feels like Cassandra should dump key in-memory settings to a separate 
file and compare them with the on-disk settings in {{cassandra.yaml}} on the 
subsequent start up. Then we can have well defined set of rules to stop a 
potentially problematic bootstrap. The only caveat being, many people don't 
gracefully shutdown Cassandra.

> Prevent generating new tokens on a node when data exists
> --------------------------------------------------------
>
>                 Key: CASSANDRA-14679
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14679
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: mck
>            Priority: Critical
>
> Data loss is possible if a node starts up without {{system.local}} data 
> available.
> If a node restarts and its {{system.local}} data is unavailable it will 
> generate new tokens. This will cause range movements in the cluster causing 
> potential data loss, as these range movements are not part of a 
> bootstrap/decommission and leaves orphaned data around the cluster.
> This can happen if a node restarts without a JBOD entry available, or if the 
> cassandra.yaml changes and leaves a JBOD entry out.
> If a node starts up, finds data but not its {{system.local}} it should not 
> generate new tokens. Neither should it assign itself a new Host ID.
> This is described in more detail in 
> http://thelastpickle.com/blog/2018/08/22/the-fine-print-when-using-multiple-data-directories.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to