[ https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526452#comment-17526452 ]
Stefan Miklosovic commented on CASSANDRA-17180: ----------------------------------------------- [~paulo] thanks for finally looking into it, I ll deal with it over the weekend to finally move this over the line. I had implemented something similar to your postActions idea but Brandon's opinion was that we are inventing just something else here. But I see you moved that "execute post actions loop" after all checks are verified in CassandraDaemon instead of having it in StartupChecks.verify directly. I am fine with your take on that, is Brandon too? Good to know this is going to check system_distributed and system_auth too. As for the default place of the heartbeat file, thats good point. Maybe we should go a little bit wild here and we might save it to /tmp/ ? I think that has the most guarantee of being writable. I do not like the fact that there is suddenly some file in area for sstables / tables. Other existing software might have a problem with this. For example when you are backuping, you would need to what ... exclude or include that file? It depends how people look at these backups etc. For that reason I would place it somewhere else. But .... if we place it to /tmp, and you have more than one node running on the same machine, there will be the clash as two nodes happen to write to the same file {_}by default{_}. In that case we would have to make that file name unique, e.g. by including node's id. What is your take on this? Yes we can rename that class. I do not mind to start to write JSON into that file, but ... how do you want to parse that file? I still need to read it / check it and so on. By what you would like to replace all that logic? > Implement startup check to prevent Cassandra start to spread zombie data > ------------------------------------------------------------------------ > > Key: CASSANDRA-17180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17180 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Observability > Reporter: Stefan Miklosovic > Assignee: Stefan Miklosovic > Priority: Normal > Time Spent: 9.5h > Remaining Estimate: 0h > > As already discussed on ML, it would be nice to have a service which would > periodically write timestamp to a file signalling it is up / running. > Then, on the startup, we would read this file and we would determine if there > is some table which gc grace is behind this time and we would fail the start > so we would prevent zombie data to be likely spread around a cluster. > https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org