[ https://issues.apache.org/jira/browse/CASSANDRA-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867995#comment-17867995 ]
Stefan Miklosovic edited comment on CASSANDRA-18111 at 7/23/24 9:54 AM: ------------------------------------------------------------------------ _Wouldn't we just need to watch 3 parent snapshot directories with the current implementation versus 10k files if we were to monitor snapshot manifests instead?_ {code:java} data1/ks1/tb1/snapshots/snapshot1/_files_ data2/ks1/tb1/snapshots/snapshot1/_files_ data3/ks1/tb1/snapshots/snapshot1/_files_ {code} Now, we watch 3 dirs. {code:java} data1/ks1/tb1/snapshots data2/ks1/tb1/snapshots data3/ks1/tb1/snapshots{code} And in each of the dirs, we react on a deletion of some snapshot dir. So yes, you are actually right, it would be 3 root snapshot dirs and detecting removal of a dir in each vs. 30k snapshot manifests when we have 10k snapshots and we want to introduce a manifest in each. Let me think about this more ... I think that what you suggest is possible to do. was (Author: smiklosovic): _If one snapshot directory was accidentally removed, we don't want to propagate this error to the remaining snapshot directories._ But on the contrary ... how are you going to fix it? So you end up with two dirs out of three with SSTables. So now what? How are you going to restore from that snapshot? Because if you restore it, you just don't get it all? What is a snapshot which does not restore the data as it were good for? The results would be misleading, you would probably need to repair after restore etc ... I am not completely persuaded about your point that we should not delete the snapshots when the manual removal is detected. Because if you do that (or rather, don't), then a user does not have any visibility into what broken snapshots he has. These snapshots, when removed from memory, would be "orphaned", because we cache it all in memory, so there is a disconnection between what is in memory and what is on disk and that is the very reason I was introducing this. How would you make it that a user would be informed about corrupted snapshots which miss one of their data dirs? (after the node is restarted, we can indeed just log when we detect it first). Also, if you have some files missing, the next step we want to do here, I was discussing this with [~frankgh] / [~yifanc] is that we might extend the manifest to contain all the files etc. and then we can do a check (via extension of nodetool listsnapshots or similar) that what is in the manifest is indeed located on the disk in non-corrupted state (checksums same etc). So you can check the consistency of that. How would you detect that you are completely missing one of the directories? What if we removed the dir with the manifest? Then other two would not contain any ... That is probably additional argument to contain the manifest in every data dir. _Wouldn't we just need to watch 3 parent snapshot directories with the current implementation versus 10k files if we were to monitor snapshot manifests instead?_ {code:java} data1/ks1/tb1/snapshots/snapshot1/_files_ data2/ks1/tb1/snapshots/snapshot1/_files_ data3/ks1/tb1/snapshots/snapshot1/_files_ {code} Now, we watch 3 dirs. {code:java} data1/ks1/tb1/snapshots data2/ks1/tb1/snapshots data3/ks1/tb1/snapshots{code} And in each of the dirs, we react on a deletion of some snapshot dir. So yes, you are actually right, it would be 3 root snapshot dirs and detecting removal of a dir in each vs. 30k snapshot manifests when we have 10k snapshots and we want to introduce a manifest in each. EDIT: what we can do if we do not want to delete the data when some snapshot dir would be removed is to just detect the removal (because of the watcher) and on this detection, we would not go to _remove_ the snapshot, which would remove the rest, but we might introduce a flag into TableSnapshot, like "corrupted", and we would just set this flag to that. So when you do "nodetool listsnapshots", there would be additional column saying if this snapshot dir is corrupted or not and you might act on that. The bad news is that this will be all visible until we turn the node off because on the next start, when it will load all the snapshots again, since there is no data in a specific snapshot data dir (because we removed it before), it would have nothing to "detect" again. > Centralize all snapshot operations to SnapshotManager and cache snapshots > ------------------------------------------------------------------------- > > Key: CASSANDRA-18111 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18111 > Project: Cassandra > Issue Type: Improvement > Components: Local/Snapshots > Reporter: Paulo Motta > Assignee: Stefan Miklosovic > Priority: Normal > Fix For: 5.x > > Time Spent: 4h 50m > Remaining Estimate: 0h > > Everytime {{nodetool listsnapshots}} is called, all data directories are > scanned to find snapshots, what is inefficient. > For example, fetching the > {{org.apache.cassandra.metrics:type=ColumnFamily,name=SnapshotsSize}} metric > can take half a second (CASSANDRA-13338). > This improvement will also allow snapshots to be efficiently queried via > virtual tables (CASSANDRA-18102). > In order to do this, we should: > a) load all snapshots from disk during initialization > b) keep a collection of snapshots on {{SnapshotManager}} > c) update the snapshots collection anytime a new snapshot is taken or cleared > d) detect when a snapshot is manually removed from disk. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org