[jira] [Comment Edited] (CASSANDRA-18111) Centralize all snapshot operations to SnapshotManager and cache snapshots

Stefan Miklosovic (Jira) Tue, 23 Jul 2024 02:56:01 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867995#comment-17867995
 ]


Stefan Miklosovic edited comment on CASSANDRA-18111 at 7/23/24 9:54 AM:
------------------------------------------------------------------------

_Wouldn't we just need to watch 3 parent snapshot directories with the current 
implementation versus 10k files if we were to monitor snapshot manifests 
instead?_
{code:java}
data1/ks1/tb1/snapshots/snapshot1/_files_
data2/ks1/tb1/snapshots/snapshot1/_files_
data3/ks1/tb1/snapshots/snapshot1/_files_  {code}
Now, we watch 3 dirs. 
{code:java}
data1/ks1/tb1/snapshots
data2/ks1/tb1/snapshots
data3/ks1/tb1/snapshots{code}
And in each of the dirs, we react on a deletion of some snapshot dir.

So yes, you are actually right, it would be 3 root snapshot dirs and detecting 
removal of a dir in each vs. 30k snapshot manifests when we have 10k snapshots 
and we want to introduce a manifest in each.

Let me think about this more ... I think that what you suggest is possible to 
do.


was (Author: smiklosovic):
_If one snapshot directory was accidentally removed, we don't want to propagate 
this error to the remaining snapshot directories._

But on the contrary ... how are you going to fix it? So you end up with two 
dirs out of three with SSTables. So now what? How are you going to restore from 
that snapshot? Because if you restore it, you just don't get it all? What is a 
snapshot which does not restore the data as it were good for? The results would 
be misleading, you would probably need to repair after restore etc ... 

I am not completely persuaded about your point that we should not delete the 
snapshots when the manual removal is detected. Because if you do that (or 
rather, don't), then a user does not have any visibility into what broken 
snapshots he has. These snapshots, when removed from memory, would be 
"orphaned", because we cache it all in memory, so there is a disconnection 
between what is in memory and what is on disk and that is the very reason I was 
introducing this. How would you make it that a user would be informed about 
corrupted snapshots which miss one of their data dirs? (after the node is 
restarted, we can indeed just log when we detect it first).

Also, if you have some files missing, the next step we want to do here, I was 
discussing this with [~frankgh] / [~yifanc] is that we might extend the 
manifest to contain all the files etc. and then we can do a check (via 
extension of nodetool listsnapshots or similar) that what is in the manifest is 
indeed located on the disk in non-corrupted state (checksums same etc). So you 
can check the consistency of that. How would you detect that you are completely 
missing one of the directories? What if we removed the dir with the manifest? 
Then other two would not contain any ... That is probably additional argument 
to contain the manifest in every data dir. 

_Wouldn't we just need to watch 3 parent snapshot directories with the current 
implementation versus 10k files if we were to monitor snapshot manifests 
instead?_
{code:java}
data1/ks1/tb1/snapshots/snapshot1/_files_
data2/ks1/tb1/snapshots/snapshot1/_files_
data3/ks1/tb1/snapshots/snapshot1/_files_  {code}
Now, we watch 3 dirs. 
{code:java}
data1/ks1/tb1/snapshots
data2/ks1/tb1/snapshots
data3/ks1/tb1/snapshots{code}
And in each of the dirs, we react on a deletion of some snapshot dir.

So yes, you are actually right, it would be 3 root snapshot dirs and detecting 
removal of a dir in each vs. 30k snapshot manifests when we have 10k snapshots 
and we want to introduce a manifest in each.

EDIT:

what we can do if we do not want to delete the data when some snapshot dir 
would be removed is to just detect the removal (because of the watcher) and on 
this detection, we would not go to _remove_ the snapshot, which would remove 
the rest, but we might introduce a flag into TableSnapshot, like "corrupted", 
and we would just set this flag to that. So when you do "nodetool 
listsnapshots", there would be additional column saying if this snapshot dir is 
corrupted or not and you might act on that.

The bad news is that this will be all visible until we turn the node off 
because on the next start, when it will load all the snapshots again, since 
there is no data in a specific snapshot data dir (because we removed it 
before), it would have nothing to "detect" again.

> Centralize all snapshot operations to SnapshotManager and cache snapshots
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18111
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18111
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Snapshots
>            Reporter: Paulo Motta
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Everytime {{nodetool listsnapshots}} is called, all data directories are 
> scanned to find snapshots, what is inefficient.
> For example, fetching the 
> {{org.apache.cassandra.metrics:type=ColumnFamily,name=SnapshotsSize}} metric 
> can take half a second (CASSANDRA-13338).
> This improvement will also allow snapshots to be efficiently queried via 
> virtual tables (CASSANDRA-18102).
> In order to do this, we should:
> a) load all snapshots from disk during initialization
> b) keep a collection of snapshots on {{SnapshotManager}}
> c) update the snapshots collection anytime a new snapshot is taken or cleared
> d) detect when a snapshot is manually removed from disk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-18111) Centralize all snapshot operations to SnapshotManager and cache snapshots

Reply via email to