[jira] [Comment Edited] (CASSANDRA-18111) Centralize all snapshot operations to SnapshotManager and cache snapshots

Stefan Miklosovic (Jira) Mon, 22 Jul 2024 00:28:06 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867673#comment-17867673
 ]


Stefan Miklosovic edited comment on CASSANDRA-18111 at 7/22/24 7:27 AM:
------------------------------------------------------------------------

That behaviour makes sense to me. Why would you want to keep around technically 
corrupted snapshot? That snapshot is not valid anymore so why would you want to 
keep it around?

Also, tracking all manifest files and to be notified about their deletion 
increases the number of such watched files by as many data dirs you would have. 
Right now, when there is 10k snapshots, if they were spread across three disks, 
we would need to monitor 30k manifest files instead of just 10k snapshot 
directories. 


was (Author: smiklosovic):
That behaviour makes sense to me. Why would you want to keep around technically 
corrupted snapshot? That snapshot is not valid anymore so why would you want to 
keep it around?

> Centralize all snapshot operations to SnapshotManager and cache snapshots
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18111
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18111
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Snapshots
>            Reporter: Paulo Motta
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Everytime {{nodetool listsnapshots}} is called, all data directories are 
> scanned to find snapshots, what is inefficient.
> For example, fetching the 
> {{org.apache.cassandra.metrics:type=ColumnFamily,name=SnapshotsSize}} metric 
> can take half a second (CASSANDRA-13338).
> This improvement will also allow snapshots to be efficiently queried via 
> virtual tables (CASSANDRA-18102).
> In order to do this, we should:
> a) load all snapshots from disk during initialization
> b) keep a collection of snapshots on {{SnapshotManager}}
> c) update the snapshots collection anytime a new snapshot is taken or cleared
> d) detect when a snapshot is manually removed from disk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-18111) Centralize all snapshot operations to SnapshotManager and cache snapshots

Reply via email to