[ 
https://issues.apache.org/jira/browse/CASSANDRA-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17855878#comment-17855878
 ] 

Stefan Miklosovic commented on CASSANDRA-18111:
-----------------------------------------------

In the description we say:

_For example, fetching the 
org.apache.cassandra.metrics:type=ColumnFamily,name=SnapshotsSize metric can 
take half a second (CASSANDRA-13338)._ 

The reason this is happening is not because we list the snapshots every time. 
It is because even we cache a snapshot representation in a memory 
(TableSnapshot, which just describes the basic metadata), when we are going to 
e.g. nodetool listsnapshots, there are two columns in the output, "True size" 
and "Size on disk". Both of them are going to the disk when reading the sizes 
hence this has nothing to do with what we "cache".

Snapshots are composed of hard-linked sstables. The true snapshot size should 
only include snapshot files which do not contain a corresponding "live" sstable 
file. "Size on disk" is just the sum of dir sizes where that snapshot is, 
regardless what is hardlinked or not.

To not go to the disk at all, we would need to start to track all files in the 
snapshot in the manifest, with the sizes as well, upon its creation, and then 
just provide the sum of these files from the manifest. So when we cache the 
manifest and we want to know the sizes, we would just sum it up in the memory 
again, not looking at the disk.

This would require us to extend the manifest.json which has this content atm:

{code}
{
  "files" : [ "oa-1-big-Data.db" ],
  "created_at" : "2024-06-18T10:49:22.789Z",
  "expires_at" : null,
  "ephemeral" : false
}
{code}

I am not sure why there is only db file included, probably because that acts as 
the primary contributor to the snapshot size and all other files are negligible 
when it comes to their sizes compared to db file. Neverthless, we should add 
all files there. On the other hand, all files for that logical SSTables are in 
TOC.txt file for that SSTable. I am more of a fan to explicitly enumerate all 
files in manifest.json with their sizes too, otherwise we would need to go to 
the disk again by looking into what files there are from TOC and again 
calculate their sizes ... not good.

More important question is whether this was modelled like that on purpose - to 
go to the disk every single time - because if you think about that ... if 
somebody goes to data dir and removes some files from the snapshot, once we 
start to compute and hold the sizes in the manifest, then these numbers will 
not match the actual state anymore. 

> Cache snapshots in memory
> -------------------------
>
>                 Key: CASSANDRA-18111
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18111
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Snapshots
>            Reporter: Paulo Motta
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Everytime {{nodetool listsnapshots}} is called, all data directories are 
> scanned to find snapshots, what is inefficient.
> For example, fetching the 
> {{org.apache.cassandra.metrics:type=ColumnFamily,name=SnapshotsSize}} metric 
> can take half a second (CASSANDRA-13338).
> This improvement will also allow snapshots to be efficiently queried via 
> virtual tables (CASSANDRA-18102).
> In order to do this, we should:
> a) load all snapshots from disk during initialization
> b) keep a collection of snapshots on {{SnapshotManager}}
> c) update the snapshots collection anytime a new snapshot is taken or cleared
> d) detect when a snapshot is manually removed from disk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to