[jira] [Commented] (CASSANDRA-18111) Cache snapshots in memory

Stefan Miklosovic (Jira) Wed, 19 Jun 2024 01:22:09 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17856211#comment-17856211
 ]


Stefan Miklosovic commented on CASSANDRA-18111:
-----------------------------------------------

{code}
    @Test
    public void testTableMetadataSize()
    {
        TableSnapshot tableSnapshot = new TableSnapshot("my_keyspace-name",
                                                        
"my_table_name_a_litt_bit_longer",
                                                        UUID.randomUUID(),
                                                        
UUID.randomUUID().toString(), // tag
                                                        Instant.now(),
                                                        null,
                                                        Set.of(new 
File("/a/b/d/c/d/d/sdsd/sdsdsdsd/sdsd/sdsdsdsd/sds/ds/ds/dsd"),
                                                               new 
File("/a/b/d/c/d/d/sdsd/sdsdsdsd/sdsd/lklklks /sds/ds/ds/dsd")),
                                                        false);

        long size = meter.measure(tableSnapshot);
        long deepSize = meter.measureDeep(tableSnapshot);

        System.out.println(deepSize); // 768

        TableSnapshot tableSnapshot2 = new TableSnapshot("my_keyspace-name",
                                                        
"my_table_name_a_litt_bit_longer",
                                                        UUID.randomUUID(),
                                                        
UUID.randomUUID().toString(), // tag
                                                        Instant.now(),
                                                        null,
                                                        Set.of(new 
File("/a/b/d/c/d/d/sdsd/sdsdsdsd/sdsd/sdsdsdsd/sds/ds/ds/dsd")),
                                                        false);

        long deepSize2 = meter.measureDeep(tableSnapshot2);

        System.out.println(deepSize2); // 648

        // 328
        System.out.println(meter.measureDeep(Set.of(new 
File("/a/b/d/c/d/d/sdsd/sdsdsdsd/sdsd/sdsdsdsd/sds/ds/ds/dsd"))));
        // 120
        
System.out.println(meter.measureDeep(Set.of("/a/b/d/c/d/d/sdsd/sdsdsdsd/sdsd/sdsdsdsd/sds/ds/ds/dsd")));
    }
{code}

So with two datadirs it takes around 768 bytes, with 1 data dir 648 bytes.

When we manage to store Set of Strings instead of Set of Files, it will 
basically cut the size of that to half so we may say that we save 200 bytes per 
one data dir hence it would be around 500 bytes per snapshot entry if we 
average all things out.

So if I count it correctly then you would put around 200 000 snapshots into 100 
MiB. 20k snapshots 10MiB, 10k snapshots 5MiB. I do not think we need to deal 
with this ... 



> Cache snapshots in memory
> -------------------------
>
>                 Key: CASSANDRA-18111
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18111
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Snapshots
>            Reporter: Paulo Motta
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Everytime {{nodetool listsnapshots}} is called, all data directories are 
> scanned to find snapshots, what is inefficient.
> For example, fetching the 
> {{org.apache.cassandra.metrics:type=ColumnFamily,name=SnapshotsSize}} metric 
> can take half a second (CASSANDRA-13338).
> This improvement will also allow snapshots to be efficiently queried via 
> virtual tables (CASSANDRA-18102).
> In order to do this, we should:
> a) load all snapshots from disk during initialization
> b) keep a collection of snapshots on {{SnapshotManager}}
> c) update the snapshots collection anytime a new snapshot is taken or cleared
> d) detect when a snapshot is manually removed from disk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18111) Cache snapshots in memory

Reply via email to