[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783517#comment-17783517 ]
Stefan Miklosovic edited comment on CASSANDRA-18945 at 11/7/23 12:31 PM: ------------------------------------------------------------------------- So ... this is interesting. It fails the multiplexer of j17_jvm_dtests_vnode_repeat as well as the individual test in j17_jvm_dtests_vnode What is interesting is that it does not fail j17_jvm_dtests_repeat (without vnode). java17_separate_tests which runs j17_jvm_dtests_vnode does not fail it either. I am trying to run j17_jvm_dtests_vnode_repeat for java17_separate_tests if it indeed fails there too. This is the PR against trunk. This is the branch (2) (1) https://app.circleci.com/pipelines/github/instaclustr/cassandra/3443/workflows/97bfb70b-146a-4da8-afaf-7f5909b0492d (2) https://github.com/instaclustr/cassandra/commits/CASSANDRA-18945-trunk EDIT: Yes, j17_jvm_dtests_vnode_repeat on j17 separate tests fails multiplexer too. That test is flaky and has to be reworked. was (Author: smiklosovic): So ... this is interesting. It fails the multiplexer of j17_jvm_dtests_vnode_repeat as well as the individual test in j17_jvm_dtests_vnode What is interesting is that it does not fail j17_jvm_dtests_repeat (without vnode). java17_separate_tests which runs j17_jvm_dtests_vnode does not fail it either. I am trying to run j17_jvm_dtests_vnode_repeat for java17_separate_tests if it indeed fails there too. This is the PR against trunk. This is the branch (2) (1) https://app.circleci.com/pipelines/github/instaclustr/cassandra/3443/workflows/97bfb70b-146a-4da8-afaf-7f5909b0492d (2) https://github.com/instaclustr/cassandra/commits/CASSANDRA-18945-trunk > Unified Compaction Strategy is creating too many sstables > --------------------------------------------------------- > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction > Reporter: Branimir Lambov > Assignee: Ethan Brown > Priority: Normal > Fix For: 5.0-beta > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, > where 𝜆 is a parameter whose value is between 0 and 1. > With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 𝜆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org