Hello Alexander, Thank you for the pointer, it looks like this part of the documentation has become outdated. Major compaction does indeed start separate operations for non-overlapping sections of the compaction space, but because of changes in the default configuration, we no longer have a guaranteed splitting of the space into b shards.
More precisely, because (since CASSANDRA-18945) we have a default min_sstable_size of 100MiB, flushes will often result in one sstable that covers the whole token space, thus overlapping with all the other sstables and creating a single overlap region covering the whole token space. Because of this, in many cases the major compaction operations created will be only one. The output will still be split into as many shards as the density of the data set calls for. After CASSANDRA-18802 (which is not part of Cassandra 5 but is committed in trunk), that single operation will still be executed in parallel for every output shard. If you need to have a set minimum parallelism regardless of the size of flushed sstables, try adjusting min_sstable_size to 0 or some value smaller than 100MiB that makes more sense for your usecase. Regards, Branimir ________________________________ From: Alexander Batyrshin <[email protected]> Sent: Friday 28 November 2025 03:59 To: [email protected] <[email protected]> Subject: [EXTERNAL] Cassandra-5 UCS and Major Compaction Hello everyone, I have been testing UCS in Cassandra 5 and noticed that the behavior of major compaction diverges from the documentation. Since I am using the default value of base_shard = 4, I expected 4 compaction tasks to be initiated. However, Hello everyone, I have been testing UCS in Cassandra 5 and noticed that the behavior of major compaction diverges from the documentation. Since I am using the default value of base_shard = 4, I expected 4 compaction tasks to be initiated. However, in my case only a single task was launched, and it included all SSTables in the table. My compaction settings: { 'base_shard_count': ‘4', 'class': 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy', 'scaling_parameters': ‘T4' } Documentation excerpt below: Major compaction Under the working principles of UCS, a major compaction is an operation that compacts together all SSTables with (transitive) overlap, and whose output is split on shard boundaries appropriate for the expected resulting density. In other words, a major compaction will result in b concurrent compactions, each containing all SSTables covered in each of the base shards. The output will be split on shard boundaries whose number depends on the total size of data contained in the shard.
