Just an FYI - I understand you're using this to do an apples to apples comparison, so it's good to have everything the same in both.
But.... in case you're not aware - 64KB chunk length is a *terrible* setting for performance and cost. Disk is always cheaper than CPU, so for production deployments, I would **NEVER** use 64KB. I've given a few talks on why it's so awful, you can read the JIRA where we changed the default: https://issues.apache.org/jira/browse/CASSANDRA-13241 Jon On Tue, Mar 10, 2026 at 3:23 PM dbms-tech <[email protected]> wrote: > Thanks for taking the time to reply. > > In my lab environment, I just altered the table to simulate LCS. I'll > update with my findings tomorrow. > > INITIAL: compaction = {'class': > 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy', > 'max_sstables_to_compact': '64', 'min_sstable_size': '100MiB', > 'scaling_parameters': 'T4', 'sstable_growth': '0.3333333333333333', > 'target_sstable_size': '1GiB'} > ---- > NEW: COMPACTION = compaction = {'base_shard_count': '8', 'class': > 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy', > 'scaling_parameters': 'L10', 'target_sstable_size': '256MiB'} > > > On Tue, Mar 10, 2026 at 2:42 PM Patrick McFadin <[email protected]> > wrote: > >> It looks like you are comparing Cassandra 2 LCS to Cassandra 5 UCS with >> T4, which is a tiered/STCS-like layout. If so, I would not treat the >> storage increase as an expected Cassandra 5 baseline. Since the compression >> ratio is nearly identical, I’d retest with UCS L10 for a more >> apples-to-apples comparison with LCS before drawing conclusions about disk >> usage. >> >> Patrick >> >> >> On Mon, Mar 9, 2026 at 7:39 AM dbms-tech <[email protected]> wrote: >> >>> Testing a C5 upgrade from C5 in our lab environment as PoC. I migrated a >>> 1.3B rows table from C2 to C5. >>> On C2, table was 7.4 TB large. On C5, it is 9.5 TB. >>> >>> I verified the following ... >>> ============================== >>> - There are no stale snapshots in C5. >>> - C5 size above is exclusively for the table's sstables. No other files >>> reside in the /data directory. >>> - tablestats shows 'SSTable Compression Ratio: 0.25316' for C5 and 0.25 >>> for C2. >>> - I adjusted C5 table to use: chunk_length_in_kb': '64', 'class': >>> 'org.apache.cassandra.io.compress.ZstdCompressor', 'compression_level': '10 >>> to obtain the above compression ratio. >>> - C5 uses default UCS while C2 uses LTS. >>> - C2 used DeflateCompressor. >>> - Compactions are optimal with near zero pending compactions. >>> - Cluster is fully & regularly repaired using Reaper. >>> >>> Inquiries ... >>> - Is C5 (UCS) expected to utilize this much more storage compared to C2 >>> (LCS)? >>> - Other than increasing compression level and chunk_size, what else can >>> be done to match C2 storage? >>> >>> Thanks in advance. >>> >>> ============== >>> C5 >>> ============== >>> CREATE TABLE xxx.yyy ( >>> >>> ddd bigint PRIMARY KEY, >>> xxx boolean, >>> yyy text, >>> zzz text, >>> ddd bigint >>> ) WITH additional_write_policy = '99p' >>> AND allow_auto_snapshot = true >>> AND bloom_filter_fp_chance = 0.01 >>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} >>> AND cdc = false >>> AND comment = '' >>> AND compaction = {'class': >>> 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy', >>> 'max_sstables_to_compact': '64', 'min_sstable_size': '100MiB', >>> 'scaling_parameters': 'T4', 'sstable_growth': '0.3333333333333333', >>> 'target_sstable_size': '1GiB'} >>> AND compression = {'chunk_length_in_kb': '64', 'class': >>> 'org.apache.cassandra.io.compress.ZstdCompressor', 'compression_level': >>> '10'} >>> AND memtable = 'default' >>> AND crc_check_chance = 1.0 >>> AND default_time_to_live = 0 >>> AND extensions = {} >>> AND gc_grace_seconds = 864000 >>> AND incremental_backups = true >>> AND max_index_interval = 2048 >>> AND memtable_flush_period_in_ms = 0 >>> AND min_index_interval = 128 >>> AND read_repair = 'BLOCKING' >>> AND speculative_retry = '99p'; >>> >>> ================= >>> C5 >>> ================== >>> nodetool tablestats xxx.yyy >>> Total number of tables: 1 >>> ---------------- >>> Keyspace: xxx >>> Read Count: 764664474 >>> Read Latency: 1.092679410718903 ms >>> Write Count: 127141592 >>> Write Latency: 0.02513343241761516 ms >>> Pending Flushes: 0 >>> Table: document >>> SSTable count: 783 >>> Old SSTable count: 0 >>> Max SSTable size: 4.664GiB >>> Space used (live): 1208104887289 >>> Space used (total): 1208104887289 >>> Space used by snapshots (total): 0 >>> Off heap memory used (total): 1324105435 >>> SSTable Compression Ratio: 0.25316 >>> Number of partitions (estimate): 126389507 >>> Memtable cell count: 23023 >>> Memtable data size: 424380268 >>> Memtable off heap memory used: 450102571 >>> Memtable switch count: 1242 >>> Speculative retries: 4016043 >>> Local read count: 725885015 >>> Local read latency: 1.153 ms >>> Local write count: 24986436 >>> Local write latency: 0.052 ms >>> Local read/write ratio: 29.05116 >>> Pending flushes: 0 >>> Percent repaired: 0.0 >>> Bytes repaired: 0B >>> Bytes unrepaired: 4.328TiB >>> Bytes pending repair: 0B >>> Bloom filter false positives: 4722014 >>> Bloom filter false ratio: 0.01052 >>> Bloom filter space used: 289491392 >>> Bloom filter off heap memory used: 289485128 >>> Index summary off heap memory used: 0 >>> Compression metadata off heap memory used: 584517736 >>> Compacted partition minimum bytes: 21 >>> Compacted partition maximum bytes: 10090808 >>> Compacted partition mean bytes: 21423 >>> Average live cells per slice (last five minutes): 1.0 >>> Maximum live cells per slice (last five minutes): 1 >>> Average tombstones per slice (last five minutes): 1.0 >>> Maximum tombstones per slice (last five minutes): 1 >>> Droppable tombstone ratio: 0.01064 >>> Top partitions by size (last update: >>> 2026-03-06T16:13:07Z): >>> >>> >>> -- >>> >>> ---------------------------------------- >>> Thank you >>> >>> >>> > > -- > > ---------------------------------------- > Thank you > > >
