[ https://issues.apache.org/jira/browse/CASSANDRA-16339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yifan Cai updated CASSANDRA-16339: ---------------------------------- Test and Documentation Plan: Perf. Test Plan # Run data prepopulation # Run steady state load for X seconds as Phase 1. # Make one test case dependent change, e.g. triggering garbagecollect, altering table, etc., and continue the steady state load for Y seconds as Phase 2. # Compare the performance metrics beween Phase 1 and Phase 2. The workload used in the test was generated from tlp-stress. Steady state load Read : Write : Delete == 5 : 4 : 1, and the QPS was kept at 3K/s. Status: Patch Available (was: Open) Result report link: https://github.com/yifan-c/CASSANDRA-15581-COMPACTION-TEST/blob/main/CASSANDRA-16339/7019-Test:%20Perf%20Comparison%20%5BLCS%20-%20provide_overlapping_tombstones%5D.pdf Seen from the result charts, we have those observations after altering 'provided_overlapping_tombstones' == 'row' for the table. * The read latency drops initially but it becomes more unstable. There are larger spikes of the tail latencies, p95 and p99. The avg. latencies also gets higher. * The write latency is about the same. * The number of L0 sstables builds up quickly and it further affects the compaction speed. Since almost all L0 sstables can be used as the shadow sources for GarbageSkipper. The flame graph (attached, flamegraph_garbageskipper.png) confirms that GarbageSkipper occupies the majority of the cpu time. Garbage skipping is a feature that utilizes the *spare* IO capacity to produce more compacted SSTables. We may want to avoid doing the garbage skipping, when the system does not have IO to spare. In the case of LCS, it is when the number of L0 sstables is building up. > LCS steady state load of table with vs. w/o GC performance test > --------------------------------------------------------------- > > Key: CASSANDRA-16339 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16339 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark > Reporter: Yifan Cai > Assignee: Yifan Cai > Priority: Normal > Attachments: flamegraph_grabageskipper.png > > > The testing cluster should be pre-populated with ~200GB data in each node. > The baseline cluster has the table created with > {{provide_overlapping_tombstones}} disabled. The other cluster has the table > with {{provide_overlapping_tombstones == row}}. Compare the read, write and > compaction performance between those 2 clusters. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org