[ https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821388#comment-17821388 ]
Dipietro Salvatore commented on CASSANDRA-19429: ------------------------------------------------ 2. Test without compaction after writes and nodetool disableautocompaction Cmd: {code:java} bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && bin/cqlsh -e 'drop keyspace if exists keyspace1;' && bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write n=10000000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log -graph file=cload.html && bin/nodetool disableautocompaction && sleep 30s && tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m cl=ONE -rate threads=100 -node localhost -log file=result.log -graph file=graph.html ## Compact and re-run it bin/nodetool compact keyspace1 && sleep 30s && tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m cl=ONE -rate threads=100 -node localhost -log file=result.log -graph file=graph.html |& tee stress.txt {code} Results using Ubuntu22.04 on r8g.24xlarge with stress test colocated on the same instance : * 4.1.3 released: {code:java} Results: Op rate : 135,805 op/s [READ: 122,231 op/s, WRITE: 13,574 op/s] Partition rate : 135,805 pk/s [READ: 122,231 pk/s, WRITE: 13,574 pk/s] Row rate : 135,805 row/s [READ: 122,231 row/s, WRITE: 13,574 row/s] Latency mean : 0.7 ms [READ: 0.8 ms, WRITE: 0.2 ms] Latency median : 0.6 ms [READ: 0.7 ms, WRITE: 0.1 ms] Latency 95th percentile : 1.9 ms [READ: 2.0 ms, WRITE: 0.2 ms] Latency 99th percentile : 2.6 ms [READ: 2.6 ms, WRITE: 0.3 ms] Latency 99.9th percentile : 7.0 ms [READ: 7.2 ms, WRITE: 1.3 ms] Latency max : 51.3 ms [READ: 51.3 ms, WRITE: 48.8 ms] Total partitions : 81,488,855 [READ: 73,343,700, WRITE: 8,145,155] Total errors : 0 [READ: 0, WRITE: 0] Total GC count : 1,583 Total GC memory : 2522.153 GiB Total GC time : 7.4 seconds Avg GC time : 4.7 ms StdDev GC time : 2.2 ms Total operation time : 00:10:00 ## Compact and re-run it bin/nodetool compact keyspace1 && sleep 30s && tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m cl=ONE -rate threads=100 -node localhost -log file=result.log -graph file=graph.html |& tee stress.txt ... Results: Op rate : 136,878 op/s [READ: 123,177 op/s, WRITE: 13,701 op/s] Partition rate : 136,878 pk/s [READ: 123,177 pk/s, WRITE: 13,701 pk/s] Row rate : 136,878 row/s [READ: 123,177 row/s, WRITE: 13,701 row/s] Latency mean : 0.7 ms [READ: 0.8 ms, WRITE: 0.2 ms] Latency median : 0.6 ms [READ: 0.7 ms, WRITE: 0.1 ms] Latency 95th percentile : 1.9 ms [READ: 2.0 ms, WRITE: 0.2 ms] Latency 99th percentile : 2.6 ms [READ: 2.6 ms, WRITE: 0.3 ms] Latency 99.9th percentile : 6.5 ms [READ: 6.7 ms, WRITE: 1.2 ms] Latency max : 52.6 ms [READ: 52.6 ms, WRITE: 50.2 ms] Total partitions : 82,197,489 [READ: 73,969,820, WRITE: 8,227,669] Total errors : 0 [READ: 0, WRITE: 0] Total GC count : 1,395 Total GC memory : 2225.329 GiB Total GC time : 6.6 seconds Avg GC time : 4.7 ms StdDev GC time : 2.2 ms Total operation time : 00:10:00{code} * 4.1.3 with patch: {code:java} Results: Op rate : 241,176 op/s [READ: 217,059 op/s, WRITE: 24,117 op/s] Partition rate : 241,176 pk/s [READ: 217,059 pk/s, WRITE: 24,117 pk/s] Row rate : 241,176 row/s [READ: 217,059 row/s, WRITE: 24,117 row/s] Latency mean : 0.4 ms [READ: 0.4 ms, WRITE: 0.2 ms] Latency median : 0.3 ms [READ: 0.3 ms, WRITE: 0.1 ms] Latency 95th percentile : 0.7 ms [READ: 0.7 ms, WRITE: 0.2 ms] Latency 99th percentile : 0.8 ms [READ: 0.8 ms, WRITE: 0.3 ms] Latency 99.9th percentile : 7.2 ms [READ: 7.3 ms, WRITE: 5.1 ms] Latency max : 5003.8 ms [READ: 5,003.8 ms, WRITE: 48.5 ms] Total partitions : 144,931,367 [READ: 130,438,344, WRITE: 14,493,023] Total errors : 0 [READ: 0, WRITE: 0] Total GC count : 4,186 Total GC memory : 6673.759 GiB Total GC time : 23.3 seconds Avg GC time : 5.6 ms StdDev GC time : 3.7 ms Total operation time : 00:10:00 ## Compact and re-run it bin/nodetool compact keyspace1 && sleep 30s && tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m cl=ONE -rate threads=100 -node localhost -log file=result.log -graph file=graph.html |& tee stress.txt ... Results: Op rate : 232,130 op/s [READ: 208,904 op/s, WRITE: 23,226 op/s] Partition rate : 232,130 pk/s [READ: 208,904 pk/s, WRITE: 23,226 pk/s] Row rate : 232,130 row/s [READ: 208,904 row/s, WRITE: 23,226 row/s] Latency mean : 0.4 ms [READ: 0.5 ms, WRITE: 0.2 ms] Latency median : 0.3 ms [READ: 0.3 ms, WRITE: 0.1 ms] Latency 95th percentile : 0.7 ms [READ: 0.7 ms, WRITE: 0.2 ms] Latency 99th percentile : 0.8 ms [READ: 0.8 ms, WRITE: 0.3 ms] Latency 99.9th percentile : 6.7 ms [READ: 6.8 ms, WRITE: 4.8 ms] Latency max : 5003.8 ms [READ: 5,003.8 ms, WRITE: 43.2 ms] Total partitions : 139,378,987 [READ: 125,433,395, WRITE: 13,945,592] Total errors : 0 [READ: 0, WRITE: 0] Total GC count : 3,910 Total GC memory : 6243.924 GiB Total GC time : 19.8 seconds Avg GC time : 5.1 ms StdDev GC time : 3.5 ms Total operation time : 00:10:00 {code} Consistent results between the two different runs (first and after the compaction). 1.7x better performance with the patch but lower than expected probably due to locks generated by `org/apache/cassandra/io/sstable/format/big/BigTableReader.getPosition` to `com/github/benmanes/caffeine/cache/BoundedLocalCache.performCleanUp` (github: [https://github.com/ben-manes/caffeine/blob/94d7c8aff9cb2c970a7ffbb2c489f267cd42d7ff/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java#L1657-L1665)] !Screenshot 2024-02-27 at 11.29.41.png! > Remove lock contention generated by getCapacity function in SSTableReader > ------------------------------------------------------------------------- > > Key: CASSANDRA-19429 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19429 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable > Reporter: Dipietro Salvatore > Assignee: Dipietro Salvatore > Priority: Normal > Fix For: 4.0.x, 4.1.x > > Attachments: Screenshot 2024-02-26 at 10.27.10.png, Screenshot > 2024-02-27 at 11.29.41.png, asprof_cass4.1.3__lock_20240216052912lock.html > > Time Spent: 20m > Remaining Estimate: 0h > > Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock > acquires is measured in the `getCapacity` function from > `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 > seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), > this limits the CPU utilization of the system to under 50% when testing at > full load and therefore limits the achieved throughput. > Removing the lock contention from the SSTableReader.java file by replacing > the call to `getCapacity` with `size` achieves up to 2.95x increase in > throughput on r8g.24xlarge and 2x on r7i.24xlarge: > |Instance type|Cass 4.1.3|Cass 4.1.3 patched| > |r8g.24xlarge|168k ops|496k ops (2.95x)| > |r7i.24xlarge|153k ops|304k ops (1.98x)| > > Instructions to reproduce: > {code:java} > ## Requirements for Ubuntu 22.04 > sudo apt install -y ant git openjdk-11-jdk > ## Build and run > CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && > CASSANDRA_USE_JDK11=true ant stress-build && rm -rf data && bin/cassandra -f > -R > # Run > bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \ > bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \ > bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write > n=10000000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log > -graph file=cload.html && \ > bin/nodetool compact keyspace1 && sleep 30s && \ > tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m > cl=ONE -rate threads=406 -node localhost -log file=result.log -graph > file=graph.html > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org