[ https://issues.apache.org/jira/browse/CASSANDRA-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17943938#comment-17943938 ]
Jordan West commented on CASSANDRA-15452: ----------------------------------------- [~benedict] looking at it closer, I do see how what you initially described you saw in traces could happen. I don't think its super consequential, we are allocating and de-allocating when we close the file but I agree with your statement "we should probably first confirm we have a buffer to deallocate before we do any cleanup". Its a wasteful allocation of a decent size buffer. The issue is the call to {{getBlock()}} instead of {{block()}} and then checking we have an allocated buffer. I believe I have a working patch – doing some more testing – but we could probably open a new JIRA for this and address it there as its not urgent. Thoughts? > Improve disk access patterns during compaction and range reads > -------------------------------------------------------------- > > Key: CASSANDRA-15452 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15452 > Project: Apache Cassandra > Issue Type: Improvement > Components: Legacy/Local Write-Read Paths, Local/Compaction > Reporter: Jon Haddad > Assignee: Jordan West > Priority: Normal > Fix For: 5.0.4, 5.1 > > Attachments: ci_summary_jrwest_jwest-15452-5.0_153.html, everyfs.txt, > image-2024-11-22-16-17-23-194.png, image-2025-01-07-16-04-23-909.png, > image-2025-01-07-16-56-12-853.png, image-2025-01-07-16-57-29-134.png, > iostat-5.0-head.output, iostat-5.0-patched.output, iostat-ebs-15452.png, > iostat-ebs-head.png, iostat-instance-15452.png, iostat-instance-head.png, > results.txt, results_details_jrwest_jwest-15452-5.0_153.tar.xz, > screenshot-1.png, screenshot-2.png, screenshot-3.png, screenshot-4.png, > screenshot-5.png, screenshot-6.png, sequential.fio, throughput-1.png, > throughput.png > > Time Spent: 9h 40m > Remaining Estimate: 0h > > On read heavy workloads Cassandra performs much better when using a low read > ahead setting. In my tests I've seen an 5x improvement in throughput and > more than a 50% reduction in latency. However, I've also observed that it > can have a negative impact on compaction and streaming throughput. It > especially negatively impacts cloud environments where small reads incur high > costs in IOPS due to tiny requests. > # We should investigate using POSIX_FADV_DONTNEED on files we're compacting > to see if we can improve performance and reduce page faults. > # This should be combined with an internal read ahead style buffer that > Cassandra manages, similar to a BufferedInputStream but with our own > machinery. This buffer should read fairly large blocks of data off disk at > at time. EBS, for example, allows 1 IOP to be up to 256KB. A considerable > amount of time is spent in blocking I/O during compaction and streaming. > Reducing the frequency we read from disk should speed up all sequential I/O > operations. > # We can reduce system calls by buffering writes as well, but I think it > will have less of an impact than the reads -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org