[ 
https://issues.apache.org/jira/browse/CASSANDRA-19987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18052700#comment-18052700
 ] 

Sam Lightfoot edited comment on CASSANDRA-19987 at 1/18/26 1:55 PM:
--------------------------------------------------------------------

*Test Scenario:*
 * 6GB cgroup memory limit, 3GB JVM heap
 * 10k/s read workload against 1GB hot dataset
 * Compaction: 2x 64GB SSTables (STCS)
 * Duration: 10 minutes of concurrent compaction + reads 

*Memory Pressure Stall Comparison (PSI metrics):*

 
||Metric||Direct I/O||Buffered I/O||Difference||Notes||
|Mean Stall|25.0 ms/s|32.4 ms/s|+29%| |
|p99 Stall|78.6 ms/s|90.5 ms/s|+15%| |
|Max Stall|117.6 ms/s|176.2 ms/s|+50%| |
|Total Stall Time|15.0s|19.4s|+29%| |
|Spikes >50 ms/s|15|26|+73%| |
|Active_File Std Dev|10.15 MB|24.98 MB|+146%|Active_File is page cache hot list 
size|

*Conclusion:* Direct I/O for compaction reads reduces memory pressure stalls by 
29% and provides 2.5x more stable page cache active list. Buffered I/O pollutes 
the page cache, causing LRU churn that evicts hot data and triggers direct 
reclaim stalls - directly impacting read latency.

{*}Raw data{*}: see _direct_page_cache_compaction_ and 
_buffered_page_cache_compaction_

{*}Command{*}{_}:{_} cat 
/sys/fs/cgroup/system.slice/cassandra-test.scope/memory.pressure


was (Author: JIRAUSER302824):
*Test Scenario:*
 * 6GB cgroup memory limit, 3GB JVM heap
 * 10k/s read workload against 1GB hot dataset
 * Compaction: 2x 64GB SSTables (STCS)
 * Duration: 10 minutes of concurrent compaction + reads 

*Memory Pressure Stall Comparison (PSI metrics):*
||Metric||Direct I/O||Buffered I/O||Difference||Notes||
|Mean Stall|25,029 μs/s|32,410 μs/s|+29%| |
|p99 Stall|78,641 μs/s|90,466 μs/s|+15%| |
|Max Stall|117,595 μs/s|176,150 μs/s|+50%| |
|Total Stall Time|15.0s|19.4s|+29%| |
|Spikes >50k μs/s|15|26|+73%| |
|Active_File Std Dev|10.15 MB|24.98 MB|+146%|Active_File is page cache hot list 
size|

*Conclusion:* Direct I/O for compaction reads reduces memory pressure stalls by 
29% and provides 2.5x more stable page cache active list. Buffered I/O pollutes 
the page cache, causing LRU churn that evicts hot data and triggers direct 
reclaim stalls - directly impacting read latency.

{*}Raw data{*}: see _direct_page_cache_compaction_ and 
_buffered_page_cache_compaction_

{*}Command{*}{_}:{_} cat 
/sys/fs/cgroup/system.slice/cassandra-test.scope/memory.pressure

> Direct IO support for compaction reads
> --------------------------------------
>
>                 Key: CASSANDRA-19987
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19987
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Local/Compaction
>            Reporter: Jon Haddad
>            Assignee: Sam Lightfoot
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: buffered_during_compaction-reads_1m.txt, 
> buffered_during_compaction-reads_5m.txt, buffered_page_cache_compaction, 
> direct_during_compaction-reads_1m.txt, direct_during_compaction-reads_5m.txt, 
> direct_page_cache_compaction, image-2025-12-24-10-20-44-947.png, 
> image-2025-12-24-10-21-11-928.png, image-2025-12-24-10-34-10-834.png
>
>          Time Spent: 4h
>  Remaining Estimate: 0h
>
> If we use direct io to read SSTables during compaction, we can avoid 
> polluting the page cache with data we're about to delete.  As another side 
> effect, we also evict pages to make room for whatever we're putting in.  This 
> unnecessary churn leads to higher CPU overhead and can cause dips in client 
> read latency, as we're going to be evicting pages that could be used to serve 
> those reads.
> This is most notable with STCS as the SSTables get larger, potentially 
> evicting the entire hot dataset out of cache, but is affected by every 
> compaction strategy.
> This is a follow up to be done after CASSANDRA-15452 since we will have an 
> internal buffer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to