[
https://issues.apache.org/jira/browse/CASSANDRA-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aleksey Yeschenko updated CASSANDRA-10249:
------------------------------------------
Fix Version/s: 2.2.x
> Make buffered read size configurable
> ------------------------------------
>
> Key: CASSANDRA-10249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Albert P Tobey
> Assignee: Albert P Tobey
> Fix For: 2.1.x, 2.2.x
>
> Attachments: Screenshot 2015-09-11 09.32.04.png, Screenshot
> 2015-09-11 09.34.10.png, patched-2.1.9-dstat-lvn10.png,
> stock-2.1.9-dstat-lvn10.png, yourkit-screenshot.png
>
>
> On read workloads, Cassandra 2.1 reads drastically more data than it emits
> over the network. This causes problems throughput the system by wasting disk
> IO and causing unnecessary GC.
> I have reproduce the issue on clusters and locally with a single instance.
> The only requirement to reproduce the issue is enough data to blow through
> the page cache. The default schema and data size with cassandra-stress is
> sufficient for exposing the issue.
> With stock 2.1.9 I regularly observed anywhere from 300:1 to 500
> disk:network ratio. That is to say, for 1MB/s of network IO, Cassandra was
> doing 300-500MB/s of disk reads, saturating the drive.
> After applying this patch for standard IO mode
> https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around
> 100:1 on my local test rig. Latency improved considerably and GC became a lot
> less frequent.
> I tested with 512 byte reads as well, but got the same performance, which
> makes sense since all HDD and SSD made in the last few years have a 4K block
> size (many of them lie and say 512).
> I'm re-running the numbers now and will post them tomorrow.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)