Hi,

I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by
changing to disk_access_mode:mmap_index_only. In a particular benchmark I
got 5x more read throughput on 3.11.x with disk_access_mode:
mmap_index_only vs disk_access_mode: auto/mmap.

Changing disk_access_mode to mmap_index_only seems to be a common
recommendation on forums[1][2][3][4] and slack (find by searching
disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).

It's not clear to me when using the default disk_access_mode:auto/mmap is
beneficial, perhaps only when the read set fits in memory? Mick seems to
think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost
and should be only used when warranted. However it's not uncommon to see
people being bitten with OOMs or lower read performance due to the default
disk_access_mode, so it makes me think it's not the best fool-proof default.

Should we consider changing default "auto" behavior of "disk_access_mode"
to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer
and perhaps more performant?

Thanks,

Paulo

[1]
https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
[2] https://phabricator.wikimedia.org/T137419
[3] https://stackoverflow.com/a/55975471
[4]
https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
[5] https://issues.apache.org/jira/browse/CASSANDRA-15531

Reply via email to