[ https://issues.apache.org/jira/browse/CASSANDRA-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305095#comment-14305095 ]
Ariel Weisberg commented on CASSANDRA-8729: ------------------------------------------- There are other reasons to use private memory that maybe aren't so important. For in-memory write workloads you get outliers if you have threads write to a memory mapped files. They did tend to show up in the very long tail P99.99, P99.999. With a dedicated thread draining to the filesystem you can control how much data is buffered when the filesystem is out to lunch. If you write a quick benchmark that just spits out zeroes to a file via write vs a memory mapped file do you see a difference in throughput or CPU utilization? I am skeptical that mmap is actually much faster (or even slower!). > Commitlog causes read before write when overwriting > --------------------------------------------------- > > Key: CASSANDRA-8729 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8729 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Ariel Weisberg > > The memory mapped commit log implementation writes directly to the page > cache. If a page is not in the cache the kernel will read it in even though > we are going to overwrite. > The way to avoid this is to write to private memory, and then pad the write > with 0s at the end so it is page (4k) aligned before writing to a file. > The commit log would benefit from being refactored into something that looks > more like a pipeline with incoming requests receiving private memory to write > in, completed buffers being submitted to a parallelized compression/checksum > step, followed by submission to another thread for writing to a file that > preserves the order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)