[ 
https://issues.apache.org/jira/browse/CASSANDRA-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305095#comment-14305095
 ] 

Ariel Weisberg commented on CASSANDRA-8729:
-------------------------------------------

There are other reasons to use private memory that maybe aren't so important. 
For in-memory write workloads you get outliers if you have threads write to a 
memory mapped files. They did tend to show up in the very long tail P99.99, 
P99.999. With a dedicated thread draining to the filesystem you can control how 
much data is buffered when the filesystem is out to lunch.

If you write a quick benchmark that just spits out zeroes to a file via write 
vs a memory mapped file do you see a difference in throughput or CPU 
utilization? I am skeptical that mmap is actually much faster (or even slower!).

> Commitlog causes read before write when overwriting
> ---------------------------------------------------
>
>                 Key: CASSANDRA-8729
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8729
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Ariel Weisberg
>
> The memory mapped commit log implementation writes directly to the page 
> cache. If a page is not in the cache the kernel will read it in even though 
> we are going to overwrite.
> The way to avoid this is to write to private memory, and then pad the write 
> with 0s at the end so it is page (4k) aligned before writing to a file.
> The commit log would benefit from being refactored into something that looks 
> more like a pipeline with incoming requests receiving private memory to write 
> in, completed buffers being submitted to a  parallelized compression/checksum 
> step, followed by submission to another thread for writing to a file that 
> preserves the order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to