[ https://issues.apache.org/jira/browse/CASSANDRA-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305017#comment-14305017 ]
Branimir Lambov commented on CASSANDRA-8729: -------------------------------------------- This particual problem is caused by segment recycling. Mixing segment reuse with memory-mapped writing appears to have been a very bad idea; removing reuse solves the problem immediately. If we insist on reuse, we need to get rid of memory-mapping and necessarily use 4k padding (padding will have a benefit without reuse as well, but not that pronounced). I'm not that sold on the benefits of recycling, though. If you delete a segment file and immediately create a new one with the same size, isn't the OS supposed to reuse the space anyway? I remember that's what they did ~15yrs ago, but things have probably changed. On the other hand I _am_ seeing quite different performance writing to memmapped vs. writing to channel (using a very thin non-compressing version of the compression path of CASSANDRA-6809 with direct buffers). Memmapped appears to allow a ~20% higher throughput on Windows. I think we should get rid of the recycling, and later do the rest of the improvements you list. > Commitlog causes read before write when overwriting > --------------------------------------------------- > > Key: CASSANDRA-8729 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8729 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Ariel Weisberg > > The memory mapped commit log implementation writes directly to the page > cache. If a page is not in the cache the kernel will read it in even though > we are going to overwrite. > The way to avoid this is to write to private memory, and then pad the write > with 0s at the end so it is page (4k) aligned before writing to a file. > The commit log would benefit from being refactored into something that looks > more like a pipeline with incoming requests receiving private memory to write > in, completed buffers being submitted to a parallelized compression/checksum > step, followed by submission to another thread for writing to a file that > preserves the order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)