[ 
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340607#comment-14340607
 ] 

Branimir Lambov commented on CASSANDRA-6809:
--------------------------------------------

New version uploaded to [the same github 
branch|https://github.com/apache/cassandra/compare/trunk...blambov:6809-compressed-logs].
 Includes removal of recycling as agreed in CASSANDRA-8771, as the compressed 
path was using unnecessarily large amounts of memory for prepared segments to 
write in.

bq. I am more comfortable with the single threaded task generation in 
Benedict's diff. I would rather see that or no multi-threading for now.

Rolled back the multithreaded sync capability.

bq. Cap the size of the buffer pool to a fixed number of buffers. If you end up 
allocating more than that number, free the memory rather then pooling.

Done now. Previously the size of allocated memory was defined by 
commitlog_segment_size_in_mb, which unnecessarily connected two unrelated 
limits. Removing recycling fixes this problem.

bq. I’d still like to see the JSON field in commit log descriptor be reusable 
for other config parameters.

Done. Caused a change in CommitLogTest as the descriptor is now larger.

bq. In CommitLogSegment constructor it is no longer truncating the file, why is 
it no longer necessary?

As the size is not preset for {{CompressedSegment}}, it makes no sense to 
truncate/resize at open. The resize was thus moved to {{MemoryMappedSegment}}; 
it no longer needs to truncate, but allocates the space for the file. I am not 
sure it is really necessary (Windows at least resizes file on memmapping), but 
the Java docs do not guarantee mapping will succeed if the file is not big 
enough so I left it in place.

I'd appreciate a test on Linux to confirm CASSANDRA-8729 is no longer a problem 
with uncompressed logs after this patch.

{quote}
* In cassandra.yaml, for commitlog_compression_threads can you document that 
the default is 1?
* awaitTermination changed from block forever to block for 3600 seconds. I have 
never been a fan of ExecutorService's async termination. If it was supposed to 
finish it's last task it should terminate and if it doesn't then something is 
wrong.
{quote}
No longer relevant.

> Compressed Commit Log
> ---------------------
>
>                 Key: CASSANDRA-6809
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: docs-impacting, performance
>             Fix For: 3.0
>
>         Attachments: ComitLogStress.java, logtest.txt
>
>
> It seems an unnecessary oversight that we don't compress the commit log. 
> Doing so should improve throughput, but some care will need to be taken to 
> ensure we use as much of a segment as possible. I propose decoupling the 
> writing of the records from the segments. Basically write into a (queue of) 
> DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X 
> MB written to the CL (where X is ordinarily CLS size), and then pack as many 
> of the compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to