[ https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284186#comment-14284186 ]
Branimir Lambov edited comment on CASSANDRA-6809 at 1/20/15 6:47 PM: --------------------------------------------------------------------- Multiple sync threads weren't really supported by the code, but it wasn't very hard to make it work. I updated [the branch|https://github.com/blambov/cassandra/tree/compressed-cl] to not rely on synchronization for all writing and added an option to use more than one thread for compression. With this LZ4 compressed logs can surpass uncompressed even on SSDs (tested with 30ms periodic sync which to me makes much more sense the huge default). The diff is [here|https://github.com/blambov/cassandra/commit/873cc2bc147e4e1e8209e79c60ea5cd295d2da42]; a large portion of it is code indented differently (Is there any way to make github recognize this?). Admittedly this solution doesn't use threads optimally (each thread still waits for its writes to materialize), but IMHO is straightforward and simple. was (Author: blambov): Multiple sync threads weren't really supported by the code, but it wasn't very hard to make it work. I updated [the branch|https://github.com/blambov/cassandra/tree/compressed-cl] to not rely on synchronization for all writing and added an option to use more than one thread for compression. With this LZ4 compressed logs can surpass uncompressed even on SSDs. The diff is [here|https://github.com/blambov/cassandra/commit/873cc2bc147e4e1e8209e79c60ea5cd295d2da42]; a large portion of it is code indented differently (Is there any way to make github recognize this?). Admittedly this solution doesn't use threads optimally (each thread still waits for its writes to materialize), but IMHO is straightforward and simple. > Compressed Commit Log > --------------------- > > Key: CASSANDRA-6809 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6809 > Project: Cassandra > Issue Type: Improvement > Reporter: Benedict > Assignee: Branimir Lambov > Priority: Minor > Labels: performance > Fix For: 3.0 > > Attachments: ComitLogStress.java, logtest.txt > > > It seems an unnecessary oversight that we don't compress the commit log. > Doing so should improve throughput, but some care will need to be taken to > ensure we use as much of a segment as possible. I propose decoupling the > writing of the records from the segments. Basically write into a (queue of) > DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X > MB written to the CL (where X is ordinarily CLS size), and then pack as many > of the compressed chunks into a CLS as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)