[ 
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285436#comment-14285436
 ] 

Branimir Lambov commented on CASSANDRA-6809:
--------------------------------------------

The current approach boils down to multiple sync threads which
* form sections at regular time intervals,
* compress the section,
* wait for any previous syncs to have retired,
* write and flush the compressed data,
* retire the sync.
(In the uncompressed case a single thread which skips the second and third 
step.)

Let me try to rephrase what you are saying to make sure I understand it 
correctly:
* single sync thread forms sections at regular time intervals and sends them to 
compression executor/phase (SPMC queue),
* compression task sends completed sections to flush executor/phase (MPSC 
queue, ordering and wait for the first in-flight one required),
* flush task retires syncs in order.

Is this what you mean? Why is this simpler, or of comparable complexity? 
Wouldn't the two extra queues waste resources and increase latency?

Smaller-than-segment batches (sections) are already part of the design in both 
cases, assuming that the sync period is sane (e.g. ~100ms). In both approaches 
there's room to further separate write and flush at the expense of added 
complexity.

> Compressed Commit Log
> ---------------------
>
>                 Key: CASSANDRA-6809
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: ComitLogStress.java, logtest.txt
>
>
> It seems an unnecessary oversight that we don't compress the commit log. 
> Doing so should improve throughput, but some care will need to be taken to 
> ensure we use as much of a segment as possible. I propose decoupling the 
> writing of the records from the segments. Basically write into a (queue of) 
> DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X 
> MB written to the CL (where X is ordinarily CLS size), and then pack as many 
> of the compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to