[ 
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287853#comment-14287853
 ] 

Branimir Lambov commented on CASSANDRA-6809:
--------------------------------------------

{quote}
* single sync thread forms sections at regular time intervals and sends them to 
compression executor/phase (SPMC queue),
* sync thread waits on futures and syncs each in order
{quote}
I gave your suggestion a day of development, but it still introduces more 
problems than it solves. I took it [this 
far|https://github.com/blambov/cassandra/compare/blambov:compressed-cl...compressed-cl-compressionexecutor].
 It's already significantly more complicated than the option I proposed, and I 
got worse performance and still some uncertainties around recycling and 
shutdown.

Perhaps I did not put this clearly, but I don't see a point introducing a 
trigger for compression other than a sync. Reducing write latency for a 10s 
sync period is of no value whatsoever; with a short period, especially in batch 
mode where it really matters, you wouldn't want to start a compression cycle 
before the batch is completed anyway (if you did, a better solution to the 
problem is to just compress each mutation individually). We have ample 
flexibility in the sync period (time) and segment size (space) to be able to 
use compression efficiently. Granted, this may require documenting different 
defaults for compression, but this is something I would much prefer to live 
with than extra code complexity needed to work around badly chosen parameters.

Assuming sync-only triggering and short periods, your suggestion requires 
decoupling of sync starts from sync completions with a queue of sync requests 
in flight. That's what I implemented in the code above. Am I doing something 
wrong?


Going back to the previous approach (updated to fix problem with sync possibly 
completing earlier than it should),

bq. We're now no longer honouring the sync interval; we are syncing more 
frequently, which may reduce disk throughput. The exact time of syncing in 
relation to each other may also vary, likely into lock-step under saturation, 
so that there may be short periods of many competing syncs potentially yielding 
pathological disk behaviour, and introducing competition for the synchronized 
blocks inside the segments, in effect introducing a MPMC queue, eliminating 
those few micros of benefit.

The sync frequency is as specified, the intervals will vary, but writes to disk 
are still serial so disks should behave normally. There will be competition on 
waitForSync if compression is constantly late but, as you say, in this case the 
magnitude of the overheads is too small to matter. A bigger problem is that I 
can imagine a pathological situation where only one thread is doing work if the 
others have nothing to do but also become late waiting for it, and start the 
next cycle at the same time.

> Compressed Commit Log
> ---------------------
>
>                 Key: CASSANDRA-6809
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: ComitLogStress.java, logtest.txt
>
>
> It seems an unnecessary oversight that we don't compress the commit log. 
> Doing so should improve throughput, but some care will need to be taken to 
> ensure we use as much of a segment as possible. I propose decoupling the 
> writing of the records from the segments. Basically write into a (queue of) 
> DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X 
> MB written to the CL (where X is ordinarily CLS size), and then pack as many 
> of the compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to