[ 
https://issues.apache.org/jira/browse/CASSANDRA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015469#comment-13015469
 ] 

Sylvain Lebresne commented on CASSANDRA-2156:
---------------------------------------------

I think this will quite a useful patch.

Dividing the total compaction rate by the number of active compaction to 
determine each given active compaction rate may be a bit coarse-grained in some 
situations, but it's also probably good enough and I'm fine letting that as 
further improvement if it happens that it needs to be improved.

Also, we may want ultimately to throttle cleanup compaction too and maybe have 
a specific rate for validation compaction. But I'm fine having it as another 
ticket.

A few comments:
 * A MB is 1024 * 1024 bytes, and a ms is 1000 seconds. I think the definition 
of CompactionIterator.THROTTLE_BYTES_PER_MS takes liberties with standard units 
:).
 * We should really allow 0 for the compaction rate to deactivate throttling 
(and that should really throttle() completely), if only because bugs exist.
 * To have compaction rate changeable live would be pretty cool and it's super 
easy (an AtomicInteger for THROTTLE_BYTES_PER_MS with some jmx call in 
CompactionManager to change it should be enough), so let's do it now.
 * In theory, there is a risk of division by 0 because targetBytesPerMs can be 
0. Granted this is more than unlikely given that the minimum value for THROTTLE 
is 1024, but nevertheless, let's be on the safe side.
 * In the same idea, excessBytes can be negative. Pretty sure sleep just 
assumes that any negative number is 0, but it would be better to actually check 
for all those limit case.
 * I'd also be in favor of having the logging in changes of targetByteInMS at 
debug level. Because there'll be one message each time you start a compaction 
and n messages each time the number of active compaction change and we'll print 
them even though we doesn't throttle anything, so it will be noise for most 
people. Anyway, really no big deal.


> Compaction Throttling
> ---------------------
>
>                 Key: CASSANDRA-2156
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 
> 0005-Throttle-total-compaction-to-a-configurable-throughput.txt, 
> for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt, 
> for-0.6-0002-Make-compaction-throttling-configurable.txt
>
>
> Compaction is currently relatively bursty: we compact as fast as we can, and 
> then we wait for the next compaction to be possible ("hurry up and wait").
> Instead, to properly amortize compaction, you'd like to compact exactly as 
> fast as you need to to keep the sstable count under control.
> For every new level of compaction, you need to increase the rate that you 
> compact at: a rule of thumb that we're testing on our clusters is to 
> determine the maximum number of buckets a node can support (aka, if the 15th 
> bucket holds 750 GB, we're not going to have more than 15 buckets), and then 
> multiply the flush throughput by the number of buckets to get a minimum 
> compaction throughput to maintain your sstable count.
> Full explanation: for a min compaction threshold of {{T}}, the bucket at 
> level {{N}} can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of 
> data on disk). Every time a new unit is added, it has a {{1/SsubN}} chance of 
> causing the bucket at level N to fill. If the bucket at level N fills, it 
> causes {{SsubN}} units to be compacted. So, for each active level in your 
> system you have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any 
> time a new unit is added.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to