[ 
https://issues.apache.org/jira/browse/CASSANDRA-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537772#comment-13537772
 ] 

Sylvain Lebresne commented on CASSANDRA-4897:
---------------------------------------------

bq. There needs to be some kind of heuristics when to compact bucket with max 
sized tables

In a way I agree, but that's pretty much what leveled compaction is. I'm of the 
opinion that it might be better to spend time optimizing leveled compaction 
rather than spending time trying to hack size tiered to do things it wasn't 
designed for and that kind of go against its nature.

bq. If oldest table is older then user configured number of hours, then run 
compaction

It'll kind of work, but it's a hack imho that has a number of downside in 
practice. The main goal of compaction is to keep the number of compaction you 
need to look at for a read low at all time. But by configuring a time between 
which sstables are compacted, you don't control very well how much sstable may 
accumulate during that time. Meaning that this setting will be hard to tune 
right for users downright to impossible to tune correctly if your write load 
varies too much over time. And on the other side, if you set that setting too 
low, you will compact sstables regularly even when they don't need to be.

Don't get me wrong, I buy that for certain workload and for certain value of 
maxSSTableSize and 'configured number of hours between compaction', you could 
get something reasonably useful. But we also have to consider the risk of foot 
shooting for users, and I'm not yet bought on the fact that it's acceptable in 
that case.
                
> Allow tiered compaction define max sstable size
> -----------------------------------------------
>
>                 Key: CASSANDRA-4897
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4897
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Radim Kolar
>            Assignee: Radim Kolar
>             Fix For: 1.2.1
>
>         Attachments: cass-maxsize1.txt, cass-maxsize2.txt
>
>
> Lucene is doing same thing. Correctly configured max segment size will 
> recycle old data faster with less diskspace.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to