[ https://issues.apache.org/jira/browse/CASSANDRA-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784245#comment-13784245 ]
Jonathan Ellis commented on CASSANDRA-6109: ------------------------------------------- I guess whether hotness or overlap is a more important criterion depends on your goal: # prioritizing by hotness helps speed reads up more, especially when you have a lot of cold data sitting around # prioritizing by overlap ratio reduces disk space and helps throw away obsolete cells faster I was hoping to tackle #1 here, but maybe that needs a separate strategy a la CASSANDRA-5560. For #2, CASSANDRA-5906 adds a HyperLogLog component that does a fantastic job of letting us estimate overlap ratios. > Consider coldness in STCS compaction > ------------------------------------ > > Key: CASSANDRA-6109 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6109 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: Jonathan Ellis > Assignee: Tyler Hobbs > Fix For: 2.0.2 > > > I see two options: > # Don't compact cold sstables at all > # Compact cold sstables only if there is nothing more important to compact > The latter is better if you have cold data that may become hot again... but > it's confusing if you have a workload such that you can't keep up with *all* > compaction, but you can keep up with hot sstable. (Compaction backlog stat > becomes useless since we fall increasingly behind.) -- This message was sent by Atlassian JIRA (v6.1#6144)