[ https://issues.apache.org/jira/browse/CASSANDRA-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805387#comment-13805387 ]
Tyler Hobbs commented on CASSANDRA-6109: ---------------------------------------- bq. When they make up more than X% do we stop discriminating or merge them only with other cold sstables? I was thinking we would stop discriminating. The logic would basically be this: {noformat} total_reads = sum(sstable.reads_per_sec for sstable in sstables) total_cold_reads = 0 cold_sstables = set() for sstable in sorted(sstables, key=lambda sstable: sstable.reads_per_key_per_sec): if (sstable.reads_per_sec + total_cold_reads) / total_reads < configurable_threshold: cold_sstables.add(sstable) total_cold_reads += sstable.reads_per_sec else: break getBuckets(sstable for sstable in sstables if sstable not in cold_sstables) {noformat} > Consider coldness in STCS compaction > ------------------------------------ > > Key: CASSANDRA-6109 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6109 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: Jonathan Ellis > Assignee: Tyler Hobbs > Fix For: 2.0.2 > > Attachments: 6109-v1.patch, 6109-v2.patch > > > I see two options: > # Don't compact cold sstables at all > # Compact cold sstables only if there is nothing more important to compact > The latter is better if you have cold data that may become hot again... but > it's confusing if you have a workload such that you can't keep up with *all* > compaction, but you can keep up with hot sstable. (Compaction backlog stat > becomes useless since we fall increasingly behind.) -- This message was sent by Atlassian JIRA (v6.1#6144)