[ https://issues.apache.org/jira/browse/CASSANDRA-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114977#comment-15114977 ]
Marcus Eriksson commented on CASSANDRA-11060: --------------------------------------------- also see CASSANDRA-11056 > Allow DTCS old SSTable filtering to use min timestamp instead of max > -------------------------------------------------------------------- > > Key: CASSANDRA-11060 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11060 > Project: Cassandra > Issue Type: Improvement > Reporter: Sam Bisbee > Labels: dtcs > > We have observed a DTCS behavior when using TTLs where SSTables are never or > very rarely fully expired due to compaction, allowing expired data to be > "stuck" in large partially expired SSTables. > This is because compaction filtering is performed on the max timestamp, which > continues to grow as SSTables are compacted together. This means they will > never move past max_sstable_age_days. With a sufficiently large TTL, like 30 > days, this allows old but not expired SSTables to continue combining and > never become fully expired, even with a max_sstable_age_days of 1. > As a result we have seen expired data hang around in large SSTables for over > six months longer than it should have. This is obviously wasteful and a disk > capacity issue. > As a result we have been running an extended version of DTCS called MTCS in > some deployments. The only change is that it uses min timestamp instead of > max for compaction filtering (filterOldSSTables()). This allows SSTables to > move beyond max_sstable_age_days and stop compacting, which means the entire > SSTable can become fully expired and be dropped off disk as intended. > You can see and test MTCS here: https://github.com/threatstack/mtcs > I am not advocating that MTCS be its own stand alone compaction strategy. > However, I would like to see a configuration option for DTCS that allows you > to specify whether old SSTables should be filtered on min or max timestamp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)