[ https://issues.apache.org/jira/browse/CASSANDRA-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Jirsa updated CASSANDRA-10496: ----------------------------------- Comment: was deleted (was: [~krummas] - why not make 3 resulting sstables - one earlier, one later, and one exactly on target? The marginal cost of the third sstable seems pretty minor. The benefit seems relatively significant? ) > Make DTCS split partitions based on time during compaction > ---------------------------------------------------------- > > Key: CASSANDRA-10496 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10496 > Project: Cassandra > Issue Type: Improvement > Reporter: Marcus Eriksson > Labels: dtcs > Fix For: 3.x > > > To avoid getting old data in new time windows with DTCS (or related, like > [TWCS|CASSANDRA-9666]), we need to split out old data into its own sstable > during compaction. > My initial idea is to just create two sstables, when we create the compaction > task we state the start and end times for the window, and any data older than > the window will be put in its own sstable. > By creating a single sstable with old data, we will incrementally get the > windows correct - say we have an sstable with these timestamps: > {{[100, 99, 98, 97, 75, 50, 10]}} > and we are compacting in window {{[100, 80]}} - we would create two sstables: > {{[100, 99, 98, 97]}}, {{[75, 50, 10]}}, and the first window is now > 'correct'. The next compaction would compact in window {{[80, 60]}} and > create sstables {{[75]}}, {{[50, 10]}} etc. > We will probably also want to base the windows on the newest data in the > sstables so that we actually have older data than the window. -- This message was sent by Atlassian JIRA (v6.3.4#6332)