[ 
https://issues.apache.org/jira/browse/CASSANDRA-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153049#comment-16153049
 ] 

mck edited comment on CASSANDRA-10496 at 9/5/17 2:28 AM:
---------------------------------------------------------

[~iksaif],
a few comments:
 - i suspect [~krummas] is keen to see a patch that splits partitions. 
 - changing locations isn't supported. see how i paired it with the writer in 
my experiment above.
 - Marcus' original idea was to create only two sstables per TWCS window. is 
that still possible?
 - shouldn't the bucket be based of the maxTimestamp? see `getBuckets(..)` and 
`newestBucket(..)`
 - is it correct that the idea is as "old" sstables are split out they would 
later then get re-compacted with their original bucket, and the domino effect 
that this could cause re-compacting older buckets could be avoided by 
increasing minThreshold to 3?



was (Author: michaelsembwever):
[~iksaif],
a few comments:
 - i suspect [~krummas] is keen to see a patch that splits partitions. 
 - changing locations isn't supported. see how i paired it with the writer in 
my experiment above.
 - i don't think you want to create the SSTableWriters multiple times.
 - Marcus' original idea was to create only two sstables per TWCS window. is 
that still possible?
 - shouldn't the bucket be based of the maxTimestamp? see `getBuckets(..)` and 
`newestBucket(..)`
 - is it correct that the idea is as "old" sstables are split out they would 
later then get re-compacted with their original bucket, and the domino effect 
that this could cause re-compacting older buckets could be avoided by 
increasing minThreshold to 3?


> Make DTCS/TWCS split partitions based on time during compaction
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-10496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10496
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>              Labels: dtcs
>             Fix For: 4.x
>
>
> To avoid getting old data in new time windows with DTCS (or related, like 
> [TWCS|CASSANDRA-9666]), we need to split out old data into its own sstable 
> during compaction.
> My initial idea is to just create two sstables, when we create the compaction 
> task we state the start and end times for the window, and any data older than 
> the window will be put in its own sstable.
> By creating a single sstable with old data, we will incrementally get the 
> windows correct - say we have an sstable with these timestamps:
> {{[100, 99, 98, 97, 75, 50, 10]}}
> and we are compacting in window {{[100, 80]}} - we would create two sstables:
> {{[100, 99, 98, 97]}}, {{[75, 50, 10]}}, and the first window is now 
> 'correct'. The next compaction would compact in window {{[80, 60]}} and 
> create sstables {{[75]}}, {{[50, 10]}} etc.
> We will probably also want to base the windows on the newest data in the 
> sstables so that we actually have older data than the window.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to