[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110742#comment-15110742 ]
Marcus Eriksson commented on CASSANDRA-9666: -------------------------------------------- [~rstrickland] we ran a few tests in CASSANDRA-10195 - new DTCS and TWCS performance was pretty much the same > Provide an alternative to DTCS > ------------------------------ > > Key: CASSANDRA-9666 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9666 > Project: Cassandra > Issue Type: Improvement > Reporter: Jeff Jirsa > Assignee: Jeff Jirsa > Fix For: 2.1.x, 2.2.x > > Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png > > > DTCS is great for time series data, but it comes with caveats that make it > difficult to use in production (typical operator behaviors such as bootstrap, > removenode, and repair have MAJOR caveats as they relate to > max_sstable_age_days, and hints/read repair break the selection algorithm). > I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices > the tiered nature of DTCS in order to address some of DTCS' operational > shortcomings. I believe it is necessary to propose an alternative rather than > simply adjusting DTCS, because it fundamentally removes the tiered nature in > order to remove the parameter max_sstable_age_days - the result is very very > different, even if it is heavily inspired by DTCS. > Specifically, rather than creating a number of windows of ever increasing > sizes, this strategy allows an operator to choose the window size, compact > with STCS within the first window of that size, and aggressive compact down > to a single sstable once that window is no longer current. The window size is > a combination of unit (minutes, hours, days) and size (1, etc), such that an > operator can expect all data using a block of that size to be compacted > together (that is, if your unit is hours, and size is 6, you will create > roughly 4 sstables per day, each one containing roughly 6 hours of data). > The result addresses a number of the problems with > DateTieredCompactionStrategy: > - At the present time, DTCS’s first window is compacted using an unusual > selection criteria, which prefers files with earlier timestamps, but ignores > sizes. In TimeWindowCompactionStrategy, the first window data will be > compacted with the well tested, fast, reliable STCS. All STCS options can be > passed to TimeWindowCompactionStrategy to configure the first window’s > compaction behavior. > - HintedHandoff may put old data in new sstables, but it will have little > impact other than slightly reduced efficiency (sstables will cover a wider > range, but the old timestamps will not impact sstable selection criteria > during compaction) > - ReadRepair may put old data in new sstables, but it will have little impact > other than slightly reduced efficiency (sstables will cover a wider range, > but the old timestamps will not impact sstable selection criteria during > compaction) > - Small, old sstables resulting from streams of any kind will be swiftly and > aggressively compacted with the other sstables matching their similar > maxTimestamp, without causing sstables in neighboring windows to grow in size. > - The configuration options are explicit and straightforward - the tuning > parameters leave little room for error. The window is set in common, easily > understandable terms such as “12 hours”, “1 Day”, “30 days”. The > minute/hour/day options are granular enough for users keeping data for hours, > and users keeping data for years. > - There is no explicitly configurable max sstable age, though sstables will > naturally stop compacting once new data is written in that window. > - Streaming operations can create sstables with old timestamps, and they'll > naturally be joined together with sstables in the same time bucket. This is > true for bootstrap/repair/sstableloader/removenode. > - It remains true that if old data and new data is written into the memtable > at the same time, the resulting sstables will be treated as if they were new > sstables, however, that no longer negatively impacts the compaction > strategy’s selection criteria for older windows. > Patch provided for : > - 2.1: https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 > - 2.2: https://github.com/jeffjirsa/cassandra/commits/twcs-2.2 > - trunk (post-8099): https://github.com/jeffjirsa/cassandra/commits/twcs > Rebased, force-pushed July 18, with bug fixes for estimated pending > compactions and potential starvation if more than min_threshold tables > existed in current window but STCS did not consider them viable candidates > Rebased, force-pushed Aug 20 to bring in relevant logic from CASSANDRA-9882 -- This message was sent by Atlassian JIRA (v6.3.4#6332)