[ https://issues.apache.org/jira/browse/CASSANDRA-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aleksey Yeschenko reassigned CASSANDRA-5561: -------------------------------------------- Assignee: Aleksey Yeschenko > Compaction strategy that minimizes re-compaction of old/frozen data > ------------------------------------------------------------------- > > Key: CASSANDRA-5561 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5561 > Project: Cassandra > Issue Type: Improvement > Components: Core > Affects Versions: 1.2.3 > Reporter: Tupshin Harper > Assignee: Aleksey Yeschenko > Fix For: 2.0.1 > > > Neither LCS nor STCS are good for data that becomes immutable over time. The > most obvious case is for time-series data where the application can guarantee > that out-of-order delivery (to Cassandra) of events can't take place more > than N minutes/seconds/hours/days have elapsed after the real (wall time). > There are various approaches that could involve paying attention to the row > keys (if they include a time component) and/or the column name (if they are > TimeUUID or Integer based and are inherently time-ordered), but it might be > sufficient to just look at the timestamp of the columns themselves. > A possible approach: > 1) Define an optional max-out-of-order window on a per-CF basis. > 2) Use normal (LCS or STCS) compaction strategy for any SSTables that include > any columns younger than max-out-of-order-delivery). > 3) Use alternate compaction strategy (will call it TWCS time window > compaction strategy for now) for any SSTables that only contain columns older > than max-out-of-order-delivery. > 4) TWCS will only compact sstables containing data older than > max-out-of-order-delivery. > 5) TWCS will only perform compaction to reduce row fragmentation (if there is > any by the time it gets to TWCS or to reduce the number of small sstables. > 6) To minimize re-compaction in TWCS, it should aggresively try to compact as > many small sstables as possible into one large sstable that would never have > to get recompacted. > In the case of large datasets (e.g. 5TB per node) with LCS, there would be on > the order of seven levels, and hence seven separate writes of the same data > over time. With this approach, it should be possible to get about 3 > compactions per column (2 in original compaction and one more once hitting > TWCS) in most cases, cutting the write workload by a factor of two or more > for high volume time-series applications. > Note that the only workaround I can currently suggest to minimize compaction > for these workloads is to programatically shard your data across time-window > ranges (e.g. new CF per week), but that pushes unnecessary writing and > querying logic out to the user and is not as convenient nor flexible. > Also note that I am not convinced that the approach I've suggested above is > the best/most general way to solve the problem, but it does appear to be a > relatively easy one to implement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira