[ 
https://issues.apache.org/jira/browse/CASSANDRA-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Deng updated CASSANDRA-10306:
---------------------------------
    Labels: dtcs  (was: )

> Splitting SSTables in time, deleting and archiving SSTables
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-10306
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10306
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Antti Nissinen
>              Labels: dtcs
>
> This document is a continuation for 
> [CASSANDRA-10195|https://issues.apache.org/jira/browse/CASSANDRA-10195] and 
> describes some needs to be able split files in time wise as discussed also in 
> [CASSANDRA-8361|https://issues.apache.org/jira/browse/CASSANDRA-8361]. Data 
> model is explained shortly, then the practical issues running Cassandra with 
> time series data and needs for the splitting capabilities.
> Data model: (snippet from 
> [CASSANDRA-9644|https://issues.apache.org/jira/browse/CASSANDRA-9644)]
> Data is time series data. Data is saved so that one row contains a certain 
> time span of data for a given metric ( 20 days in this case). The row key 
> contains information about the start time of the time span and metrix name. 
> Column name gives the offset from the beginning of time span. Column time 
> stamp is set to correspond time stamp when adding together the timestamp from 
> the row key and the offset (the actual time stamp of data point). Data model 
> is analog to KairosDB implementation.
> In the practical application the data is added to real-time into the column 
> family. While converting from legacy system old data is pre-loaded in timely 
> order by faking the timestamp of the column before starting the real-time 
> data collection. However, there is intermittently a need to insert also older 
> data to the database due to the fact that is has not been available in 
> real-time or additional time series are fed in afterward due to unforeseeable 
> needs. 
> Adding old  data simultaneously with real-time data will lead to SSTables 
> that are containing data from a time period exceeding the length of the 
> compaction window (TWCS and DTCS). Therefore SSTables are not behaving in 
> predictable manner in compaction process.
> Tombstones are masking the data from queries but the release of disk space 
> requires that SStables containing tombstones would be compacted together with 
> SSTables having the original data. While using TWCS or DTCS and writing 
> tombstones with timestamp corresponding the real time SStables containing the 
> original data will not end up to be compacted with SSTables having the 
> tombstone. Even if writing tombstones by faking the timestamps the SSTable 
> should be written apart from the on-going real-time data. Otherwise the 
> SSTables have to be splitted (see later). 
> TTL is a working method to delete data from column family and releasing disk 
> space in a predictable manner. However, setting the correct TTL is not a 
> trivial task. Required TTL might change e.g. due to legislation or the 
> customer would like to have a longer lifetime for the data. 
> The other factor affecting the disk space consumption is the variability of 
> the rate how much data is fed to the column family. In certain 
> troubleshooting cases the sample rate can be increased ten fold for a large 
> portion of collected time series. This will lead to rapid consumption of disk 
> space and old data has to be deleted / archived in a such manner that disk 
> space will be released in a quick and predictable manner.
> Losing one or more nodes from the cluster and not having a spare hardware 
> will also lead to a situation that data from the lost node has to be 
> replicated again for the remaining nodes. This will lead to increased disk 
> space consumption per node and probably requires some cleaning of older data 
> away from the active column family.
> All of the above issues could be of course handled just by adding more disk 
> space or nodes to the cluster. In the cloud environment that would a feasible 
> option. In the application sitting in real hardware in isolated environment 
> this is not a feasible solution due to practical reasons or due to costs. 
> Getting new hardware on sites might take a long time e.g. due to custom 
> regulations.
> In the application domain (time series data collection) the data is not 
> modified after inserting to the column family. There will be only read 
> operations and deletion / archiving of old data based on the TTL or operator 
> actions.
> The above reasoning will lead to following conclusions and proposals.
> * TWCS and DTCS (with certain modifications) are leading to a well structured 
> SSTables where tables are organized in timely manner giving opportunities to 
> manage available disk capacity on nodes. Recovering from repairs works also 
> (compaction the flood of small SSTables with larger ones).
> * Being able to effectively split the SStables along a given time line would 
> lead to SSTable sets on all nodes that would allow deletion or archiving 
> SSTables. What would be the mechanism to inactivate SSTables during deletion 
> / archiving so that nodes don’t start streaming “missing” data between nodes 
> (repairs)?
> * Being able to split existing SSTables along multiple timelines determined 
> by TWCS would allow insertion of older data to the column family that would 
> eventually be compacted in desired manner in correct time window. Original 
> SSTable would be streamed to several SStables according to time windows. In 
> the end empty SSTables would be discarded.
> * Splitting action would be a tool to be executed through the nodetool 
> command when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to