[jira] [Comment Edited] (CASSANDRA-10306) Splitting SSTables in time, deleting and archiving SSTables

Antti Nissinen (JIRA) Fri, 09 Oct 2015 06:58:01 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950417#comment-14950417
 ]


Antti Nissinen edited comment on CASSANDRA-10306 at 10/9/15 1:57 PM:
---------------------------------------------------------------------

The idea is following:
- let's keep the TTL as it is currently so that compactions (DTCS and TWCS) can 
drop SSTables that are fully expired. Splitting the SSTables in time manner in 
compactions (like you proposed) will make sure that there won't be any SSTables 
that are covering a large time span that won't be dropped.(i.e.having a large 
number of old data points and few points from the near history). 
- Some times there is a need to clean up quickly and effectively data from 
column family so that we will give a time line and all the data will be deleted 
or archived to different media behind that time (i.e. remove all data that is 
older than given time stamp, age is considered from the timestamp of the 
column). That would require splitting all SSTables that are having data on both 
sides of the timeline. If the compaction has been working as expected there is 
probably only a couple of SStables to be splitted. 
- If all SSTables for given column family on each node would be splitted 
according to the given time line and SStables could be "inactivated" from the 
active SSTable set then the SSTables could be removed / moved to somewhere else 
and the repair operations would not start the replicate missing data during the 
move operation while nodes are out of sync.

That would actually correspond the temporary change of the global TTL as you 
said. TTL is probably saved as an absolute time stamp in the columns so that 
algorithm should use some kind of offset values to lead to desired time line 
for deletion / archiving.



was (Author: anissinen):
The idea is following:
- let's keep the TTL as it is currently so that compactions (DTCS and TWCS) can 
drop SSTables that are fully expired. Splitting the SSTables in time manner in 
compactions (like you proposed) will make sure that there won't be any SSTables 
that are covering a large time span that won't be dropped.(i.e.having a large 
number of old data points and few points from the near history). 
- Some times there is a need to clean up quickly and effectively data from 
column family so that we will give a time line and all the data will be deleted 
or archived to different media behind that time (i.e. remove all data that is 
older than given time stamp, age is considered from the timestamp of the 
column). That would require splitting all SSTables that are having data on both 
sides of the timeline. If the compaction has been working as expected there is 
probably only a couple of SStables to be splitted. 
- If all SSTables for given column family on each node would be splitted 
according to the given time line and SStables could be "inactivated" from the 
active SSTable set then the SSTables could be removed / moved to somewhere else 
and the repair operations would not start the replicate missing data during the 
move operation while nodes are out of sync.



> Splitting SSTables in time, deleting and archiving SSTables
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-10306
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10306
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Antti Nissinen
>
> This document is a continuation for 
> [CASSANDRA-10195|https://issues.apache.org/jira/browse/CASSANDRA-10195] and 
> describes some needs to be able split files in time wise as discussed also in 
> [CASSANDRA-8361|https://issues.apache.org/jira/browse/CASSANDRA-8361]. Data 
> model is explained shortly, then the practical issues running Cassandra with 
> time series data and needs for the splitting capabilities.
> Data model: (snippet from 
> [CASSANDRA-9644|https://issues.apache.org/jira/browse/CASSANDRA-9644)]
> Data is time series data. Data is saved so that one row contains a certain 
> time span of data for a given metric ( 20 days in this case). The row key 
> contains information about the start time of the time span and metrix name. 
> Column name gives the offset from the beginning of time span. Column time 
> stamp is set to correspond time stamp when adding together the timestamp from 
> the row key and the offset (the actual time stamp of data point). Data model 
> is analog to KairosDB implementation.
> In the practical application the data is added to real-time into the column 
> family. While converting from legacy system old data is pre-loaded in timely 
> order by faking the timestamp of the column before starting the real-time 
> data collection. However, there is intermittently a need to insert also older 
> data to the database due to the fact that is has not been available in 
> real-time or additional time series are fed in afterward due to unforeseeable 
> needs. 
> Adding old  data simultaneously with real-time data will lead to SSTables 
> that are containing data from a time period exceeding the length of the 
> compaction window (TWCS and DTCS). Therefore SSTables are not behaving in 
> predictable manner in compaction process.
> Tombstones are masking the data from queries but the release of disk space 
> requires that SStables containing tombstones would be compacted together with 
> SSTables having the original data. While using TWCS or DTCS and writing 
> tombstones with timestamp corresponding the real time SStables containing the 
> original data will not end up to be compacted with SSTables having the 
> tombstone. Even if writing tombstones by faking the timestamps the SSTable 
> should be written apart from the on-going real-time data. Otherwise the 
> SSTables have to be splitted (see later). 
> TTL is a working method to delete data from column family and releasing disk 
> space in a predictable manner. However, setting the correct TTL is not a 
> trivial task. Required TTL might change e.g. due to legislation or the 
> customer would like to have a longer lifetime for the data. 
> The other factor affecting the disk space consumption is the variability of 
> the rate how much data is fed to the column family. In certain 
> troubleshooting cases the sample rate can be increased ten fold for a large 
> portion of collected time series. This will lead to rapid consumption of disk 
> space and old data has to be deleted / archived in a such manner that disk 
> space will be released in a quick and predictable manner.
> Losing one or more nodes from the cluster and not having a spare hardware 
> will also lead to a situation that data from the lost node has to be 
> replicated again for the remaining nodes. This will lead to increased disk 
> space consumption per node and probably requires some cleaning of older data 
> away from the active column family.
> All of the above issues could be of course handled just by adding more disk 
> space or nodes to the cluster. In the cloud environment that would a feasible 
> option. In the application sitting in real hardware in isolated environment 
> this is not a feasible solution due to practical reasons or due to costs. 
> Getting new hardware on sites might take a long time e.g. due to custom 
> regulations.
> In the application domain (time series data collection) the data is not 
> modified after inserting to the column family. There will be only read 
> operations and deletion / archiving of old data based on the TTL or operator 
> actions.
> The above reasoning will lead to following conclusions and proposals.
> * TWCS and DTCS (with certain modifications) are leading to a well structured 
> SSTables where tables are organized in timely manner giving opportunities to 
> manage available disk capacity on nodes. Recovering from repairs works also 
> (compaction the flood of small SSTables with larger ones).
> * Being able to effectively split the SStables along a given time line would 
> lead to SSTable sets on all nodes that would allow deletion or archiving 
> SSTables. What would be the mechanism to inactivate SSTables during deletion 
> / archiving so that nodes don’t start streaming “missing” data between nodes 
> (repairs)?
> * Being able to split existing SSTables along multiple timelines determined 
> by TWCS would allow insertion of older data to the column family that would 
> eventually be compacted in desired manner in correct time window. Original 
> SSTable would be streamed to several SStables according to time windows. In 
> the end empty SSTables would be discarded.
> * Splitting action would be a tool to be executed through the nodetool 
> command when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-10306) Splitting SSTables in time, deleting and archiving SSTables

Reply via email to