The deleting compaction strategy from protectwise 
(https://github.com/protectwise/cassandra-util/blob/master/deleting-compaction-strategy/README.md)
 was written (I believe) to solve a similar problem - business based deletion 
rules to enable flexible TTLs. May want to glance at that.

Other answers inline below 


-- 
Jeff Jirsa


> On Aug 9, 2017, at 1:41 AM, Steinmaurer, Thomas 
> <thomas.steinmau...@dynatrace.com> wrote:
> 
> Hello,
>  
> our top contributor from a data volume perspective is time series data. We 
> are running with STCS since our initial production deployment in 2014 with 
> several clusters with a varying number of nodes, but currently with max. 9 
> nodes per single cluster per different region in AWS with m4.xlarge / EBS gp2 
> storage. We have a road of Cassandra versions starting with 1.2 to actually 
> using DSC 2.1.15 soon to be replaced by Apache Cassandra 2.1.18 across all 
> deployments. Lately we switched from Thrift (Astyanax) to Native/CQL 
> (DataStax driver). Overall we are pretty happy with stability and the scale 
> out offering.
>  
> We store time series data in different resolutions, from 1min up to 1day 
> aggregates per “time slot”. Each resolution has its own column family / table 
> and a periodic worker is executing our business logic regarding time series 
> aging from e.g. 1min => 5min => … resolution + deletion in source resolutions 
> according to our retention per resolution policy. So deletions will happen 
> way later (e.g. at least > 14d). We don’t use TTLs on written time series 
> data (in production, see TWCS testing below), so purging is exclusively 
> handled by explicit DELETEs in our aging business logic creating tombstones.
>  
> Naturally with STCS and late explicit deletions / tombstones, it will take a 
> lot of time to finally reclaim disk space, even worse, we are now running a 
> major compaction every X weeks. We currently are also testing with STCS 
> min_threshold = 2 etc., but all in all, this all feels not being a long-term 
> solution. I guess there is nothing else we are missing from a 
> configuration/setting side with STCS? Single SSTable compaction might not 
> kick in as well, cause checking with sstablemeta, estimated droppable 
> tombstones for our time series based SSTables is pretty much 0.0 all the 
> time. I guess as we don’t write with TTL?


Or you aren't issuing deletes, explicit deletes past GCGS will cause that 
number to increase

>  
> TWCS caught my eyes in 2015 I think, and even more at the Cassandra Summit 
> 2016 + other Tombstone related talks. Cassandra 3.0 is around 6 months ahead 
> for us, thus initial testing was with 2.1.18 patched with TWCS from GitHub.
>  
> Looks like TWCS is exactly what we need, thus test-wise, once we start 
> writing with TTL we end up with a single SSTable per passed window size and 
> data (SSTables) older than TTL + grace get automatically removed from disk. 
> Even with enabled out-of-orders DELETEs from our business logic, purging 
> SSTables seems not be stucked. Not sure if this is expected. Writing with TTL 
> is also a bit problematic, in case our retention policy changes in general or 
> for particular customers.

Search for my Cassandra summit talk from 2016 - there's a few other compaction 
options you probably want to set to more aggressively trigger single sstable 
compaction to help unstick it.

>  
> A few questions, as we need some short-term (with C* 2.1) and long-term (with 
> C* 3.0) mitigation:
> ·         With STCS, estimated droppable tombstones being always 0.0 (thus 
> also no automatic single SSTable compaction may happen): Is this a matter of 
> not writing with TTL? If yes, would enabling TTL with STCS improve the disk 
> reclaim situation, cause then single SSTAble compactions will kick in?
> ·         What is the semantic of “default_time_to_live” at table level? 
> From: http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_expire_c.html : 
> “After the default_time_to_live TTL value has been exceed, Cassandra 
> tombstones the entire table”. What does “entire table” mean?

It probably means sstable, but even that isn't really accurate - that's a doc 
bug 

> Hopefully / I guess I don’t end up with an empty table every X past TTLs?
> ·         Anything else I’m missing regarding STCS and reclaiming disk space 
> earlier in our TS use case?

LCS rewrites much more aggressively on partition updates - if you can spare the 
IO it's likely going to be more efficient purging deleted data than STCS 

> ·         I know, changing compaction is a matter of executing ALTER TABLE 
> (or temporary via JMX for a single node), but as we have legacy data being 
> written without TTL, I wonder if we may end up in stuck SSTable again
> ·         In case of stuck SSTables with any compaction strategy, what is the 
> best way to debug/analyze why it got stuck (overlapping etc.)?

sstableexpiredblockers

>  
> Thanks a lot and sorry for the lengthy email.
>  
> Thomas
> The contents of this e-mail are intended for the named addressee only. It 
> contains information that may be confidential. Unless you are the named 
> addressee or an authorized designee, you may not copy or use it, or disclose 
> it to anyone else. If you received it in error please notify us immediately 
> and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) 
> is a company registered in Linz whose registered office is at 4040 Linz, 
> Austria, Freistädterstraße 313

Reply via email to