[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638698#comment-17638698
 ] 

Brad Schoening commented on CASSANDRA-18042:
--------------------------------------------

>>  It is easy to delete old sstables off disk. Many operators know how to do 
>>this

It depends on the environment, in a managed service environment, admin users 
likely do not have access. Our users lack even read access to the sstable files 
on disk.  Operators who manage hundreds of clusters can't delete users data 
without having a detailed conversations first.

>> Not true. Tombstone compactions will still occur. There are TWCS options for 
>> tuning this.

Ok, good to know.  The documentation doesn't mention this, could you elaborate?
{quote}In an expiring/TTL workload, the contents of an entire SSTable likely 
expire at approximately the same time, allowing them to be dropped completely,

https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/twcs.html
{quote}
My recollection was that we had seen a small number of zero-based TTL rows 
keeping large sstables alive with TWCs.  I'll have to double check what we saw. 
Which options can we tune? Aside from setting the window size, the only option 
I only see is unsafe_aggressive_sstable_expiration.

But, I think we're mostly in agreement here.  Having this guardrail default to 
warning is fine.    

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18042
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/Guardrails, Legacy/Core
>            Reporter: Stefan Miklosovic
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 4.x
>
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to