[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-12-02 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642452#comment-17642452
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

Great, I'm glad you find the feedback useful :)

The PR looks perfect to me, +1.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-12-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642174#comment-17642174
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

[~adelapena] I will make you co-author if you do not mind. I do not think that 
reviewing so patiently and repeatedly reflects mere "reviewer" enough. I feel 
like we were in this together.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-12-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642155#comment-17642155
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

Great. So we do not need to change anything. 

I have just changed the order of properties in cassandra.yaml as discussed and 
I improved the documentation.

Since this is just documentation change, I do not consider another build to be 
executed. Provided builds also include 500x on added tests.

[https://github.com/apache/cassandra/pull/2027/files]

j11 
[https://app.circleci.com/pipelines/github/instaclustr/cassandra/1625/workflows/21487fae-9fb0-4014-b499-f24893e568fc]
j8 
[https://app.circleci.com/pipelines/github/instaclustr/cassandra/1625/workflows/58e0f2e7-b796-4b32-8062-64e4798e5624]

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-12-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642145#comment-17642145
 ] 

Brandon Williams commented on CASSANDRA-18042:
--

I too am on the "warned" bandwagon now.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-12-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642142#comment-17642142
 ] 

Josh McKenzie commented on CASSANDRA-18042:
---

Ok. Enough bikeshedding; you hit the nail on the head Stefan. It's the _tense_ 
of "_warned" that's tripping me up.

We already have "_enabled" and I'm a firm believer in respecting precedent and 
consistency of context in terms of UX tradeoffs. So:
{code:java}
feature_warned: true
feature_enabled: true{code}
Works for me.

Thanks both of you for being patient with me. :D

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-12-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642110#comment-17642110
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

Simple "warn" is enough for me. "to warn" is a proper verb in English, exactly 
in that form. "warned" seems strange. I know it has same form as "enabled", 
ideally I would do "_enable" and "_warn". Since we have "enabled" already I 
think we should eventually stick to "warned" as well anyway.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-12-01 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642089#comment-17642089
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

Good, good, we'll getting to agreement :) .

I think that the "_on_use" suffix is similar to "_if_enabled", it can be 
applied to any "feature_*" property but "feature_enabled". So we could have 
"feature_attribute_a_on_use", etc. 

As for using "_warned", "_warning", "_discouraged", etc. I don't have a 
preference. I'm not a native English speaker and I don't have a very good sense 
of how weird any of those sound.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-12-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642075#comment-17642075
 ] 

Josh McKenzie commented on CASSANDRA-18042:
---

Had time to sleep on it. I'm almost with you now Andres, and we should follow 
up ticket to break out the warn vs. enable on existing guardrails just so we 
don't forget it.

So we're looking at something like:
{code:java}
# feature can be enabled or disabled via guardrail, and operators can optionally
# have it warn the user of edge cases and provide a contact email to reach out 
to
# in the event they have questions about usage
feature_enabled: false
feature_warned: true
feature_attribute_a: 100MiB
feature_attribute_b: false
feature_attribute_b_warned: true
feature_attribute_c: LZ4
{code}
My only outstanding minor uncertainty is whether we go with _warned or 
something slightly different:
{code:java}
feature_warned: true
feature_warning: true
feature_warn_on_use: true
{code}
"feature_warned" seems ambiguous as to whether we warn on feature use or are 
warning the feature of something. I tend to try to achieve clarity at the 
expense of verbosity (... clearly), so I'm partial to the "warn_on_use" form.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-12-01 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641897#comment-17641897
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

The ordering with {{featue_enabled}} at the top makes sense to me. Probably not 
only before the warning but before anything else. So, extending a bit the 
example, it would look like:
{code:java}
sai_indexes_enabled: false
sai_indexes_warned: true
sai_indexes_write_buffer_size: 100MiB
sai_indexes_take_snapshots_enabled: false
sai_indexes_take_snapshots_warned: true
sai_indexes_take_snapshots_compressor: LZ4
{code}
As for the dependency relationship between {{featue_enabled}} and the warning 
expressed by the {{_if_enabled_}} \{_}suffix. I think that dependency doesn't 
happen only for the warning. In fact, all {{feature}}{_}{{{}*{}}} properties 
also depend on {{{}featue_enabled{}}}, since they are all noop if the feature 
is disabled. I think that dependency would be easy to understand if we put 
{{featue_enabled}} and the top, and maybe reinforce the idea with comments on 
the properties. Otherwise, if we were going to be terribly strict with the 
application of the {{_if_enabled}} suffix to all dependant properties, the 
config could be as confusing as:
{code:java}
sai_indexes_enabled: false
sai_indexes_warned_if_enabled: true
sai_indexes_write_buffer_size_if_enabled: 100MiB
sai_indexes_take_snapshots_if_enabled_enabled: false
sai_indexes_take_snapshots_if_enabled_warned_if_enabled: true
sai_indexes_take_snapshots_if_enabled_compressor_if_enabled: LZ4
{code}
I think we don't want that format.

Note also that there is also a similar dependency between the soft and hard 
limits of the other guardrails. For example, triggering the hard limit of a 
threshold will omit the warning. Or using a disallowed value will avoid the 
warning on warned values. Here we are doing the same but with a boolean 
threshold.
{quote}If you do not want to do that, I am not sure how to achieve that, maybe 
having a guardrail which would accept more values?
{code:java}
sai_indexes: {enabled, warned, disabled}
{code}
{quote}
Not a bad idea for boolean guardrails such as {{{}EnableFlag{}}}, although 
harder to achieve with not-boolean guardrails. However, we already have 
multiple {{feature_enabled}} flags. Some of those are implemented as 
guardrails, some aren't. We can't easily change them to a different format, so 
I think it's desirable to keep {{feature_enabled}} for compatibility.
{quote}I think it's worth a little extra time here since we're establishing a 
precedent (warn / fail threshold on feature flagged things)
{quote}
Indeed, a bad decision at naming conventions can stay with us for a long time.
{quote}I think we're kind of bikeshedding between a bunch of dissatisfying 
options here as we try to bolt something on to a feature (guardrails as feature 
flag) that was never intended to have the notion of "optionally warn if someone 
takes the extra step to turn this feature on".
{quote}
Some of the current not-guardrail experimental feature flags also throw a 
warning when the feature is enabled.  Their behaviour is hardcoded and it's not 
very consistent between features. If I'm right, we have warnings for SASI 
indexes and MVs, but not for transient replication or dropping compact storage. 
Adding the notion of the warning to the guardrail gives us a standarized way to 
implement this kind of flags. This can be done without necessarily having a 
yaml property for the warning, it can be hardcoded in the guardrail constructor 
for each case. However, having the yaml property to enable/disable the warning 
would allow users to get rid of the default warning if they think they know 
what they are doing.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted 

[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-12-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641791#comment-17641791
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

{code}
sai_indexes_take_snapshots_warned: true
sai_indexes_take_snapshots: false
sai_indexes_warned: true
sai_indexes_enabled: false
{code}

_In the above example, would we allow sai indexes and warn when used? Would we 
disallow them and not warn?_

We would disallow sai indexes and emit a failure so we would not warn. So the 
second option of yours.

For better clarity, I would order the properties in yaml differently:

{code}
# enables taking snapshots on sai indexes, if false, warn flag is irrelevant
sai_indexes_take_snapshots: false
sai_indexes_take_snapshots_warned: true
# enables sai indexes, if false, warn flag is irrelevant
sai_indexes_enabled: false
sai_indexes_warned: true
{code}

The motivation behind that is that the documentation / yaml is read from the 
top to the bottom, not vice versa. If there was "warn" first a reader 
understands it "ok we are going to warn here" just to be informed on the next 
line that she is actually not going to be warned because it is not enabled. 
With this new way of ordering, she can comfortably just skip everything else 
after that.

To wrap it up, yes, this is new way of doing things, dependent properties. If 
you do not want to do that, I am not sure how to achieve that, maybe having a 
guardrail which would accept more values? Like this:

{code}
sai_indexes: {enabled, warned, disabled}
{code}

You know what I mean ... This leads to "custom guardrails" which could deal 
with this kind of stuff. Custom guardrails are part of my Password Generation 
and Validation CEP proposal. If me managed to implement that CEP, this 
behaviour would be available virtually for free.



> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641585#comment-17641585
 ] 

Josh McKenzie commented on CASSANDRA-18042:
---

I see it being differently vague with the _warned case:
{code}
sai_indexes_take_snapshots_warned: true
sai_indexes_take_snapshots: false
sai_indexes_warned: true
sai_indexes_enabled: false
{code}
In the above example, would we allow sai indexes and warn when used? Would we 
disallow them and not warn?

Whatever form we take, if we say the warning is contingent on the feature being 
enabled that's different than our other guardrails where we have an assumption 
that all features are enabled but they warn and fail at certain thresholds. 
This new paradigm (warning on feature flagged things) is overloading the notion 
of "warned" to something who's state is _dependent_ on another configuration 
option if I'm not mistaken.

So in the above example, if we're saying "sai_indexes will not be configured 
and the _warned option is dead unless the _enabled is flipped", there's an 
undocumented coupling between them that's also divergent from how the other 
guardrails use _warn and _fail.

{code}
sai_indexes_take_snapshots_if_enabled
{code}
I wasn't advocating for this format. Only specifically for "warn_if_enabled" 
for feature flags implemented through guardrails.

So, all this said: not a hill I'm willing to die on. I think we're kind of 
bikeshedding between a bunch of dissatisfying options here as we try to bolt 
something on to a feature (guardrails as feature flag) that was never intended 
to have the notion of "optionally warn if someone takes the extra step to turn 
this feature on".

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641580#comment-17641580
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

Yep, it's not only redundant but it can also produce ambiguity in some cases. I 
guess it might have worked better if we had the nested guardrail config that 
was initially proposed on the CEP, but with the agreed flat config now it can 
clash with other properties. 

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641570#comment-17641570
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

In this case I would go like 
{code:java}
sai_indexes_take_snapshots: true
sai_indexes_warn_if_enabled: true
sai_indexes_enabled: true {code}
Because in order to be able to take snapshots of sai indexes, sai indexes have 
to be enabled in the first place so "if_enabled" is redundant.

and you could eventually do
{code:java}
sai_indexes_take_snapshots_warn_if_enabled: true
sai_indexes_take_snapshots: true
sai_indexes_warn_if_enabled: true
sai_indexes_enabled: true  {code}
I personally do not see anything wrong with "sai_indexes_warned" or 
"zero_ttl_on_twcs_warned" format. that "if_enabled" is just confusing IMO.

zero_ttl_on_twcs_enabled and zero_ttl_on_twcs_warned is just fine, same logic 
is done as for "if_enabled" cases, it is just shorter.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641566#comment-17641566
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

I understand that feature flags could be implemented with an {{EnableFlag}} as 
it's defined in the current PR, so they have separate properties for warning 
and enabling/disabling.

A curious detail of "feature_warn_if_enabled" is that any other property 
associated to the feature would equally only happen if "feature_enabled". So, 
let's say that we have a guardrail for SAI indexes:
{code}
sai_indexes_warn_if_enabled: true
sai_indexes_enabled: true
{code}
If we add a new property to, for example, taking snapshots of the index, should 
we use:
{code}
sai_indexes_take_snapshots_if_enabled: true
sai_indexes_warn_if_enabled: true
sai_indexes_enabled: true
{code}
or
{code}
sai_indexes_take_snapshots: true
sai_indexes_warn_if_enabled: true
sai_indexes_enabled: true
{code}
?

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641563#comment-17641563
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

Yes it is same relationship, because if "warn_if_enabled" is true and 
"feature_enabled" is false it would fail the query so warning is redundant. I 
think this is designed just fine.

What I got from your last sentence is that we might maybe return to having one 
property and warning every time? Did I interpret it right?

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641560#comment-17641560
 ] 

Josh McKenzie commented on CASSANDRA-18042:
---

{quote}Just to understand the mechanics better - "warn_if_enabled", that "if" 
means that "if zero_ttl_on_twcs_enabled is true" ?
{quote}
To your point Stefan, it is a touch unwieldy. But basically "feature_warned" 
would only actually happen if "feature_enabled" so it's the same dependent 
relationship.

There's the meta question here: should we have warnings for feature flag 
enabled features? Basically, all the things we feature flag guardrailed are 
"you could be using this wrong and/or be setting yourself up for future pain" 
style features too right?

So maybe the answer here is we should have the ability to warn on all of them 
that gets passed back to clients w/context on their usage and potential 
pitfalls like this one.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641503#comment-17641503
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

Just to understand the mechanics better - "warn_if_enabled", that "if" means 
that "if zero_ttl_on_twcs_enabled is true" ? 

I mean, I am not against this, but having "if" inside the property is quite 
awkward. I think this is the first case we have in the whole yaml with 
something different from a noun or verb or adjective ... you feel me.


> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641490#comment-17641490
 ] 

Brandon Williams commented on CASSANDRA-18042:
--

In feature_warn_if_enabled vs feature_warned, I think I'm with Josh on the 
former now.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641454#comment-17641454
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

Whatever we choose, I think that it would be desirable to keep the 
{{feature_enabled}} format for the hard limit because of the point mentioned on 
[this 
comment|https://issues.apache.org/jira/browse/CASSANDRA-18042?focusedCommentId=17640654=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17640654].
 Long story short, we already have properties with that format on 4.1 and not 
changing the convention would allow us to add soft/warn limits to them. 

Both
{code}
feature_warned: true
feature_enabled: true
{code}
and 
{code}
feature_warn_if_enabled: true
feature_enabled: true
{code}
work for me. 

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641446#comment-17641446
 ] 

Brandon Williams commented on CASSANDRA-18042:
--

I think that works, I wasn't hugely fond of 'fail_enabled' either.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641441#comment-17641441
 ] 

Josh McKenzie commented on CASSANDRA-18042:
---

1. I'm terrible at naming things. Grain of salt me.
2. I think it's worth a little extra time here since we're establishing a 
precedent (warn / fail threshold on feature flagged things)

Brandon's point I think needs a touch further tweaking; the "_fail_enabled" is 
confusing to me.

The format of:
{quote}
feature_warn_if_enabled: true
feature_enabled: true
{quote}
while not internally "consistent" with one another makes the most clear logical 
sense to me.

So we'd concretely have:
{quote}
zero_ttl_on_twcs_warn_if_enabled: true
zero_ttl_on_twcs_enabled: true
{quote}

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641427#comment-17641427
 ] 

Brandon Williams commented on CASSANDRA-18042:
--

I think it would be a little more consistent with other properties as:

zero_ttl_on_twcs_warn_enabled: true
zero_ttl_on_twcs_fail_enabled: true

My reasoning being that 'warn_threshold' and 'fail_threshold' are common, but 
since these aren't thresholds we go with the established 'enabled' instead.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641407#comment-17641407
 ] 

Josh McKenzie commented on CASSANDRA-18042:
---

{code}
zero_ttl_on_twcs_warned: true
zero_ttl_on_twcs_enabled: true
{code}
Works for me.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-30 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641153#comment-17641153
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

[~jmckenzie] [~bschoeni] everybody comfortable with the latest changes?

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-29 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640797#comment-17640797
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

Ok, will be done on commit.

Lets wait for others to have some feedback if they are OK with that as well.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640728#comment-17640728
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

Thanks, looks good to me, +1 if CI finishes ok. 

We should also wait a bit to see if others are also happy with the naming 
convention. That convention should in principle be used for other similar 
future guardrails. That includes the case of adding configurable warnings to 
existing enable flags, which should be very easy thanks to this patch :)

Just a nit, in the spirit of CASSANDRA-17967 we should probably add to 
{{cassandra.yaml}} a brief explanation of why the guardrail exists. That 
explanation would be very similar, if not identical, to the {{reason}} 
attribute used in the guardrail. So, for example
{code}
# Guardrail to warn and/or enable a CREATE or ALTER TABLE statement when 
default_time_to_live is set to 0
# and the table is using TimeWindowCompactionStrategy compaction or a subclass 
of it.
# It is suspicious to use default_time_to_live set to 0 with 
TimeWindowCompactionStrategy because the data that hasn't
# been inserted or updated with a TTL will not start to automatically expire 
after they are older than a respective
# compaction window unit of a certain size.
#zero_ttl_on_twcs_warned: true
#zero_ttl_on_twcs_enabled: true
{code}


> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-29 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640705#comment-17640705
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

ok done here: https://github.com/apache/cassandra/pull/2027/files

j11 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/1625/workflows/21487fae-9fb0-4014-b499-f24893e568fc
j8 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/1625/workflows/58e0f2e7-b796-4b32-8062-64e4798e5624

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-29 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640700#comment-17640700
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

done here: https://github.com/apache/cassandra/pull/2027/files

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-29 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640671#comment-17640671
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

Naming looks good to me. So it means I need to rename it all yet again and we 
need to "invert" the logic on that enabled flag. I am on it ...

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640654#comment-17640654
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

I agree that the names of the guardrail properties should start by the 
guardrail name to keep consistency.

I like the `warn/fail` naming because it's easy to understand and it aligns 
well with the naming used for thresholds:
{code:java}
zero_default_ttl_on_twcs_warn: true
zero_default_ttl_on_twcs_fail: false
{code}
However, the downside of this approach is that it breaks the {{*_disabled}} 
convention that we are using on other enable flags. So if we wanted to add a 
warning to one of the existing enable flags, we would need to change the naming 
of the old property,

For example, lets take the currently existing guardrail to disable secondary 
indexes:
{code:java}
secondary_indexes_enabled: true
{code}
If at some point we wanted to also add a warning discouraging the use of 
secondary indexes, the naming with new convention would be:
{code:java}
secondary_indexes_warn: false
secondary_indexes_fail: false
{code}
But we can't replace the old property because of backward compatibility. That 
won't happen if the conventions for warn/fail boolean guardrails were:
{code:java}
secondary_indexes_warned: false
secondary_indexes_enabled: true
{code}
That's why I would consider using:
{code:java}
zero_ttl_on_twcs_warned: true
zero_ttl_on_twcs_enabled: true
{code}
So we have a naming convention that allows to easily transform fail-only 
boolean guardrails into warn-or-fail boolean guardrails.

wdyt?

Besides the naming discussion, I have left a few minor suggestion on the PR, 
which mostly looks good to me.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-29 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640577#comment-17640577
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

Incorporated renaming of properties as suggested above here: 
https://github.com/apache/cassandra/pull/2027/files

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-28 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640233#comment-17640233
 ] 

Ekaterina Dimitrova commented on CASSANDRA-18042:
-

I like the suggestions made by [~smiklosovic] and [~adelapena]  as they are in 
line with the guardrails framework. Diverging can be confusing for users IMHO

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-28 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640231#comment-17640231
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

What about:

{code}
zero_ttl_on_twcs_fail: true
zero_ttl_on_twcs_warn: true
{code}

I do not think it is not a good idea to start property in cassandra.yaml on 
"warn" or "fail". This is as well 1 character shorter :)

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-28 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640210#comment-17640210
 ] 

Josh McKenzie commented on CASSANDRA-18042:
---

{quote}#zero_default_ttl_on_twcs_disallowed: false
{quote}
Could we go with the somewhat simpler:
{code:java}
warn_on_0_ttl_for_twcs: true
fail_on_0_ttl_for_twcs: false
{code}
Or something along those lines? This parameter is different than the other 
_enabled guardrails as they're binary "on/off" switches, effectively feature 
flags, whereas this is a "you could be using this wrong and/or be setting 
yourself up for future pain" style guardrail.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-28 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17639854#comment-17639854
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

OK, we will go with default warning and optional failure. Two flags. That 
translates to this PR (1)

[~adelapena] could  you please go through it one more time to be sure? I will 
provide all the builds then.

(1) https://github.com/apache/cassandra/pull/2027/commits

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-25 Thread Brad Schoening (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638698#comment-17638698
 ] 

Brad Schoening commented on CASSANDRA-18042:


>>  It is easy to delete old sstables off disk. Many operators know how to do 
>>this

It depends on the environment, in a managed service environment, admin users 
likely do not have access. Our users lack even read access to the sstable files 
on disk.  Operators who manage hundreds of clusters can't delete users data 
without having a detailed conversations first.

>> Not true. Tombstone compactions will still occur. There are TWCS options for 
>> tuning this.

Ok, good to know.  The documentation doesn't mention this, could you elaborate?
{quote}In an expiring/TTL workload, the contents of an entire SSTable likely 
expire at approximately the same time, allowing them to be dropped completely,

https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/twcs.html
{quote}
My recollection was that we had seen a small number of zero-based TTL rows 
keeping large sstables alive with TWCs.  I'll have to double check what we saw. 
Which options can we tune? Aside from setting the window size, the only option 
I only see is unsafe_aggressive_sstable_expiration.

But, I think we're mostly in agreement here.  Having this guardrail default to 
warning is fine.    

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-25 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638619#comment-17638619
 ] 

Michael Semb Wever commented on CASSANDRA-18042:


bq. If you use a zero TTL with Time Window Compaction, the number of SStables 
on disk will grow infinitely. 

It is easy to delete old sstables off disk. Many operators know how to do this, 
and want to take this approach because the TTL is unknown (e.g. based on 
storage costs and not time).

The majority of use-cases TTL I've seen set TTL on writes and not the table 
schema. A fail by default guardrail would be an unpleasant surprise for them. 
Setting TTL only on writes is perfectly acceptable as well (e.g. variable TTL).

bq. it may surprise users who convert a table from Size Tiered to TWC when they 
usually, but not always, insert a row level TTL.

The existing data when it was STCS could well be without TTL. The guardrail 
cannot solve this.


bq. My understanding is that TWC will never clean up a SSTable if even a single 
row has a zero TTL.  

Not true. Tombstone compactions will still occur. There are TWCS options for 
tuning this.

bq. This is what we have seen with users in production who have configured TWC 
and used a zero TTL. 

This is the purpose of implementing the fail guardrail. But there are good 
reasons why it should not be the default.



> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-25 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638582#comment-17638582
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

[~mck] could you go through last few comments, please? I would really like to 
know your take on this, again. 

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-25 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638580#comment-17638580
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

_They had been depending on row level TTLs which sometimes didn't get set._

I understand that, but that is their mistake. If they added it, it would not 
happen. For that reason a warning as default seems to be better. We can tweak 
the message saying that they need to TTL their inserts if they expect data to 
be expired as table settings wont do it. It is something else you set it to 
fail on your side. You are definitely allowed to do that. It just should not be 
the default.

Honestly I do not understand from where people got this idea that TWCS is 
supposed to remove data automagically. 

_it may surprise users who convert a table from Size Tiered to TWC_ 

When somebody is changing compaction strategies, they are doing it with some 
goal in their mind, it needs to serve some purpose. What purpose it has in 
their case? What they think will happen doing that?

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-24 Thread Brad Schoening (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638358#comment-17638358
 ] 

Brad Schoening commented on CASSANDRA-18042:


[~smiklosovic] [~adelapena]

I would prefer to have *fail* as the default, but I'm ok with it being 
configurable as a warning.  We will set it to fail on the clusters we manage.  
Definitely would like to have to ability to configure the guard rail to fail.

If you use a zero TTL with Time Window Compaction, the number of SStables on 
disk will grow infinitely.  This will cause performance problems (too many 
sstables) and eventually exhaust the available disk space.  This is what we 
have seen with users in production who have configured TWC and used a zero TTL. 
 They had been depending on row level TTLs which sometimes didn't get set.

If we apply the _Principle of Least Surprise_ to guard rails, it may surprise 
users who convert a table from Size Tiered to TWC when they usually, but not 
always, insert a row level TTL.  In a few months they will have a serious 
problem with performance and disk space. 

My understanding is that TWC will never clean up a SSTable if even a single row 
has a zero TTL.  

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-24 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638277#comment-17638277
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

reviewed branch for the case we are always warning is here: 
[https://github.com/apache/cassandra/pull/2025/files]

the second approach, with warning property, is here: 
[https://github.com/apache/cassandra/pull/2027/files]

Thanks [~adelapena].

I would like to have more feedback if we indeed want to always warn or not.

Brad does not mind to have it always warning. Andres thinks that it depends on 
whether the zero TTL is always risky or not and if we think that warning would 
bother some users then there should be a possibility to turn it off but we 
would turn it on by default.  I do not have any strong preference but I think 
that having warning property available but turned on by default and it would be 
possible to turn it off is a good compromise.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637382#comment-17637382
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

I ve just finished this version in a different PR (we can just throw it away if 
decided as not worthy to merge)

That one includes "ensureDisabled" methods and the logic is done as I suggested 
above. Properties are 

{code}
zero_default_ttl_on_twcs_disallowed: false
zero_default_ttl_on_twcs_warned: true
{code}

I think that adding "ensureDisabled" methods to EnableFlag are quite handy and 
it is easy to code and it improves readability when we do not have to revert 
that logic in the code. 

https://github.com/apache/cassandra/pull/2027/files

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637373#comment-17637373
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

I was thinking on {{_warned}} instead of {{{}_warn{}}}, so it's more similar to 
the existing {{{}_enabled{}}}:
{code:java}
zero_default_ttl_on_twcs_warned: true
zero_default_ttl_on_twcs_enabled: true
{code}
Or it could also be:
{code:java}
zero_default_ttl_on_twcs_discouraged: true
zero_default_ttl_on_twcs_enabled: true
{code}
It's a bit odd however that the two boolean properties have different 
behaviours, where {{_enabled}} triggers the guardrails when it's {{{}false{}}}, 
whereas and {{_warned}} triggers it when it's {{{}true{}}}.

That would be solved by:
{code:java}
zero_default_ttl_on_twcs_fail: false
zero_default_ttl_on_twcs_warn: true
{code}
However, we already have multiple guardrails using {{_enabled}} in 4.1, and we 
have to preserve them.

So using {{_fail}} for this guardrail would mean that {{EnableFlag}} sometimes 
would receive {{_enabled}} properties, and sometimes would receive {{_fail}} 
properties. That's not ideal and it might be problematic if at some point we 
want to automatically generate property names from guardrail names.

Maybe [~jmckenzie], who has added a bunch of enable flags, has an opinion on 
this.

As for how we manage the true/false properties, we could easily reverse the 
predicates that we pass to the constructor.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637349#comment-17637349
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

Thanks [~adelapena] for the review. I ve already incorporated your suggestions.

I do not mind to call it zero_default_ttl_on_twcs_enabled. With comments 
included in cassandra.yaml it is pretty descriptive. However, I left MBean 
method as it was, people invoking that does not have any documentation to look 
into when called by tooling and it is better to be as much specific as 
possible. 

PR reflects these changes.

So the only question is whether we should have warning flag too. For the 
maximum flexibility we may include it. It is acceptable to have it if it is by 
default enabled. 

If we are going to have these properties:

{code}
zero_default_ttl_on_twcs_enabled: true
zero_default_ttl_on_twcs_warn: true
{code}

I think it would be better to call it like this:

{code}
zero_default_ttl_on_twcs_fail: false
zero_default_ttl_on_twcs_warn: true
{code}

We would then need to "revert" the logic so it would look like (notice negation 
in the body of that if)

{code}
if (params.defaultTimeToLive == 0
&& !SchemaConstants.isSystemKeyspace(keyspaceName)
&& 
TimeWindowCompactionStrategy.class.isAssignableFrom(params.compaction.klass()))
!Guardrails.zeroDefaultTTLOnTWCSEnabled.ensureEnabled(state);
{code}

This is a little bit hard to follow, what would be nice to have is to have 
methods called "ensureDisabled" which is more natural to me so it would be like

{code}
if (params.defaultTimeToLive == 0
&& !SchemaConstants.isSystemKeyspace(keyspaceName)
&& 
TimeWindowCompactionStrategy.class.isAssignableFrom(params.compaction.klass()))
Guardrails.zeroDefaultTTLOnTWCS.ensureDisabled(state);
{code}

what do you think?

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637316#comment-17637316
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

By the way, the name of the property is very long, what would you think about 
{{zero_default_ttl_on_twcs_enabled}}? Too cryptic even with the nice comments 
that we have around it?

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637312#comment-17637312
 ] 

Andres de la Peña commented on CASSANDRA-18042:
---

The approach looks good to me. As previously discussed on Slack conversation, I 
think it makes sense to have a soft limit on {{EnableFlag}} for this kind of 
things, and it might come handy in the future for other stuff like 
guardrail-based experimental flags.

As of whether we need to have a 
{{zero_default_ttl_on_time_window_compaction_strategy_warned}} to configure the 
warning, I guess that depends on whether the zero TTL is always risky or not. 
If we think the warning might bother some users then we should have the 
{{zero_default_ttl_on_time_window_compaction_strategy_warned}} property, 
probably enabled by default.

I have left a bunch of comments on the PR, which overall is looking good. Most 
are about nits and testing. The only slightly more serious thing is the 
duplicated emission of warnings when {{ClientState}} is {{null}}. That happens 
on background processes that are guarded but shouldn't be interrupted by 
exceptions. We some of these for disk size, collection size, etc.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637154#comment-17637154
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

Thanks [~bschoeni]

_It is not recommended to disable ..._ 

I do not think this is the right way to put it. It is strategy as any other. It 
just has some consequences to keep in mind. It is as if we do not want people 
to use TWCS with ttl = 0. That is not completely true, they can definitely use 
such configuration combination. There is just some not-so-obvious caveat 
attached we try to point out here.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-21 Thread Brad Schoening (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636948#comment-17636948
 ] 

Brad Schoening commented on CASSANDRA-18042:


[~smiklosovic] 

Looks great, I agree that there is probably no need to turn off warnings.

I tweaked the warning a little bit if you like this:
{quote}It is not recommended to disable Time To Live (TTL of zero) with Time 
Window Compaction Strategy. If data is not configured to expire with a TTL, the 
number of sstables for time windows will continually increase.
{quote}
 

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-21 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636889#comment-17636889
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

[~adelapena] it would be awesome if you took a closer look to this patch.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-14 Thread Brad Schoening (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17634055#comment-17634055
 ] 

Brad Schoening commented on CASSANDRA-18042:


+1 on the warn and fail guardrails.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-14 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17634037#comment-17634037
 ] 

Brandon Williams commented on CASSANDRA-18042:
--

+1 to having warn and fail thresholds like other guardrails and having the 
default be a warning.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-14 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17634032#comment-17634032
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

Emitting just a warning by default makes sense to me. It should be possible to 
configure it in such a way that it will fail but that would need to be set 
explicitly. 

I will look around what guardrail is the most appropriate to complete this.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-14 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17634016#comment-17634016
 ] 

Michael Semb Wever commented on CASSANDRA-18042:


Heuristically, I see more folk use write TTL than default_time_to_live on TWCS 
tables.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-14 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17634008#comment-17634008
 ] 

Stefan Miklosovic commented on CASSANDRA-18042:
---

I do not mind to have it like Josh says. What is more probable?

a) that people will use TWCS and they forget to set default_time_to_live to 
something else from 0 and 
OR
b) we leave it as it is and the operators will go over all guardrails on a 
rainy Sunday and say "ah this one is nice, maybe we should use it just to be 
sure".

I would say that a) is more probable. People solve problems most often 
re-actively. It is better if there is a safety net and they do not even notice. 
If power users want to have it with 0, they would need to turn this off 
explicitly and that is I guess is minority.

Does this relates to what you see in the field?

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS

2022-11-14 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17633975#comment-17633975
 ] 

Josh McKenzie commented on CASSANDRA-18042:
---

+1 to the guardrail.

What do we think about the guardrail defaulting to on / disabling the ability 
to use this combination? That'd allow existing infra that's doing this to 
continue to work but provide some protection / self-learning for future users. 
The downside to that would be operators who upgraded to the next major and 
didn't catch that in NEWS.txt would have to flip that switch but as long as 
it's hot-proppable should be minimally annoying.

> Implement a guardrail for not having zero default ttl on tables with TWCS
> -
>
> Key: CASSANDRA-18042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18042
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Guardrails, Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.x
>
>
> A user was surprised that his data have not started to expire after 90 days 
> on his TWCS, he noticed that default_time_to_live on the table was set to 0 
> (by accident from his side) and inserts were using TTL = 0 too.
> It is questionable why it it possible to create a table with TWCS and enable 
> a user to specify default_time_to_live to be zero.
> On the other hand, I would argue that having default_time_to_live set to 0 on 
> TWCS does not necessarily mean that such combination is illegal. It is about 
> people just using that with advantage very often so tables are compacted away 
> nicely. However, that does not have to mean that they could not use it with 
> 0. But I yet have to see a use-case where TWCS was used and default ttl was 
> set to 0 on purpose. Merely looking into Cassandra codebase, there are only 
> cases when this parameter is not 0.
> There are three approaches:
> 1) just reject such statements (for CreateTable and AlterTable statements) 
> where default_time_to_live = 0
> 2) Implement a guardrail for 1) so it can be enabled / disabled on demand
> 3) Leave possibility to set default_time_to_live to 0 on a table but make a 
> guardrail for UpdateStatement so it might reject queries for tables with 
> default_time_to_live is zero and for which its TTL (on that update statement) 
> is set to 0 too.
> I would be careful about making the current configuration illegal because of 
> backward compatibility. For that reason 2) makes the most sense to me.
> Maybe implementing 3) would make sense as well. There might be a table which 
> has default ttl set to 0 as it expects a user to supply TTL every time. 
> However, as it is not currently enforced anywhere, a client might still 
> insert TTLs to be set to 0 even by accident.
> POC for 2) is here 
> https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org