[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642452#comment-17642452 ] Andres de la Peña commented on CASSANDRA-18042: --- Great, I'm glad you find the feedback useful :) The PR looks perfect to me, +1. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642174#comment-17642174 ] Stefan Miklosovic commented on CASSANDRA-18042: --- [~adelapena] I will make you co-author if you do not mind. I do not think that reviewing so patiently and repeatedly reflects mere "reviewer" enough. I feel like we were in this together. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642155#comment-17642155 ] Stefan Miklosovic commented on CASSANDRA-18042: --- Great. So we do not need to change anything. I have just changed the order of properties in cassandra.yaml as discussed and I improved the documentation. Since this is just documentation change, I do not consider another build to be executed. Provided builds also include 500x on added tests. [https://github.com/apache/cassandra/pull/2027/files] j11 [https://app.circleci.com/pipelines/github/instaclustr/cassandra/1625/workflows/21487fae-9fb0-4014-b499-f24893e568fc] j8 [https://app.circleci.com/pipelines/github/instaclustr/cassandra/1625/workflows/58e0f2e7-b796-4b32-8062-64e4798e5624] > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642145#comment-17642145 ] Brandon Williams commented on CASSANDRA-18042: -- I too am on the "warned" bandwagon now. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642142#comment-17642142 ] Josh McKenzie commented on CASSANDRA-18042: --- Ok. Enough bikeshedding; you hit the nail on the head Stefan. It's the _tense_ of "_warned" that's tripping me up. We already have "_enabled" and I'm a firm believer in respecting precedent and consistency of context in terms of UX tradeoffs. So: {code:java} feature_warned: true feature_enabled: true{code} Works for me. Thanks both of you for being patient with me. :D > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642110#comment-17642110 ] Stefan Miklosovic commented on CASSANDRA-18042: --- Simple "warn" is enough for me. "to warn" is a proper verb in English, exactly in that form. "warned" seems strange. I know it has same form as "enabled", ideally I would do "_enable" and "_warn". Since we have "enabled" already I think we should eventually stick to "warned" as well anyway. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642089#comment-17642089 ] Andres de la Peña commented on CASSANDRA-18042: --- Good, good, we'll getting to agreement :) . I think that the "_on_use" suffix is similar to "_if_enabled", it can be applied to any "feature_*" property but "feature_enabled". So we could have "feature_attribute_a_on_use", etc. As for using "_warned", "_warning", "_discouraged", etc. I don't have a preference. I'm not a native English speaker and I don't have a very good sense of how weird any of those sound. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642075#comment-17642075 ] Josh McKenzie commented on CASSANDRA-18042: --- Had time to sleep on it. I'm almost with you now Andres, and we should follow up ticket to break out the warn vs. enable on existing guardrails just so we don't forget it. So we're looking at something like: {code:java} # feature can be enabled or disabled via guardrail, and operators can optionally # have it warn the user of edge cases and provide a contact email to reach out to # in the event they have questions about usage feature_enabled: false feature_warned: true feature_attribute_a: 100MiB feature_attribute_b: false feature_attribute_b_warned: true feature_attribute_c: LZ4 {code} My only outstanding minor uncertainty is whether we go with _warned or something slightly different: {code:java} feature_warned: true feature_warning: true feature_warn_on_use: true {code} "feature_warned" seems ambiguous as to whether we warn on feature use or are warning the feature of something. I tend to try to achieve clarity at the expense of verbosity (... clearly), so I'm partial to the "warn_on_use" form. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641897#comment-17641897 ] Andres de la Peña commented on CASSANDRA-18042: --- The ordering with {{featue_enabled}} at the top makes sense to me. Probably not only before the warning but before anything else. So, extending a bit the example, it would look like: {code:java} sai_indexes_enabled: false sai_indexes_warned: true sai_indexes_write_buffer_size: 100MiB sai_indexes_take_snapshots_enabled: false sai_indexes_take_snapshots_warned: true sai_indexes_take_snapshots_compressor: LZ4 {code} As for the dependency relationship between {{featue_enabled}} and the warning expressed by the {{_if_enabled_}} \{_}suffix. I think that dependency doesn't happen only for the warning. In fact, all {{feature}}{_}{{{}*{}}} properties also depend on {{{}featue_enabled{}}}, since they are all noop if the feature is disabled. I think that dependency would be easy to understand if we put {{featue_enabled}} and the top, and maybe reinforce the idea with comments on the properties. Otherwise, if we were going to be terribly strict with the application of the {{_if_enabled}} suffix to all dependant properties, the config could be as confusing as: {code:java} sai_indexes_enabled: false sai_indexes_warned_if_enabled: true sai_indexes_write_buffer_size_if_enabled: 100MiB sai_indexes_take_snapshots_if_enabled_enabled: false sai_indexes_take_snapshots_if_enabled_warned_if_enabled: true sai_indexes_take_snapshots_if_enabled_compressor_if_enabled: LZ4 {code} I think we don't want that format. Note also that there is also a similar dependency between the soft and hard limits of the other guardrails. For example, triggering the hard limit of a threshold will omit the warning. Or using a disallowed value will avoid the warning on warned values. Here we are doing the same but with a boolean threshold. {quote}If you do not want to do that, I am not sure how to achieve that, maybe having a guardrail which would accept more values? {code:java} sai_indexes: {enabled, warned, disabled} {code} {quote} Not a bad idea for boolean guardrails such as {{{}EnableFlag{}}}, although harder to achieve with not-boolean guardrails. However, we already have multiple {{feature_enabled}} flags. Some of those are implemented as guardrails, some aren't. We can't easily change them to a different format, so I think it's desirable to keep {{feature_enabled}} for compatibility. {quote}I think it's worth a little extra time here since we're establishing a precedent (warn / fail threshold on feature flagged things) {quote} Indeed, a bad decision at naming conventions can stay with us for a long time. {quote}I think we're kind of bikeshedding between a bunch of dissatisfying options here as we try to bolt something on to a feature (guardrails as feature flag) that was never intended to have the notion of "optionally warn if someone takes the extra step to turn this feature on". {quote} Some of the current not-guardrail experimental feature flags also throw a warning when the feature is enabled. Their behaviour is hardcoded and it's not very consistent between features. If I'm right, we have warnings for SASI indexes and MVs, but not for transient replication or dropping compact storage. Adding the notion of the warning to the guardrail gives us a standarized way to implement this kind of flags. This can be done without necessarily having a yaml property for the warning, it can be hardcoded in the guardrail constructor for each case. However, having the yaml property to enable/disable the warning would allow users to get rid of the default warning if they think they know what they are doing. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables a
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641791#comment-17641791 ] Stefan Miklosovic commented on CASSANDRA-18042: --- {code} sai_indexes_take_snapshots_warned: true sai_indexes_take_snapshots: false sai_indexes_warned: true sai_indexes_enabled: false {code} _In the above example, would we allow sai indexes and warn when used? Would we disallow them and not warn?_ We would disallow sai indexes and emit a failure so we would not warn. So the second option of yours. For better clarity, I would order the properties in yaml differently: {code} # enables taking snapshots on sai indexes, if false, warn flag is irrelevant sai_indexes_take_snapshots: false sai_indexes_take_snapshots_warned: true # enables sai indexes, if false, warn flag is irrelevant sai_indexes_enabled: false sai_indexes_warned: true {code} The motivation behind that is that the documentation / yaml is read from the top to the bottom, not vice versa. If there was "warn" first a reader understands it "ok we are going to warn here" just to be informed on the next line that she is actually not going to be warned because it is not enabled. With this new way of ordering, she can comfortably just skip everything else after that. To wrap it up, yes, this is new way of doing things, dependent properties. If you do not want to do that, I am not sure how to achieve that, maybe having a guardrail which would accept more values? Like this: {code} sai_indexes: {enabled, warned, disabled} {code} You know what I mean ... This leads to "custom guardrails" which could deal with this kind of stuff. Custom guardrails are part of my Password Generation and Validation CEP proposal. If me managed to implement that CEP, this behaviour would be available virtually for free. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641585#comment-17641585 ] Josh McKenzie commented on CASSANDRA-18042: --- I see it being differently vague with the _warned case: {code} sai_indexes_take_snapshots_warned: true sai_indexes_take_snapshots: false sai_indexes_warned: true sai_indexes_enabled: false {code} In the above example, would we allow sai indexes and warn when used? Would we disallow them and not warn? Whatever form we take, if we say the warning is contingent on the feature being enabled that's different than our other guardrails where we have an assumption that all features are enabled but they warn and fail at certain thresholds. This new paradigm (warning on feature flagged things) is overloading the notion of "warned" to something who's state is _dependent_ on another configuration option if I'm not mistaken. So in the above example, if we're saying "sai_indexes will not be configured and the _warned option is dead unless the _enabled is flipped", there's an undocumented coupling between them that's also divergent from how the other guardrails use _warn and _fail. {code} sai_indexes_take_snapshots_if_enabled {code} I wasn't advocating for this format. Only specifically for "warn_if_enabled" for feature flags implemented through guardrails. So, all this said: not a hill I'm willing to die on. I think we're kind of bikeshedding between a bunch of dissatisfying options here as we try to bolt something on to a feature (guardrails as feature flag) that was never intended to have the notion of "optionally warn if someone takes the extra step to turn this feature on". > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641580#comment-17641580 ] Andres de la Peña commented on CASSANDRA-18042: --- Yep, it's not only redundant but it can also produce ambiguity in some cases. I guess it might have worked better if we had the nested guardrail config that was initially proposed on the CEP, but with the agreed flat config now it can clash with other properties. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641570#comment-17641570 ] Stefan Miklosovic commented on CASSANDRA-18042: --- In this case I would go like {code:java} sai_indexes_take_snapshots: true sai_indexes_warn_if_enabled: true sai_indexes_enabled: true {code} Because in order to be able to take snapshots of sai indexes, sai indexes have to be enabled in the first place so "if_enabled" is redundant. and you could eventually do {code:java} sai_indexes_take_snapshots_warn_if_enabled: true sai_indexes_take_snapshots: true sai_indexes_warn_if_enabled: true sai_indexes_enabled: true {code} I personally do not see anything wrong with "sai_indexes_warned" or "zero_ttl_on_twcs_warned" format. that "if_enabled" is just confusing IMO. zero_ttl_on_twcs_enabled and zero_ttl_on_twcs_warned is just fine, same logic is done as for "if_enabled" cases, it is just shorter. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641566#comment-17641566 ] Andres de la Peña commented on CASSANDRA-18042: --- I understand that feature flags could be implemented with an {{EnableFlag}} as it's defined in the current PR, so they have separate properties for warning and enabling/disabling. A curious detail of "feature_warn_if_enabled" is that any other property associated to the feature would equally only happen if "feature_enabled". So, let's say that we have a guardrail for SAI indexes: {code} sai_indexes_warn_if_enabled: true sai_indexes_enabled: true {code} If we add a new property to, for example, taking snapshots of the index, should we use: {code} sai_indexes_take_snapshots_if_enabled: true sai_indexes_warn_if_enabled: true sai_indexes_enabled: true {code} or {code} sai_indexes_take_snapshots: true sai_indexes_warn_if_enabled: true sai_indexes_enabled: true {code} ? > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641563#comment-17641563 ] Stefan Miklosovic commented on CASSANDRA-18042: --- Yes it is same relationship, because if "warn_if_enabled" is true and "feature_enabled" is false it would fail the query so warning is redundant. I think this is designed just fine. What I got from your last sentence is that we might maybe return to having one property and warning every time? Did I interpret it right? > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641560#comment-17641560 ] Josh McKenzie commented on CASSANDRA-18042: --- {quote}Just to understand the mechanics better - "warn_if_enabled", that "if" means that "if zero_ttl_on_twcs_enabled is true" ? {quote} To your point Stefan, it is a touch unwieldy. But basically "feature_warned" would only actually happen if "feature_enabled" so it's the same dependent relationship. There's the meta question here: should we have warnings for feature flag enabled features? Basically, all the things we feature flag guardrailed are "you could be using this wrong and/or be setting yourself up for future pain" style features too right? So maybe the answer here is we should have the ability to warn on all of them that gets passed back to clients w/context on their usage and potential pitfalls like this one. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641503#comment-17641503 ] Stefan Miklosovic commented on CASSANDRA-18042: --- Just to understand the mechanics better - "warn_if_enabled", that "if" means that "if zero_ttl_on_twcs_enabled is true" ? I mean, I am not against this, but having "if" inside the property is quite awkward. I think this is the first case we have in the whole yaml with something different from a noun or verb or adjective ... you feel me. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641490#comment-17641490 ] Brandon Williams commented on CASSANDRA-18042: -- In feature_warn_if_enabled vs feature_warned, I think I'm with Josh on the former now. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641454#comment-17641454 ] Andres de la Peña commented on CASSANDRA-18042: --- Whatever we choose, I think that it would be desirable to keep the {{feature_enabled}} format for the hard limit because of the point mentioned on [this comment|https://issues.apache.org/jira/browse/CASSANDRA-18042?focusedCommentId=17640654&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17640654]. Long story short, we already have properties with that format on 4.1 and not changing the convention would allow us to add soft/warn limits to them. Both {code} feature_warned: true feature_enabled: true {code} and {code} feature_warn_if_enabled: true feature_enabled: true {code} work for me. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641446#comment-17641446 ] Brandon Williams commented on CASSANDRA-18042: -- I think that works, I wasn't hugely fond of 'fail_enabled' either. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641441#comment-17641441 ] Josh McKenzie commented on CASSANDRA-18042: --- 1. I'm terrible at naming things. Grain of salt me. 2. I think it's worth a little extra time here since we're establishing a precedent (warn / fail threshold on feature flagged things) Brandon's point I think needs a touch further tweaking; the "_fail_enabled" is confusing to me. The format of: {quote} feature_warn_if_enabled: true feature_enabled: true {quote} while not internally "consistent" with one another makes the most clear logical sense to me. So we'd concretely have: {quote} zero_ttl_on_twcs_warn_if_enabled: true zero_ttl_on_twcs_enabled: true {quote} > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641427#comment-17641427 ] Brandon Williams commented on CASSANDRA-18042: -- I think it would be a little more consistent with other properties as: zero_ttl_on_twcs_warn_enabled: true zero_ttl_on_twcs_fail_enabled: true My reasoning being that 'warn_threshold' and 'fail_threshold' are common, but since these aren't thresholds we go with the established 'enabled' instead. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641407#comment-17641407 ] Josh McKenzie commented on CASSANDRA-18042: --- {code} zero_ttl_on_twcs_warned: true zero_ttl_on_twcs_enabled: true {code} Works for me. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641153#comment-17641153 ] Stefan Miklosovic commented on CASSANDRA-18042: --- [~jmckenzie] [~bschoeni] everybody comfortable with the latest changes? > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640797#comment-17640797 ] Stefan Miklosovic commented on CASSANDRA-18042: --- Ok, will be done on commit. Lets wait for others to have some feedback if they are OK with that as well. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640728#comment-17640728 ] Andres de la Peña commented on CASSANDRA-18042: --- Thanks, looks good to me, +1 if CI finishes ok. We should also wait a bit to see if others are also happy with the naming convention. That convention should in principle be used for other similar future guardrails. That includes the case of adding configurable warnings to existing enable flags, which should be very easy thanks to this patch :) Just a nit, in the spirit of CASSANDRA-17967 we should probably add to {{cassandra.yaml}} a brief explanation of why the guardrail exists. That explanation would be very similar, if not identical, to the {{reason}} attribute used in the guardrail. So, for example {code} # Guardrail to warn and/or enable a CREATE or ALTER TABLE statement when default_time_to_live is set to 0 # and the table is using TimeWindowCompactionStrategy compaction or a subclass of it. # It is suspicious to use default_time_to_live set to 0 with TimeWindowCompactionStrategy because the data that hasn't # been inserted or updated with a TTL will not start to automatically expire after they are older than a respective # compaction window unit of a certain size. #zero_ttl_on_twcs_warned: true #zero_ttl_on_twcs_enabled: true {code} > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640705#comment-17640705 ] Stefan Miklosovic commented on CASSANDRA-18042: --- ok done here: https://github.com/apache/cassandra/pull/2027/files j11 https://app.circleci.com/pipelines/github/instaclustr/cassandra/1625/workflows/21487fae-9fb0-4014-b499-f24893e568fc j8 https://app.circleci.com/pipelines/github/instaclustr/cassandra/1625/workflows/58e0f2e7-b796-4b32-8062-64e4798e5624 > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640700#comment-17640700 ] Stefan Miklosovic commented on CASSANDRA-18042: --- done here: https://github.com/apache/cassandra/pull/2027/files > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640671#comment-17640671 ] Stefan Miklosovic commented on CASSANDRA-18042: --- Naming looks good to me. So it means I need to rename it all yet again and we need to "invert" the logic on that enabled flag. I am on it ... > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 3h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640654#comment-17640654 ] Andres de la Peña commented on CASSANDRA-18042: --- I agree that the names of the guardrail properties should start by the guardrail name to keep consistency. I like the `warn/fail` naming because it's easy to understand and it aligns well with the naming used for thresholds: {code:java} zero_default_ttl_on_twcs_warn: true zero_default_ttl_on_twcs_fail: false {code} However, the downside of this approach is that it breaks the {{*_disabled}} convention that we are using on other enable flags. So if we wanted to add a warning to one of the existing enable flags, we would need to change the naming of the old property, For example, lets take the currently existing guardrail to disable secondary indexes: {code:java} secondary_indexes_enabled: true {code} If at some point we wanted to also add a warning discouraging the use of secondary indexes, the naming with new convention would be: {code:java} secondary_indexes_warn: false secondary_indexes_fail: false {code} But we can't replace the old property because of backward compatibility. That won't happen if the conventions for warn/fail boolean guardrails were: {code:java} secondary_indexes_warned: false secondary_indexes_enabled: true {code} That's why I would consider using: {code:java} zero_ttl_on_twcs_warned: true zero_ttl_on_twcs_enabled: true {code} So we have a naming convention that allows to easily transform fail-only boolean guardrails into warn-or-fail boolean guardrails. wdyt? Besides the naming discussion, I have left a few minor suggestion on the PR, which mostly looks good to me. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640577#comment-17640577 ] Stefan Miklosovic commented on CASSANDRA-18042: --- Incorporated renaming of properties as suggested above here: https://github.com/apache/cassandra/pull/2027/files > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640233#comment-17640233 ] Ekaterina Dimitrova commented on CASSANDRA-18042: - I like the suggestions made by [~smiklosovic] and [~adelapena] as they are in line with the guardrails framework. Diverging can be confusing for users IMHO > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640231#comment-17640231 ] Stefan Miklosovic commented on CASSANDRA-18042: --- What about: {code} zero_ttl_on_twcs_fail: true zero_ttl_on_twcs_warn: true {code} I do not think it is not a good idea to start property in cassandra.yaml on "warn" or "fail". This is as well 1 character shorter :) > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640210#comment-17640210 ] Josh McKenzie commented on CASSANDRA-18042: --- {quote}#zero_default_ttl_on_twcs_disallowed: false {quote} Could we go with the somewhat simpler: {code:java} warn_on_0_ttl_for_twcs: true fail_on_0_ttl_for_twcs: false {code} Or something along those lines? This parameter is different than the other _enabled guardrails as they're binary "on/off" switches, effectively feature flags, whereas this is a "you could be using this wrong and/or be setting yourself up for future pain" style guardrail. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639854#comment-17639854 ] Stefan Miklosovic commented on CASSANDRA-18042: --- OK, we will go with default warning and optional failure. Two flags. That translates to this PR (1) [~adelapena] could you please go through it one more time to be sure? I will provide all the builds then. (1) https://github.com/apache/cassandra/pull/2027/commits > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638698#comment-17638698 ] Brad Schoening commented on CASSANDRA-18042: >> It is easy to delete old sstables off disk. Many operators know how to do >>this It depends on the environment, in a managed service environment, admin users likely do not have access. Our users lack even read access to the sstable files on disk. Operators who manage hundreds of clusters can't delete users data without having a detailed conversations first. >> Not true. Tombstone compactions will still occur. There are TWCS options for >> tuning this. Ok, good to know. The documentation doesn't mention this, could you elaborate? {quote}In an expiring/TTL workload, the contents of an entire SSTable likely expire at approximately the same time, allowing them to be dropped completely, https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/twcs.html {quote} My recollection was that we had seen a small number of zero-based TTL rows keeping large sstables alive with TWCs. I'll have to double check what we saw. Which options can we tune? Aside from setting the window size, the only option I only see is unsafe_aggressive_sstable_expiration. But, I think we're mostly in agreement here. Having this guardrail default to warning is fine. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638619#comment-17638619 ] Michael Semb Wever commented on CASSANDRA-18042: bq. If you use a zero TTL with Time Window Compaction, the number of SStables on disk will grow infinitely. It is easy to delete old sstables off disk. Many operators know how to do this, and want to take this approach because the TTL is unknown (e.g. based on storage costs and not time). The majority of use-cases TTL I've seen set TTL on writes and not the table schema. A fail by default guardrail would be an unpleasant surprise for them. Setting TTL only on writes is perfectly acceptable as well (e.g. variable TTL). bq. it may surprise users who convert a table from Size Tiered to TWC when they usually, but not always, insert a row level TTL. The existing data when it was STCS could well be without TTL. The guardrail cannot solve this. bq. My understanding is that TWC will never clean up a SSTable if even a single row has a zero TTL. Not true. Tombstone compactions will still occur. There are TWCS options for tuning this. bq. This is what we have seen with users in production who have configured TWC and used a zero TTL. This is the purpose of implementing the fail guardrail. But there are good reasons why it should not be the default. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638582#comment-17638582 ] Stefan Miklosovic commented on CASSANDRA-18042: --- [~mck] could you go through last few comments, please? I would really like to know your take on this, again. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638580#comment-17638580 ] Stefan Miklosovic commented on CASSANDRA-18042: --- _They had been depending on row level TTLs which sometimes didn't get set._ I understand that, but that is their mistake. If they added it, it would not happen. For that reason a warning as default seems to be better. We can tweak the message saying that they need to TTL their inserts if they expect data to be expired as table settings wont do it. It is something else you set it to fail on your side. You are definitely allowed to do that. It just should not be the default. Honestly I do not understand from where people got this idea that TWCS is supposed to remove data automagically. _it may surprise users who convert a table from Size Tiered to TWC_ When somebody is changing compaction strategies, they are doing it with some goal in their mind, it needs to serve some purpose. What purpose it has in their case? What they think will happen doing that? > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638358#comment-17638358 ] Brad Schoening commented on CASSANDRA-18042: [~smiklosovic] [~adelapena] I would prefer to have *fail* as the default, but I'm ok with it being configurable as a warning. We will set it to fail on the clusters we manage. Definitely would like to have to ability to configure the guard rail to fail. If you use a zero TTL with Time Window Compaction, the number of SStables on disk will grow infinitely. This will cause performance problems (too many sstables) and eventually exhaust the available disk space. This is what we have seen with users in production who have configured TWC and used a zero TTL. They had been depending on row level TTLs which sometimes didn't get set. If we apply the _Principle of Least Surprise_ to guard rails, it may surprise users who convert a table from Size Tiered to TWC when they usually, but not always, insert a row level TTL. In a few months they will have a serious problem with performance and disk space. My understanding is that TWC will never clean up a SSTable if even a single row has a zero TTL. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638277#comment-17638277 ] Stefan Miklosovic commented on CASSANDRA-18042: --- reviewed branch for the case we are always warning is here: [https://github.com/apache/cassandra/pull/2025/files] the second approach, with warning property, is here: [https://github.com/apache/cassandra/pull/2027/files] Thanks [~adelapena]. I would like to have more feedback if we indeed want to always warn or not. Brad does not mind to have it always warning. Andres thinks that it depends on whether the zero TTL is always risky or not and if we think that warning would bother some users then there should be a possibility to turn it off but we would turn it on by default. I do not have any strong preference but I think that having warning property available but turned on by default and it would be possible to turn it off is a good compromise. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 2h 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17637382#comment-17637382 ] Stefan Miklosovic commented on CASSANDRA-18042: --- I ve just finished this version in a different PR (we can just throw it away if decided as not worthy to merge) That one includes "ensureDisabled" methods and the logic is done as I suggested above. Properties are {code} zero_default_ttl_on_twcs_disallowed: false zero_default_ttl_on_twcs_warned: true {code} I think that adding "ensureDisabled" methods to EnableFlag are quite handy and it is easy to code and it improves readability when we do not have to revert that logic in the code. https://github.com/apache/cassandra/pull/2027/files > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 40m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17637373#comment-17637373 ] Andres de la Peña commented on CASSANDRA-18042: --- I was thinking on {{_warned}} instead of {{{}_warn{}}}, so it's more similar to the existing {{{}_enabled{}}}: {code:java} zero_default_ttl_on_twcs_warned: true zero_default_ttl_on_twcs_enabled: true {code} Or it could also be: {code:java} zero_default_ttl_on_twcs_discouraged: true zero_default_ttl_on_twcs_enabled: true {code} It's a bit odd however that the two boolean properties have different behaviours, where {{_enabled}} triggers the guardrails when it's {{{}false{}}}, whereas and {{_warned}} triggers it when it's {{{}true{}}}. That would be solved by: {code:java} zero_default_ttl_on_twcs_fail: false zero_default_ttl_on_twcs_warn: true {code} However, we already have multiple guardrails using {{_enabled}} in 4.1, and we have to preserve them. So using {{_fail}} for this guardrail would mean that {{EnableFlag}} sometimes would receive {{_enabled}} properties, and sometimes would receive {{_fail}} properties. That's not ideal and it might be problematic if at some point we want to automatically generate property names from guardrail names. Maybe [~jmckenzie], who has added a bunch of enable flags, has an opinion on this. As for how we manage the true/false properties, we could easily reverse the predicates that we pass to the constructor. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 40m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17637349#comment-17637349 ] Stefan Miklosovic commented on CASSANDRA-18042: --- Thanks [~adelapena] for the review. I ve already incorporated your suggestions. I do not mind to call it zero_default_ttl_on_twcs_enabled. With comments included in cassandra.yaml it is pretty descriptive. However, I left MBean method as it was, people invoking that does not have any documentation to look into when called by tooling and it is better to be as much specific as possible. PR reflects these changes. So the only question is whether we should have warning flag too. For the maximum flexibility we may include it. It is acceptable to have it if it is by default enabled. If we are going to have these properties: {code} zero_default_ttl_on_twcs_enabled: true zero_default_ttl_on_twcs_warn: true {code} I think it would be better to call it like this: {code} zero_default_ttl_on_twcs_fail: false zero_default_ttl_on_twcs_warn: true {code} We would then need to "revert" the logic so it would look like (notice negation in the body of that if) {code} if (params.defaultTimeToLive == 0 && !SchemaConstants.isSystemKeyspace(keyspaceName) && TimeWindowCompactionStrategy.class.isAssignableFrom(params.compaction.klass())) !Guardrails.zeroDefaultTTLOnTWCSEnabled.ensureEnabled(state); {code} This is a little bit hard to follow, what would be nice to have is to have methods called "ensureDisabled" which is more natural to me so it would be like {code} if (params.defaultTimeToLive == 0 && !SchemaConstants.isSystemKeyspace(keyspaceName) && TimeWindowCompactionStrategy.class.isAssignableFrom(params.compaction.klass())) Guardrails.zeroDefaultTTLOnTWCS.ensureDisabled(state); {code} what do you think? > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 0.5h > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17637316#comment-17637316 ] Andres de la Peña commented on CASSANDRA-18042: --- By the way, the name of the property is very long, what would you think about {{zero_default_ttl_on_twcs_enabled}}? Too cryptic even with the nice comments that we have around it? > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17637312#comment-17637312 ] Andres de la Peña commented on CASSANDRA-18042: --- The approach looks good to me. As previously discussed on Slack conversation, I think it makes sense to have a soft limit on {{EnableFlag}} for this kind of things, and it might come handy in the future for other stuff like guardrail-based experimental flags. As of whether we need to have a {{zero_default_ttl_on_time_window_compaction_strategy_warned}} to configure the warning, I guess that depends on whether the zero TTL is always risky or not. If we think the warning might bother some users then we should have the {{zero_default_ttl_on_time_window_compaction_strategy_warned}} property, probably enabled by default. I have left a bunch of comments on the PR, which overall is looking good. Most are about nits and testing. The only slightly more serious thing is the duplicated emission of warnings when {{ClientState}} is {{null}}. That happens on background processes that are guarded but shouldn't be interrupted by exceptions. We some of these for disk size, collection size, etc. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 20m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17637154#comment-17637154 ] Stefan Miklosovic commented on CASSANDRA-18042: --- Thanks [~bschoeni] _It is not recommended to disable ..._ I do not think this is the right way to put it. It is strategy as any other. It just has some consequences to keep in mind. It is as if we do not want people to use TWCS with ttl = 0. That is not completely true, they can definitely use such configuration combination. There is just some not-so-obvious caveat attached we try to point out here. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 10m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17636948#comment-17636948 ] Brad Schoening commented on CASSANDRA-18042: [~smiklosovic] Looks great, I agree that there is probably no need to turn off warnings. I tweaked the warning a little bit if you like this: {quote}It is not recommended to disable Time To Live (TTL of zero) with Time Window Compaction Strategy. If data is not configured to expire with a TTL, the number of sstables for time windows will continually increase. {quote} > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 10m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17636889#comment-17636889 ] Stefan Miklosovic commented on CASSANDRA-18042: --- [~adelapena] it would be awesome if you took a closer look to this patch. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > Time Spent: 10m > Remaining Estimate: 0h > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17634055#comment-17634055 ] Brad Schoening commented on CASSANDRA-18042: +1 on the warn and fail guardrails. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17634037#comment-17634037 ] Brandon Williams commented on CASSANDRA-18042: -- +1 to having warn and fail thresholds like other guardrails and having the default be a warning. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17634032#comment-17634032 ] Stefan Miklosovic commented on CASSANDRA-18042: --- Emitting just a warning by default makes sense to me. It should be possible to configure it in such a way that it will fail but that would need to be set explicitly. I will look around what guardrail is the most appropriate to complete this. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17634016#comment-17634016 ] Michael Semb Wever commented on CASSANDRA-18042: Heuristically, I see more folk use write TTL than default_time_to_live on TWCS tables. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17634008#comment-17634008 ] Stefan Miklosovic commented on CASSANDRA-18042: --- I do not mind to have it like Josh says. What is more probable? a) that people will use TWCS and they forget to set default_time_to_live to something else from 0 and OR b) we leave it as it is and the operators will go over all guardrails on a rainy Sunday and say "ah this one is nice, maybe we should use it just to be sure". I would say that a) is more probable. People solve problems most often re-actively. It is better if there is a safety net and they do not even notice. If power users want to have it with 0, they would need to turn this off explicitly and that is I guess is minority. Does this relates to what you see in the field? > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18042) Implement a guardrail for not having zero default ttl on tables with TWCS
[ https://issues.apache.org/jira/browse/CASSANDRA-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633975#comment-17633975 ] Josh McKenzie commented on CASSANDRA-18042: --- +1 to the guardrail. What do we think about the guardrail defaulting to on / disabling the ability to use this combination? That'd allow existing infra that's doing this to continue to work but provide some protection / self-learning for future users. The downside to that would be operators who upgraded to the next major and didn't catch that in NEWS.txt would have to flip that switch but as long as it's hot-proppable should be minimally annoying. > Implement a guardrail for not having zero default ttl on tables with TWCS > - > > Key: CASSANDRA-18042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18042 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails, Legacy/Core >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.x > > > A user was surprised that his data have not started to expire after 90 days > on his TWCS, he noticed that default_time_to_live on the table was set to 0 > (by accident from his side) and inserts were using TTL = 0 too. > It is questionable why it it possible to create a table with TWCS and enable > a user to specify default_time_to_live to be zero. > On the other hand, I would argue that having default_time_to_live set to 0 on > TWCS does not necessarily mean that such combination is illegal. It is about > people just using that with advantage very often so tables are compacted away > nicely. However, that does not have to mean that they could not use it with > 0. But I yet have to see a use-case where TWCS was used and default ttl was > set to 0 on purpose. Merely looking into Cassandra codebase, there are only > cases when this parameter is not 0. > There are three approaches: > 1) just reject such statements (for CreateTable and AlterTable statements) > where default_time_to_live = 0 > 2) Implement a guardrail for 1) so it can be enabled / disabled on demand > 3) Leave possibility to set default_time_to_live to 0 on a table but make a > guardrail for UpdateStatement so it might reject queries for tables with > default_time_to_live is zero and for which its TTL (on that update statement) > is set to 0 too. > I would be careful about making the current configuration illegal because of > backward compatibility. For that reason 2) makes the most sense to me. > Maybe implementing 3) would make sense as well. There might be a table which > has default ttl set to 0 as it expects a user to supply TTL every time. > However, as it is not currently enforced anywhere, a client might still > insert TTLs to be set to 0 even by accident. > POC for 2) is here > https://github.com/instaclustr/cassandra/commit/0b4dcc3d3deeffa393c02a3b80e27482007f9579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org