It seems we agree on removing the default value for the old thresholds, and
don't count system keyspaces/tables on the new ones.

The old thresholds were on active duty for around ten months, and they have
been deprecated for around a year. They will have been deprecated for
longer by the time we release 5.0. If we want to keep them in perpetuity, I
guess the plan would be:

- Remove the default value of the old thresholds in Config.java to make
them disabled by default.
- Remove the old thresholds from the default cassandra.yaml, although old
yamls can still have them.
- Use converters (@Replaces tag in Config.java) to read the old threshold
values (if present) and apply them to the new guardrails.
- During the conversion from the old thresholds to the new guardrails,
subtract the current number of system keyspace/tables from the old value.
For example, 150 tables in the old threshold translate to 103 tables in the
new guardrail, considering that there are 47 system tables.

Does this sound good?

On Wed, 14 Jun 2023 at 17:26, David Capwell <dcapw...@apple.com> wrote:

> That's problematic because the new thresholds we added in CASSANDRA-17147
> don't include system tables. Do you think we should change that?
>
>
> I wouldn’t change the semantics of the config as it’s already live.  I
> guess where I am coming from is that logically we have to think about the
> system tables, so to your point, if we think 150 is too much and the system
> already exposes 50… then we should recommend no more than 100….
>
> I find it's better for usability to not count the system tables and just
> say "It's recommended not to have more than 100 tables. This doesn't
> include system tables.”
>
>
> I am fine with this framing… internally we think about 150 but
> publicly speak 100 (due to our 50 tables)...
>
>
> On Jun 14, 2023, at 8:29 AM, Josh McKenzie <jmcken...@apache.org> wrote:
>
> In my opinion including system tables defeats that purpose because it
> forces users to know details about the system tables.
>
> Perhaps having a unit test that caps our system tables at some value and
> keeping the guardrail user-scope specific would be a better approach. I see
> your point about leaking internal details to users, specifically on things
> they can't control at this point.
>
> On Wed, Jun 14, 2023, at 8:19 AM, Andrés de la Peña wrote:
>
> > Default value I agree with you; features should be off by default!  If
> we remove the default then we disable the feature by default (which im cool
> with) and for anyone who changed the config, they would keep their behavior
>
>
> I'm glad we agree on at least removing the default value if we keep the
> deprecated properties.
>
> > With that, I kinda don’t agree that including system tables is a
> mistake, as we add more we allow less for user tables before we start to
> have issues….
>
>
> That's problematic because the new thresholds we added in CASSANDRA-17147
> don't include system tables. Do you think we should change that?
>
> I still think it's better not to include the system tables in the count.
> The thresholds on the number of keyspaces/tables/rows/columns/tombstones
> are just guidance since they cannot be exactly related to exact resource
> consumption. The main purpose of those thresholds is to prevent obvious
> antipatterns such as creating thousands of tables. A benefit of expressing
> the guardrails in terms of the number of schema entities, rather than
> counting the memory usage of those entities, is that they are easy to
> understand and reason about. In my opinion including system tables defeats
> that purpose because it forces users to know details about the system
> tables. The fact that those details change between versions doesn't help.
> Including system tables is not going to make the thresholds precise in
> terms of measuring memory consumption because that depends on other
> factors, such as the columns they store.
>
> Including system tables also imposes a minimum threshold value, like in
> 5.0 you cannot set a threshold value under 45 tables without triggering it
> with an empty db. For other thresholds, this can be more tricky. That would
> be the case of the guardrail on the number of columns in a partition, where
> you would need to know the size of the widest row in the system tables,
> which can change over time.
>
> I guess that if system tables were to be counted, a recommendation for the
> threshold would say something like "It's recommended not to have more than
> 150 tables. The system already includes 45 tables for internal usage, so
> you shouldn't create more than 105 user tables". I find it's better for
> usability to not count the system tables and just say "It's recommended not
> to have more than 100 tables. This doesn't include system tables."
>
> On Tue, 13 Jun 2023 at 23:51, Josh McKenzie <jmcken...@apache.org> wrote:
>
>
> Warning that too many tables (including system) may have negative behavior
> I think is fine
>
> This reminds me of the current situation with our tests where we just keep
> adding more and more without really considering the value of the current
> set and the costs of that body of work as it keeps growing.
>
> Having some kind of signal that we need to do some housekeeping with our
> system tables, or *something* in the feedback loop that helps us keep on
> top of this hygiene over time, seems like a clear benefit to me.
>
> On Tue, Jun 13, 2023, at 1:42 PM, David Capwell wrote:
>
> I think that the combined decision of using a default value and counting
> system tables was a mistake
>
>
> Default value I agree with you; features should be off by default!  If we
> remove the default then we disable the feature by default (which im cool
> with) and for anyone who changed the config, they would keep their behavior
>
> As for system tables… each table adds a cost to our bookkeeping, so when
> we add new tables the cost grows and the memory per table decreases, does
> it not?  Warning that too many tables (including system) may have negative
> behavior I think is fine, its only if we start to fail is when things
> become a problem (upgrading to 5.0 can’t happen due to too many tables
> added in the release?); think the feature was warn only, so that should be
> fine.  With that, I kinda don’t agree that including system tables is a
> mistake, as we add more we allow less for user tables before we start to
> have issues…. At the same time, if we have improvements in newer versions
> that allows higher number of tables, the user then has to update their
> configs (well, as long as we don’t make things worse a smaller limit than
> needed is fine…)
>
> we would need to know how many system keyspaces/tables were on the version
> we are upgrading from
>
>
> Do we?  The logic was pulling from local schema, so to keep the same
> behavior we would need to do the same; being version dependent would
> actually break the semantics as far as I can tell.
>
> On Jun 13, 2023, at 9:50 AM, Andrés de la Peña <adelap...@apache.org>
> wrote:
>
> Indeed "keyspace_count_warn_threshold" and "table_count_warn_threshold"
> include system keyspaces and tables. Also, differently to the newer
> guardrails, they are enabled by default.
>
> I find that problematic because users need to know how many system
> keyspaces/tables there are to know if they need to set the threshold value.
> Moreover, if a new release adds some system tables, the threshold can start
> to be triggered without changing the number of user tables. That would
> force some users to update the threshold values during an upgrade. Even if
> they are using the defaults. That situation would happen again in any
> release adding new keyspaces/tables. I think adding new system tables is
> not that uncommon, and indeed 5.0 does it.
>
> I think that the combined decision of using a default value and counting
> system tables was a mistake. If that's the case, I don't know for how long
> we want to remain tied to that mistake. Especially when the old thresholds
> tend to create upgrade issues on their own.
>
> If we were going to use converters, we would need to know how many system
> keyspaces/tables were on the version we are upgrading from. I don't know if
> that information is available. Or perhaps we could assume that counting
> system keyspaces/tables was a bug, and just translate changing the meaning
> to not include them.
>
>
>
>
>
> On Tue, 13 Jun 2023 at 16:51, David Capwell <dcapw...@apple.com> wrote:
>
> > Have we been dropping support entirely for old params or using the
> @Replaces annotation into perpetuity?
>
>
> My understanding is that the goal is to keep things around in perpetuity
> unless it actively causes us harm… and with @Replaces, there tends to be no
> harm to keep around…
>
> Looking at
> https://github.com/apache/cassandra/commit/bae92ee139b411c94228f8fd5bb8befb4183ca9f
>  we just marked them deprecated and created a brand new config that
> matched the old… which I feel was a bad idea…. Renaming configs are fine
> with @Replaces, but asking users to migrate with the idea of breaking them
> in the future is bad…
>
> The table_count_warn_threshold config is used at
> org.apache.cassandra.cql3.statements.schema.CreateTableStatement#clientWarnings
> The tables_warn_threshold config is used at
> org.apache.cassandra.cql3.statements.schema.CreateTableStatement#validate
>
> The only difference I see is that table_count_warn_threshold includes
> system tables where as tables_warn_threshold is only user tables…
>
> > I would like to propose removing the non-guardrail thresholds
> 'keyspace_count_warn_threshold' and 'table_count_warn_threshold'
> configuration settings on the trunk branch for the next major release.
>
> Deprecate in 4.1 is way too new for me to accept that, and its low effort
> to keep; breaking users is always a bad idea and doing it when not needed
> is bad…
>
> Honestly, I don’t see why we couldn’t use @Replaces here to solve the
> semantic gap… table_count_warn_threshold includes the system tables, so we
> just need a Converter that takes w/e the value the user put in and
> subtracts the system tables… which then gives us the user tables (matching
> tables_warn_threshold)
>
> > On Jun 13, 2023, at 7:57 AM, Josh McKenzie <jmcken...@apache.org> wrote:
> >
> >> have subsequently been deprecated since 4.1-alpha in CASSANDRA-17195
> when they were replaced/migrated to guardrails as part of CEP-3
> (Guardrails).
> > Have we been dropping support entirely for old params or using the
> @Replaces annotation into perpetuity?
> >
> > I dislike the idea of operators having to remember to update things
> between versions and being surprised when things change roughly equally to
> us carrying along undocumented deprecated param name mapping roughly
> equally. :)
> >
> > On Mon, Jun 12, 2023, at 5:56 PM, Dan Jatnieks wrote:
> >> Hello everyone,
> >>
> >> I would like to propose removing the non-guardrail thresholds
> 'keyspace_count_warn_threshold' and 'table_count_warn_threshold'
> configuration settings on the trunk branch for the next major release.
> >>
> >> These thresholds were first added with CASSANDRA-16309 in 4.0-beta4 and
> have subsequently been deprecated since 4.1-alpha in CASSANDRA-17195 when
> they were replaced/migrated to guardrails as part of CEP-3 (Guardrails).
> >>
> >> I'd appreciate any thoughts about this. I will open a ticket to get
> started if there is support for doing this.
> >>
> >> Reference:
> >> https://issues.apache.org/jira/browse/CASSANDRA-16309
> >> https://issues.apache.org/jira/browse/CASSANDRA-17195
> >> CEP-3: Guardrails
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-3%3A+Guardrails
> >>
> >>
> >> Thanks,
> >> Dan Jatnieks
>
>
>

Reply via email to