ATTENTION PLEASE Below email will be long but I believe you will agree it
deserves attention for good reasons. Thank you in advance for your time and
consideration!

Hi everyone,

I am working on the last batch of config parameters to be transferred to
the new types after CASSANDRA-15234 landed.

This led to a few questions and concerns and here I am to raise awareness
and one more time to confirm things for the community to ensure there is no
regression when 4.1 is out.

As the main decisions were taken and implemented two years ago before even
4.0 beta, I would like to ensure that everyone here is aware that in
CASSANDRA-15234 we have backward compatibility to transfer the old format
(only values, no opportunity to change unit) to the new format (value +
unit) but it falls back to the new types at the end. This is not a feature
with a flag to be disabled. We migrated the parameters to the new types in
Config class and in cassandra.yaml. Also, we explicitly say we don’t
support negative values (old and new yaml config) which was not the case
before. We never advertised the usage of negative values but we also did
not prohibit that which we do now as one more improvement in CASSANDRA-15234.
I will add an explicit note in NEWS.txt to be sure no one was expecting to
keep on using negative values with the old yaml if they were doing it for
some unknown reason, old/new - we prohibit on trunk negative values except
special cases.

As part of the documentation I mentioned the Converter class which serves
to do these conversions between old value and new value - load the old
value to the new Config types. There are special cases which I want to
highlight in case someone knows of any behavior not caught by our CI that
might be changed or broken because of those changes(better safe than sorry):

   -

   The new types DataRateSpec, DurationSpec and DataRateSpec accept only
   non-negative values (this will affect both old and new config). Thus as
   part of the converters we have the following special cases:

- MILLIS_DOUBLE_DURATION for commitlog_sync_group_window_in_ms or now
already commitlog_sync_group_window which covers that this was double and
even if we think this was a bug as it is casted later to int here[1],
someone might have been using double and NaN. Disallow less than 0 as we
already said for all types. With the new config types this parameter is
stored as long.

- MILLIS_CUSTOM_DURATION for permissions_update_interval,
roles_update_interval and credentials_update_interval. After CASSANDRA-17431
those will be updated to - old value of “-1” being translated to null. The
setters will be doing that too. Anything below -1 is prohibited and
considered a bug (both with old and new config).

- we are adding NEGATIVE_SECONDS_DURATION as part of C17431 to handle
validation_preview_purge_head_start. Anyone using the old yaml and value <0
will have it translated into 0 seconds. New yaml and format prohibits
values less than 0 and the smallest unit for this parameter is seconds.

- BYTES_CUSTOM_DATA_STORAGE translates “-1” to “null” and prohibits
anything less than -1 with the old yaml and less than 0 with the new one
for native_transport_max_concurrent_requests_in_bytes_per_ip and
native_transport_max_concurrent_requests_in_bytes after C17341. I had also
separate mail for these two as we have some concerns about them for all C*
versions.

   -

   In C17431 I also decided to leave parameters phi_convict_threshold,
   memtable_cleanup_threshold, block_for_peers_timeout_in_secs alone and
   not to migrate them to the new Config types because of various reasons
   explained in the ticket. Please let me know if you have any particular
   suggestions/concerns/thoughts around those.
   -

   DataRateSpec - it was requested to support mebibytes/s, kibibytes/s,
   bytes/s. This made things complicated internally with two of the parameters
   introduced in 4.0 which were in megabits/s and I had to keep the old
   behavior and them being able to still support megabits/s with the old
   format. As the conversion between megabits and mebibytes is not really a
   whole number, I had to store the DataRateSpec in double and not long as in
   the other types in order to ensure accurate precision, etc. We considered
   this being fine because the RateLimiter uses double and we still allow only
   whole numbers to be provided by the users. Please let me know if you see
   any problem. A quick unit test where you can validate how those work  is
   SetGetInterDCStreamThroughputTest. Important to mention that once people
   move to the new format, they can’t assign them anything less than 1MiB/s.
   -

   Regarding DataRateSpec - the two new parameters (
   entire_sstable_stream_throughput_outbound_megabits_per_sec and
   entire_sstable_inter_dc_stream_throughput_outbound_megabits_per_sec)
   still not being in release were changed to be in MiB/s. Checked with
   Francisoco in Slack - that should be fine.
   -

   I would also like to ask Stefan and Paulo to check if they do not agree
   with anything in the latest version of DurationSpec as they have
   introduced the initial version of Duration in the codebase which evolved
   into DurationSpec in C15234. Please let me know if there is anything
   else aside from the tests you introduced that I might be missing and it
   changes your intentions.  Already sent a msg in Slack in case they miss the
   email.



   -

   Also, it was discussed on the ticket but I wanted to add it again for
   consideration - virtual tables currently show both the old names and format
   and new names and format. There are only three parameters which change only
   type but not names. Those will be listed only with the new format value.
   This was not considered as enough reason to bump to 5.0 because of
   breaking change but I wanted one more time everyone to be aware of that
   agreement.



   -

   In addition, I will add one more note to NEWS.txt for people to raise
   awareness to check carefully their config on upgrade that there are no
   breaking changes. We already mention the new option and the backward
   compatibility but I think it won’t hurt to stress more on the part we
   prohibit negative values considering it being a bug and have only a few
   special cases around -1 that we convert.



   -

   Please check the matching patterns we use to accept config values in
   DurationSpec, DataRateSpec and DataStorageSpec and let me know if you
   think there was some special case missed or anything.



   -

   One more topic I want to mention is changes to public classes. There
   were some name changes (not to JMX) and out of this work I’ve seen such
   changes even in patch releases. I know we give a promise not to change
   only interfaces but I still want to mention it here and to ensure we are
   aligned. As I said, Config changes the types of our parameters. We use
   annotations and the mentioned converters for backward compatibility of our
   yaml file.



   -

   New JMX methods are not added and the old ones still work by converting
   to the base unit of the specific parameter. By base I mean the internally
   used one. The agreement was that in the future virtual tables should handle
   the new format when we add the update option for SettingsTable.



   -

   In order not to run into precision issues, we introduced the Smallest
   possible unit for certain parameters of type Duration and DataStorage,
   that can be changed if we decide to migrate any of those in time to the
   smallest possible unit internally. I am also adding their Int equivalents
   now to improve the former int parameters handling.
   -

   We didn’t find any parameters with suffix _mb or _kb that internally
   didn’t mean kibi- or mebi- but kilo- or mega-. So all default values are
   the same in the new format.


For more details - please refer to the write up doc I shared before here -
https://cassandra.apache.org/doc/trunk/cassandra/new/configuration.html
There is important note and ticket around parameters overloading which
turned to be undocumented behavior in the project that people use.


   -

   Last but not least - I want to remind you that any new parameters are
   added with the new types. Users will have to set them with their new
   format. So whoever wants to use old yaml and wants to update default values
   of new parameters will have to do this in the new format.


This email came rather long but I truly believe it is important and I want
to ask you if there were parameters you think they might have been
affected, to double check them and raise the flag if you are not sure or
worried about something done or not done.

[1]
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/GroupCommitLogService.java#L31


      Ekaterina Dimitrova

Reply via email to