ATTENTION PLEASE Below email will be long but I believe you will agree it deserves attention for good reasons. Thank you in advance for your time and consideration!
Hi everyone, I am working on the last batch of config parameters to be transferred to the new types after CASSANDRA-15234 landed. This led to a few questions and concerns and here I am to raise awareness and one more time to confirm things for the community to ensure there is no regression when 4.1 is out. As the main decisions were taken and implemented two years ago before even 4.0 beta, I would like to ensure that everyone here is aware that in CASSANDRA-15234 we have backward compatibility to transfer the old format (only values, no opportunity to change unit) to the new format (value + unit) but it falls back to the new types at the end. This is not a feature with a flag to be disabled. We migrated the parameters to the new types in Config class and in cassandra.yaml. Also, we explicitly say we don’t support negative values (old and new yaml config) which was not the case before. We never advertised the usage of negative values but we also did not prohibit that which we do now as one more improvement in CASSANDRA-15234. I will add an explicit note in NEWS.txt to be sure no one was expecting to keep on using negative values with the old yaml if they were doing it for some unknown reason, old/new - we prohibit on trunk negative values except special cases. As part of the documentation I mentioned the Converter class which serves to do these conversions between old value and new value - load the old value to the new Config types. There are special cases which I want to highlight in case someone knows of any behavior not caught by our CI that might be changed or broken because of those changes(better safe than sorry): - The new types DataRateSpec, DurationSpec and DataRateSpec accept only non-negative values (this will affect both old and new config). Thus as part of the converters we have the following special cases: - MILLIS_DOUBLE_DURATION for commitlog_sync_group_window_in_ms or now already commitlog_sync_group_window which covers that this was double and even if we think this was a bug as it is casted later to int here[1], someone might have been using double and NaN. Disallow less than 0 as we already said for all types. With the new config types this parameter is stored as long. - MILLIS_CUSTOM_DURATION for permissions_update_interval, roles_update_interval and credentials_update_interval. After CASSANDRA-17431 those will be updated to - old value of “-1” being translated to null. The setters will be doing that too. Anything below -1 is prohibited and considered a bug (both with old and new config). - we are adding NEGATIVE_SECONDS_DURATION as part of C17431 to handle validation_preview_purge_head_start. Anyone using the old yaml and value <0 will have it translated into 0 seconds. New yaml and format prohibits values less than 0 and the smallest unit for this parameter is seconds. - BYTES_CUSTOM_DATA_STORAGE translates “-1” to “null” and prohibits anything less than -1 with the old yaml and less than 0 with the new one for native_transport_max_concurrent_requests_in_bytes_per_ip and native_transport_max_concurrent_requests_in_bytes after C17341. I had also separate mail for these two as we have some concerns about them for all C* versions. - In C17431 I also decided to leave parameters phi_convict_threshold, memtable_cleanup_threshold, block_for_peers_timeout_in_secs alone and not to migrate them to the new Config types because of various reasons explained in the ticket. Please let me know if you have any particular suggestions/concerns/thoughts around those. - DataRateSpec - it was requested to support mebibytes/s, kibibytes/s, bytes/s. This made things complicated internally with two of the parameters introduced in 4.0 which were in megabits/s and I had to keep the old behavior and them being able to still support megabits/s with the old format. As the conversion between megabits and mebibytes is not really a whole number, I had to store the DataRateSpec in double and not long as in the other types in order to ensure accurate precision, etc. We considered this being fine because the RateLimiter uses double and we still allow only whole numbers to be provided by the users. Please let me know if you see any problem. A quick unit test where you can validate how those work is SetGetInterDCStreamThroughputTest. Important to mention that once people move to the new format, they can’t assign them anything less than 1MiB/s. - Regarding DataRateSpec - the two new parameters ( entire_sstable_stream_throughput_outbound_megabits_per_sec and entire_sstable_inter_dc_stream_throughput_outbound_megabits_per_sec) still not being in release were changed to be in MiB/s. Checked with Francisoco in Slack - that should be fine. - I would also like to ask Stefan and Paulo to check if they do not agree with anything in the latest version of DurationSpec as they have introduced the initial version of Duration in the codebase which evolved into DurationSpec in C15234. Please let me know if there is anything else aside from the tests you introduced that I might be missing and it changes your intentions. Already sent a msg in Slack in case they miss the email. - Also, it was discussed on the ticket but I wanted to add it again for consideration - virtual tables currently show both the old names and format and new names and format. There are only three parameters which change only type but not names. Those will be listed only with the new format value. This was not considered as enough reason to bump to 5.0 because of breaking change but I wanted one more time everyone to be aware of that agreement. - In addition, I will add one more note to NEWS.txt for people to raise awareness to check carefully their config on upgrade that there are no breaking changes. We already mention the new option and the backward compatibility but I think it won’t hurt to stress more on the part we prohibit negative values considering it being a bug and have only a few special cases around -1 that we convert. - Please check the matching patterns we use to accept config values in DurationSpec, DataRateSpec and DataStorageSpec and let me know if you think there was some special case missed or anything. - One more topic I want to mention is changes to public classes. There were some name changes (not to JMX) and out of this work I’ve seen such changes even in patch releases. I know we give a promise not to change only interfaces but I still want to mention it here and to ensure we are aligned. As I said, Config changes the types of our parameters. We use annotations and the mentioned converters for backward compatibility of our yaml file. - New JMX methods are not added and the old ones still work by converting to the base unit of the specific parameter. By base I mean the internally used one. The agreement was that in the future virtual tables should handle the new format when we add the update option for SettingsTable. - In order not to run into precision issues, we introduced the Smallest possible unit for certain parameters of type Duration and DataStorage, that can be changed if we decide to migrate any of those in time to the smallest possible unit internally. I am also adding their Int equivalents now to improve the former int parameters handling. - We didn’t find any parameters with suffix _mb or _kb that internally didn’t mean kibi- or mebi- but kilo- or mega-. So all default values are the same in the new format. For more details - please refer to the write up doc I shared before here - https://cassandra.apache.org/doc/trunk/cassandra/new/configuration.html There is important note and ticket around parameters overloading which turned to be undocumented behavior in the project that people use. - Last but not least - I want to remind you that any new parameters are added with the new types. Users will have to set them with their new format. So whoever wants to use old yaml and wants to update default values of new parameters will have to do this in the new format. This email came rather long but I truly believe it is important and I want to ask you if there were parameters you think they might have been affected, to double check them and raise the flag if you are not sure or worried about something done or not done. [1] https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/GroupCommitLogService.java#L31 Ekaterina Dimitrova