[ https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713524#comment-17713524 ]
Claude Warren edited comment on CASSANDRA-12937 at 4/18/23 10:09 AM: --------------------------------------------------------------------- hints_compression and commitlog_compression use the standard ParameterizedClass. The CompressionParams has 3 parameters that it extracts or creates from the parameters in the ParameterizedClass. The parameters in CompressionParams are {code:java} private final int chunkLength; private final int maxCompressedLength; // In content we store max length to avoid rounding errors causing compress/decompress mismatch. private final double minCompressRatio; // In configuration we store min ratio, the input parameter. {code} The ParameterizedClass constructor that accepts the Map<String,String> of options expects a key of "chunk_length_in_kb" or "chunk_length_kb" as well as a "min_compress_ratio". This change I made does not change the hints_compression or commitlog_compression options. The yaml file has an additional set of requirements: * The chunkLength (yaml: chunk_length) should be specified with the DataStorageSpec suffix (e.g. KiB). * The maxCompressedLength should be accepted as a parameter. * The maxCompressedLength (yaml: max_compressed_length) should be specified with the DataStorageSpec suffix (e.g. KiB). * maxCompressedLength and minCompressRatio are related to each other via chunk_length; so only one can be specified. I could work chunkLength and maxCompressedLength into the class_name parameters, however, I believe this will result in adding 2 more reserved words both of which will need to be removed from the parameter list. This change will affect all CompressionParams constructions that use the Map<String,String> format. I will make the change with the following processes for determining collision values: * If both max_compressed_length and min_compress_ratio are specified an ConfigurationException will be thrown. * if both chunk_length and either chunk_length_in_kb or chunk_length_kb are specified and they are not equal ConfiguraitonException will be thrown. * if chunk_length or max_compressed_length are specified and do not use the DataStorageSpec suffix a ConfigurationException will be thrown I will also ensure that the short names: lz4, none, noop, snappy, deflate, and zstd will work as class names and use the defaults specified by the CompressionParams methods of the same names. was (Author: claudenw): hints_compression and commitlog_compression use the standard ParameterizedClass. The CompressionParams has 3 parameters that it extracts or creates from the parameters in the ParameterizedClass. The parameters in CompressionParams are {code:java} private final int chunkLength; private final int maxCompressedLength; // In content we store max length to avoid rounding errors causing compress/decompress mismatch. private final double minCompressRatio; // In configuration we store min ratio, the input parameter. {code} The ParameterizedClass constructor that accepts the Map<String,String> of options expects a key of "chunk_length_in_kb" or "chunk_length_kb" as well as a "min_compress_ratio". This change I made does not change the hints_compression or commitlog_compression options. The yaml file has an additional set of requirements: * The chunkLength (yaml: chunk_length) should be specified with the DataStorageSpec suffix (e.g. KiB). * The maxCompressedLength should be accepted as a parameter. * The maxCompressedLength (yaml: max_compressed_length) should be specified with the DataStorageSpec extensions (e.g. KiB). * maxCompressedLength and minCompressRatio are related to each other via chunk_length; so only one can be specified. I could work chunkLength and maxCompressedLength into the class_name parameters, however, I believe this will result in adding 2 more reserved words both of which will need to be removed from the parameter list. This change will affect all CompressionParams constructions that use the Map<String,String> format. I will make the change with the following processes for determining collision values: * If both max_compressed_length and min_compress_ratio are specified an ConfigurationException will be thrown. * if both chunk_length and either chunk_length_in_kb or chunk_length_kb are specified and they are not equal ConfiguraitonException will be thrown. * if chunk_length or max_compressed_length are specified and do not use the DataStorageSpec suffix a ConfigurationException will be thrown I will also ensure that the short names: lz4, none, noop, snappy, deflate, and zstd will work as class names and use the defaults specified by the CompressionParams methods of the same names. > Default setting (yaml) for SSTable compression > ---------------------------------------------- > > Key: CASSANDRA-12937 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12937 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config > Reporter: Michael Semb Wever > Assignee: Claude Warren > Priority: Low > Labels: AdventCalendar2021, lhf > Fix For: 5.x > > Time Spent: 3h > Remaining Estimate: 0h > > In many situations the choice of compression for sstables is more relevant to > the disks attached than to the schema and data. > This issue is to add to cassandra.yaml a default value for sstable > compression that new tables will inherit (instead of the defaults found in > {{CompressionParams.DEFAULT}}. > Examples where this can be relevant are filesystems that do on-the-fly > compression (btrfs, zfs) or specific disk configurations or even specific C* > versions (see CASSANDRA-10995 ). > +Additional information for newcomers+ > Some new fields need to be added to {{cassandra.yaml}} to allow specifying > the field required for defining the default compression parameters. In > {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for > the default compression. This field should be initialized in > {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where > {{CompressionParams.DEFAULT}} was used the code should call > {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some > copy of configured {{CompressionParams}}. > Some unit test using {{OverrideConfigurationLoader}} should be used to test > that the table schema use the new default when a new table is created (see > CreateTest for some example). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org