[ https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16966637#comment-16966637 ]
Benedict Elliott Smith commented on CASSANDRA-15379: ---------------------------------------------------- What is your rationale for an {{EnumSet}} being more maintainable than a member function? As far as I understand we explicitly intend to retire this functionality, so planning for future uses seems counterproductive to me. If we're adding per-table config for this, why are we blanket changing the behaviour for all relevant compressors? This may well be surprising to users, and also seems to make the per-table config superfluous (or at least, only useful to restore the probably-assumed behaviour of using the same compressor for both flush and compaction) > Make it possible to flush with a different compression strategy than we > compact with > ------------------------------------------------------------------------------------ > > Key: CASSANDRA-15379 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15379 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction, Local/Config, Local/Memtable > Reporter: Joey Lynch > Assignee: Joey Lynch > Priority: Normal > > [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on > some of our most dense clusters and have been observing close to 50% > reduction in footprint with Zstd on some of our workloads! Unfortunately > though we have been running into an issue where the flush might take so long > (Zstd is slower to compress than LZ4) that we can actually block the next > flush and cause instability. > Internally we are working around this with a very simple patch which flushes > SSTables as the default compression strategy (LZ4) regardless of the table > params. This is a simple solution but I think the ideal solution though might > be for the flush compression strategy to be configurable separately from the > table compression strategy (while defaulting to the same thing). Instead of > adding yet another compression option to the yaml (like hints and commitlog) > I was thinking of just adding it to the table parameters and then adding a > {{default_table_parameters}} yaml option like: > {noformat} > # Default table properties to apply on freshly created tables. The currently > supported defaults are: > # * compression : How are SSTables compressed in general (flush, > compaction, etc ...) > # * flush_compression : How are SSTables compressed as they flush > # supported > default_table_parameters: > compression: > class_name: 'LZ4Compressor' > parameters: > chunk_length_in_kb: 16 > flush_compression: > class_name: 'LZ4Compressor' > parameters: > chunk_length_in_kb: 4 > {noformat} > This would have the nice effect as well of giving our configuration a path > forward to providing user specified defaults for table creation (so e.g. if a > particular user wanted to use a different default chunk_length_in_kb they can > do that). > So the proposed (~mandatory) scope is: > * Flush with a faster compression strategy > I'd like to implement the following at the same time: > * Per table flush compression configuration > * Ability to default the table flush and compaction compression in the yaml. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org