[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2020-04-22 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15379:
-
Status: Ready to Commit  (was: Changes Suggested)

Thank you for quantifying performance and clearly demonstrating the benefits. I 
am +1 on these changes.

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: 15379_backfill_drops_zstd_level10.png, 
> 15379_backfill_duration_zstd_level10.png, 
> 15379_backfill_queueing_zstd_level10.png, 15379_backfill_zstd_level10.png, 
> 15379_baseline_flush_trace.png, 15379_candidate_flush_trace.png, 
> 15379_concurrent_flushes_zstd_level10.png, 15379_coordinator_defaults.png, 
> 15379_coordinator_zstd_defaults.png, 15379_coordinator_zstd_level10.png, 
> 15379_flush_flamegraph_zstd_level10.png, 
> 15379_message_drops_zstd_level10.png, 15379_replica_defaults.png, 
> 15379_replica_zstd_defaults.png, 15379_request_queueing_zstd_level10.png, 
> 15379_system_defaults.png, 15379_system_zstd_defaults.png
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2020-04-21 Thread Joey Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Lynch updated CASSANDRA-15379:
---
Attachment: 15379_backfill_zstd_level10.png
15379_backfill_queueing_zstd_level10.png
15379_backfill_drops_zstd_level10.png
15379_backfill_duration_zstd_level10.png
15379_concurrent_flushes_zstd_level10.png
15379_flush_flamegraph_zstd_level10.png
15379_coordinator_zstd_level10.png
15379_message_drops_zstd_level10.png
15379_request_queueing_zstd_level10.png

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: 15379_backfill_drops_zstd_level10.png, 
> 15379_backfill_duration_zstd_level10.png, 
> 15379_backfill_queueing_zstd_level10.png, 15379_backfill_zstd_level10.png, 
> 15379_baseline_flush_trace.png, 15379_candidate_flush_trace.png, 
> 15379_concurrent_flushes_zstd_level10.png, 15379_coordinator_defaults.png, 
> 15379_coordinator_zstd_defaults.png, 15379_coordinator_zstd_level10.png, 
> 15379_flush_flamegraph_zstd_level10.png, 
> 15379_message_drops_zstd_level10.png, 15379_replica_defaults.png, 
> 15379_replica_zstd_defaults.png, 15379_request_queueing_zstd_level10.png, 
> 15379_system_defaults.png, 15379_system_zstd_defaults.png
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2020-04-20 Thread Joey Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Lynch updated CASSANDRA-15379:
---
Attachment: 15379_replica_zstd_defaults.png
15379_coordinator_zstd_defaults.png
15379_system_zstd_defaults.png

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: 15379_baseline_flush_trace.png, 
> 15379_candidate_flush_trace.png, 15379_coordinator_defaults.png, 
> 15379_coordinator_zstd_defaults.png, 15379_replica_defaults.png, 
> 15379_replica_zstd_defaults.png, 15379_system_defaults.png, 
> 15379_system_zstd_defaults.png
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2020-04-20 Thread Joey Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Lynch updated CASSANDRA-15379:
---
Attachment: 15379_baseline_flush_trace.png

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: 15379_baseline_flush_trace.png, 
> 15379_candidate_flush_trace.png, 15379_coordinator_defaults.png, 
> 15379_replica_defaults.png, 15379_system_defaults.png
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2020-04-20 Thread Joey Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Lynch updated CASSANDRA-15379:
---
Attachment: 15379_candidate_flush_trace.png

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: 15379_candidate_flush_trace.png, 
> 15379_coordinator_defaults.png, 15379_replica_defaults.png, 
> 15379_system_defaults.png
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2020-04-19 Thread Joey Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Lynch updated CASSANDRA-15379:
---
Attachment: 15379_system_defaults.png

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: 15379_coordinator_defaults.png, 
> 15379_replica_defaults.png, 15379_system_defaults.png
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2020-04-19 Thread Joey Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Lynch updated CASSANDRA-15379:
---
Attachment: 15379_replica_defaults.png

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: 15379_coordinator_defaults.png, 
> 15379_replica_defaults.png
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2020-04-19 Thread Joey Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Lynch updated CASSANDRA-15379:
---
Attachment: 15379_coordinator_defaults.png

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: 15379_coordinator_defaults.png
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2020-01-23 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15379:
-
Reviewers: Dinesh Joshi, Dinesh Joshi  (was: Dinesh Joshi)
   Dinesh Joshi, Dinesh Joshi  (was: Dinesh Joshi)
   Status: Review In Progress  (was: Patch Available)

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2020-01-23 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15379:
-
Status: Changes Suggested  (was: Review In Progress)

Hi [~jolynch], thanks for the patch. I went over it and it looks generally 
good. On a high level the only concern I have is introducing a 
{{NoOpCompressor}} may lead to some performance issues compared to our current 
state. This is mainly due to Java JIT's inability to optimize megamorphic call 
sites. However, I think this is just a theory and we should try and validate it 
using an actual performance test. IMHO, the advantages that you have laid out 
would outweight a bit of performance penalty.

Other than that, I had some code related feedback. It fixes the 
{{DatabaseDescriptorRefTest}} and also makes minor structural modifications for 
safety and clarity. I have illustrated in my branch 
[here|https://github.com/apache/cassandra/compare/trunk...dineshjoshi:CASSANDRA-15379-review?expand=1].
 Please feel free to cherry pick the commits in your branch.

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2019-12-18 Thread Joey Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Lynch updated CASSANDRA-15379:
---
Test and Documentation Plan: 
Tests: Automated unit tests and manual testing.

Documentation: Included docs of the new options both in the yaml and in the 
website

 

After review will do a more thorough manual test and docs.
 Status: Patch Available  (was: Open)

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15379) Make it possible to flush with a different compression strategy than we compact with

2019-11-04 Thread Joey Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Lynch updated CASSANDRA-15379:
---
Change Category: Performance
 Complexity: Low Hanging Fruit
  Fix Version/s: 4.0-alpha
  Reviewers: Dinesh Joshi
 Status: Open  (was: Triage Needed)

> Make it possible to flush with a different compression strategy than we 
> compact with
> 
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Local/Memtable
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on 
> some of our most dense clusters and have been observing close to 50% 
> reduction in footprint with Zstd on some of our workloads! Unfortunately 
> though we have been running into an issue where the flush might take so long 
> (Zstd is slower to compress than LZ4) that we can actually block the next 
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes 
> SSTables as the default compression strategy (LZ4) regardless of the table 
> params. This is a simple solution but I think the ideal solution though might 
> be for the flush compression strategy to be configurable separately from the 
> table compression strategy (while defaulting to the same thing). Instead of 
> adding yet another compression option to the yaml (like hints and commitlog) 
> I was thinking of just adding it to the table parameters and then adding a 
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently 
> supported defaults are:
> # * compression   : How are SSTables compressed in general (flush, 
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
>   compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 16
>   flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
>   chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path 
> forward to providing user specified defaults for table creation (so e.g. if a 
> particular user wanted to use a different default chunk_length_in_kb they can 
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org