[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-05-16 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14654:
---
Fix Version/s: (was: 4.x)
   4.0

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.0
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-05-15 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-14654:
--
Status: Resolved  (was: Ready to Commit)
Resolution: Fixed

committed 
[7df67eff2d66dba4bed2b4f6aeabf05144d9b057|https://github.com/apache/cassandra/commit/7df67eff2d66dba4bed2b4f6aeabf05144d9b057]

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-26 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14654:
-
Status: Ready to Commit  (was: Review In Progress)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-26 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-14654:
--
Status: Review In Progress  (was: Changes Suggested)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-15 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14654:
-
Platform:   (was: Java7)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-15 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14654:
-
Platform: Java7

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-08 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14654:
-
Test and Documentation Plan: TBD
 Status: Patch Available  (was: Open)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-08 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14654:
-
Status: Review In Progress  (was: Patch Available)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-08 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14654:
-
Status: Changes Suggested  (was: Review In Progress)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-08 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14654:
-
Status: Open  (was: Ready to Commit)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-03 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14654:
-
Reviewers: Benedict, Dinesh Joshi  (was: Dinesh Joshi, Marcus Eriksson)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-02 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14654:
-
Status: Ready to Commit  (was: Review In Progress)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-02 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14654:
-
Status: Review In Progress  (was: Patch Available)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-11-17 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14654:
-
Status: Patch Available  (was: Open)

Marking "Patch Available" on behalf of [~cnlwsu]: 
https://github.com/clohfink/cassandra/tree/compaction_allocs

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-10-24 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14654:
-
Reviewers: Dinesh Joshi, Marcus Eriksson  (was: Marcus Eriksson)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-10-11 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-14654:

Reviewers: Marcus Eriksson

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-14654:
--
Attachment: screenshot-4.png

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-14654:
--
Attachment: screenshot-3.png

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-14654:
--
Attachment: screenshot-2.png

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-14654:
--
Attachment: screenshot-1.png

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-14654:
---
Labels: Performance pull-request-available  (was: Performance)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14654:
---
Labels: Performance  (was: )

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance
> Fix For: 4.x
>
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14654:
---
Fix Version/s: 4.x

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance
> Fix For: 4.x
>
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14654:
-
Component/s: Compaction

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org