[ https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16585140#comment-16585140 ]
Benedict commented on CASSANDRA-14654: -------------------------------------- I've only had a quick glance at the patch, but a couple of problems leap out at me for your KeyCacheKey changes: # A {{ByteBuffer}} is much larger in memory than a {{byte[]}} ## This hasn't been accounted for, so we could considerably overcommit memory to the key cache ## This reduces the efficacy of the key cache by permitting it to store fewer entries # A {{DecoratedKey}}'s {{ByteBuffer}} might reference a {{byte[]}} or native memory that is managed collectively, and as such either expire in a manner that leaks memory or corrupts the contents of the key cache. We probably currently wouldn't fall afoul of this, but the change seems to permit too easy misuse. > Reduce heap pressure during compactions > --------------------------------------- > > Key: CASSANDRA-14654 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14654 > Project: Cassandra > Issue Type: Improvement > Components: Compaction > Reporter: Chris Lohfink > Assignee: Chris Lohfink > Priority: Major > Labels: Performance, pull-request-available > Fix For: 4.x > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > Time Spent: 10m > Remaining Estimate: 0h > > Small partition compactions are painfully slow with a lot of overhead per > partition. There also tends to be an excess of objects created (ie > 200-700mb/s) per compaction thread. > The EncoderStats walks through all the partitions and with mergeWith it will > create a new one per partition as it walks the potentially millions of > partitions. In a test scenario of about 600byte partitions and a couple 100mb > of data this consumed ~16% of the heap pressure. Changing this to instead > mutably track the min values and create one in a EncodingStats.Collector > brought this down considerably (but not 100% since the > UnfilteredRowIterator.stats() still creates 1 per partition). > The KeyCacheKey makes a full copy of the underlying byte array in > ByteBufferUtil.getArray in its constructor. This is the dominating heap > pressure as there are more sstables. By changing this to just keeping the > original it completely eliminates the current dominator of the compactions > and also improves read performance. > Minor tweak included for this as well for operators when compactions are > behind on low read clusters is to make the preemptive opening setting a > hotprop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org