[ https://issues.apache.org/jira/browse/CASSANDRA-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021552#comment-17021552 ]
Benedict Elliott Smith commented on CASSANDRA-15392: ---------------------------------------------------- bq. In cases where there are that many sstables to merge the node is usually also under significant gc pressure, so it would not be a great time to abandon this optimization. Perhaps the size limit should be larger, and an intrusive linked-list stack is certainly a good data structure to use here. But we probably want to impose _some_ coarse size limits? Though you make a very good point that three is much too few in this case. But otherwise we have the problem that infrequent storms of sstable creation slowly causes threads to accumulate "huge" pools - and we can have a lot of these threads, since read-parallelism is gated by the number of threads. It's far from crazy to have a system with > 200 threads, and if they each managed to hold on to 1k iterators at a time (noting we need iterators both for the partitions, their rows and any complex columns, so this seems readily achievable), we'd be looking at 50-100MiB+ of heap utilised by these pools. > Pool Merge Iterators > -------------------- > > Key: CASSANDRA-15392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15392 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction > Reporter: Blake Eggleston > Assignee: Blake Eggleston > Priority: Normal > Fix For: 4.0 > > > By pooling merge iterators, instead of creating new ones each time we need > them, we can reduce garbage on the compaction and read paths under relevant > workloads by ~4% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org