[ https://issues.apache.org/jira/browse/CASSANDRA-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021455#comment-17021455 ]
Blake Eggleston commented on CASSANDRA-15392: --------------------------------------------- Thanks for the feedback. I just pushed up some commits addressing most points. Getting rid of linked list queues is a good idea, but don’t think TinyThreadLocalPool will be sufficient in some cases. If {{MergeIterator#get}} is called with more than than {{DEFAULT_SIZE}} iterators, we’ll nest pooled iterators. In cases where there _are_ that many sstables to merge the node is usually also under significant gc pressure, so it would not be a great time to abandon this optimization. An intrusive linked stack is the best I could come up with. It's unbounded, doesn't do any allocations, and has very simple logic, which was important for preventing oneToOne performance from regressing too much. @Jordan, short circuiting the merge iterator in that case is a really good idea. However, FQL will need a bit of (probably straightforward) rework before {{MergeIterator#get}} can return a closeable iterator, since it relies on having separate input and output types. As this relates to SASI, this wouldn’t be a regression right? I'd guess we're already wasting TrivialOneToOne iterators? I’ve also put together a very simple jmh benchmark. After a few iterations of additional optimization, the manyToOne path saw a 15% increase in throughput in most cases, and the oneToOne path saw a 2-6% reduction in most cases. Numbers below: {code:java} Baseline: Benchmark Mode Cnt Score Error Units MergeIteratorGetBench.manyToOne sample 59417991 418.721 ± 22.706 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.00 sample 101.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.50 sample 148.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.90 sample 171.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.95 sample 176.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.99 sample 188.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.999 sample 1456.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.9999 sample 9171.213 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p1.00 sample 38862848.000 ns/op MergeIteratorGetBench.one sample 53514991 83.452 ± 4.005 ns/op MergeIteratorGetBench.one:one·p0.00 sample 47.000 ns/op MergeIteratorGetBench.one:one·p0.50 sample 73.000 ns/op MergeIteratorGetBench.one:one·p0.90 sample 90.000 ns/op MergeIteratorGetBench.one:one·p0.95 sample 95.000 ns/op MergeIteratorGetBench.one:one·p0.99 sample 103.000 ns/op MergeIteratorGetBench.one:one·p0.999 sample 119.000 ns/op MergeIteratorGetBench.one:one·p0.9999 sample 3830.003 ns/op MergeIteratorGetBench.one:one·p1.00 sample 21987328.000 ns/op Pooled: Benchmark Mode Cnt Score Error Units MergeIteratorGetBench.manyToOne sample 100352304 133.534 ± 1.662 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.00 sample 86.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.50 sample 128.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.90 sample 143.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.95 sample 148.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.99 sample 157.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.999 sample 172.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p0.9999 sample 3744.000 ns/op MergeIteratorGetBench.manyToOne:manyToOne·p1.00 sample 13008896.000 ns/op MergeIteratorGetBench.one sample 97007239 87.340 ± 3.063 ns/op MergeIteratorGetBench.one:one·p0.00 sample 51.000 ns/op MergeIteratorGetBench.one:one·p0.50 sample 78.000 ns/op MergeIteratorGetBench.one:one·p0.90 sample 92.000 ns/op MergeIteratorGetBench.one:one·p0.95 sample 97.000 ns/op MergeIteratorGetBench.one:one·p0.99 sample 106.000 ns/op MergeIteratorGetBench.one:one·p0.999 sample 117.000 ns/op MergeIteratorGetBench.one:one·p0.9999 sample 3600.000 ns/op MergeIteratorGetBench.one:one·p1.00 sample 27164672.000 ns/op {code} > Pool Merge Iterators > -------------------- > > Key: CASSANDRA-15392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15392 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction > Reporter: Blake Eggleston > Assignee: Blake Eggleston > Priority: Normal > Fix For: 4.0 > > > By pooling merge iterators, instead of creating new ones each time we need > them, we can reduce garbage on the compaction and read paths under relevant > workloads by ~4% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org