[ 
https://issues.apache.org/jira/browse/CASSANDRA-8915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570983#comment-14570983
 ] 

Benedict commented on CASSANDRA-8915:
-------------------------------------

np - they were just that: suggestions. Hopefully anyone who would be confused 
by the aspects that I was will be helped sufficiently by the combination of 
your updates and my comment.

I've finished vetting the code. I really like this implementation: not only 
does it ace the comparison tests, it is very clear to read. Perhaps reading the 
proof made it clearer, but I found it as - or easier - to read than the 
non-optimised variant. 

I've pushed some small comments and one minor "cleanup" (merging two 
near-identical paths in the replaceAndSink method). I've also applied my 
earlier updates to MergeIteratorComparisonTest. I ran these against the other 
two versions, and found it to meet or beat on every test (or be close enough 
that it didn't matter), which matches with expectation.  I'm happy to merge 
once we squash and get clean CI runs.

> Improve MergeIterator performance
> ---------------------------------
>
>                 Key: CASSANDRA-8915
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8915
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Branimir Lambov
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: compaction, performance
>             Fix For: 3.x
>
>
> The implementation of {{MergeIterator}} uses a priority queue and applies a 
> pair of {{poll}}+{{add}} operations for every item in the resulting sequence. 
> This is quite inefficient as {{poll}} necessarily applies at least {{log N}} 
> comparisons (up to {{2log N}}), and {{add}} often requires another {{log N}}, 
> for example in the case where the inputs largely don't overlap (where {{N}} 
> is the number of iterators being merged).
> This can easily be replaced with a simple custom structure that can perform 
> replacement of the top of the queue in a single step, which will very often 
> complete after a couple of comparisons and in the worst case scenarios will 
> match the complexity of the current implementation.
> This should significantly improve merge performance for iterators with 
> limited overlap (e.g. levelled compaction).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to