[ 
https://issues.apache.org/jira/browse/LUCENE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221500#comment-14221500
 ] 

Stefan Pohl commented on LUCENE-6066:
-------------------------------------

I also needed this missing functionality and hence implemented it in the 
specialized heap used within the MinShouldMatchSumScorer (LUCENE-4571). I also 
went for the O\(n\) solution not to have the position-update overhead, but in 
my scoring use-case the removed item is often to be found in the beginning of 
the heap array.

Regarding your patch, please note that it's not enough to sift-down the 
inserted last element, you first have to try bubbling it up in order to deal 
with all possible heap states. A more randomized test case would generate 
plenty of example cases where the heap property otherwise won't be invariant.

It would be nice if there would be only one implementation of heap 
functionality, so that they could centrally be tested thoroughly, but it still 
seems that specialized versions can outperform generic ones (e.g. 
LUCENE-6028)...

> New "remove" method in PriorityQueue
> ------------------------------------
>
>                 Key: LUCENE-6066
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6066
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/query/scoring
>            Reporter: Mark Harwood
>            Priority: Minor
>             Fix For: 5.0
>
>         Attachments: LUCENE-PQRemoveV1.patch
>
>
> It would be useful to be able to remove existing elements from a 
> PriorityQueue. 
> The proposal is that a linear scan is performed to find the element being 
> removed and then the end element in heap[size] is swapped into this position 
> to perform the delete. The method downHeap() is then called to shuffle the 
> replacement element back down the array but the existing downHeap method must 
> be modified to allow picking up an entry from any point in the array rather 
> than always assuming the first element (which is its only current mode of 
> operation).
> A working javascript model of the proposal with animation is available here: 
> http://jsfiddle.net/grcmquf2/22/ 
> In tests the modified version of "downHeap" produces the same results as the 
> existing impl but adds the ability to push down from any point.
> An example use case that requires remove is where a client doesn't want more 
> than N matches for any given key (e.g. no more than 5 products from any one 
> retailer in a marketplace). In these circumstances a document that was 
> previously thought of as competitive has to be removed from the final PQ and 
> replaced with another doc (eg a retailer who already has 5 matches in the PQ 
> receives a 6th match which is better than his previous ones). This particular 
> process is managed by a special "DiversifyingPriorityQueue" which wraps the 
> main PriorityQueue and could be contributed as part of another issue if there 
> is interest in that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to