[ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658810#action_12658810
 ] 

Michael McCandless commented on LUCENE-1483:
--------------------------------------------

{quote}
> In the case where 3 hits come from a Reader, but you ask for 1000 back, that 
> will run through that loop 1000 times, but you only need to convert 3 right?
{quote}


Well, it's tricky... and indeed all tests pass with the bug (which is
spooky -- I think we need to add cases to TestSort where 1) the index
has many segments, and 2) the number of hits is much greater than the
queue size), but I'm pretty sure it's a bug.

You're right: it'd be nice to only visit the "used" slots in the queue
on advancing to each reader.  During the "startup transient" (when
collector has not yet seen enough hits to fill its queue), the slot
indeed increases one at a time, and you could at that point use it to
efficiently visit only the used slots.

Howevever, after startup transient, the pqueue will then track the
weakest entry in the queue, which can occur in any of the slots, and
when a hit that beats that weakest entry arrives, it will call copy()
into that slot.

So the slot passed to copy is now a "relatively random" value.  For a
1000 sized queue whose slots are full, you might get a copy into slot
242.  In this case we were incorrectly setting "this.slot" to 242 and
then only converting the first 242 entries.

If we changed to to track the maxSlot it should work... but I'm not
sure this is worthwhile, since it only speeds up already super-fast
searches and slightly hurts slow searches.


> Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-1483
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1483
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, sortBench.py, 
> sortCollate.py
>
>
> FieldCache and Filters are forced down to a single segment reader, allowing 
> for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to