[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

Michael McCandless (JIRA) Sun, 18 Jan 2009 08:37:20 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664979#action_12664979
 ]


Michael McCandless commented on LUCENE-1483:
--------------------------------------------


bq. Even still, you are seeing like a 40% diff, but small enough times to not 
matter. 

Right, good point.

I think the massive slowness of iterating through all terms & docs
from a MultiTermEnum/Docs may come from asking the N-1 SegmentReaders
to seek to a non-existent (for them) term.

Ie when we ask MultiTermDocs to seek to a unique title X, only the
particular segment that title X comes from actually has it, whereas
the others do a costly seek to the index term just before it then scan
to look for the non-existent term, and then repeat that for the next
title, etc.

In fact this probably causes the underlying buffer in
BufferedIndexReader to get reloaded many times whenever we cross a
boundary (ie, we keep flipping between buffer N and N+1, then back to
N then N+1 again, etc.) -- maybe that's the source massive slowness?

BTW I think this change may also speed up Range/PrefixQuery as well.


> Change IndexSearcher multisegment searches to search each individual segment 
> using a single HitCollector
> --------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1483
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1483
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: LUCENE-1483-partial.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> sortBench.py, sortCollate.py
>
>
> FieldCache and Filters are forced down to a single segment reader, allowing 
> for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

Reply via email to