[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

Michael McCandless (JIRA) Tue, 06 Jan 2009 06:17:11 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661149#action_12661149
 ]


Michael McCandless commented on LUCENE-1483:
--------------------------------------------

On what ComparatorPolicy to use by default... I think we should start
with ORD, but gather counters of number of compares vs number of
copies, and based on those counters (and comparing to numDocs())
decide "how aggressively" to switch comparators?  That determination
should also take into account the queue size.

An optimized index would always use ORD (w/o gathering counters),
which is fastest.

In the future... we could imagine allowing the query to dictate the
order that segments are visited.  EG if the query can roughly estimate
how many hits it'll get on a given segment, we could order by that
instead of simply numDocs().

The query could also choose an appropriate ComparatorPolicy, eg, if it
estimates it'll get very few hits, VAL is best right from the start,
else start with ORD.

Another future fix would be to implement ORDSUB with a single pass
through the queue, using a reused secondary pqueue to do the full sort
of the queue.  This would let us assign subords much faster, I think.

But I don't think we should pursue these optimizations as part of this
issue... we need to bring closure here; we already have some solid
gains to capture.  I think we should wrapup now...



> Change IndexSearcher multisegment searches to search each individual segment 
> using a single HitCollector
> --------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1483
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1483
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> sortBench.py, sortCollate.py
>
>
> FieldCache and Filters are forced down to a single segment reader, allowing 
> for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

Reply via email to