Hoss,
A bit long, sorry for that, sometimes things are just as complex as they are.
On Saturday 14 April 2007 01:13, Chris Hostetter wrote:
>
...
>
> I don'tget it, how would a Scorer not implement skipTo? ...oh...
>
> final class BooleanScorer extends Scorer {
> ...
> public boolean skipTo(int target) {
> throw new UnsupportedOperationException();
> }
Some history for the underlying reason for this:
Once upon a time no Scorer would implement skipTo().
Most people would use BooleanScorer for queries with multiple terms, and
things worked well with the Scorer.next() method, especially for
disjunctions. Occasionally documents would be scored out of document order,
but that did not lead to problems because Hits would reorder the documents by
score value anyway.
Then skipTo() was added to improve the speed of conjunctions. To do this each
Scorer needs to score all documents in document number order and implement
skipTo() because it skipTo() used by ConjunctionScorer. BooleanScorer will
only use ConjunctionScorer in very specific (but also frequently occurring)
circumstances. At this point the index format was also changed to include the
skip forward information.
As I said, the implementation of disjunctions in BooleanScorer does not score
documents strictly in document order. It can be made to do that, but that
would lead to some loss of performance. BooleanScorer uses a kind of
distributive sort that is faster than the priority queue used by
DisjunctionSumScorer.
Then BooleanScorer2 came along. BooleanScorer2 uses ConjunctionScorer in more
circumstances than BooleanScorer., and it usesuses DisjunctionSumScorer for
disjunctions. LUCENCE-730 is an attempt to get the top level disjunction
performance of BooleanScorer back.
Disjunctions below top level, for example in a query like this:
+(a1 a2) +(b1 b2)
need skipTo() (called from ConjunctionScorer) on the two nested disjunctions,
and for that DisjunctionSumScorer is used. Currently for the top level
disjunction case:
a1 a2 b1 b2
DisjunctionSumScorer is normally used. But when the setUseScorer14() method is
used, BooleanScorer will (always?) be used. The patch at LUCENE-584 tries to
handle this setUseScorer14() case by keeping also the old filtering method
that checks the Bits individually in IndexSearcher.
LUCENE-730 will always use BooleanScorer for the top level disjunctions, so
with a bit of luck the setUseScorer14 method can also be deprecated/removed.
LUCENE-584 has another possible performance advantage in that it allows an
implementation of filtering by using a ConjunctionScorer directly instead of
doing the filtering in IndexSearcher, but that still needs to be added.
Regards,
Paul Elschot
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]