[
https://issues.apache.org/jira/browse/LUCENE-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2838:
----------------------------------
Attachment: LUCENE-2838-no-topscorer-opt.patch
After thinking one day about it, I found some problems with the "collector
hack" and this style of decorator pattern:
- If you wrap multiple times, the setScorer() method in the wrapped collector
may set the wrong scorer (you see this, if you wrap multiple
ConstantScoreQueries on top of each other, then the boost of the inner one is
returned. The problem is that the score(Collector) method inverts the decorator
pattern.
- The inner scorer (like BoolenScorer with its buckets) may set a different
scorer in the collector than itsself that implements doc() different, so
setting the ConstantScorer always as collector's scorer can lead to wrong
results (we dont see this in the test, as no collector uses Scorer.doc(), only
Scorer.score(), which returns constant).
I changed the code so CSQ now passes always topScorer=false to Weight.scorer()
of the wrapped query and does not overwrite score(Collector,...) methods. It
still allows out-of-order collection. Now BooleanScorer2 is always used with
MTQs.
The question is, would the previous but broken optimization make sense for
speed? Mike/Mark?
> ConstantScoreQuery should directly support wrapping Query and simply strip
> off scores
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-2838
> URL: https://issues.apache.org/jira/browse/LUCENE-2838
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Search
> Reporter: Uwe Schindler
> Assignee: Uwe Schindler
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2838-no-topscorer-opt.patch, LUCENE-2838.patch,
> LUCENE-2838.patch
>
>
> Especially in MultiTermQuery rewrite modes we often simply need to strip off
> scores from Queries and make them constant score. Currently the code to do
> this looks quite ugly: new ConstantScoreQuery(new QueryWrapperFilter(query))
> As the name says, QueryWrapperFilter should make any other Query constant
> score, so why does it not take a Query as ctor param? This question was aldso
> asked quite often by my customers and is simply correct, if you think about
> it.
> Looking closer into the code, it is clear that this would also speed up MTQs:
> - One additional wrapping and method calls can be removed
> - Maybe we can even deprecate QueryWrapperFilter in 3.1 now (it's now only
> used in tests and the use-case for this class is not really available) and
> LUCENE-2831 does not need the stupid hack to make Simon's assertions pass
> - CSQ now supports out-of-order scoring and topLevel scoring, so a CSQ on
> top-level now directly feeds the Collector. For that a small trick is used:
> The score(Collector) calls are directly delegated and the scores are stripped
> by wrapping the setScorer() method in Collector
> During that I found a visibility bug in Scorer (LUCENE-2839): The method
> "boolean score(Collector collector, int max, int firstDocID)" should be
> public not protected, as its not solely intended to be overridden by
> subclasses and is called from other classes, too! This leads to no compiler
> bugs as the other classes that calls it is mainly BooleanScorer(2) and thats
> in same package, but visibility is wrong. I will open an issue for that and
> fix it at least in trunk where we have no backwards-requirement.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]