Toggle score normalization in Hits
----------------------------------
Key: LUCENE-954
URL: https://issues.apache.org/jira/browse/LUCENE-954
Project: Lucene - Java
Issue Type: Improvement
Components: Search
Affects Versions: 2.2
Environment: any
Reporter: Christian Kohlschütter
Fix For: 2.2
The current implementation of the "Hits" class sometimes performs score
normalization.
In particular, whenever the top-ranked score is bigger than 1.0, it is
normalized to a maximum of 1.0.
In this case, Hits may return different score results than TopDocs-based
methods.
In my scenario (a federated search system), Hits delievered just plain wrong
results.
I was merging results from several sources, all having homogeneous statistics
(similar to MultiSearcher, but over the Internet using HTTP/XML-based
protocols).
Sometimes, some of the sources had a top-score greater than 1, so I ended up
with garbled results.
I suggest to add a switch to enable/disable this score-normalization at runtime.
My patch (attached) has an additional peformance benefit, since score
normalization now occurs only when Hits#score() is called, not when creating
the Hits result list. Whenever scores are not required, you save one
multiplication per retrieved hit (i.e., at least 100 multiplications with the
current implementation of Hits).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]