[
https://issues.apache.org/jira/browse/LUCENE-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15339571#comment-15339571
]
Adrien Grand commented on LUCENE-7347:
--------------------------------------
Coords might have some value indeed, but maybe they are not worth the
maintenance cost anymore? Coords make the construction of scorers more tricky
in BooleanWeight. One example is that if a disjunction produces a single
non-null Scorer, you cannot return it directly today because of coords: it
needs to be wrapped into a Scorer wrapper that will multiply the score by the
coord factor. It is quite trappy that bugs mostly manifest themselves in corner
cases such as searching for terms that are not indexed? Moreover, things become
more complicated when you mix required and optional clauses (see
BooleanTopLevelScorers), or when you want to optimize disjunctions for the case
that a large range of documents is matched by a single clause. The latter is
what caused the tricky scoring bug in LUCENE-7132: there was a bug in how
BooleanScorer dealt with coords when using this optimization.
Given that TF-IDF is not the default Similarity anymore and that BM25, which I
see as a better TF-IDF, does not need them, I think it is appealing to remove
them so that we can make the code for boolean queries easier to read and less
bug-prone.
> Remove queryNorm and coords
> ---------------------------
>
> Key: LUCENE-7347
> URL: https://issues.apache.org/jira/browse/LUCENE-7347
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Assignee: Adrien Grand
>
> These two features are specific to TF-IDF and introduce some complexity (see
> eg. handling of coords in BooleanWeight) and bugs/corner-cases (see eg. how
> taking the query norm into account causes scoring challenges on LUCENE-7337).
> Since we made BM25 the default in 6.0, I propose that we remove these
> TF-IDF-specific features in 7.0.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]