[ 
https://issues.apache.org/jira/browse/LUCENE-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15339571#comment-15339571
 ] 

Adrien Grand commented on LUCENE-7347:
--------------------------------------

Coords might have some value indeed, but maybe they are not worth the 
maintenance cost anymore? Coords make the construction of scorers more tricky 
in BooleanWeight. One example is that if a disjunction produces a single 
non-null Scorer, you cannot return it directly today because of coords: it 
needs to be wrapped into a Scorer wrapper that will multiply the score by the 
coord factor. It is quite trappy that bugs mostly manifest themselves in corner 
cases such as searching for terms that are not indexed? Moreover, things become 
more complicated when you mix required and optional clauses (see 
BooleanTopLevelScorers), or when you want to optimize disjunctions for the case 
that a large range of documents is matched by a single clause. The latter is 
what caused the tricky scoring bug in LUCENE-7132: there was a bug in how 
BooleanScorer dealt with coords when using this optimization.

Given that TF-IDF is not the default Similarity anymore and that BM25, which I 
see as a better TF-IDF, does not need them, I think it is appealing to remove 
them so that we can make the code for boolean queries easier to read and less 
bug-prone.

> Remove queryNorm and coords
> ---------------------------
>
>                 Key: LUCENE-7347
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7347
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>
> These two features are specific to TF-IDF and introduce some complexity (see 
> eg. handling of coords in BooleanWeight) and bugs/corner-cases (see eg. how 
> taking the query norm into account causes scoring challenges on LUCENE-7337).
> Since we made BM25 the default in 6.0, I propose that we remove these 
> TF-IDF-specific features in 7.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to