[ 
https://issues.apache.org/jira/browse/LUCENE-9725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17277545#comment-17277545
 ] 

Julie Tibshirani commented on LUCENE-9725:
------------------------------------------

I opened https://github.com/apache/lucene-solr/pull/2293 to show the idea.

> Allow BM25FQuery to use other similarities
> ------------------------------------------
>
>                 Key: LUCENE-9725
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9725
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Julie Tibshirani
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> From a high level, BM25FQuery works as follows:
> # Given a list of fields and weights, it pretends there's a synthetic 
> combined field where all terms have been indexed. It computes new term and 
> collection statistics for this combined field.
> # It uses a disjunction iterator and BM25Similarity to score the documents.
> The steps are (1) compute statistics that represent the combined field 
> content, and (2) pass these to a similarity function. There is nothing really 
> specific to BM25Similarity in this approach. In step 2, we could use another 
> similarity, for example BooleanSimilarity or those based on language models 
> like LMDirichletSimilarity. The main restriction is that norms have to be 
> additive (the norm of the combined field must be the sum of the field norms).
> Maybe we could unhardcode BM25Similarity in BM25FQuery and instead use the 
> one configured on IndexSearcher. We could think of this as providing a 
> sensible default approach to cross-field scoring for many similarities. It's 
> an incremental step towards LUCENE-8711, which would give similarities more 
> fine-grained control over how stats/ scores are combined across fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to