[ 
https://issues.apache.org/jira/browse/LUCENE-9346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247640#comment-17247640
 ] 

Zach Chen edited comment on LUCENE-9346 at 12/11/20, 5:18 AM:
--------------------------------------------------------------

hi [~jpountz], I spent some time looking into this and studying the algorithms 
in *MinShouldMatchSumScorer* and *WANDScorer,* and just finished with some 
initial changes and opened a draft PR. I think I went with a different 
direction from what you suggested above, by mainly keeping track of the number 
of scorers matched without changing the *WANDScorer* algorithm (not sure if I 
understand it enough to make a correct change either :D ), and comparing it 
with *minShouldMatch* parameter after *minCompetitiveScore* has been reached. 
Could you please take a look and let me know if that approach works as well?

In the PR, I also put in some nocommit to keep track of some questions I have 
(all the tests are passing without the nocommit comments btw):
 # Currently, *WANDScorer* will only be used for *ScoreMode.TOP_SCORES*. Should 
it be used for other score modes as well once *MinShouldMatchSumScorer* gets 
deprecated? Running *WANDScorer* with other ScodeMode now would fail some tests 
I think.
 # For now inside *WANDScorer*'s constructor, *WANDScorer.cost* is calculated 
as sum of the cost of its individual scorer.  But from 
*MinShouldMatchSumScorer*'s side, the cost is calculated also taking into 
account the *minShouldMatch* parameter as it impacts the tail capacity. Should 
*minShouldMatch* be taken into account in the calculation for *WANDScorer.cost* 
as well**, especially when the current solution in the PR doesn't change the 
tail capacity of *WANDScorer?* 


was (Author: zacharymorn):
hi [~jpountz], I spent some time looking into this and studying the algorithms 
in *MinShouldMatchSumScorer* and *WANDScorer,* and just finished with some 
initial changes and opened a draft PR. I think I went with a different 
direction from what you suggested above, by mainly keeping track of the number 
of scorers matched without changing the *WANDScorer* algorithm (not sure if I 
understand it enough to make a correct change either :D ), and comparing it 
with *minShouldMatch* parameter after *minCompetitiveScore* has been reached. 
Could you please take a look and let me know if that approach works as well?

In the PR, I also put in some nocommit to keep track of some questions I have 
(all the tests are now passing without the nocommit comments btw):
 # Currently, *WANDScorer* will only be used for *ScoreMode.TOP_SCORES*. Should 
it be used for other score modes as well once *MinShouldMatchSumScorer* gets 
deprecated? Running *WANDScorer* with other ScodeMode now would fail some tests 
I think.
 # For now inside *WANDScorer*'s constructor, *WANDScorer.cost* is calculated 
as sum of the cost of its individual scorer.  But from 
*MinShouldMatchSumScorer*'s side, the cost is calculated also taking into 
account the *minShouldMatch* parameter as it impacts the tail capacity. Should 
*minShouldMatch* be taken into account in the calculation for *WANDScorer.cost* 
as well**, especially when the current solution in the PR doesn't change the 
tail capacity of *WANDScorer?* 

> WANDScorer should support minimumNumberShouldMatch
> --------------------------------------------------
>
>                 Key: LUCENE-9346
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9346
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently we deoptimize when a minimumNumberShouldMatch is provided and fall 
> back to a scorer that doesn't dynamically prune hits based on scores.
> Given how WANDScorer and MinShouldMatchSumScorer are similar I wonder if we 
> could remove MinShouldSumScorer once WANDScorer supports minimumNumberShould 
> match. Then any improvements we bring to WANDScorer like two-phase support 
> (LUCENE-8806) would automatically cover more queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to