jpountz commented on pull request #101:
URL: https://github.com/apache/lucene/pull/101#issuecomment-834791219


   > The last two are optimization techniques not mentioned in the paper I 
think?
   
   To be honest I didn't read the paper recently so it's possible I diverged a 
bit from it.
   
   (2) feels natural to avoid doing useless score computations, though it might 
only work well when score upper bounds are very close to the actual scores. 
Maybe we should test on wikibig instead of wikimedium to get better confidence 
that this change makes things better.
   
   Regarding (3) does it actually push more scorers into `nonEssentialScorers`? 
I thought I just reorganized the existing logic a bit. If it pushes more 
scorers into `nonEssentialScorers` it's probably a bug. :)
   
   > but it seems like a good net speedup given the latter twos already have 
much faster QPS compared to OrHighHigh?
   
   Agreed, we already made similar choices in the past. It's probably still 
worth playing with a wider variety of queries, e.g. queries with many terms 
that have mixed frequencies, e.g. something like OrHighHighMedMedLowLow, or 
disjunctions within conjunctions/conjunctions within disjunctions.
   
   Maybe we can also try to port similar changes to the bulk scorer to see if 
it yields even greater benefits?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to