jpountz commented on pull request #101: URL: https://github.com/apache/lucene/pull/101#issuecomment-834791219
> The last two are optimization techniques not mentioned in the paper I think? To be honest I didn't read the paper recently so it's possible I diverged a bit from it. (2) feels natural to avoid doing useless score computations, though it might only work well when score upper bounds are very close to the actual scores. Maybe we should test on wikibig instead of wikimedium to get better confidence that this change makes things better. Regarding (3) does it actually push more scorers into `nonEssentialScorers`? I thought I just reorganized the existing logic a bit. If it pushes more scorers into `nonEssentialScorers` it's probably a bug. :) > but it seems like a good net speedup given the latter twos already have much faster QPS compared to OrHighHigh? Agreed, we already made similar choices in the past. It's probably still worth playing with a wider variety of queries, e.g. queries with many terms that have mixed frequencies, e.g. something like OrHighHighMedMedLowLow, or disjunctions within conjunctions/conjunctions within disjunctions. Maybe we can also try to port similar changes to the bulk scorer to see if it yields even greater benefits? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org