jimczi edited a comment on issue #904: LUCENE-8992: Share minimum score across segment in concurrent search URL: https://github.com/apache/lucene-solr/pull/904#issuecomment-538892799 We discussed offline with @jpountz and while it proved difficult to implement the check directly in the IndexSearcher it seems better to decorrelate the checks of the global minimum score from the update of the local one. I pushed a [commit](https://github.com/apache/lucene-solr/pull/904/commits/47f9bf6f31ce2b429c77807c2d08111d9ff93b07) that implements this idea by checking the global score every 1024 documents and since the rate of the check is known I also switched to a `LongAccumulator` in order to speed up the updates. Finally I added the maximum document id associated with the current maximum minimum score in order to be able to require the next float in the `TopScoreDocCollector` for leaves that are after the document id registered in the global maximum score. Here's the result of the benchmark on `TopScoreDocCollector` using wikimedium: ```` TaskQPS baseline StdDev QPS patch StdDev Pct diff HighIntervalsOrdered 6.23 (0.0%) 5.98 (0.0%) -4.0% ( -3% - -3%) HighSpanNear 5.47 (0.0%) 5.29 (0.0%) -3.3% ( -3% - -3%) LowSpanNear 19.06 (0.0%) 18.06 (0.0%) -5.2% ( -5% - -5%) OrHighMed 71.30 (0.0%) 67.68 (0.0%) -5.1% ( -5% - -5%) MedSpanNear 17.86 (0.0%) 16.96 (0.0%) -5.0% ( -5% - -5%) Respell 29.32 (0.0%) 28.16 (0.0%) -4.0% ( -3% - -3%) AndHighMed 107.02 (0.0%) 102.92 (0.0%) -3.8% ( -3% - -3%) Fuzzy2 43.22 (0.0%) 41.87 (0.0%) -3.1% ( -3% - -3%) IntNRQ 58.96 (0.0%) 57.60 (0.0%) -2.3% ( -2% - -2%) Fuzzy1 55.31 (0.0%) 54.05 (0.0%) -2.3% ( -2% - -2%) LowPhrase 39.99 (0.0%) 39.19 (0.0%) -2.0% ( -1% - -1%) LowSloppyPhrase 23.71 (0.0%) 23.51 (0.0%) -0.8% ( 0% - 0%) AndHighLow 820.39 (0.0%) 815.03 (0.0%) -0.7% ( 0% - 0%) HighPhrase 65.78 (0.0%) 65.64 (0.0%) -0.2% ( 0% - 0%) MedSloppyPhrase 18.55 (0.0%) 18.89 (0.0%) 1.8% ( 1% - 1%) HighSloppyPhrase 7.06 (0.0%) 7.22 (0.0%) 2.4% ( 2% - 2%) Wildcard 63.42 (0.0%) 65.49 (0.0%) 3.3% ( 3% - 3%) MedPhrase 59.06 (0.0%) 61.16 (0.0%) 3.5% ( 3% - 3%) Prefix3 72.62 (0.0%) 75.86 (0.0%) 4.5% ( 4% - 4%) OrNotHighLow 777.16 (0.0%) 812.50 (0.0%) 4.5% ( 4% - 4%) AndHighHigh 31.35 (0.0%) 33.66 (0.0%) 7.3% ( 7% - 7%) OrHighHigh 18.98 (0.0%) 20.95 (0.0%) 10.4% ( 10% - 10%) LowTerm 598.85 (0.0%) 745.84 (0.0%) 24.5% ( 24% - 24%) OrNotHighMed 388.26 (0.0%) 549.63 (0.0%) 41.6% ( 41% - 41%) MedTerm 386.78 (0.0%) 595.30 (0.0%) 53.9% ( 53% - 53%) OrHighNotMed 308.92 (0.0%) 496.75 (0.0%) 60.8% ( 60% - 60%) HighTerm 310.13 (0.0%) 515.95 (0.0%) 66.4% ( 66% - 66%) OrHighNotLow 304.05 (0.0%) 521.76 (0.0%) 71.6% ( 71% - 71%) OrHighNotHigh 273.30 (0.0%) 470.54 (0.0%) 72.2% ( 72% - 72%) OrNotHighHigh 296.77 (0.0%) 512.47 (0.0%) 72.7% ( 72% - 72%) OrHighLow 108.61 (0.0%) 325.00 (0.0%) 199.2% ( 199% - 199%) ```` Note that I ran the benchmark against a version that is before LUCENE-8978. The results against the already committed code in LUCENE-8978 show small regressions on some queries (high-phrase) and better results on others (highorlow) but the overall is comparable. I have a slight preference over this version because the behavior does not depend on the rate of the updates of the local minimum score.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org