epotyom commented on PR #13657:
URL: https://github.com/apache/lucene/pull/13657#issuecomment-2293467656
I've made some temporary changes in luceneutil to be able to only run a
couple of tasks that show regression and have meaningful profiler results -
profiler results that we get for all tasks seems to have too many samples for
other tasks e.g. faceting.
Results after 20 runs:
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
MedTerm 581.44 (3.8%) 505.29
(3.5%) -13.1% ( -19% - -5%) 0.000
HighTerm 559.77 (3.5%) 501.17
(3.5%) -10.5% ( -16% - -3%) 0.000
```
The biggest difference in the profiler seems to be that we spend more time
in
`org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer.score(float,
long)` now?
Diff image:

JFR files:
[Archive.zip](https://github.com/user-attachments/files/16637306/Archive.zip)
The code that was tested is slightly different from this PR, sharing
branches just in case:
- candidate: Branch with regression:
https://github.com/epotyom/lucene/tree/IndexSearcher-search-regression
- baseline: Branch with NO regression:
https://github.com/epotyom/lucene/tree/IndexSearcher-search-NO-regression
- Their diff: https://github.com/epotyom/lucene/pull/1/files
luceneutil branch to reproduce:
https://github.com/mikemccand/luceneutil/compare/main...epotyom:luceneutil:tasks_with_regression
you'd need to generate task file manually as it seems to be to large for for
github:
```
rm tasks/wikimedium.10M.regressed.tasks
cat tasks/wikimedium.10M.nostopwords.tasks | egrep '^(MedTerm|HighTerm):' >
tasks/wikimedium.10M.regressed.tasks.1
for n in {1..10000}; do cat tasks/wikimedium.10M.regressed.tasks.1 >>
tasks/wikimedium.10M.regressed.tasks; done
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]