gf2121 commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-2736218538
I run some benchmarks to find out the major reason:
**Baseline**: main branch
**Candidate**: collecting docs greater than maxDocVisited into bitset
(instead of `DocIdSetBuilder`)
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
CountFilteredIntNRQ 41.78 (3.9%) 41.05
(2.9%) -1.7% ( -8% - 5%) 0.111
IntSet 84.83 (2.5%) 84.34
(2.3%) -0.6% ( -5% - 4%) 0.441
FilteredIntNRQ 77.52 (3.5%) 78.06
(3.2%) 0.7% ( -5% - 7%) 0.516
IntNRQ 80.49 (3.1%) 82.13
(3.2%) 2.0% ( -4% - 8%) 0.041
TermDTSort 59.85 (2.1%) 66.70
(2.4%) 11.4% ( 6% - 16%) 0.000
TermDayOfYearSort 61.19 (2.3%) 68.41
(4.3%) 11.8% ( 4% - 18%) 0.000
```
**Baseline**: collecting docs greater than maxDocVisited into bitset
**Candidate**: collecting all docs into bitset (no `if (doc >
maxDocVisited)`)
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
IntNRQ 82.00 (3.5%) 80.69
(2.8%) -1.6% ( -7% - 4%) 0.309
IntSet 84.61 (2.0%) 84.22
(2.7%) -0.5% ( -5% - 4%) 0.697
FilteredIntNRQ 78.12 (1.5%) 78.13
(1.8%) 0.0% ( -3% - 3%) 0.991
TermDTSort 66.41 (2.9%) 67.74
(3.2%) 2.0% ( -4% - 8%) 0.192
CountFilteredIntNRQ 40.68 (4.8%) 41.57
(2.3%) 2.2% ( -4% - 9%) 0.244
TermDayOfYearSort 69.90 (2.8%) 71.88
(3.3%) 2.8% ( -3% - 9%) 0.064
```
It looks like 'more chance to become a bitset' contributes more to the speed
up.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]