zacharymorn commented on PR #12194:
URL: https://github.com/apache/lucene/pull/12194#issuecomment-1491226817
Thanks @mikemccand for the review! Yes I did run the full 20 iterations with
`enwiki-20130102-lines.txt` corpus. I also tried to just run a single
`AndHighNotMonth` task to find out which one was giving 800% improvement, but
the highest improvement I saw was around 150% with this single task:
```
AndHighNotMonth: +its -monthPostings:apr # freq=1160703
```
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
PKLookup 182.12 (28.8%) 84.26
(8.4%) -53.7% ( -70% - -23%) 0.000
AndHighNotMonth 64.22 (8.4%) 160.51
(59.1%) 149.9% ( 75% - 237%) 0.000
```
Not sure if that 800% comes from JVM's JIT compilation actually?
For the above task, being able to leverage skip-data helps a lot to skip
much more than what a typical block of 128 docs would allow:
```
First doc in block: 0 | Last doc in block: 127 | Furthest skip entry doc: -1
First doc in block: 128 | Last doc in block: 255 | Furthest skip entry doc:
524287 | Number of continuous matching docs: 524032
First doc in block: 524288 | Last doc in block: 524415 | Furthest skip entry
doc: 589823 | Number of continuous matching docs: 65408
First doc in block: 589824 | Last doc in block: 589951 | Furthest skip entry
doc: 655359 | Number of continuous matching docs: 65408
First doc in block: 655360 | Last doc in block: 655487 | Furthest skip entry
doc: 720895 | Number of continuous matching docs: 65408
First doc in block: 720896 | Last doc in block: 721023 | Furthest skip entry
doc: 786431 | Number of continuous matching docs: 65408
First doc in block: 786432 | Last doc in block: 786559 | Furthest skip entry
doc: 851967 | Number of continuous matching docs: 65408
First doc in block: 851968 | Last doc in block: 852095 | Furthest skip entry
doc: 917503 | Number of continuous matching docs: 65408
First doc in block: 917504 | Last doc in block: 917631 | Furthest skip entry
doc: 983039 | Number of continuous matching docs: 65408
First doc in block: 983040 | Last doc in block: 983167 | Furthest skip entry
doc: 991231 | Number of continuous matching docs: 8064
First doc in block: 991232 | Last doc in block: 991359 | Furthest skip entry
doc: 992255 | Number of continuous matching docs: 896
First doc in block: 992256 | Last doc in block: 992383 | Furthest skip entry
doc: 993279 | Number of continuous matching docs: 896
First doc in block: 993280 | Last doc in block: 993407 | Furthest skip entry
doc: 994303 | Number of continuous matching docs: 896
First doc in block: 994304 | Last doc in block: 994431 | Furthest skip entry
doc: 995327 | Number of continuous matching docs: 896
First doc in block: 995328 | Last doc in block: 995455 | Furthest skip entry
doc: 996351 | Number of continuous matching docs: 896
First doc in block: 996352 | Last doc in block: 996479 | Furthest skip entry
doc: 997375 | Number of continuous matching docs: 896
First doc in block: 997376 | Last doc in block: 997503 | Furthest skip entry
doc: 998399 | Number of continuous matching docs: 896
First doc in block: 998400 | Last doc in block: 998527 | Furthest skip entry
doc: -1
First doc in block: 998528 | Last doc in block: 998655 | Furthest skip entry
doc: -1
First doc in block: 998656 | Last doc in block: 998783 | Furthest skip entry
doc: -1
First doc in block: 998784 | Last doc in block: 998911 | Furthest skip entry
doc: -1
First doc in block: 998912 | Last doc in block: 999039 | Furthest skip entry
doc: -1
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]