benwtrent commented on PR #14226:
URL: https://github.com/apache/lucene/pull/14226#issuecomment-2657348683
OK, I ran it on 8M data set with 128 segments.
Indeed it visits way fewer vectors (seemingly), and is consistent across
multiple threads.
```
recall latency(ms) nDoc topK fanout visited num segments
selectivity
0.719 36.100 8000000 100 100 28015 128
1.000
```
8 threaded qps
```
recall latency(ms) nDoc topK fanout visited num segments
selectivity
0.719 5.700 8000000 100 100 28015 128
1.000
```
The current baseline (which isn't multi-threaded consistent, so I am just
showing the single thread perf):
```
recall latency(ms) nDoc topK fanout visited num segments
selectivity
0.949 59.900 8000000 100 100 87299 128
1.000
```
Given the visited & latency, I wonder if the "visited" actually represents
what this query is visiting?
But, maybe the answer is allowing more exploration per segment?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]