msokolov commented on PR #14226: URL: https://github.com/apache/lucene/pull/14226#issuecomment-2692803205
Sorry, it took me a while to get back to this. My local setup got messed up
and somehow I was exhaustively searching entire segments! Anyway I finally got
this working again, including tracking reentries so we can see better what's
going on. I also have a modified version of luceneutil that reports on this
which I will post over there, and I named the free parameter `lambda` so we can
call it something. Detailed results on the Cohere 768 data below. My summary
is that: when we have generous parameter settings (either or both high `fanout`
or high `lambda`) the reentries don't add anything; the result queues are
already large enough to capture the global top K. But at lower levels of these
parameters, reentry allows some recovery of results that would otherwise have
been overlooked. I guess the way I'm thinking about this is it is essentially
equivalent to fanout, although a bit better because it scales with partition
size, and additionally can serve as a safety measure in case o
f highly skewed data. We wouldn't want to adopt this (more efficient)
pro-rated strategy without it because it is vulnerable to that adversarial
case. Maybe we should have a more large-scale dataset that demonstrates this.
EG one where a timestamp is a key part of the vector data and the documents are
indexed over time.
# comparing reentry with no reentry
## LAMBDA=3, no re-entry
```
recall latency (ms) nDoc topK fanout maxConn beamWidth quantized
visited reentries index s index docs/s num segments index size (MB) vec
disk (MB) vec RAM (MB)
0.732 5.433 500000 50 0 64 250 no
6218 0 0.00 Infinity 8 1501.56
1464.844 1464.844
0.770 7.389 500000 100 0 64 250 no
8849 0 0.00 Infinity 0 0.00
0.000 0.000
0.837 7.338 500000 50 50 64 250 no
8849 0 0.00 Infinity 0 0.00
0.000 0.000
0.832 9.366 500000 100 50 64 250 no
11174 0 0.00 Infinity 0 0.00
0.000 0.000
0.876 9.211 500000 50 100 64 250 no
11174 0 0.00 Infinity 0 0.00
0.000 0.000
0.867 11.018 500000 100 100 64 250 no
13375 0 0.00 Infinity 0 0.00
0.000 0.000
```
## LAMBDA=3, with reentry
```
recall latency (ms) nDoc topK fanout maxConn beamWidth quantized
visited reentries index s index docs/s num segments index size (MB) vec
disk (MB) vec RAM (MB)
0.806 6.131 500000 50 0 64 250 no
6560 1128 0.00 Infinity 8 1501.56
1464.844 1464.844
0.838 8.953 500000 100 0 64 250 no
9460 1197 0.00 Infinity 0 0.00
0.000 0.000
0.855 7.813 500000 50 50 64 250 no
8923 179 0.00 Infinity 0 0.00
0.000 0.000
0.862 10.191 500000 100 50 64 250 no
11405 393 0.00 Infinity 0 0.00
0.000 0.000
0.880 9.364 500000 50 100 64 250 no
11200 52 0.00 Infinity 0 0.00
0.000 0.000
0.880 11.462 500000 100 100 64 250 no
13494 172 0.00 Infinity 0 0.00
0.000 0.000
```
## LAMBDA=5, with reentry
```
recall latency (ms) nDoc topK fanout maxConn beamWidth quantized
visited reentries index s index docs/s num segments index size (MB) vec
disk (MB) vec RAM (MB)
0.837 6.932 500000 50 0 64 250 no
7814 415 0.00 Infinity 8 1501.56
1464.844 1464.844
0.858 9.765 500000 100 0 64 250 no
10959 497 0.00 Infinity 0 0.00
0.000 0.000
0.876 9.107 500000 50 50 64 250 no
10739 76 0.00 Infinity 0 0.00
0.000 0.000
0.879 11.781 500000 100 50 64 250 no
13420 175 0.00 Infinity 0 0.00
0.000 0.000
0.893 11.336 500000 50 100 64 250 no
13316 7 0.00 Infinity 0 0.00
0.000 0.000
0.892 13.209 500000 100 100 64 250 no
15752 69 0.00 Infinity 0 0.00
0.000 0.000
```
## LAMBDA=5, with no reentry
```
recall latency (ms) nDoc topK fanout maxConn beamWidth quantized
visited reentries index s index docs/s num segments index size (MB) vec
disk (MB) vec RAM (MB)
0.803 6.554 500000 50 0 64 250 no
7682 0 0.00 Infinity 8 1501.56
1464.844 1464.844
0.823 9.062 500000 100 0 64 250 no
10707 0 0.00 Infinity 0 0.00
0.000 0.000
0.870 9.137 500000 50 50 64 250 no
10707 0 0.00 Infinity 0 0.00
0.000 0.000
0.866 11.392 500000 100 50 64 250 no
13313 0 0.00 Infinity 0 0.00
0.000 0.000
0.893 11.330 500000 50 100 64 250 no
13313 0 0.00 Infinity 0 0.00
0.000 0.000
0.888 13.281 500000 100 100 64 250 no
15705 0 0.00 Infinity 0 0.00
0.000 0.000
```
## LAMBDA=12, no re-entry
```
recall latency (ms) nDoc topK fanout maxConn beamWidth quantized
visited reentries index s index docs/s num segments index size (MB) vec
disk (MB) vec RAM (MB)
0.886 9.955 500000 50 0 64 250 no
12100 0 0.00 Infinity 8 1501.56
1464.844 1464.844
0.893 13.789 500000 100 0 64 250 no
16645 0 0.00 Infinity 0 0.00
0.000 0.000
0.905 13.922 500000 50 50 64 250 no
16645 0 0.00 Infinity 0 0.00
0.000 0.000
0.904 16.443 500000 100 50 64 250 no
20215 0 0.00 Infinity 0 0.00
0.000 0.000
0.911 16.797 500000 50 100 64 250 no
20215 0 0.00 Infinity 0 0.00
0.000 0.000
0.910 19.732 500000 100 100 64 250 no
23274 0 0.00 Infinity 0 0.00
0.000 0.000
```
## LAMBDA=12, with re-entry
```
recall latency (ms) nDoc topK fanout maxConn beamWidth quantized
visited reentries index s index docs/s num segments index size (MB) vec
disk (MB) vec RAM (MB)
0.887 9.826 500000 50 0 64 250 no
12108 20 0.00 Infinity 8 1501.56
1464.844 1464.844
0.895 13.490 500000 100 0 64 250 no
16666 39 0.00 Infinity 0 0.00
0.000 0.000
0.905 13.616 500000 50 50 64 250 no
16645 0 0.00 Infinity 0 0.00
0.000 0.000
0.904 16.507 500000 100 50 64 250 no
20217 4 0.00 Infinity 0 0.00
0.000 0.000
0.911 16.565 500000 50 100 64 250 no
20215 0 0.00 Infinity 0 0.00
0.000 0.000
0.910 19.024 500000 100 100 64 250 no
23274 1 0.00 Infinity 0 0.00
0.000 0.000
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
