mccullocht opened a new pull request, #16030:
URL: https://github.com/apache/lucene/pull/16030
Add an option to the quantization format to enable or disable centering
(enabled by default). When centering is disabled we also stop writing the float
vectors which can lead to significant storage savings. Special handling is
included during merges -- we check that all of the input is in the same
encoding, and handle transcoding if some of the input is float vectors.
Large portions of this change were generated using claude code. I reviewed,
tweaked, and tested the code before puttig it up for review.
This change is being made as a new codec as the format changes to drop the
center vector when centering is disabled. This is not strictly necessary as we
could write a zero vector instead, but I have plans to make other format
changes related to data blindness, see #16029.
luceneutil results -- 1M cohere vectors, 8 bit quantization.
before:
```
recall latency(ms) netCPU avgCpuCount nDoc searchType topK fanout
resultSimilarity decay resultCount maxConn beamWidth quantized visited
index(s) index_docs/s force_merge(s) num_segments index_size(MB)
filterStrategy filterSelectivity overSample vec_disk(MB) vec_RAM(MB)
bp-reorder indexType
0.974 2.304 2.297 0.997 1000000 KNN 100 100
N/A N/A 100.000 64 250 8 bits 8619
132.85 7527.40 235.00 1 5047.27
null N/A 1.000 4898.071 991.821 false
HNSW
```
after
```
recall latency(ms) netCPU avgCpuCount nDoc searchType topK fanout
resultSimilarity decay resultCount maxConn beamWidth quantized visited
index(s) index_docs/s force_merge(s) num_segments index_size(MB)
filterStrategy filterSelectivity overSample vec_disk(MB) vec_RAM(MB)
bp-reorder indexType
0.972 2.281 2.274 0.997 1000000 KNN 100 100
N/A N/A 100.000 64 250 8 bits 8612
143.06 6990.07 160.33 1 1140.98
null N/A 1.000 4898.071 991.821 false
HNSW
```
The harness extrapolates vector size from the input size so believe the
on-disk number -- this is about 4x smaller. Force merge is faster since we
don't have to re-quantize vectors on merge. Recall is very similar but YMMV.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]