Re: [I] Can we add configuration on dropping raw vectors from quantized formats after some period of time? [lucene]

via GitHub Tue, 28 Apr 2026 07:04:26 -0700


mikemccand commented on issue #13251:
URL: https://github.com/apache/lucene/issues/13251#issuecomment-4335942836


   Thanks @mccullocht -- it looks like that discussion starts around [this 
comment](https://github.com/apache/lucene/pull/15903#issuecomment-4200627888) 
(#15903 is getting big -- final inch is hard!).
   
   With [BBQ supporting 
pre-conditioning](https://www.elastic.co/search-labs/blog/elasticsearch-bbq-preconditioning-vectors)
 (and also the [proposed 
TurboQuant](https://github.com/apache/lucene/pull/15903) I think), which 
normalizes model-output dimensional irregularities in the incoming vectors 
(luceneutil's `knnPerfTest.py` [smells query/document 
vectors](https://github.com/mikemccand/luceneutil/blob/c530a720329bba774fefdadd17e027187845d100/src/python/knnPerfTest.py#L464)
 to try to spot such mis-behaving dimensions/vectors/corpora) can't we always 
use a data-blind quantization if pre-conditioning is enabled?  All dimensions 
should look nicely uniform, carry the same (ish) information content the same 
(ish) normal value distributions?
   
   And then if the quantization is data-blind, can't we expose the option to 
always drop the full precision vectors?  Users may still want to keep them 
around if they are re-ranking or so... but merging would no longer lose 
information if the quantization is data blind?
   
   (Hmm, maybe not -- in the adversarial cases where vectors are very tightly 
clustered, not utilizing the full vector space, random rotation won't fix that, 
and centering is then probably still helpful).
   
   > A technique that OSQ doesn't implement but appears in one of the source 
papers (Intel LVQ) is quantizing the residual, e.g. for some vector v quantize 
v - dequantize(quantize(v)) and use this for re-ranking.
   
   That sounds like a great idea!  Rather than wastefully / independently 
storing your higher precision vectors for reranking, take advantage of the 
quantized form (used for first pass vector retrieval) and build on it.  Maybe I 
quantize to 2 bits for first pass, and use 4 bits for second pass, which would 
equate (ish) to reranking with 6 bit precision vectors (if we quantized the 
residuals), vs today where it would just be reranking with 4 bits.
   
   But how would we model this in Lucene?  Would it somehow be a 2nd vector 
field which magically pulls the residual vector from the 1st field (sheesh, 
another example of spooky/magical field to field interaction within one 
`Document`...).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Can we add configuration on dropping raw vectors from quantized formats after some period of time? [lucene]

Reply via email to