Re: [I] Can we add configuration on dropping raw vectors from quantized formats after some period of time? [lucene]

via GitHub Tue, 28 Apr 2026 09:30:25 -0700


mccullocht commented on issue #13251:
URL: https://github.com/apache/lucene/issues/13251#issuecomment-4337203143

> (Hmm, maybe not -- in the adversarial cases where vectors are very tightly
clustered, not utilizing the full vector space, random rotation won't fix that,
and centering is then probably still helpful).

The comment you linked showed that centering is still valuable -- even after
rotation -- so I don't want to be too opinionated in terms of what we provide.
In particular centering appears to be much more valuable than rotation at lower
bit rates, really maximizing the value of what little information you have
available.

We can have an option to drop full precision vectors if the codec is
configured to be data blind (OSQ can already do this by making the center the 0
vector). I poked at this for an hour or so and the tricky part is merging mixed
input where some segments are already quantized and some are float-only.
Whatever you might think of it the current codec is _a lot_ simpler for the
assumption of float input.

> But how would we model this in Lucene? Would it somehow be a 2nd vector
field which magically pulls the residual vector from the 1st field (sheesh,
another example of spooky/magical field to field interaction within one
Document...).

I think you can hide it all in one field. You may want a second file for the
residual vectors so it's easy to avoid them when warming an index. After that
you can use a
[rescorer](https://github.com/apache/lucene/blob/9582ee9b846ff92de600d0648405893335a7da0a/lucene/core/src/java/org/apache/lucene/index/FloatVectorValues.java#L79)
to compare a float input to primary+residual vector. Comparison can happen by
either (a) decoding the primary and residual vectors into a float vector and
performing direct comparison or (b) quantizing the target vector and summing
the results of the primary and residual vector comparison.

> I think it's [this paper from Intel](https://arxiv.org/abs/2304.04759)?
And also implemented in their GitHub project
[ScalableVectorSearch](https://github.com/intel/ScalableVectorSearch) -- ASL 2
licensed (like Lucene) so we can peruse / poach / borrow ideas/implementations!

This is indeed the paper I was thinking of. There is a [second
paper](https://arxiv.org/pdf/2402.02044) that describes a packing scheme to
make comparisons between vectors of different bit rates (e.g. 2-bit vector
against 4-bit vector) cheaper. Last I looked they had hidden the implementation
and were distributing pre-compiled objects, good to see they changed their
minds.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Can we add configuration on dropping raw vectors from quantized formats after some period of time? [lucene]

Reply via email to