benwtrent opened a new pull request, #12582:
URL: https://github.com/apache/lucene/pull/12582
As with most codec changes, this is an eye popping number of LoC and the
design isn't finished yet.
I am opening this as draft to be open about the work and to discuss further
direction.
Initial benchmarking (utilizing non-normalized cohere embeddings + max-inner
product, which is a particularly difficult case for naive quantization), I get
10-20% faster search, 2x faster index building, and ~4x smaller storage used
for the search (I am keeping the raw vectors around...we can debate if we want
to do that).
Recall@10 with 100 fanout = 0.804
Recall@100 with 200 fanout = 0.9.
I am reaching the point where the design needs to be finalized and I wanted
to reachout for feedback.
Some design discussion points that I am unsure about are:
- Do we want to have a new "flat" vector codec that HNSW (or other
complicated vector indexing methods), can use? Detractor here is that now HNSW
codec relies on another pluggable thing that is a "flat" vector index (just
provides mechanisms for reading, writing, merging vectors in a flat index).
- Should "quantization" just be a thing that is provided to vector codecs?
The main detractor here is future scalar quantization could easily be added
(like int4 or even binary).
- Should the "quantizer" keep the raw vectors around itself? Or rely on
some external party to provide them (in this case, I an relying on the HNSW
codec)?
Again, this is draft, I have a ton of comments to fix up, etc. But wanted
early feedback and what we want to integrate into Lucene.
As a side note, it really seems some of these classes
(OffHeap...vectorReader...) should be common between all the vector codecs
instead of copied around, its a ton of code that gets copied with almost no
change between codecs :/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]