[GitHub] [lucene] benwtrent opened a new pull request, #12582: Add new int8 scalar quantization to HNSW codec

via GitHub Thu, 21 Sep 2023 11:57:03 -0700


benwtrent opened a new pull request, #12582:
URL: https://github.com/apache/lucene/pull/12582


   As with most codec changes, this is an eye popping number of LoC and the 
design isn't finished yet. 
   
   I am opening this as draft to be open about the work and to discuss further 
direction. 
   
   Initial benchmarking (utilizing non-normalized cohere embeddings + max-inner 
product, which is a particularly difficult case for naive quantization), I get 
10-20% faster search, 2x faster index building, and ~4x smaller storage used 
for the search (I am keeping the raw vectors around...we can debate if we want 
to do that).
   
   Recall@10 with 100 fanout = 0.804
   Recall@100 with 200 fanout = 0.9.
   
   I am reaching the point where the design needs to be finalized and I wanted 
to reachout for feedback.
   
   Some design discussion points that I am unsure about are:
   
    - Do we want to have a new "flat" vector codec that HNSW (or other 
complicated vector indexing methods), can use? Detractor here is that now HNSW 
codec relies on another pluggable thing that is a "flat" vector index (just 
provides mechanisms for reading, writing, merging vectors in a flat index).
    - Should "quantization" just be a thing that is provided to vector codecs? 
The main detractor here is future scalar quantization could easily be added 
(like int4 or even binary). 
    - Should the "quantizer" keep the raw vectors around itself? Or rely on 
some external party to provide them (in this case, I an relying on the HNSW 
codec)?
   
   
   Again, this is draft, I have a ton of comments to fix up, etc. But wanted 
early feedback and what we want to integrate into Lucene.
   
   
   As a side note, it really seems some of these classes 
(OffHeap...vectorReader...) should be common between all the vector codecs 
instead of copied around, its a ton of code that gets copied with almost no 
change between codecs :/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [lucene] benwtrent opened a new pull request, #12582: Add new int8 scalar quantization to HNSW codec

Reply via email to