kaivalnp opened a new pull request, #15979:
URL: https://github.com/apache/lucene/pull/15979

   ### Description
   
   Closes #14758
   
   Add a new de-duplicating vector format that only stores unique vectors on 
disk.
   De-duplication is done for vectors across all docs and fields indexed by the 
format.
   
   Disclaimer: This was mostly written by an AI, with me refining the 
implementation through prompts -- although I think it did a pretty good job on 
its own!
   
   Details about the format itself (layout of vectors on disk, de-duplication 
strategy during flush and merge, performance tradeoffs, etc) are included in a 
markdown doc in the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to