Hello, I would like to ask if has somebody tried/planned to implement indexing for dense vectors. The default scoring process is suitable only for text documents, but we would like to use/support/develop a plugin enabling to combine/replace default index by the dense vector index for non-textual documents.
We have documents represented by both texts and float vectors. We would like to be able to search similar documents to a given document using a document vector (and not to use queries like MORELIKETHIS). There is a vector encoding to text technique, but it is not very accurate: * float numbers 0.0, 0.1, 0.8 for one vector position have different distances |0.0 - 0.1| < |0.1 - 0.8|, but encoded strings don't: 'V1-0.00-0.05' ~ 'V1-0.05-0.10' ~ 'V1-0.80-0.85', therefore we would like to search the whole dense vector in Lucene index (using some existing vector index technique, e.g. https://github.com/spotify/annoy). My question is whether this functionality was tested by somebody before and what is your opinion about implementing it. Is it technically possible to make a plugin supporting this functionality (having another distributed index and separate scoring function), or is it better to store the index for dense vectors outside of Lucine? Thank you for your insight and time, Jimmy