[
https://issues.apache.org/jira/browse/LUCENE-10146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julie Tibshirani resolved LUCENE-10146.
---------------------------------------
Fix Version/s: main (9.0)
Resolution: Fixed
> Add VectorSimilarityFunction.COSINE
> -----------------------------------
>
> Key: LUCENE-10146
> URL: https://issues.apache.org/jira/browse/LUCENE-10146
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Julie Tibshirani
> Priority: Major
> Fix For: main (9.0)
>
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> To perform ANN search with cosine similarity, users are expected to normalize
> the document and query vectors to unit length, then use
> {{VectorSimilarityFunction.DOT_PRODUCT}}. I think it would be good to also
> support cosine similarity directly through
> {{VectorSimilarityFunction.COSINE}}. This would allow users to perform ANN
> based on cosine similarity, while retaining access to the original vectors
> through {{VectorValues}}. That way they can use the original vectors in a
> reranking step or return them to the application for further processing.
> It looks like nmslib and hnswlib support cosine similarity. On the other
> hand, FAISS only supports dot product and suggests users normalize the
> vectors to perform cosine similarity
> (https://github.com/facebookresearch/faiss/issues/95). To me adding this one
> additional similarity is worth it in terms of what it lets users accomplish.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]