Hi Together
Cohere just published approx. 100Mio embeddings based on Wikipedia content
https://txt.cohere.com/embedding-archives-wikipedia/
resp.
https://huggingface.co/datasets/Cohere/wikipedia-22-12-en-embeddings
https://huggingface.co/datasets/Cohere/wikipedia-22-12-de-embeddings
HTH
Right RAVectorValues is just fronting an array of vectors and it
doesn't have any intermediate storage or other state (like a file
pointer) so it can support many simultaneous callers. Other
implementations of the interface work differently; see
OffHeapByteVectorValues, which is representing vector
It looks like I misunderstood how the Builder works, and the RAVV provided
to the constructor does not need to contain any values up front.
Specifically, Lucene95HnswVectorsWriter.FieldWriter adds vectors
incrementally to the RAVV that it gives to the builder as addValue is
called.
On Wed, Apr 19,
I still don't understand, because RAVectorValues copy method is "return
this." When is it important to have a true separate copy?
On Wed, Apr 19, 2023 at 3:04 AM Michael Sokolov wrote:
> These vector values have internal buffers they use to return the vectors.
> In order to compare two vectors