Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-20 Thread Michael Wechner
Hi Together Cohere just published approx. 100Mio embeddings based on Wikipedia content https://txt.cohere.com/embedding-archives-wikipedia/ resp. https://huggingface.co/datasets/Cohere/wikipedia-22-12-en-embeddings https://huggingface.co/datasets/Cohere/wikipedia-22-12-de-embeddings HTH

Re: HNSW questions

2023-04-20 Thread Michael Sokolov
Right RAVectorValues is just fronting an array of vectors and it doesn't have any intermediate storage or other state (like a file pointer) so it can support many simultaneous callers. Other implementations of the interface work differently; see OffHeapByteVectorValues, which is representing vector

Re: HNSW questions

2023-04-20 Thread Jonathan Ellis
It looks like I misunderstood how the Builder works, and the RAVV provided to the constructor does not need to contain any values up front. Specifically, Lucene95HnswVectorsWriter.FieldWriter adds vectors incrementally to the RAVV that it gives to the builder as addValue is called. On Wed, Apr 19,

Re: HNSW questions

2023-04-20 Thread Jonathan Ellis
I still don't understand, because RAVectorValues copy method is "return this." When is it important to have a true separate copy? On Wed, Apr 19, 2023 at 3:04 AM Michael Sokolov wrote: > These vector values have internal buffers they use to return the vectors. > In order to compare two vectors