That class is intended for use by the Lucene index writer - it's not designed as a general purpose class for re-use outside that context. And IndexWriter writes documents to disk in bulk.
On Wed, Apr 19, 2023 at 3:54 PM Jonathan Ellis <jbel...@gmail.com> wrote: > > Thanks, Michael! > > Looking at the paper by Malkov and Yashunin, it looks like the algorithm > allows for building the hnsw graph incrementally. Why does our > implementation require specifying all the vectors up front to > HnswGraphBuilder.create? > > On Wed, Apr 19, 2023 at 3:04 AM Michael Sokolov <msoko...@gmail.com> wrote: >> >> These vector values have internal buffers they use to return the vectors. In >> order to compare two vectors we need to use two independent sources so that >> one doesn't overwrite this internal state when fetching the second vector. >> >> Sorry I forgot the second question and can't see it on my phone. Brb >> >> On Tue, Apr 18, 2023, 10:55 PM Jonathan Ellis <jbel...@gmail.com> wrote: >>> >>> HI all, a couple questions on how HNSW works: >>> >>> 1. What is driving the requirement for two copies of the input vectors? It >>> looks like the RAVV implementations do shallow copies, so the vector from A >>> is the same that would be returned by B. What am I missing? >>> >>> 2. What is the intended behavior when adding identical vectors to a HNSW? >>> It looks like when I supply 10 identical vectors, they all get added to the >>> graph, but when I search for the nearest neighbors, I only get one of them >>> in the result set. >>> >>> -- >>> Jonathan Ellis >>> co-founder, http://www.datastax.com >>> @spyced > > > > -- > Jonathan Ellis > co-founder, http://www.datastax.com > @spyced --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org