It looks like I misunderstood how the Builder works, and the RAVV provided
to the constructor does not need to contain any values up front.
Specifically, Lucene95HnswVectorsWriter.FieldWriter adds vectors
incrementally to the RAVV that it gives to the builder as addValue is
called.

On Wed, Apr 19, 2023 at 1:37 PM Michael Sokolov <msoko...@gmail.com> wrote:

> That class is intended for use by the Lucene index writer - it's not
> designed as a general purpose class for re-use outside that context.
> And IndexWriter writes documents to disk in bulk.
>
> On Wed, Apr 19, 2023 at 3:54 PM Jonathan Ellis <jbel...@gmail.com> wrote:
> >
> > Thanks, Michael!
> >
> > Looking at the paper by Malkov and Yashunin, it looks like the algorithm
> allows for building the hnsw graph incrementally.  Why does our
> implementation require specifying all the vectors up front to
> HnswGraphBuilder.create?
> >
> > On Wed, Apr 19, 2023 at 3:04 AM Michael Sokolov <msoko...@gmail.com>
> wrote:
> >>
> >> These vector values have internal buffers they use to return the
> vectors. In order to compare two vectors we need to use two independent
> sources so that one doesn't overwrite this internal state when fetching the
> second vector.
> >>
> >> Sorry I forgot the second question and can't see it on my phone. Brb
> >>
> >> On Tue, Apr 18, 2023, 10:55 PM Jonathan Ellis <jbel...@gmail.com>
> wrote:
> >>>
> >>> HI all, a couple questions on how HNSW works:
> >>>
> >>> 1. What is driving the requirement for two copies of the input
> vectors?  It looks like the RAVV implementations do shallow copies, so the
> vector from A is the same that would be returned by B.  What am I missing?
> >>>
> >>> 2. What is the intended behavior when adding identical vectors to a
> HNSW?  It looks like when I supply 10 identical vectors, they all get added
> to the graph, but when I search for the nearest neighbors, I only get one
> of them in the result set.
> >>>
> >>> --
> >>> Jonathan Ellis
> >>> co-founder, http://www.datastax.com
> >>> @spyced
> >
> >
> >
> > --
> > Jonathan Ellis
> > co-founder, http://www.datastax.com
> > @spyced
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced

Reply via email to