For vector search in Lucene, functionality for memory access (on/off-heap
-- using Panama FFM) and vectorization (using the Vector API --
jdk.incubator.vector) is tightly coupled (see PanamaVectorUtilSupport
<https://github.com/apache/lucene/blob/602bfbd9af0ee9027de45c1572527eee6b073841/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57>
). This made sense when it was originally added, because Panama FFM was in
preview + Vector API was incubating.

However, since Panama FFM was finalized in JDK22 (
https://openjdk.org/jeps/454), I wonder if we should decouple it from
vectorization now?
This would mean exposing (and supporting!) top-level on/off-heap vector
similarity functions from all VectorUtilSupport
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25>
implementations
(with Lucene providing one non-vectorized
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultVectorUtilSupport.java#L26>
and another Vector API powered vectorized
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57>
implementation),
and only scoping non-finalized (i.e. incubating Vector API related)
functionality in the MR-JAR (i.e. java25/
<https://github.com/apache/lucene/tree/main/lucene/core/src/java25>)

Although not a huge motivation, this would allow users that do not use
vectorization to score vectors off-heap.
The main benefit could be cleaner separation of functionality in the long
term, also making it easier to write new VectorUtilSupport
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25>
implementations
that primarily work with MemorySegment APIs (for example a native
implementation in # <https://github.com/apache/lucene/pull/13572>13572
<https://github.com/apache/lucene/pull/13572>)

Issue: https://github.com/apache/lucene/issues/15284
PR: https://github.com/apache/lucene/pull/15285

- Kaival

Reply via email to