Agree on waiting for Lucene 11, otherwise it may become very difficult to backport any related features into 10.x.
On Sat, Oct 11, 2025 at 5:58 PM Uwe Schindler <[email protected]> wrote: > Hi, > > hi, I am on travel at the moment till end of October and won't be able to > look into this. Basically we can move some parts of the MemorySegment > supporting code to main sourceSet, but I don't want to touch that code too > much. The nmost important thing si: No public API is allowed to have any > internal Vector API signatures in public API (we don't have a check fo > this), so we'd need to closely look at it. The reason why some of the code > is in the java 25 sourceset is exactly because of this. E.g. the > MemorySegment-based scoring. > > In the future we may add more MemorySegment based APIs into core (and > remove all ByteBuffers, too), but this is some approach I wanted to look at > after we have released Lucene 11. > > Uwe > Am 09.10.2025 um 23:57 schrieb Kaival Parikh: > > For vector search in Lucene, functionality for memory access (on/off-heap > -- using Panama FFM) and vectorization (using the Vector API -- > jdk.incubator.vector) is tightly coupled (see PanamaVectorUtilSupport > <https://github.com/apache/lucene/blob/602bfbd9af0ee9027de45c1572527eee6b073841/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57> > ). This made sense when it was originally added, because Panama FFM was > in preview + Vector API was incubating. > > However, since Panama FFM was finalized in JDK22 ( > https://openjdk.org/jeps/454), I wonder if we should decouple it from > vectorization now? > This would mean exposing (and supporting!) top-level on/off-heap vector > similarity functions from all VectorUtilSupport > <https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25> > implementations > (with Lucene providing one non-vectorized > <https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultVectorUtilSupport.java#L26> > and another Vector API powered vectorized > <https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57> > implementation), > and only scoping non-finalized (i.e. incubating Vector API related) > functionality in the MR-JAR (i.e. java25/ > <https://github.com/apache/lucene/tree/main/lucene/core/src/java25>) > > Although not a huge motivation, this would allow users that do not use > vectorization to score vectors off-heap. > The main benefit could be cleaner separation of functionality in the long > term, also making it easier to write new VectorUtilSupport > <https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25> > implementations > that primarily work with MemorySegment APIs (for example a native > implementation in # <https://github.com/apache/lucene/pull/13572>13572 > <https://github.com/apache/lucene/pull/13572>) > > Issue: https://github.com/apache/lucene/issues/15284 > PR: https://github.com/apache/lucene/pull/15285 > > - Kaival > > -- > Uwe Schindler > Achterdiek 19, D-28357 Bremenhttps://www.thetaphi.de > eMail: [email protected] > >
