Hi,

hi, I am on travel at the moment till end of October and won't be able to look into this. Basically we can move some parts of the MemorySegment supporting code to main sourceSet, but I don't want to touch that code too much. The nmost important thing si: No public API is allowed to have any internal Vector API signatures in public API (we don't have a check fo this), so we'd need to closely look at it. The reason why some of the code is in the java 25 sourceset is exactly because of this. E.g. the MemorySegment-based scoring.

In the future we may add more MemorySegment based APIs into core (and remove all ByteBuffers, too), but this is some approach I wanted to look at after we have released Lucene 11.

Uwe

Am 09.10.2025 um 23:57 schrieb Kaival Parikh:
For vector search in Lucene, functionality for memory access (on/off-heap -- using Panama FFM) and vectorization (using the Vector API -- jdk.incubator.vector) is tightly coupled (see |PanamaVectorUtilSupport| <https://github.com/apache/lucene/blob/602bfbd9af0ee9027de45c1572527eee6b073841/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57>|). This made sense when it was originally added, because Panama FFM was in preview + Vector API was incubating.|
|
|
|However, since Panama FFM was finalized in JDK22 (|https://openjdk.org/jeps/454|), I wonder if we should decouple it from vectorization now?| |This would mean exposing (and supporting!) top-level on/off-heap vector similarity functions from all VectorUtilSupport <https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25> implementations (with Lucene providing one non-vectorized <https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultVectorUtilSupport.java#L26> and another Vector API powered vectorized <https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57> implementation), and only scoping non-finalized (i.e. incubating Vector API related) functionality in the MR-JAR (i.e. java25/ <https://github.com/apache/lucene/tree/main/lucene/core/src/java25>)|
|
|
|Although not a huge motivation, this would allow users that do not use vectorization to score vectors off-heap.| |The main benefit could be cleaner separation of functionality in the long term, also making it easier to write new ||VectorUtilSupport <https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25> implementations that primarily work with MemorySegment APIs (for example a native implementation in # <https://github.com/apache/lucene/pull/13572>|13572 <https://github.com/apache/lucene/pull/13572>|)|

Issue: https://github.com/apache/lucene/issues/15284
PR: https://github.com/apache/lucene/pull/15285

- Kaival

--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:[email protected]

Reply via email to