Hi,
hi, I am on travel at the moment till end of October and won't be able
to look into this. Basically we can move some parts of the MemorySegment
supporting code to main sourceSet, but I don't want to touch that code
too much. The nmost important thing si: No public API is allowed to have
any internal Vector API signatures in public API (we don't have a check
fo this), so we'd need to closely look at it. The reason why some of the
code is in the java 25 sourceset is exactly because of this. E.g. the
MemorySegment-based scoring.
In the future we may add more MemorySegment based APIs into core (and
remove all ByteBuffers, too), but this is some approach I wanted to look
at after we have released Lucene 11.
Uwe
Am 09.10.2025 um 23:57 schrieb Kaival Parikh:
For vector search in Lucene, functionality for memory access
(on/off-heap -- using Panama FFM) and vectorization (using the Vector
API -- jdk.incubator.vector) is tightly coupled (see
|PanamaVectorUtilSupport|
<https://github.com/apache/lucene/blob/602bfbd9af0ee9027de45c1572527eee6b073841/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57>|).
This made sense when it was originally added, because Panama FFM was
in preview + Vector API was incubating.|
|
|
|However, since Panama FFM was finalized in JDK22
(|https://openjdk.org/jeps/454|), I wonder if we should decouple it
from vectorization now?|
|This would mean exposing (and supporting!) top-level on/off-heap
vector similarity functions from all VectorUtilSupport
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25> implementations
(with Lucene providing one non-vectorized
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultVectorUtilSupport.java#L26>
and another Vector API powered vectorized
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57> implementation),
and only scoping non-finalized (i.e. incubating Vector API related)
functionality in the MR-JAR (i.e. java25/
<https://github.com/apache/lucene/tree/main/lucene/core/src/java25>)|
|
|
|Although not a huge motivation, this would allow users that do not
use vectorization to score vectors off-heap.|
|The main benefit could be cleaner separation of functionality in the
long term, also making it easier to write new ||VectorUtilSupport
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25> implementations
that primarily work with MemorySegment APIs (for example a native
implementation in #
<https://github.com/apache/lucene/pull/13572>|13572
<https://github.com/apache/lucene/pull/13572>|)|
Issue: https://github.com/apache/lucene/issues/15284
PR: https://github.com/apache/lucene/pull/15285
- Kaival
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:[email protected]