Hi,
I don't see a porblem with that. We can inlcude that in main branch.
Backporting is harder but we already have a difference between 10 and
main: All code using MemorySegment / MMapDircetory is already different.
Git merge/cherry picking is already very intelligent when it figures out
file name is identical.
Uwe
Am 14.10.2025 um 00:13 schrieb Trevor McCulloch:
Agree on waiting for Lucene 11, otherwise it may become very difficult
to backport any related features into 10.x.
On Sat, Oct 11, 2025 at 5:58 PM Uwe Schindler <[email protected]> wrote:
Hi,
hi, I am on travel at the moment till end of October and won't be
able to look into this. Basically we can move some parts of the
MemorySegment supporting code to main sourceSet, but I don't want
to touch that code too much. The nmost important thing si: No
public API is allowed to have any internal Vector API signatures
in public API (we don't have a check fo this), so we'd need to
closely look at it. The reason why some of the code is in the java
25 sourceset is exactly because of this. E.g. the
MemorySegment-based scoring.
In the future we may add more MemorySegment based APIs into core
(and remove all ByteBuffers, too), but this is some approach I
wanted to look at after we have released Lucene 11.
Uwe
Am 09.10.2025 um 23:57 schrieb Kaival Parikh:
For vector search in Lucene, functionality for memory access
(on/off-heap -- using Panama FFM) and vectorization (using the
Vector API -- jdk.incubator.vector) is tightly coupled (see
|PanamaVectorUtilSupport|
<https://github.com/apache/lucene/blob/602bfbd9af0ee9027de45c1572527eee6b073841/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57>|).
This made sense when it was originally added, because Panama FFM
was in preview + Vector API was incubating.|
|
|
|However, since Panama FFM was finalized in JDK22
(|https://openjdk.org/jeps/454|), I wonder if we should decouple
it from vectorization now?|
|This would mean exposing (and supporting!) top-level on/off-heap
vector similarity functions from all VectorUtilSupport
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25>
implementations
(with Lucene providing one non-vectorized
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultVectorUtilSupport.java#L26>
and another Vector API powered vectorized
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57>
implementation),
and only scoping non-finalized (i.e. incubating Vector API
related) functionality in the MR-JAR (i.e. java25/
<https://github.com/apache/lucene/tree/main/lucene/core/src/java25>)|
|
|
|Although not a huge motivation, this would allow users that do
not use vectorization to score vectors off-heap.|
|The main benefit could be cleaner separation of functionality in
the long term, also making it easier to write new
||VectorUtilSupport
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25>
implementations
that primarily work with MemorySegment APIs (for example a native
implementation in #
<https://github.com/apache/lucene/pull/13572>|13572
<https://github.com/apache/lucene/pull/13572>|)|
Issue: https://github.com/apache/lucene/issues/15284
PR: https://github.com/apache/lucene/pull/15285
- Kaival
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:[email protected]
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:[email protected]