Hi,
we can't backport this to 10.x because it is Java 21 only so we can't
have any memory segments in public APIs. The plan is already to release
a Lucene 11 at some point before end of year!
Uwe
Am 16.10.2025 um 00:06 schrieb Kaival Parikh:
Thanks Uwe and Trevor,
No public API is allowed to have any internal Vector API
signatures in public API
I agree, that's what I aimed for in the PR, keeping all incubating
stuff in internal APIs..
Agree on waiting for Lucene 11, otherwise it may become very
difficult to backport any related features into 10.x.
To unblock related features in 10.x, can we backport this change
/partially/ by cherry-picking stuff internal to Lucene (i.e.
VectorUtilSupport), and just exclude public changes (i.e. the new
signatures in VectorUtil)?
I really feel making MemorySegment-based vector scoring a first-class
API in Lucene could open up some interesting features..
But yes, it would be easier overall with Lucene 11, do we have a
timeline for it?
- Kaival
On Mon, Oct 13, 2025 at 12:14 PM Trevor McCulloch
<[email protected]> wrote:
Agree on waiting for Lucene 11, otherwise it may become very
difficult to backport any related features into 10.x.
On Sat, Oct 11, 2025 at 5:58 PM Uwe Schindler <[email protected]> wrote:
Hi,
hi, I am on travel at the moment till end of October and won't
be able to look into this. Basically we can move some parts of
the MemorySegment supporting code to main sourceSet, but I
don't want to touch that code too much. The nmost important
thing si: No public API is allowed to have any internal Vector
API signatures in public API (we don't have a check fo this),
so we'd need to closely look at it. The reason why some of the
code is in the java 25 sourceset is exactly because of this.
E.g. the MemorySegment-based scoring.
In the future we may add more MemorySegment based APIs into
core (and remove all ByteBuffers, too), but this is some
approach I wanted to look at after we have released Lucene 11.
Uwe
Am 09.10.2025 um 23:57 schrieb Kaival Parikh:
For vector search in Lucene, functionality for memory access
(on/off-heap -- using Panama FFM) and vectorization (using
the Vector API -- jdk.incubator.vector) is tightly coupled
(see |PanamaVectorUtilSupport|
<https://github.com/apache/lucene/blob/602bfbd9af0ee9027de45c1572527eee6b073841/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57>|).
This made sense when it was originally added, because Panama
FFM was in preview + Vector API was incubating.|
|
|
|However, since Panama FFM was finalized in JDK22
(|https://openjdk.org/jeps/454|), I wonder if we should
decouple it from vectorization now?|
|This would mean exposing (and supporting!) top-level
on/off-heap vector similarity functions from all
VectorUtilSupport
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25>
implementations
(with Lucene providing one non-vectorized
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultVectorUtilSupport.java#L26>
and another Vector API powered vectorized
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java25/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java#L57>
implementation),
and only scoping non-finalized (i.e. incubating Vector API
related) functionality in the MR-JAR (i.e. java25/
<https://github.com/apache/lucene/tree/main/lucene/core/src/java25>)|
|
|
|Although not a huge motivation, this would allow users that
do not use vectorization to score vectors off-heap.|
|The main benefit could be cleaner separation of
functionality in the long term, also making it easier to
write new ||VectorUtilSupport
<https://github.com/apache/lucene/blob/8f68736e75609d13053420450ad451e52cba107d/lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorUtilSupport.java#L25>
implementations
that primarily work with MemorySegment APIs (for example a
native implementation in #
<https://github.com/apache/lucene/pull/13572>|13572
<https://github.com/apache/lucene/pull/13572>|)|
Issue: https://github.com/apache/lucene/issues/15284
PR: https://github.com/apache/lucene/pull/15285
- Kaival
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:[email protected]
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:[email protected]