This is an automated email from the ASF dual-hosted git repository.
hossman pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/solr.git
The following commit(s) were added to refs/heads/main by this push:
new 39b379f4e3b SOLR-17975: New StrFloatLateInteractionVectorField
suitable for re-ranking documents using multi-vector values for late
interaction models
39b379f4e3b is described below
commit 39b379f4e3bbd26ffa17b8056ee14994914e51c8
Author: Chris Hostetter <[email protected]>
AuthorDate: Wed Feb 4 12:15:58 2026 -0700
SOLR-17975: New StrFloatLateInteractionVectorField suitable for re-ranking
documents using multi-vector values for late interaction models
---
...5-string-based-lateinteraction-vector-field.yml | 7 +
.../solr/schema/LateInteractionVectorField.java | 63 ++++
.../schema/StrFloatLateInteractionVectorField.java | 399 +++++++++++++++++++++
.../org/apache/solr/search/ValueSourceParser.java | 26 ++
.../conf/bad-schema-late-vec-field-indexed.xml | 27 ++
.../conf/bad-schema-late-vec-field-multivalued.xml | 27 ++
.../conf/bad-schema-late-vec-field-nodv.xml | 27 ++
.../conf/bad-schema-late-vec-ft-indexed.xml | 27 ++
.../conf/bad-schema-late-vec-ft-nodim.xml | 27 ++
.../conf/bad-schema-late-vec-ft-nodv.xml | 27 ++
.../conf/bad-schema-late-vec-ft-sim.xml | 27 ++
.../solr/collection1/conf/schema-deprecations.xml | 2 +-
.../conf/schema-inplace-required-field.xml | 2 +-
.../solr/collection1/conf/schema-late-vec.xml | 37 ++
.../test-files/solr/collection1/conf/schema15.xml | 3 +
.../conf/solrconfig-test-properties.xml | 2 +-
.../solr/configsets/cache-control/conf/schema.xml | 2 +-
.../configsets/cache-control/conf/solrconfig.xml | 2 +-
.../collectionA/conf/schema.xml | 2 +-
.../collectionA/conf/solrconfig.xml | 2 +-
.../schema/TestLateInteractionVectorFieldInit.java | 110 ++++++
.../org/apache/solr/search/QueryEqualityTest.java | 15 +
.../solr/search/TestLateInteractionVectors.java | 310 ++++++++++++++++
.../pages/field-types-included-with-solr.adoc | 2 +
.../query-guide/pages/dense-vector-search.adoc | 183 ++++++++--
.../query-guide/pages/function-queries.adoc | 5 +
26 files changed, 1330 insertions(+), 33 deletions(-)
diff --git
a/changelog/unreleased/SOLR-17975-string-based-lateinteraction-vector-field.yml
b/changelog/unreleased/SOLR-17975-string-based-lateinteraction-vector-field.yml
new file mode 100644
index 00000000000..dfb24bdfaa0
--- /dev/null
+++
b/changelog/unreleased/SOLR-17975-string-based-lateinteraction-vector-field.yml
@@ -0,0 +1,7 @@
+title: New StrFloatLateInteractionVectorField suitable for re-ranking
documents using multi-vector values for late interaction models
+type: added
+authors:
+- name: hossman
+links:
+- name: SOLR-17975
+ url: https://issues.apache.org/jira/browse/SOLR-17975
diff --git
a/solr/core/src/java/org/apache/solr/schema/LateInteractionVectorField.java
b/solr/core/src/java/org/apache/solr/schema/LateInteractionVectorField.java
new file mode 100644
index 00000000000..f4083481d4f
--- /dev/null
+++ b/solr/core/src/java/org/apache/solr/schema/LateInteractionVectorField.java
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.schema;
+
+import org.apache.lucene.search.DoubleValuesSource;
+import org.apache.solr.search.FunctionQParser;
+import org.apache.solr.search.SyntaxError;
+
+/**
+ * This is a marrker interface indicating that a {@link FieldType} supports
query time usage as
+ * "late interaction" vector field
+ *
+ * @lucene.internal
+ * @lucene.experimental Not currently intended for implementation by custom
FieldTypes
+ */
+public interface LateInteractionVectorField {
+
+ // TODO: Refactor StrFloatLateInteractionVectorField if/when more classes
implement this
+ // interface.
+ //
+ // The surface area of this interface is intentionally small to focus on the
query time aspects
+ //
+ // If/When we want to add more FieldTypes that implement this interface, we
should strongly
+ // consider refactoring some of the StrFloatLateInteractionVectorField
internals into an
+ // abstract base class for re-use.
+ //
+ // What that abstract base class should look like (and what should be
refactored will largely
+ // depend on *why* more implementations are being added:
+ // - new external representations ? (ie: not a single String)
+ // - new internal implementation ? (ie: int/byte vectors)
+
+ /**
+ * Method used for parsing some type specific query input structure into
Value source.
+ *
+ * <p>At the time this method is called, the {@link FunctionQParser} will
have already been used
+ * to parse the <code>fieldName</code> which will have been resolved to a
{@link FieldType} which
+ * must implement this interface
+ *
+ * <p>This method should be responsible for whatever type specific argument
parsing is neccessary,
+ * and confirm that no invalid (or unexpected "extra") arguments exist in
the <code>
+ * FunctionQParser</code>
+ *
+ * <p>(If field types implementing this method need the {@link SchemaField}
corrisponding to the
+ * <code>fieldName</code>, they should maintain a reference to the {@link
IndexSchema} used to
+ * initialize the <code>FieldType</code> instance.
+ */
+ public DoubleValuesSource parseLateInteractionValuesSource(
+ final String fieldName, final FunctionQParser fp) throws SyntaxError;
+}
diff --git
a/solr/core/src/java/org/apache/solr/schema/StrFloatLateInteractionVectorField.java
b/solr/core/src/java/org/apache/solr/schema/StrFloatLateInteractionVectorField.java
new file mode 100644
index 00000000000..f80b6cbea54
--- /dev/null
+++
b/solr/core/src/java/org/apache/solr/schema/StrFloatLateInteractionVectorField.java
@@ -0,0 +1,399 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.schema;
+
+import static java.util.Optional.ofNullable;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.List;
+import java.util.Locale;
+import java.util.Map;
+import org.apache.lucene.document.LateInteractionField;
+import org.apache.lucene.document.StoredField;
+import org.apache.lucene.index.IndexableField;
+import org.apache.lucene.index.VectorSimilarityFunction;
+import org.apache.lucene.queries.function.ValueSource;
+import org.apache.lucene.search.DoubleValuesSource;
+import org.apache.lucene.search.LateInteractionFloatValuesSource;
+import org.apache.lucene.search.LateInteractionFloatValuesSource.ScoreFunction;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.SortField;
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.response.TextResponseWriter;
+import org.apache.solr.search.FunctionQParser;
+import org.apache.solr.search.QParser;
+import org.apache.solr.search.StrParser;
+import org.apache.solr.search.SyntaxError;
+import org.apache.solr.uninverting.UninvertingReader;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * An implementation of {@link LateInteractionVectorField} backed by {@link
LateInteractionField}
+ * that externally represents all <code>float[][]</code> values (both field
values and query values)
+ * encoded as a (single) String.
+ *
+ * <p>Example: <code>[[1.1,-2.2,3],[4.0,5,-6.6],[7,8,99.99]]</code>
+ */
+public class StrFloatLateInteractionVectorField extends FieldType
+ implements LateInteractionVectorField {
+ private static final Logger log =
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+ public static final String VECTOR_DIMENSION = "vectorDimension";
+ public static final String SIMILARITY_FUNCTION = "similarityFunction";
+ public static final VectorSimilarityFunction DEFAULT_SIMILARITY =
+ VectorSimilarityFunction.EUCLIDEAN;
+ public static final String SCORE_FUNCTION = "scoreFunction";
+ public static final ScoreFunction DEFAULT_SCORE_FUNCTION =
ScoreFunction.SUM_MAX_SIM;
+
+ private static final int MUST_BE_TRUE = DOC_VALUES;
+ private static final int MUST_BE_FALSE = MULTIVALUED | TOKENIZED | INDEXED |
UNINVERTIBLE;
+
+ private static String MUST_BE_TRUE_MSG =
+ " fields require these properties to be true: " +
propertiesToString(MUST_BE_TRUE);
+ private static String MUST_BE_FALSE_MSG =
+ " fields require these properties to be false: " +
propertiesToString(MUST_BE_FALSE);
+
+ private int dimension;
+ private VectorSimilarityFunction similarityFunction;
+ private ScoreFunction scoreFunction;
+
+ public StrFloatLateInteractionVectorField() {
+ super();
+ }
+
+ @Override
+ public void init(IndexSchema schema, Map<String, String> args) {
+
+ this.dimension =
+ ofNullable(args.remove(VECTOR_DIMENSION))
+ .map(Integer::parseInt)
+ .orElseThrow(
+ () ->
+ new SolrException(
+ SolrException.ErrorCode.SERVER_ERROR,
+ VECTOR_DIMENSION + " is a mandatory parameter"));
+
+ this.similarityFunction =
+ optionalEnumArg(
+ SIMILARITY_FUNCTION,
+ args.remove(SIMILARITY_FUNCTION),
+ VectorSimilarityFunction.class,
+ DEFAULT_SIMILARITY);
+ this.scoreFunction =
+ optionalEnumArg(
+ SCORE_FUNCTION,
+ args.remove(SCORE_FUNCTION),
+ ScoreFunction.class,
+ DEFAULT_SCORE_FUNCTION);
+
+ // By the time this method is called, FieldType.setArgs has already set
"typical" defaults,
+ // and parsed the users explicit options.
+ // We need to override those defaults, and error if the user asked for
nonsense
+
+ this.properties |= MUST_BE_TRUE;
+ this.properties &= ~MUST_BE_FALSE;
+ if (on(trueProperties, MUST_BE_FALSE)) {
+ throw new SolrException(
+ SolrException.ErrorCode.SERVER_ERROR, getClass().getSimpleName() +
MUST_BE_FALSE_MSG);
+ }
+ if (on(falseProperties, MUST_BE_TRUE)) {
+ throw new SolrException(
+ SolrException.ErrorCode.SERVER_ERROR, getClass().getSimpleName() +
MUST_BE_TRUE_MSG);
+ }
+
+ super.init(schema, args);
+ }
+
+ public int getDimension() {
+ return dimension;
+ }
+
+ public VectorSimilarityFunction getSimilarityFunction() {
+ return similarityFunction;
+ }
+
+ public ScoreFunction getScoreFunction() {
+ return scoreFunction;
+ }
+
+ @Override
+ public DoubleValuesSource parseLateInteractionValuesSource(
+ final String fieldName, final FunctionQParser fp) throws SyntaxError {
+ final String vecStr = fp.parseArg();
+ if (null == vecStr || fp.hasMoreArguments()) {
+ throw new SolrException(
+ SolrException.ErrorCode.BAD_REQUEST,
+ "Invalid number of arguments. Please provide both a field name, and
a (String) multi-vector.");
+ }
+ return new LateInteractionFloatValuesSource(
+ fieldName,
+ stringToMultiFloatVector(dimension, vecStr),
+ getSimilarityFunction(),
+ getScoreFunction());
+ }
+
+ @Override
+ protected void checkSupportsDocValues() {
+ // No-Op: always supported
+ }
+
+ @Override
+ protected boolean enableDocValuesByDefault() {
+ return true;
+ }
+
+ @Override
+ public void checkSchemaField(final SchemaField field) throws SolrException {
+ super.checkSchemaField(field);
+ if (field.multiValued()) {
+ throw new SolrException(
+ SolrException.ErrorCode.SERVER_ERROR,
+ getClass().getSimpleName() + " fields can not be multiValued: " +
field.getName());
+ }
+ if (field.indexed()) {
+ throw new SolrException(
+ SolrException.ErrorCode.SERVER_ERROR,
+ getClass().getSimpleName() + " fields can not be indexed: " +
field.getName());
+ }
+
+ if (!field.hasDocValues()) {
+ throw new SolrException(
+ SolrException.ErrorCode.SERVER_ERROR,
+ getClass().getSimpleName() + " fields must have docValues: " +
field.getName());
+ }
+ }
+
+ /** Not supported: We override createFields. so this should never be called
*/
+ @Override
+ public IndexableField createField(SchemaField field, Object value) {
+ throw new IllegalStateException("This method should never be called in
expected operation");
+ }
+
+ @Override
+ public List<IndexableField> createFields(SchemaField field, Object value) {
+ try {
+ final ArrayList<IndexableField> fields = new ArrayList<>(2);
+
+ if (!CharSequence.class.isInstance(value)) {
+ throw new SolrException(
+ SolrException.ErrorCode.SERVER_ERROR,
+ getClass().getSimpleName() + " fields require string input: " +
field.getName());
+ }
+ final String valueString = value.toString();
+
+ final float[][] multiVec = stringToMultiFloatVector(dimension,
valueString);
+ fields.add(new LateInteractionField(field.getName(), multiVec));
+
+ if (field.stored()) {
+ fields.add(new StoredField(field.getName(), valueString));
+ }
+
+ return fields;
+ } catch (SyntaxError | RuntimeException e) {
+ throw new SolrException(
+ SolrException.ErrorCode.SERVER_ERROR,
+ "Error while creating field '" + field + "' from value '" + value +
"'",
+ e);
+ }
+ }
+
+ /**
+ * Converts a String representation of 1 or more float vectors of the
specified <code>dimension
+ * </code> into a <code>float[][]</code>
+ *
+ * @param dimension must be a positive integer
+ * @param input String value to be parsed, must not be null
+ * @lucene.experimental
+ */
+ public static float[][] stringToMultiFloatVector(final int dimension, final
String input)
+ throws SyntaxError {
+
+ assert 0 < dimension;
+ final int lastIndex = dimension - 1;
+
+ final List<float[]> result = new ArrayList<>(7);
+ final StrParser sp = new StrParser(input);
+ sp.expect("["); // outer array
+
+ while (sp.pos < sp.end) {
+ sp.expect("[");
+ final float[] entry = new float[dimension];
+ for (int i = 0; i < dimension; i++) {
+ final int preFloatPos = sp.pos;
+ try {
+ entry[i] = sp.getFloat();
+ } catch (NumberFormatException e) {
+ throw new SyntaxError(
+ "Expected float at position " + preFloatPos + " in '" + input +
"'", e);
+ }
+ if (i < lastIndex) {
+ sp.expect(",");
+ }
+ }
+
+ sp.expect("]");
+ result.add(entry);
+
+ if (',' != sp.peek()) {
+ // no more entries in outer array
+ break;
+ }
+ sp.expect(",");
+ }
+ sp.expect("]"); // outer array
+
+ sp.eatws();
+ if (sp.pos < sp.end) {
+ throw new SyntaxError("Unexpected text at position " + sp.pos + " in '"
+ input + "'");
+ }
+ return result.toArray(new float[result.size()][]);
+ }
+
+ /**
+ * Formats a non empty <code>float[][]</code> representing vectors into a
string for external
+ * representation.
+ *
+ * <p>NOTE: no validation is done to confirm that the individual
<code>float[]</code> values have
+ * consistent length.
+ *
+ * @param input String value to be parsed, must not be null
+ * @lucene.experimental
+ */
+ public static String multiFloatVectorToString(final float[][] input) {
+ assert null != input && 0 < input.length;
+ final StringBuilder out =
+ new StringBuilder(input.length * 89 /* prime, smallish, ~4 verbose
floats */);
+ out.append("[");
+ for (int i = 0; i < input.length; i++) {
+ final float[] currentVec = input[i];
+ assert 0 < currentVec.length;
+ out.append("[");
+ for (int x = 0; x < currentVec.length; x++) {
+ out.append(currentVec[x]);
+ out.append(",");
+ }
+ out.replace(out.length() - 1, out.length(), "]");
+ out.append(",");
+ }
+ out.replace(out.length() - 1, out.length(), "]");
+ return out.toString();
+ }
+
+ @Override
+ public String toExternal(IndexableField f) {
+ String val = f.stringValue();
+ if (val == null) {
+ val =
multiFloatVectorToString(LateInteractionField.decode(f.binaryValue()));
+ }
+ return val;
+ }
+
+ @Override
+ public UninvertingReader.Type getUninversionType(SchemaField sf) {
+ return null;
+ }
+
+ @Override
+ public void write(TextResponseWriter writer, String name, IndexableField f)
throws IOException {
+ writer.writeStr(name, toExternal(f), false);
+ }
+
+ @Override
+ public Object toObject(SchemaField sf, BytesRef term) {
+ return multiFloatVectorToString(LateInteractionField.decode(term));
+ }
+
+ /** Not supported */
+ @Override
+ public Query getPrefixQuery(QParser parser, SchemaField sf, String termStr) {
+ throw new SolrException(
+ SolrException.ErrorCode.BAD_REQUEST,
+ getClass().getSimpleName() + " not supported for prefix queries.");
+ }
+
+ /** Not supported */
+ @Override
+ public ValueSource getValueSource(SchemaField field, QParser parser) {
+ throw new SolrException(
+ SolrException.ErrorCode.BAD_REQUEST,
+ getClass().getSimpleName()
+ + " not supported for function queries, use lateVector()
function.");
+ }
+
+ /** Not supported */
+ @Override
+ public Query getFieldQuery(QParser parser, SchemaField field, String
externalVal) {
+ throw new SolrException(
+ SolrException.ErrorCode.BAD_REQUEST,
+ getClass().getSimpleName()
+ + " not supported for field queries, use lateVector() function.");
+ }
+
+ /** Not Supported */
+ @Override
+ public Query getRangeQuery(
+ QParser parser,
+ SchemaField field,
+ String part1,
+ String part2,
+ boolean minInclusive,
+ boolean maxInclusive) {
+ throw new SolrException(
+ SolrException.ErrorCode.BAD_REQUEST,
+ getClass().getSimpleName() + " not supported for range queries.");
+ }
+
+ /** Not Supported */
+ @Override
+ public Query getSetQuery(QParser parser, SchemaField field,
Collection<String> externalVals) {
+ throw new SolrException(
+ SolrException.ErrorCode.BAD_REQUEST,
+ getClass().getSimpleName() + " not supported for set queries.");
+ }
+
+ /** Not Supported */
+ @Override
+ public SortField getSortField(SchemaField field, boolean top) {
+ throw new SolrException(
+ SolrException.ErrorCode.BAD_REQUEST,
+ getClass().getSimpleName() + " not supported for sorting.");
+ }
+
+ /**
+ * @param key Config option name, used in exception messages
+ * @param value Value specified in configuration, may be <code>null</code>
+ * @param clazz Enum class specifying the return type
+ * @param defaultValue default to use if value is <code>null</code>
+ */
+ private static final <E extends Enum<E>> E optionalEnumArg(
+ final String key, final String value, final Class<E> clazz, final E
defaultValue)
+ throws SolrException {
+ try {
+ return ofNullable(value)
+ .map(v -> Enum.valueOf(clazz, v.toUpperCase(Locale.ROOT)))
+ .orElse(defaultValue);
+ } catch (IllegalArgumentException e) {
+ throw new SolrException(
+ SolrException.ErrorCode.SERVER_ERROR, key + " not recognized: " +
value);
+ }
+ }
+}
diff --git a/solr/core/src/java/org/apache/solr/search/ValueSourceParser.java
b/solr/core/src/java/org/apache/solr/search/ValueSourceParser.java
index 79daff98762..04e17b78d8e 100644
--- a/solr/core/src/java/org/apache/solr/search/ValueSourceParser.java
+++ b/solr/core/src/java/org/apache/solr/search/ValueSourceParser.java
@@ -81,6 +81,7 @@ import org.apache.solr.request.SolrRequestInfo;
import org.apache.solr.schema.CurrencyFieldType;
import org.apache.solr.schema.FieldType;
import org.apache.solr.schema.IndexSchema;
+import org.apache.solr.schema.LateInteractionVectorField;
import org.apache.solr.schema.SchemaField;
import org.apache.solr.schema.StrField;
import org.apache.solr.schema.TextField;
@@ -1359,6 +1360,31 @@ public abstract class ValueSourceParser implements
NamedListInitializedPlugin {
});
addParser("childfield", new ChildFieldValueSourceParser());
+
+ addParser(
+ "lateVector",
+ new ValueSourceParser() {
+
+ @Override
+ public ValueSource parse(final FunctionQParser fp) throws
SyntaxError {
+
+ final String fieldName = fp.parseArg();
+ if (null == fieldName) {
+ throw new SolrException(
+ SolrException.ErrorCode.BAD_REQUEST,
+ "Invalid arguments. First argument must be a field name");
+ }
+ final FieldType ft =
fp.getReq().getSchema().getFieldType(fieldName);
+ if (ft instanceof LateInteractionVectorField lift) {
+ return ValueSource.fromDoubleValuesSource(
+ lift.parseLateInteractionValuesSource(fieldName, fp));
+ }
+ throw new SolrException(
+ SolrException.ErrorCode.BAD_REQUEST,
+ "Field name is not defined in schema as a
StrFloatLateInteractionVectorField: "
+ + fieldName);
+ }
+ });
}
///////////////////////////////////////////////////////////////////////////////
diff --git
a/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-field-indexed.xml
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-field-indexed.xml
new file mode 100644
index 00000000000..656d8a75b56
--- /dev/null
+++
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-field-indexed.xml
@@ -0,0 +1,27 @@
+<?xml version="1.0" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<schema name="bad-schema" version="1.7">
+
+ <field name="bad_field" type="late" indexed="true" />
+
+ <fieldType name="late" class="solr.StrFloatLateInteractionVectorField"
vectorDimension="4" />
+ <fieldType name="string" class="solr.StrField" multiValued="true"/>
+ <field name="id" type="string" indexed="true" stored="true"
multiValued="false" required="false"/>
+ <uniqueKey>id</uniqueKey>
+</schema>
diff --git
a/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-field-multivalued.xml
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-field-multivalued.xml
new file mode 100644
index 00000000000..a5803e61391
--- /dev/null
+++
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-field-multivalued.xml
@@ -0,0 +1,27 @@
+<?xml version="1.0" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<schema name="bad-schema" version="1.7">
+
+ <field name="bad_field" type="late" multiValued="true" />
+
+ <fieldType name="late" class="solr.StrFloatLateInteractionVectorField"
vectorDimension="4" />
+ <fieldType name="string" class="solr.StrField" multiValued="true"/>
+ <field name="id" type="string" indexed="true" stored="true"
multiValued="false" required="false"/>
+ <uniqueKey>id</uniqueKey>
+</schema>
diff --git
a/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-field-nodv.xml
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-field-nodv.xml
new file mode 100644
index 00000000000..68a0a744628
--- /dev/null
+++
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-field-nodv.xml
@@ -0,0 +1,27 @@
+<?xml version="1.0" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<schema name="bad-schema" version="1.7">
+
+ <field name="bad_field" type="late" docValues="false" />
+
+ <fieldType name="late" class="solr.StrFloatLateInteractionVectorField"
vectorDimension="4" />
+ <fieldType name="string" class="solr.StrField" multiValued="true"/>
+ <field name="id" type="string" indexed="true" stored="true"
multiValued="false" required="false"/>
+ <uniqueKey>id</uniqueKey>
+</schema>
diff --git
a/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-indexed.xml
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-indexed.xml
new file mode 100644
index 00000000000..7cb4545ad68
--- /dev/null
+++
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-indexed.xml
@@ -0,0 +1,27 @@
+<?xml version="1.0" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<schema name="bad-schema" version="1.7">
+
+ <fieldType name="bad_ft" class="solr.StrFloatLateInteractionVectorField"
vectorDimension="4" indexed="true" multiValued="true" />
+
+ <fieldType name="string" class="solr.StrField" multiValued="true"/>
+
+ <field name="id" type="string" indexed="true" stored="true"
multiValued="false" required="false"/>
+ <uniqueKey>id</uniqueKey>
+</schema>
diff --git
a/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-nodim.xml
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-nodim.xml
new file mode 100644
index 00000000000..9734e3ac7d0
--- /dev/null
+++
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-nodim.xml
@@ -0,0 +1,27 @@
+<?xml version="1.0" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<schema name="bad-schema" version="1.7">
+
+ <fieldType name="bad_ft" class="solr.StrFloatLateInteractionVectorField" />
+
+ <fieldType name="string" class="solr.StrField" multiValued="true"/>
+
+ <field name="id" type="string" indexed="true" stored="true"
multiValued="false" required="false"/>
+ <uniqueKey>id</uniqueKey>
+</schema>
diff --git
a/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-nodv.xml
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-nodv.xml
new file mode 100644
index 00000000000..91d2e257b7e
--- /dev/null
+++
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-nodv.xml
@@ -0,0 +1,27 @@
+<?xml version="1.0" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<schema name="bad-schema" version="1.7">
+
+ <fieldType name="bad_ft" class="solr.StrFloatLateInteractionVectorField"
vectorDimension="4" docValues="false" />
+
+ <fieldType name="string" class="solr.StrField" multiValued="true"/>
+
+ <field name="id" type="string" indexed="true" stored="true"
multiValued="false" required="false"/>
+ <uniqueKey>id</uniqueKey>
+</schema>
diff --git
a/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-sim.xml
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-sim.xml
new file mode 100644
index 00000000000..41c7be6b0cc
--- /dev/null
+++
b/solr/core/src/test-files/solr/collection1/conf/bad-schema-late-vec-ft-sim.xml
@@ -0,0 +1,27 @@
+<?xml version="1.0" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<schema name="bad-schema" version="1.7">
+
+ <fieldType name="bad_ft" class="solr.StrFloatLateInteractionVectorField"
vectorDimension="4" similarityFunction="bogus" />
+
+ <fieldType name="string" class="solr.StrField" multiValued="true"/>
+
+ <field name="id" type="string" indexed="true" stored="true"
multiValued="false" required="false"/>
+ <uniqueKey>id</uniqueKey>
+</schema>
diff --git
a/solr/core/src/test-files/solr/collection1/conf/schema-deprecations.xml
b/solr/core/src/test-files/solr/collection1/conf/schema-deprecations.xml
index e50bfa8575c..a44cb99f67f 100644
--- a/solr/core/src/test-files/solr/collection1/conf/schema-deprecations.xml
+++ b/solr/core/src/test-files/solr/collection1/conf/schema-deprecations.xml
@@ -27,4 +27,4 @@
<field name="_version_" type="long" indexed="false" stored="false"/>
</fields>
-</schema>
\ No newline at end of file
+</schema>
diff --git
a/solr/core/src/test-files/solr/collection1/conf/schema-inplace-required-field.xml
b/solr/core/src/test-files/solr/collection1/conf/schema-inplace-required-field.xml
index 2892e5eb39f..5fd1c11e257 100644
---
a/solr/core/src/test-files/solr/collection1/conf/schema-inplace-required-field.xml
+++
b/solr/core/src/test-files/solr/collection1/conf/schema-inplace-required-field.xml
@@ -32,4 +32,4 @@
<field name="signatureField" type="string" indexed="true" stored="false"/>
<dynamicField name="*_sS" type="string" indexed="true" stored="true"/>
-</schema>
\ No newline at end of file
+</schema>
diff --git a/solr/core/src/test-files/solr/collection1/conf/schema-late-vec.xml
b/solr/core/src/test-files/solr/collection1/conf/schema-late-vec.xml
new file mode 100644
index 00000000000..e68e54decb4
--- /dev/null
+++ b/solr/core/src/test-files/solr/collection1/conf/schema-late-vec.xml
@@ -0,0 +1,37 @@
+<?xml version="1.0" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<schema name="late-vec-schema" version="1.7">
+
+ <fieldType name="late_vec_3_defaults"
class="solr.StrFloatLateInteractionVectorField" vectorDimension="3" />
+ <fieldType name="late_vec_4_defaults"
class="solr.StrFloatLateInteractionVectorField" vectorDimension="4" />
+
+ <fieldType name="late_vec_4_cosine"
class="solr.StrFloatLateInteractionVectorField" vectorDimension="4"
similarityFunction="cosine" />
+ <fieldType name="late_vec_4_nostored"
class="solr.StrFloatLateInteractionVectorField" vectorDimension="4"
stored="false" />
+
+ <field name="lv_3_def" type="late_vec_3_defaults" />
+ <field name="lv_4_def" type="late_vec_4_defaults" />
+ <field name="lv_4_cosine" type="late_vec_4_cosine" />
+
+ <field name="lv_4_nostored" type="late_vec_4_nostored" />
+ <field name="lv_3_nostored" type="late_vec_3_defaults" stored="false" />
+
+ <fieldType name="string" class="solr.StrField" multiValued="true"/>
+ <field name="id" type="string" indexed="true" stored="true"
multiValued="false" required="false"/>
+ <uniqueKey>id</uniqueKey>
+</schema>
diff --git a/solr/core/src/test-files/solr/collection1/conf/schema15.xml
b/solr/core/src/test-files/solr/collection1/conf/schema15.xml
index aefea6f106c..87fdad981d6 100644
--- a/solr/core/src/test-files/solr/collection1/conf/schema15.xml
+++ b/solr/core/src/test-files/solr/collection1/conf/schema15.xml
@@ -631,6 +631,9 @@
</analyzer>
</fieldType>
+ <!-- Late Interaction Vectors -->
+ <fieldType name="late_vector_4"
class="solr.StrFloatLateInteractionVectorField" vectorDimension="4" />
+ <field name="late_vec_4" type="late_vector_4" />
<uniqueKey>id</uniqueKey>
diff --git
a/solr/core/src/test-files/solr/collection1/conf/solrconfig-test-properties.xml
b/solr/core/src/test-files/solr/collection1/conf/solrconfig-test-properties.xml
index 2a095e0b02a..7fea20f90b3 100644
---
a/solr/core/src/test-files/solr/collection1/conf/solrconfig-test-properties.xml
+++
b/solr/core/src/test-files/solr/collection1/conf/solrconfig-test-properties.xml
@@ -34,4 +34,4 @@
attr2="${non.existent.sys.prop:default-from-config}">prefix-${solr.test.sys.prop2}-suffix</propTest>
<requestHandler name="/select" class="solr.SearchHandler" />
-</config>
\ No newline at end of file
+</config>
diff --git
a/solr/core/src/test-files/solr/configsets/cache-control/conf/schema.xml
b/solr/core/src/test-files/solr/configsets/cache-control/conf/schema.xml
index 9d85ec4f7f0..954baf7cfee 100644
--- a/solr/core/src/test-files/solr/configsets/cache-control/conf/schema.xml
+++ b/solr/core/src/test-files/solr/configsets/cache-control/conf/schema.xml
@@ -24,4 +24,4 @@
<field name="id" type="string" indexed="true" stored="true"/>
<dynamicField name="*_s" type="string" indexed="true" stored="true" />
<uniqueKey>id</uniqueKey>
-</schema>
\ No newline at end of file
+</schema>
diff --git
a/solr/core/src/test-files/solr/configsets/cache-control/conf/solrconfig.xml
b/solr/core/src/test-files/solr/configsets/cache-control/conf/solrconfig.xml
index bd27a88952a..6fb9ca42086 100644
--- a/solr/core/src/test-files/solr/configsets/cache-control/conf/solrconfig.xml
+++ b/solr/core/src/test-files/solr/configsets/cache-control/conf/solrconfig.xml
@@ -51,4 +51,4 @@
<indexConfig>
<mergeScheduler
class="${solr.mscheduler:org.apache.lucene.index.ConcurrentMergeScheduler}"/>
: </indexConfig>
-</config>
\ No newline at end of file
+</config>
diff --git
a/solr/core/src/test-files/solr/configsets/different-stopwords/collectionA/conf/schema.xml
b/solr/core/src/test-files/solr/configsets/different-stopwords/collectionA/conf/schema.xml
index 0d788eb038a..37692c15943 100644
---
a/solr/core/src/test-files/solr/configsets/different-stopwords/collectionA/conf/schema.xml
+++
b/solr/core/src/test-files/solr/configsets/different-stopwords/collectionA/conf/schema.xml
@@ -102,4 +102,4 @@
<uniqueKey>id</uniqueKey>
-</schema>
\ No newline at end of file
+</schema>
diff --git
a/solr/core/src/test-files/solr/configsets/different-stopwords/collectionA/conf/solrconfig.xml
b/solr/core/src/test-files/solr/configsets/different-stopwords/collectionA/conf/solrconfig.xml
index 52da2f113a5..f190e0d9a7c 100644
---
a/solr/core/src/test-files/solr/configsets/different-stopwords/collectionA/conf/solrconfig.xml
+++
b/solr/core/src/test-files/solr/configsets/different-stopwords/collectionA/conf/solrconfig.xml
@@ -55,4 +55,4 @@
</lst>
</requestHandler>
-</config>
\ No newline at end of file
+</config>
diff --git
a/solr/core/src/test/org/apache/solr/schema/TestLateInteractionVectorFieldInit.java
b/solr/core/src/test/org/apache/solr/schema/TestLateInteractionVectorFieldInit.java
new file mode 100644
index 00000000000..e15448bb324
--- /dev/null
+++
b/solr/core/src/test/org/apache/solr/schema/TestLateInteractionVectorFieldInit.java
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.schema;
+
+import java.util.Arrays;
+import org.apache.lucene.index.VectorSimilarityFunction;
+import org.apache.solr.core.AbstractBadConfigTestBase;
+
+/**
+ * Basic tests of {@link StrFloatLateInteractionVectorField} FieldType &
SchemaField
+ * initialization
+ */
+public class TestLateInteractionVectorFieldInit extends
AbstractBadConfigTestBase {
+
+ public void test_bad_ft_opts() throws Exception {
+ assertConfigs(
+ "solrconfig-basic.xml",
+ "bad-schema-late-vec-ft-nodim.xml",
+ StrFloatLateInteractionVectorField.VECTOR_DIMENSION);
+ assertConfigs(
+ "solrconfig-basic.xml",
+ "bad-schema-late-vec-ft-sim.xml",
+ StrFloatLateInteractionVectorField.SIMILARITY_FUNCTION);
+ assertConfigs(
+ "solrconfig-basic.xml",
+ "bad-schema-late-vec-ft-nodv.xml",
+ "require these properties to be true: docValues");
+ assertConfigs(
+ "solrconfig-basic.xml",
+ "bad-schema-late-vec-ft-indexed.xml",
+ "require these properties to be false:");
+ }
+
+ public void test_bad_field_opts() throws Exception {
+ assertConfigs(
+ "solrconfig-basic.xml", "bad-schema-late-vec-field-nodv.xml",
"docValues: bad_field");
+ assertConfigs(
+ "solrconfig-basic.xml", "bad-schema-late-vec-field-indexed.xml",
"indexed: bad_field");
+ assertConfigs(
+ "solrconfig-basic.xml",
+ "bad-schema-late-vec-field-multivalued.xml",
+ "multiValued: bad_field");
+ }
+
+ public void test_SchemaFields() throws Exception {
+ try {
+ initCore("solrconfig-basic.xml", "schema-late-vec.xml");
+ final IndexSchema schema = h.getCore().getLatestSchema();
+
+ final SchemaField def3 = schema.getField("lv_3_def");
+ final SchemaField def4 = schema.getField("lv_4_def");
+ final SchemaField nostored3 = schema.getField("lv_3_nostored");
+ final SchemaField nostored4 = schema.getField("lv_4_nostored");
+ final SchemaField cosine4 = schema.getField("lv_4_cosine");
+
+ // these should be true for everyone
+ for (SchemaField sf : Arrays.asList(def3, def4, cosine4, nostored3,
nostored4)) {
+ assertNotNull(sf.getName(), sf);
+ assertNotNull(sf.getName(), sf.getType());
+ assertNotNull(sf.getName(), sf.getType() instanceof
StrFloatLateInteractionVectorField);
+ assertTrue(sf.getName(), sf.hasDocValues());
+ assertFalse(sf.getName(), sf.multiValued());
+ assertFalse(sf.getName(), sf.indexed());
+ }
+
+ for (SchemaField sf : Arrays.asList(def3, nostored3)) {
+ assertEquals(
+ sf.getName(), 3, ((StrFloatLateInteractionVectorField)
sf.getType()).getDimension());
+ }
+ for (SchemaField sf : Arrays.asList(def4, cosine4, nostored4)) {
+ assertEquals(
+ sf.getName(), 4, ((StrFloatLateInteractionVectorField)
sf.getType()).getDimension());
+ }
+ for (SchemaField sf : Arrays.asList(def3, def4, cosine4)) {
+ assertTrue(sf.getName(), sf.stored());
+ }
+ for (SchemaField sf : Arrays.asList(nostored3, nostored4)) {
+ assertFalse(sf.getName(), sf.stored());
+ }
+ for (SchemaField sf : Arrays.asList(def3, def4, nostored3, nostored4)) {
+ assertEquals(
+ sf.getName(),
+ StrFloatLateInteractionVectorField.DEFAULT_SIMILARITY,
+ ((StrFloatLateInteractionVectorField)
sf.getType()).getSimilarityFunction());
+ }
+
+ assertEquals(
+ cosine4.getName(),
+ VectorSimilarityFunction.COSINE,
+ ((StrFloatLateInteractionVectorField)
cosine4.getType()).getSimilarityFunction());
+
+ } finally {
+ deleteCore();
+ }
+ }
+}
diff --git a/solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java
b/solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java
index 8df761740ae..a991827d80e 100644
--- a/solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java
+++ b/solr/core/src/test/org/apache/solr/search/QueryEqualityTest.java
@@ -1018,6 +1018,21 @@ public class QueryEqualityTest extends SolrTestCaseJ4 {
}
}
+ public void testFuncLateVector() throws Exception {
+ try (SolrQueryRequest req =
+ req(
+ "f", "late_vec_4",
+ "v1", "[[1,2,3,4],[4,5,6,7]]")) {
+ assertFuncEquals(
+ req,
+ "lateVector(late_vec_4, $v1)",
+ "lateVector($f, $v1)",
+ "lateVector($f, '[[1,2,3,4],[4,5,6,7]]')",
+ "lateVector(late_vec_4, '[[1.0,2.0,3.0,4.0],[4.0,5.0,6.0,7.0]]')",
+ "lateVector(late_vec_4, ' [[ 1, 2, 3, 4.0] ,[4,5,6,7]] ')");
+ }
+ }
+
public void testFuncQuery() throws Exception {
SolrQueryRequest req = req("myQ", "asdf");
try {
diff --git
a/solr/core/src/test/org/apache/solr/search/TestLateInteractionVectors.java
b/solr/core/src/test/org/apache/solr/search/TestLateInteractionVectors.java
new file mode 100644
index 00000000000..c82c0cf4ca1
--- /dev/null
+++ b/solr/core/src/test/org/apache/solr/search/TestLateInteractionVectors.java
@@ -0,0 +1,310 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.search;
+
+import static
org.apache.lucene.search.LateInteractionFloatValuesSource.ScoreFunction.SUM_MAX_SIM;
+import static
org.apache.solr.schema.StrFloatLateInteractionVectorField.multiFloatVectorToString;
+import static
org.apache.solr.schema.StrFloatLateInteractionVectorField.stringToMultiFloatVector;
+import static org.hamcrest.Matchers.startsWith;
+
+import java.util.Arrays;
+import java.util.EnumSet;
+import java.util.List;
+import java.util.Map;
+import org.apache.lucene.document.LateInteractionField;
+import org.apache.lucene.document.StoredField;
+import org.apache.lucene.index.IndexableField;
+import org.apache.lucene.index.VectorSimilarityFunction;
+import org.apache.lucene.search.LateInteractionFloatValuesSource.ScoreFunction;
+import org.apache.solr.SolrTestCaseJ4;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.schema.SchemaField;
+import org.apache.solr.schema.StrFloatLateInteractionVectorField;
+import org.junit.After;
+import org.junit.Before;
+
+/** Basic tests of {@link StrFloatLateInteractionVectorField} FieldType */
+public class TestLateInteractionVectors extends SolrTestCaseJ4 {
+
+ @Before
+ public void init() throws Exception {
+ initCore("solrconfig-basic.xml", "schema-late-vec.xml");
+ }
+
+ @After
+ public void cleanUp() {
+ clearIndex();
+ deleteCore();
+ }
+
+ public void testFutureProofAgainstNewScoreFunctions() throws Exception {
+ // if this assert fails, it means there are new value(s) in the
ScoreFunction enum,
+ // and we need to add fieldType declarations using those new
ScoreFunctions to our test
+ // configs, and confirm the correct score function is used in various tests
+ //
+ // then remove this test method
+ assertEquals(
+ "The Lucene ScoreFunction enum now has more then one value, test needs
updated",
+ EnumSet.of(ScoreFunction.SUM_MAX_SIM),
+ EnumSet.allOf(ScoreFunction.class));
+ }
+
+ public void testStringFloatEncodingAndDecoding() throws Exception {
+ final int DIMENSIONS = 4;
+
+ // some basic whitespace and int/float equivilences...
+ final float[][] basic = new float[][] {{1, 2, 3, 4}, {-5, 6, 7, 8}};
+ final List<String> basicWs =
+ Arrays.asList(
+ "[[1.0,2.0,3.0,4.0],[-5.0,6.0,7.0,8.0]]",
+ "[[1,2,3,4],[-5,6,7,8.0]]",
+ " [ [ 1,+2, 3,4 ] , [-05, 6,7, 8.000] ] ");
+
+ for (String in : basicWs) {
+ assertEquals(in, basic, stringToMultiFloatVector(DIMENSIONS, in));
+ }
+
+ // round trips of some "simple" fixed data with known string values
+ final Map<String, float[][]> simple =
+ Map.of(
+ "[[1.0,2.0,3.0,4.0]]",
+ new float[][] {{1, 2, 3, 4}},
+ basicWs.get(0),
+ basic,
+ "[[1.1754944E-38,1.4E-45,3.4028235E38,-0.0]]",
+ new float[][] {{Float.MIN_NORMAL, Float.MIN_VALUE,
Float.MAX_VALUE, -0.0F}});
+ for (Map.Entry<String, float[][]> e : simple.entrySet()) {
+ // one way each way
+ assertEquals(e.getValue(), stringToMultiFloatVector(DIMENSIONS,
e.getKey()));
+ assertEquals(e.getKey(), multiFloatVectorToString(e.getValue()));
+ // round trip each way
+ assertEquals(
+ e.getValue(),
+ stringToMultiFloatVector(DIMENSIONS,
multiFloatVectorToString(e.getValue())));
+ assertEquals(
+ e.getKey(),
multiFloatVectorToString(stringToMultiFloatVector(DIMENSIONS, e.getKey())));
+ }
+
+ // round trips of randomized vectors
+ final int randomIters = atLeast(50);
+ for (int iter = 0; iter < randomIters; iter++) {
+ final float[][] data = new float[atLeast(5)][];
+ for (int d = 0; d < data.length; d++) {
+ final float[] vec = data[d] = new float[DIMENSIONS];
+ for (int v = 0; v < DIMENSIONS; v++) {
+ vec[v] = random().nextFloat();
+ }
+ }
+ assertEquals(data, stringToMultiFloatVector(DIMENSIONS,
multiFloatVectorToString(data)));
+ }
+ }
+
+ public void testStringDecodingValidation() {
+ final int DIMENSIONS = 2;
+
+ // these should all be SyntaxErrors starting with "Expected..."
+ for (String bad :
+ Arrays.asList(
+ "",
+ "garbage",
+ "[]",
+ "[",
+ "]",
+ "[[1,2],",
+ "[[1,2],[]]",
+ "[[1,2]garbage]",
+ "[[1,2],[3]]",
+ "[[1,2],[,3]]",
+ "[[1,2],[3,,]]",
+ "[[1,2],[3,asdf]]")) {
+ final SyntaxError e =
+ expectThrows(
+ SyntaxError.class,
+ () -> {
+ stringToMultiFloatVector(DIMENSIONS, bad);
+ });
+ assertThat(e.getMessage(), startsWith("Expected "));
+ }
+
+ // Extra stuff at the end of input is "Unexpected..."
+ for (String bad : Arrays.asList("[[1,2]]garbage", "[[1,2]] garbage"))
{
+ final SyntaxError e =
+ expectThrows(
+ SyntaxError.class,
+ () -> {
+ stringToMultiFloatVector(DIMENSIONS, bad);
+ });
+ assertThat(e.getMessage(), startsWith("Unexpected "));
+ }
+ }
+
+ /** Low level test of createFields */
+ public void createFields() throws Exception {
+ final Map<String, float[][]> data =
+ Map.of(
+ "[[1,2,3,4]]",
+ new float[][] {{1F, 2F, 3F, 4F}},
+ "[[1,2,3,4],[5,6,7,8]]",
+ new float[][] {{1F, 2F, 3F, 4F}, {5F, 6F, 7F, 8F}});
+
+ try (SolrQueryRequest r = req()) {
+ // defaults with stored + doc values
+ for (String input : data.keySet()) {
+ final SchemaField f = r.getSchema().getField("lv_4_def");
+ final float[][] expected = data.get(input);
+ final List<IndexableField> actual = f.getType().createFields(f, input);
+ assertEquals(2, actual.size());
+
+ if (actual.get(0) instanceof LateInteractionField lif) {
+ assertEquals(expected, lif.getValue());
+ } else {
+ fail("first Field isn't a LIF: " + actual.get(0).getClass());
+ }
+ if (actual.get(1) instanceof StoredField stored) {
+ assertEquals(input, stored.stringValue());
+ } else {
+ fail("second Field isn't stored: " + actual.get(1).getClass());
+ }
+ }
+
+ // stored=false, only doc values
+ for (String input : data.keySet()) {
+ final SchemaField f = r.getSchema().getField("lv_4_nostored");
+ final float[][] expected = data.get(input);
+ final List<IndexableField> actual = f.getType().createFields(f, input);
+ assertEquals(1, actual.size());
+
+ if (actual.get(0) instanceof LateInteractionField lif) {
+ assertEquals(expected, lif.getValue());
+ } else {
+ fail("first Field isn't a LIF: " + actual.get(0).getClass());
+ }
+ }
+ }
+ }
+
+ public void testSimpleIndexAndRetrieval() throws Exception {
+ // for simplicity, use a single doc, with identical values in several
fields
+
+ final float[][] d3 = new float[][] {{0.1F, 0.2F, 0.3F}, {0.5F, -0.6F,
0.7F}, {0.1F, 0F, 0F}};
+ final String d3s = multiFloatVectorToString(d3);
+ final float[][] d4 =
+ new float[][] {{0.1F, 0.2F, 0.3F, 0.4F}, {0.5F, -0.6F, 0.7F, 0.8F},
{0.1F, 0F, 0F, 0F}};
+ final String d4s = multiFloatVectorToString(d4);
+ // quick round trip sanity checks
+ assertEquals(d3, stringToMultiFloatVector(3, d3s));
+ assertEquals(d4, stringToMultiFloatVector(4, d4s));
+
+ // now index the strings
+ assertU(
+ add(
+ doc(
+ "id", "xxx",
+ "lv_3_def", d3s,
+ "lv_3_nostored", d3s,
+ "lv_4_def", d4s,
+ "lv_4_cosine", d4s,
+ "lv_4_nostored", d4s)));
+
+ assertU(commit());
+
+ final float[][] q3 = new float[][] {{0.1F, 0.3F, 0.4F}, {0F, 0F, 0.1F}};
+ final String q3s = multiFloatVectorToString(q3);
+ final float[][] q4 = new float[][] {{0.9F, 0.9F, 0.9F, 0.9F}, {0.1F, 0.1F,
0.1F, 0.1F}};
+ final String q4s = multiFloatVectorToString(q4);
+ // quick round trip sanity checks
+ assertEquals(q3, stringToMultiFloatVector(3, q3s));
+ assertEquals(q4, stringToMultiFloatVector(4, q4s));
+
+ // expected values based on Lucene's underlying raw computation
+ // (this also ensures that our configured simFunc is being used correctly)
+ final float euclid3 = SUM_MAX_SIM.compare(q3, d3,
VectorSimilarityFunction.EUCLIDEAN);
+ final float euclid4 = SUM_MAX_SIM.compare(q4, d4,
VectorSimilarityFunction.EUCLIDEAN);
+ final float cosine4 = SUM_MAX_SIM.compare(q4, d4,
VectorSimilarityFunction.COSINE);
+
+ // quick sanity check that our data is useful for differentiation...
+ assertNotEquals(euclid4, cosine4);
+
+ // retrieve our doc, and check it's returned field values as well as our
sim function results
+ assertQ(
+ req(
+ "q", "id:xxx",
+ "fl", "*",
+ "fl", "euclid_3_def:lateVector(lv_3_def,'" + q3s + "')",
+ "fl", "euclid_3_nostored:lateVector(lv_3_nostored,'" + q3s + "')",
+ "fl", "euclid_4_def:lateVector(lv_4_def,'" + q4s + "')",
+ "fl", "euclid_4_nostored:lateVector(lv_4_nostored,'" + q4s + "')",
+ "fl", "cosine_4:lateVector(lv_4_cosine,'" + q4s + "')"),
+
+ // stored fields
+ "//str[@name='lv_3_def'][.='" + d3s + "']",
+ "//str[@name='lv_4_def'][.='" + d4s + "']",
+ "//str[@name='lv_4_cosine'][.='" + d4s + "']",
+
+ // dv only non-stored fields
+ "//str[@name='lv_3_nostored'][.='" + d3s + "']",
+ "//str[@name='lv_4_nostored'][.='" + d4s + "']",
+
+ // function computations
+ "//float[@name='euclid_3_def'][.=" + euclid3 + "]",
+ "//float[@name='euclid_3_nostored'][.=" + euclid3 + "]",
+ "//float[@name='euclid_4_def'][.=" + euclid4 + "]",
+ "//float[@name='euclid_4_nostored'][.=" + euclid4 + "]",
+ "//float[@name='cosine_4'][.=" + cosine4 + "]",
+
+ // sanity check
+ "//*[@numFound='1']");
+ }
+
+ public void testReRank() throws Exception {
+ final int numDocs = atLeast(10);
+ // NOTE start at '1' and stop at '<'; we add one more doc after the loop
+ for (int i = 1; i < numDocs; i++) {
+ assertU(
+ add(
+ doc(
+ "id", "xxx" + i,
+ "lv_3_def", "[[1,2,3],[4,5,6],[1,1," + i + "]]")));
+ }
+ assertU(
+ add(
+ doc(
+ "id", "yyy",
+ "lv_3_def", "[[1,5,9],[9,9,9],[6,6,6]]")));
+ assertU(commit());
+
+ // match everything, but our xxx* docs should all score higher then yyy
+ final String q = "id:xxx* OR *:*";
+
+ // sanity check our default scores & sort
+ // yyy should not appear in the top N-1 results
+ assertQ(
+ req("q", q, "rows", Integer.toString(numDocs - 1)),
+ "//*[@numFound='" + numDocs + "']",
+ "not(//str[@name='id'][.='yyy'])");
+
+ // same query with smalll rows and reRank to pull yyy to the top spot
+ assertQ(
+ req(
+ "q", q,
+ "rows", "5",
+ "rq", "{!rerank reRankQuery=$rqq reRankDocs=1000 reRankWeight=3}",
+ "rqq", "{!func}lateVector(lv_3_def,'[[9,9,9],[6,7,6]]')"),
+ "//*[@numFound='" + numDocs + "']",
+ "//result/doc[1]/str[@name='id'][.='yyy']");
+ }
+}
diff --git
a/solr/solr-ref-guide/modules/indexing-guide/pages/field-types-included-with-solr.adoc
b/solr/solr-ref-guide/modules/indexing-guide/pages/field-types-included-with-solr.adoc
index 082318a6754..4eaed0e0475 100644
---
a/solr/solr-ref-guide/modules/indexing-guide/pages/field-types-included-with-solr.adoc
+++
b/solr/solr-ref-guide/modules/indexing-guide/pages/field-types-included-with-solr.adoc
@@ -71,6 +71,8 @@ The
{solr-javadocs}/core/org/apache/solr/schema/package-summary.html[`org.apache
|StrField |String (UTF-8 encoded string or Unicode). Indexed `indexed="true"`
strings are intended for small fields and are _not_ tokenized or analyzed in
any way. They have a hard limit of slightly less than 32K. Non-indexed
`indexed="false"` and non-DocValues `docValues="false"` strings are suitable
for storing large strings.
+|StrFloatLateInteractionVectorField |Supports indexing dense "Multi-Vectors"
of float values for use with Late Interaction Query Re-Ranking. See the section
xref:query-guide:dense-vector-search.adoc[] for more information.
+
|TextField |Text, usually multiple words or tokens. In normal usage, only
fields of type TextField or SortableTextField will specify an
xref:analyzers.adoc[analyzer].
|UUIDField |Universally Unique Identifier (UUID). Pass in a value of `NEW` and
Solr will create a new UUID.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/dense-vector-search.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/dense-vector-search.adoc
index 94938e64c1a..cbe3d578190 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/dense-vector-search.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/dense-vector-search.adoc
@@ -50,8 +50,16 @@ It provides efficient approximate nearest neighbor search
for high dimensional v
See https://doi.org/10.1016/j.is.2013.10.006[Approximate nearest neighbor
algorithm based on navigable small world graphs (2014)] and
https://arxiv.org/abs/1603.09320[Efficient and robust approximate nearest
neighbor search using Hierarchical Navigable Small World graphs (2018)] for
details.
-== Index Time
-This is the Apache Solr field type designed to support dense vector search:
+=== Late Interaction Retrieval
+
+Late Interaction Retrieval (LIR) is a method of encoding detailed semantic
information into multiple dense vectors for finer grain representation.
+
+Across a large corpora, these "multi-vector" representations of the original
semantic content are typically too large and unwieldy to index and search in
navigable small-world graph in a useful manner. Instead "Late Interaction"
approaches are typically used to compute vector similarities scores against a
subset of documents, after an initial pass of other search techniques.
+
+
+== Indexing Dense Vectors in Navigable Small-world Graphs
+
+Apache Solr supports multiple relative field types designed to support dense
vector search with navigable small world graphs:
=== DenseVectorField
The dense vector field gives the possibility of indexing and searching dense
vectors of float elements.
@@ -76,9 +84,9 @@ s|Required |Default: none
The dimension of the dense vector to pass in.
+
Accepted values:
-Any integer.
+Any positive integer.
-`similarityFunction`::
+[[similarityFunction-caveats]]`similarityFunction`::
+
[%autowidth,frame=none]
|===
@@ -87,24 +95,18 @@ Any integer.
+
Vector similarity function; used in search to return top K most similar
vectors to a target vector.
+
-Accepted values: `euclidean`, `dot_product` or `cosine`.
-
+Accepted values:
++
* `euclidean`: https://en.wikipedia.org/wiki/Euclidean_distance[Euclidean
distance]
-* `dot_product`: https://en.wikipedia.org/wiki/Dot_product[Dot product]
-
-[NOTE]
-this similarity is intended as an optimized way to perform cosine similarity.
In order to use it, all vectors must be of unit length, including both document
and query vectors. Using dot product with vectors that are not unit length can
result in errors or poor search results.
-
* `cosine`: https://en.wikipedia.org/wiki/Cosine_similarity[Cosine similarity]
+* `dot_product`: https://en.wikipedia.org/wiki/Dot_product[Dot product]
[NOTE]
-the cosine similarity scores returned by Solr are normalized like this : `(1 +
cosine_similarity) / 2`.
-
-[NOTE]
-the preferred way to perform cosine similarity is to normalize all vectors to
unit length, and instead use DOT_PRODUCT. You should only use this function if
you need to preserve the original vectors and cannot normalize them in advance.
-
-[NOTE]
-The HNSW parameters `hnswM` and `hnswEfConstruction`, previously known as
`hnswMaxConnections` and `hnswBeamWidth` respectively.
+====
+* the preferred way to perform `cosine` similarity is to normalize all vectors
to unit length, and instead use `doc_product`. You should only specify `cosine`
if you need to preserve the original vectors and cannot normalize them in
advance.
+* `dot_product` is intended as an optimized way to perform `cosine`
similarity. In order to use it, all vectors must be of unit length, including
both document and query vectors. Using dot product with vectors that are not
unit length can result in errors or poor search results.
+* the cosine similarity scores returned by Solr are normalized like this : `(1
+ cosine_similarity) / 2`.
+====
To use the following advanced parameters that customise the codec format
and the hyperparameter of the HNSW algorithm, make sure the
xref:configuration-guide:codec-factory.adoc[Schema Codec Factory], is in use.
@@ -124,7 +126,6 @@ Here's how `DenseVectorField` can be configured with the
advanced hyperparameter
+
(advanced) Specifies the underlying knn algorithm to use
+
-
Accepted values: `hnsw`, `cagra_hnsw` (requires GPU acceleration setup).
Please note that the `knnAlgorithm` accepted values may change in future
releases.
@@ -138,7 +139,6 @@ Please note that the `knnAlgorithm` accepted values may
change in future release
+
(advanced) Specifies the underlying encoding of the dense vector elements.
This affects memory/disk impact for both the indexed and stored fields (if
enabled)
+
-
Accepted values: `FLOAT32`, `BYTE`.
@@ -388,9 +388,10 @@ BinaryQuantizedDenseVectorField accepts the same
parameters as `DenseVectorField
`similarityFunction`. Bit quantization uses its own distance calculation and
so does not require nor use the `similarityFunction`
param.
-== Query Time
+[[query-hnsw-fields]]
+== Querying Vectors in Navigable Small-world Graphs
-Apache Solr provides three query parsers that work with dense vector fields,
that each support different ways of matching documents based on vector
similarity: The `knn` query parser, the `vectorSimilarity` query parser and the
`knn_text_to_vector` query parser.
+Apache Solr provides three query parsers that work with the `DenseVectorField`
family of field types, that each support different ways of matching documents
based on vector similarity: The `knn` query parser, the `vectorSimilarity`
query parser and the `knn_text_to_vector` query parser.
All parsers return scores for retrieved documents that are the approximate
distance to the target vector (defined by the similarityFunction configured at
indexing time) and both support "Pre-Filtering" the document graph to reduce
the number of candidate vectors evaluated (without needing to compute their
vector similarity distances).
@@ -743,11 +744,11 @@ Here's an example of a simple `vectorSimilarity` search:
The search results retrieved are all documents whose similarity with the input
vector `[1.0, 2.0, 3.0, 4.0]` is at least `0.7` based on the
`similarityFunction` configured at indexing time
-=== Which one to use?
+=== Which Query Parser to use?
Let's see when to use each of the dense retrieval query parsers available:
-== knn Query Parser
+==== knn Query Parser
You should use the `knn` query parser when:
@@ -756,7 +757,7 @@ You should use the `knn` query parser when:
* you want to a have a fine-grained control over the way you encode text to
vector and prefer to do it outside of Apache Solr
-== knn_text_to_vector Query Parser
+==== knn_text_to_vector Query Parser
You should use the `knn_text_to_vector` query parser when:
@@ -770,7 +771,7 @@ Apache Solr uses
https://github.com/langchain4j/langchain4j[LangChain4j] to inte
The integration is experimental and we are going to improve our stress-test
and benchmarking coverage of this query parser in future iterations: if you
care about raw performance you may prefer to encode the text outside of Solr
====
-== vectorSimilarity Query Parser
+==== vectorSimilarity Query Parser
You should use the `vectorSimilarity` query parser when:
@@ -877,6 +878,136 @@ The final ranked list of results will have the first pass
score(main query `q`)
Details about using the ReRank Query Parser can be found in the
xref:query-guide:query-re-ranking.adoc[Query Re-Ranking] section.
====
+
+== Indexing Multi-Vectors for Late Interaction
+
+For Late Interaction usecases, Solr provides a
`StrFloatLateInteractionVectorField` field type, which supports indexing a
variable length "Multi-Vector" of Float vectors, serialized as as a single
String value.
+
+For example: `"[[1.0, 2, 3.7, 4.1], [2.2, -2.5, 7.3, 4.0]]"`
+
+Here's how `StrFloatLateInteractionVectorField` should be configured in the
schema:
+
+[source,xml]
+<fieldType name="late_vectors" class="solr.StrFloatLateInteractionVectorField"
vectorDimension="4" similarityFunction="cosine"/>
+<field name="my_late_vectors" type="late_vectors" docValues="true"
stored="true"/>
+
+
+`vectorDimension`::
++
+[%autowidth,frame=none]
+|===
+s|Required |Default: none
+|===
++
+The dimension of the individual dense vectors that will be contained in the
Multi-Vectors indexed in this field
++
+Accepted values:
+Any positive integer.
+
+`similarityFunction`::
++
+[%autowidth,frame=none]
+|===
+|Optional |Default: `euclidean`
+|===
++
+Vector similarity function; used in computing the similarity of indexed
vectors to a target vector.
++
+Accepted values: `euclidean`, `dot_product` or `cosine`.
++
+[NOTE]
+See <<similarityFunction-caveats,previous notes regarding
`similarityFunction`>> in `DenseVectorField`, they are also applicable here.
+
+`scoreFunction`::
++
+[%autowidth,frame=none]
+|===
+|Optional |Default: `sum_min_max`
+|===
++
+Multi Vector scoring function, used to compute a single numeric score when
computing the similarity of multiple indexed vectors to a multiple target
vector.
++
+Accepted values: `sum_min_max`
+
+`StrFloatLateInteractionVectorField` also supports the standard attribute
`stored`.
+
+[NOTE]
+====
+* `StrFloatLateInteractionVectorField` defaults to (and requires)
`docValues="true" indexed="false" multivalued="false"`
+* Allthough the field type is used to index "Multi-Vectors", Only a _single_
string value (including the multiple vectors) may be indexed into each field.
+====
+
+Here's how a `StrFloatLateInteractionVectorField` named `my_late_vector`
should be indexed:
+
+[tabs#latevectorfield-index]
+======
+JSON::
++
+====
+[source,json]
+----
+[{ "id": "1",
+"my_late_vector": "[[1.0, 2, 3.7, 4.1], [2.2, -2.5, 7.3, 4.0]]",
+},
+{ "id": "2",
+"my_late_vector": "[[2.0, 5.6, -3.2, 1.4], [7.8, -2.5, 3.7, 0.0034], [-2.2,
5.5, 0.6, -0.030]]"
+}
+]
+----
+====
+
+XML::
++
+====
+[source,xml]
+----
+<add>
+<doc>
+<field name="id">1</field>
+<field name="my_late_vector">[[1.0, 2, 3.7, 4.1], [2.2, -2.5, 7.3,
4.0]]</field>
+</doc>
+<doc>
+<field name="id">2</field>
+<field name="my_late_vector">[[2.0, 5.6, -3.2, 1.4], [7.8, -2.5, 3.7, 0.0034],
[-2.2, 5.5, 0.6, -0.030]]</field>
+</doc>
+</add>
+----
+====
+
+SolrJ::
++
+====
+[source,java,indent=0]
+----
+final SolrClient client = getSolrClient();
+
+final SolrInputDocument d1 = new SolrInputDocument();
+d1.setField("id", "1");
+d1.setField("my_late_vector", "[[1.0, 2, 3.7, 4.1], [2.2, -2.5, 7.3, 4.0]]"));
+
+final SolrInputDocument d2 = new SolrInputDocument();
+d2.setField("id", "2");
+d2.setField("my_late_vector", "[[2.0, 5.6, -3.2, 1.4], [7.8, -2.5, 3.7,
0.0034], [-2.2, 5.5, 0.6, -0.030]]"));
+
+client.add(Arrays.asList(d1, d2));
+----
+====
+======
+
+[[late-vector-queries]]
+== Using Late Interaction Vectors in Re-Ranking
+
+Late Interaction vector fields are poorly suited for querying (or filtering)
documents, but they can be very useful for
xref:query-guide:query-re-ranking.adoc[Re-Ranking] first pass results from
other queries (even <<query-hnsw-fields,other dense vector queries>>) by using
a `lateVector()`
xref:query-guide:function-queries.adoc#latevector-function[function query].
+
+Here is an example of re-ranking a query using a
`StrFloatLateInteractionVectorField` named `my_late_vector`:
+
+[source,text]
+?q=title:"Potato Chips"&rq={!rerank
reRankQuery=$rqq}&rqq={!func}lateVector(my_late_vector,"[[1.0,-2.0,3.0,4.0],[[6.0,7,8.1,9.9]]")
+
+
+Details about using the ReRank Query Parser can be found in the
xref:query-guide:query-re-ranking.adoc[Query Re-Ranking] section.
+
+
== GPU Acceleration
[NOTE]
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/function-queries.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/function-queries.adoc
index 05abe1a114f..7244db1428f 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/function-queries.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/function-queries.adoc
@@ -253,6 +253,11 @@ An expression can be any function which outputs boolean
values, or even function
* `if(termfreq (cat,'electronics'),popularity,42)`: This function checks each
document for to see if it contains the term "electronics" in the `cat` field.
If it does, then the value of the `popularity` field is returned, otherwise
the value of `42` is returned.
+=== lateVector Function
+Computes a Multi-Vector similarity score between a Late Interaction vector
field and a target Multi-Vector.
+
+See the xref:query-guide:dense-vector-search.adoc#late-vector-queries[Dense
Vector Search] section for more details
+
=== linear Function
Implements `m*x+c` where `m` and `c` are constants and `x` is an arbitrary
function.
This is equivalent to `sum(product(m,x),c)`, but slightly more efficient as it
is implemented as a single function.