[
https://issues.apache.org/jira/browse/SOLR-17815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alessandro Benedetti updated SOLR-17815:
----------------------------------------
Description:
ACORN is an interesting approach to optimised filtered vector search:
https://arxiv.org/abs/2403.04871
ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and
Structured Data
Liana Patel, Peter Kraft, Carlos Guestrin, Matei Zaharia
h1. LUCENE IMPLEMENTATION
This was implemented in Lucene with https://github.com/apache/lucene/pull/14160
Specifically in org.apache.lucene.util.hnsw.FilteredHnswGraphSearcher
that can be used in Solr via org.apache.lucene.search.knn.KnnSearchStrategy.Hnsw
/**
* Create a new Hnsw strategy
*
* @param filteredSearchThreshold threshold for filtered search, a
percentage value from 0 to
* 100 where 0 means never use filtered search and 100 means always use
filtered search.
*/
public Hnsw(int filteredSearchThreshold) {
if (filteredSearchThreshold < 0 || filteredSearchThreshold > 100) {
throw new IllegalArgumentException("filteredSearchThreshold must be >=
0 and <= 100");
}
this.filteredSearchThreshold = filteredSearchThreshold;
}
h1. DEFAULT
ACORN with a threshold of '60' is the default when we upgrade to Lucene 10.x .
Disabling ACORN can be obtained at the KnnSearchStrategy level passing '0' as
the threshold.
public static class Hnsw extends KnnSearchStrategy {
public static final Hnsw DEFAULT = new
Hnsw(DEFAULT_FILTERED_SEARCH_THRESHOLD);
h1. SCOPE OF THIS ISSUE
This issue should study when ACORN is useful or not, and if the default is not
good enough for Solr.
If not, the expected result from this task is a detailed motivation and the
implementation of a parameter that gives users the possibility of
disabling/regulating the ACORN behavior.
was:
ACORN with a threshold of '60' is the default when we upgrade to Lucene 10.x .
Disabling ACORN can be obtained at KnnSearchStrategy level.
public static class Hnsw extends KnnSearchStrategy {
public static final Hnsw DEFAULT = new
Hnsw(DEFAULT_FILTERED_SEARCH_THRESHOLD);
This issue should study when ACORN is useful or not, and if the default is not
good enough for Solr, we should make it parametric and explain it how to use it
to the users.
More info will follow and be better organised soon):
An interesting paper in this area: https://arxiv.org/abs/2403.04871
ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and
Structured Data
Liana Patel, Peter Kraft, Carlos Guestrin, Matei Zaharia
This was implemented in Lucene in https://github.com/apache/lucene/pull/14160
Specifically in org.apache.lucene.util.hnsw.FilteredHnswGraphSearcher
that can be used in Solr via org.apache.lucene.search.knn.KnnSearchStrategy.Hnsw
/**
* Create a new Hnsw strategy
*
* @param filteredSearchThreshold threshold for filtered search, a
percentage value from 0 to
* 100 where 0 means never use filtered search and 100 means always use
filtered search.
*/
public Hnsw(int filteredSearchThreshold) {
if (filteredSearchThreshold < 0 || filteredSearchThreshold > 100) {
throw new IllegalArgumentException("filteredSearchThreshold must be >=
0 and <= 100");
}
this.filteredSearchThreshold = filteredSearchThreshold;
}
We may pass an additional parameter to disable it (filteredSearchThreshold=0),
the default is:
public static final int DEFAULT_FILTERED_SEARCH_THRESHOLD = 60;
> Add parameter to regulate for ACORN-based filtering in vector search?
> ---------------------------------------------------------------------
>
> Key: SOLR-17815
> URL: https://issues.apache.org/jira/browse/SOLR-17815
> Project: Solr
> Issue Type: New Feature
> Components: vector-search
> Reporter: Alessandro Benedetti
> Priority: Major
>
> ACORN is an interesting approach to optimised filtered vector search:
> https://arxiv.org/abs/2403.04871
> ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and
> Structured Data
> Liana Patel, Peter Kraft, Carlos Guestrin, Matei Zaharia
> h1. LUCENE IMPLEMENTATION
> This was implemented in Lucene with
> https://github.com/apache/lucene/pull/14160
> Specifically in org.apache.lucene.util.hnsw.FilteredHnswGraphSearcher
> that can be used in Solr via
> org.apache.lucene.search.knn.KnnSearchStrategy.Hnsw
> /**
> * Create a new Hnsw strategy
> *
> * @param filteredSearchThreshold threshold for filtered search, a
> percentage value from 0 to
> * 100 where 0 means never use filtered search and 100 means always
> use filtered search.
> */
> public Hnsw(int filteredSearchThreshold) {
> if (filteredSearchThreshold < 0 || filteredSearchThreshold > 100) {
> throw new IllegalArgumentException("filteredSearchThreshold must be
> >= 0 and <= 100");
> }
> this.filteredSearchThreshold = filteredSearchThreshold;
> }
> h1. DEFAULT
> ACORN with a threshold of '60' is the default when we upgrade to Lucene 10.x .
> Disabling ACORN can be obtained at the KnnSearchStrategy level passing '0' as
> the threshold.
> public static class Hnsw extends KnnSearchStrategy {
> public static final Hnsw DEFAULT = new
> Hnsw(DEFAULT_FILTERED_SEARCH_THRESHOLD);
> h1. SCOPE OF THIS ISSUE
> This issue should study when ACORN is useful or not, and if the default is
> not good enough for Solr.
> If not, the expected result from this task is a detailed motivation and the
> implementation of a parameter that gives users the possibility of
> disabling/regulating the ACORN behavior.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]