[
https://issues.apache.org/jira/browse/SOLR-17815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alessandro Benedetti updated SOLR-17815:
----------------------------------------
Description:
ACORN with a threshold of '60' is the default when we upgrade to Lucene 10.x .
Disabling ACORN can be obtained at KnnSearchStrategy level.
public static class Hnsw extends KnnSearchStrategy {
public static final Hnsw DEFAULT = new
Hnsw(DEFAULT_FILTERED_SEARCH_THRESHOLD);
This issue should study when ACORN is useful or not, and if the default is not
good enough for Solr, we should make it parametric and explain it how to use it
to the users.
More info will follow and be better organised soon):
An interesting paper in this area: https://arxiv.org/abs/2403.04871
ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and
Structured Data
Liana Patel, Peter Kraft, Carlos Guestrin, Matei Zaharia
This was implemented in Lucene in https://github.com/apache/lucene/pull/14160
Specifically in org.apache.lucene.util.hnsw.FilteredHnswGraphSearcher
that can be used in Solr via org.apache.lucene.search.knn.KnnSearchStrategy.Hnsw
/**
* Create a new Hnsw strategy
*
* @param filteredSearchThreshold threshold for filtered search, a
percentage value from 0 to
* 100 where 0 means never use filtered search and 100 means always use
filtered search.
*/
public Hnsw(int filteredSearchThreshold) {
if (filteredSearchThreshold < 0 || filteredSearchThreshold > 100) {
throw new IllegalArgumentException("filteredSearchThreshold must be >=
0 and <= 100");
}
this.filteredSearchThreshold = filteredSearchThreshold;
}
We may pass an additional parameter to disable it (filteredSearchThreshold=0),
the default is:
public static final int DEFAULT_FILTERED_SEARCH_THRESHOLD = 60;
was:
An interesting paper in this area: https://arxiv.org/abs/2403.04871
ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and
Structured Data
Liana Patel, Peter Kraft, Carlos Guestrin, Matei Zaharia
This was implemented in Lucene in https://github.com/apache/lucene/pull/14160
Specifically in org.apache.lucene.util.hnsw.FilteredHnswGraphSearcher
that can be used in Solr via org.apache.lucene.search.knn.KnnSearchStrategy.Hnsw
/**
* Create a new Hnsw strategy
*
* @param filteredSearchThreshold threshold for filtered search, a
percentage value from 0 to
* 100 where 0 means never use filtered search and 100 means always use
filtered search.
*/
public Hnsw(int filteredSearchThreshold) {
if (filteredSearchThreshold < 0 || filteredSearchThreshold > 100) {
throw new IllegalArgumentException("filteredSearchThreshold must be >=
0 and <= 100");
}
this.filteredSearchThreshold = filteredSearchThreshold;
}
We may pass an additional parameter to disable it (filteredSearchThreshold=0),
the default is:
public static final int DEFAULT_FILTERED_SEARCH_THRESHOLD = 60;
> Add parameter to regulate for ACORN-based filtering in vector search?
> ---------------------------------------------------------------------
>
> Key: SOLR-17815
> URL: https://issues.apache.org/jira/browse/SOLR-17815
> Project: Solr
> Issue Type: Sub-task
> Reporter: Alessandro Benedetti
> Priority: Major
>
> ACORN with a threshold of '60' is the default when we upgrade to Lucene 10.x .
> Disabling ACORN can be obtained at KnnSearchStrategy level.
> public static class Hnsw extends KnnSearchStrategy {
> public static final Hnsw DEFAULT = new
> Hnsw(DEFAULT_FILTERED_SEARCH_THRESHOLD);
> This issue should study when ACORN is useful or not, and if the default is
> not good enough for Solr, we should make it parametric and explain it how to
> use it to the users.
> More info will follow and be better organised soon):
> An interesting paper in this area: https://arxiv.org/abs/2403.04871
> ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and
> Structured Data
> Liana Patel, Peter Kraft, Carlos Guestrin, Matei Zaharia
> This was implemented in Lucene in https://github.com/apache/lucene/pull/14160
> Specifically in org.apache.lucene.util.hnsw.FilteredHnswGraphSearcher
> that can be used in Solr via
> org.apache.lucene.search.knn.KnnSearchStrategy.Hnsw
> /**
> * Create a new Hnsw strategy
> *
> * @param filteredSearchThreshold threshold for filtered search, a
> percentage value from 0 to
> * 100 where 0 means never use filtered search and 100 means always
> use filtered search.
> */
> public Hnsw(int filteredSearchThreshold) {
> if (filteredSearchThreshold < 0 || filteredSearchThreshold > 100) {
> throw new IllegalArgumentException("filteredSearchThreshold must be
> >= 0 and <= 100");
> }
> this.filteredSearchThreshold = filteredSearchThreshold;
> }
> We may pass an additional parameter to disable it
> (filteredSearchThreshold=0), the default is:
> public static final int DEFAULT_FILTERED_SEARCH_THRESHOLD = 60;
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]