[ 
https://issues.apache.org/jira/browse/SOLR-17815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alessandro Benedetti updated SOLR-17815:
----------------------------------------
    Description: 
ACORN with a threshold of '60' is the default when we upgrade to Lucene 10.x .
Disabling ACORN can be obtained at KnnSearchStrategy level.

 public static class Hnsw extends KnnSearchStrategy {
    public static final Hnsw DEFAULT = new 
Hnsw(DEFAULT_FILTERED_SEARCH_THRESHOLD);

This issue should study when ACORN is useful or not, and if the default is not 
good enough for Solr, we should make it parametric and explain it how to use it 
to the users.

More info will follow and be better organised soon):


An interesting paper in this area: https://arxiv.org/abs/2403.04871
ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and 
Structured Data

Liana Patel, Peter Kraft, Carlos Guestrin, Matei Zaharia

This was implemented in Lucene in https://github.com/apache/lucene/pull/14160

Specifically in org.apache.lucene.util.hnsw.FilteredHnswGraphSearcher
that can be used in Solr via org.apache.lucene.search.knn.KnnSearchStrategy.Hnsw

/**
     * Create a new Hnsw strategy
     *
     * @param filteredSearchThreshold threshold for filtered search, a 
percentage value from 0 to
     *     100 where 0 means never use filtered search and 100 means always use 
filtered search.
     */
    public Hnsw(int filteredSearchThreshold) {
      if (filteredSearchThreshold < 0 || filteredSearchThreshold > 100) {
        throw new IllegalArgumentException("filteredSearchThreshold must be >= 
0 and <= 100");
      }
      this.filteredSearchThreshold = filteredSearchThreshold;
    }

We may pass an additional parameter to disable it (filteredSearchThreshold=0), 
the default is:
 public static final int DEFAULT_FILTERED_SEARCH_THRESHOLD = 60;

  was:
An interesting paper in this area: https://arxiv.org/abs/2403.04871
ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and 
Structured Data

Liana Patel, Peter Kraft, Carlos Guestrin, Matei Zaharia

This was implemented in Lucene in https://github.com/apache/lucene/pull/14160

Specifically in org.apache.lucene.util.hnsw.FilteredHnswGraphSearcher
that can be used in Solr via org.apache.lucene.search.knn.KnnSearchStrategy.Hnsw

/**
     * Create a new Hnsw strategy
     *
     * @param filteredSearchThreshold threshold for filtered search, a 
percentage value from 0 to
     *     100 where 0 means never use filtered search and 100 means always use 
filtered search.
     */
    public Hnsw(int filteredSearchThreshold) {
      if (filteredSearchThreshold < 0 || filteredSearchThreshold > 100) {
        throw new IllegalArgumentException("filteredSearchThreshold must be >= 
0 and <= 100");
      }
      this.filteredSearchThreshold = filteredSearchThreshold;
    }

We may pass an additional parameter to disable it (filteredSearchThreshold=0), 
the default is:
 public static final int DEFAULT_FILTERED_SEARCH_THRESHOLD = 60;


> Add parameter to regulate for ACORN-based filtering in vector search?
> ---------------------------------------------------------------------
>
>                 Key: SOLR-17815
>                 URL: https://issues.apache.org/jira/browse/SOLR-17815
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Alessandro Benedetti
>            Priority: Major
>
> ACORN with a threshold of '60' is the default when we upgrade to Lucene 10.x .
> Disabling ACORN can be obtained at KnnSearchStrategy level.
>  public static class Hnsw extends KnnSearchStrategy {
>     public static final Hnsw DEFAULT = new 
> Hnsw(DEFAULT_FILTERED_SEARCH_THRESHOLD);
> This issue should study when ACORN is useful or not, and if the default is 
> not good enough for Solr, we should make it parametric and explain it how to 
> use it to the users.
> More info will follow and be better organised soon):
> An interesting paper in this area: https://arxiv.org/abs/2403.04871
> ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and 
> Structured Data
> Liana Patel, Peter Kraft, Carlos Guestrin, Matei Zaharia
> This was implemented in Lucene in https://github.com/apache/lucene/pull/14160
> Specifically in org.apache.lucene.util.hnsw.FilteredHnswGraphSearcher
> that can be used in Solr via 
> org.apache.lucene.search.knn.KnnSearchStrategy.Hnsw
> /**
>      * Create a new Hnsw strategy
>      *
>      * @param filteredSearchThreshold threshold for filtered search, a 
> percentage value from 0 to
>      *     100 where 0 means never use filtered search and 100 means always 
> use filtered search.
>      */
>     public Hnsw(int filteredSearchThreshold) {
>       if (filteredSearchThreshold < 0 || filteredSearchThreshold > 100) {
>         throw new IllegalArgumentException("filteredSearchThreshold must be 
> >= 0 and <= 100");
>       }
>       this.filteredSearchThreshold = filteredSearchThreshold;
>     }
> We may pass an additional parameter to disable it 
> (filteredSearchThreshold=0), the default is:
>  public static final int DEFAULT_FILTERED_SEARCH_THRESHOLD = 60;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to