This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new c9e9f6c9451 [fix](ann) Clamp nprobe/efSearch to prevent FAISS errors 
and recall loss (#64226)
c9e9f6c9451 is described below

commit c9e9f6c9451054bd344af41579b55f00b0870831
Author: yaoxiao <[email protected]>
AuthorDate: Wed Jun 10 16:55:08 2026 +0800

    [fix](ann) Clamp nprobe/efSearch to prevent FAISS errors and recall loss 
(#64226)
    
    ### What problem does this PR solve?
    
    Problem Summary:
    
    This PR fixes two correctness bugs in ANN vector search parameter
    handling — one
    silently degrades recall, the other crashes FAISS:
    
    1. **HNSW `efSearch < k` silently drops results (recall loss).**
    In FAISS HNSW the dynamic result set is capped at `efSearch`. When a
    query sets
    `hnsw_ef_search` smaller than the requested top-k (`LIMIT k`), FAISS
    produces fewer
    than `k` candidates and fills the rest with `-1` (invalid labels). For
    example
    `hnsw_ef_search=1` with `LIMIT 50` returns only a few valid rows, the
    remaining
       slots are `-1` — recall drops sharply, **without any error**.
    
    2. **IVF `nprobe < 1` crashes FAISS.**
    FAISS asserts `nprobe > 0`, so forwarding `nprobe < 1` (e.g. `0`)
    directly to FAISS
    throws `nprobe > 0 failed`. Note that `nprobe > nlist` is **not** a
    crash — FAISS
    internally caps `nprobe` at the index's real `nlist`, so only the lower
    bound needs
       guarding.
    
    ### Reproducing script:
    
    
[verify_ann_bugs.sh](https://github.com/user-attachments/files/28745672/verify_ann_bugs.sh)
    
    ### Reproduction results
    <img width="2413" height="394" alt="image"
    
src="https://github.com/user-attachments/assets/69161a4b-87b1-4df7-832c-490af9ff2171";
    />
    
    ### What is changed
    
    **BE** (`be/src/storage/index/ann/faiss_ann_index.cpp`):
    - `ann_topn_search`: boost `efSearch` to `max(ef_search, k)` so HNSW
    always returns
      `k` valid results instead of silently padding with `-1`.
    - `ann_topn_search` and `range_search`: guard `nprobe` with `max(nprobe,
    1)` to
    prevent the `nprobe > 0` assertion crash. The upper bound is left to
    FAISS, which
    caps `nprobe` at the index's real `nlist`. (An earlier revision clamped
    against
    `_params.ivf_nlist`, but that field stays at the default 1024 for loaded
    indexes
      and would wrongly cap `nprobe` for `nlist > 1024` — caught in review.)
    
    **FE** (`fe/fe-core/.../qe/SessionVariable.java`):
    - Add `checkHnswEfSearch` / `checkIvfNprobe` checkers that reject values
    `< 1` for
    `hnsw_ef_search` and `ivf_nprobe`, blocking illegal values before they
    reach BE.
    
    **Tests:**
    - BE UT: `NprobeClampedToNlist`, `EfSearchLinkedToK`; also fixed
    `CompareResultWithNativeFaiss2` (the native-FAISS reference now uses the
    same
      `efSearch = max(ef_search, k)` so both sides stay consistent).
    - FE UT: `testAnnSessionVariableChecker` covering both checkers (accept
    valid value,
      reject `0`).
    - Regression: `ann_index_p0/ann_search_params_clamp.groovy` (+ its
    `.out` baseline)
    verifying FE rejection, nprobe clamping, and efSearch boosting
    end-to-end.
    
    ### Release note
    
    Fixed ANN vector search parameter handling. HNSW `hnsw_ef_search` is now
    raised to at
    least the query top-k, so results are no longer silently truncated
    (filled with `-1`)
    when `hnsw_ef_search` is smaller than `LIMIT`. IVF `ivf_nprobe` is
    guarded against
    values `< 1`, which previously crashed FAISS. Both `hnsw_ef_search` and
    `ivf_nprobe`
    now reject values `< 1` at SET time.
    
    ### Check List (For Author)
    
    - Test
        - [x] Regression test
        - [x] Unit Test
        - [ ] Manual test (add detailed scripts or steps below)
        - [ ] No need to test or manual test. Explain why:
    
    - Behavior changed:
        - [x] Yes.
    - `hnsw_ef_search` / `ivf_nprobe` values `< 1` are now rejected at SET
    time.
    - At query time `efSearch` is raised to at least `k`, and `nprobe` is
    guarded
    with `max(nprobe, 1)` (upper bound left to FAISS). Results previously
    truncated (filled with `-1`) or that crashed FAISS (`nprobe=0`) now
    return
              correctly.
    
    - Does this need documentation?
        - [x] No.
    
    ### Check List (For Reviewer who merge this PR)
    
    - [ ] Confirm the release note
    - [ ] Confirm test cases
    - [ ] Confirm document
    - [ ] Add branch pick label
    
    ---------
    
    Co-authored-by: yaoxiao <[email protected]>
---
 be/src/storage/index/ann/faiss_ann_index.cpp       |  15 ++-
 .../storage/index/ann/faiss_vector_index_test.cpp  | 112 ++++++++++++++++++-
 .../java/org/apache/doris/qe/SessionVariable.java  |  18 +++
 .../org/apache/doris/qe/SessionVariablesTest.java  |  29 +++++
 .../data/ann_index_p0/ann_search_params_clamp.out  |   7 ++
 .../ann_index_p0/ann_search_params_clamp.groovy    | 122 +++++++++++++++++++++
 6 files changed, 298 insertions(+), 5 deletions(-)

diff --git a/be/src/storage/index/ann/faiss_ann_index.cpp 
b/be/src/storage/index/ann/faiss_ann_index.cpp
index 7a8c538a200..bf14128dc78 100644
--- a/be/src/storage/index/ann/faiss_ann_index.cpp
+++ b/be/src/storage/index/ann/faiss_ann_index.cpp
@@ -692,7 +692,8 @@ doris::Status FaissVectorIndex::ann_topn_search(const 
float* query_vec, int k,
                     "HNSW search parameters should not be null for HNSW 
index");
         }
         faiss::SearchParametersHNSW* param = new faiss::SearchParametersHNSW();
-        param->efSearch = hnsw_params->ef_search;
+        // efSearch must be >= k to guarantee k results are returned
+        param->efSearch = std::max(hnsw_params->ef_search, k);
         param->check_relative_distance = hnsw_params->check_relative_distance;
         param->bounded_queue = hnsw_params->bounded_queue;
         param->sel = id_sel.get();
@@ -704,7 +705,11 @@ doris::Status FaissVectorIndex::ann_topn_search(const 
float* query_vec, int k,
                     "IVF search parameters should not be null for IVF index");
         }
         faiss::SearchParametersIVF* param = new faiss::SearchParametersIVF();
-        param->nprobe = ivf_params->nprobe;
+        // FAISS asserts nprobe > 0, so guard the lower bound against nprobe < 
1.
+        // Do NOT clamp the upper bound here: FAISS internally caps nprobe at 
the
+        // index's real nlist, and _params.ivf_nlist is unreliable for loaded 
indexes
+        // (load() leaves it at the default 1024, see 
AnnIndexReader::load_index).
+        param->nprobe = std::max(ivf_params->nprobe, 1);
         param->sel = id_sel.get();
         search_param.reset(param);
     } else {
@@ -793,7 +798,11 @@ doris::Status FaissVectorIndex::range_search(const float* 
query_vec, const float
         {
             // Engine prepare: set search parameters and bind selector
             SCOPED_RAW_TIMER(&result.engine_prepare_ns);
-            param->nprobe = ivf_params->nprobe;
+            // FAISS asserts nprobe > 0, so guard the lower bound against 
nprobe < 1.
+            // Do NOT clamp the upper bound here: FAISS internally caps nprobe 
at the
+            // index's real nlist, and _params.ivf_nlist is unreliable for 
loaded
+            // indexes (load() leaves it at the default 1024, see load_index).
+            param->nprobe = std::max(ivf_params->nprobe, 1);
         }
         search_param.reset(param);
     } else {
diff --git a/be/test/storage/index/ann/faiss_vector_index_test.cpp 
b/be/test/storage/index/ann/faiss_vector_index_test.cpp
index 60c89951cb5..d60d83ecb17 100644
--- a/be/test/storage/index/ann/faiss_vector_index_test.cpp
+++ b/be/test/storage/index/ann/faiss_vector_index_test.cpp
@@ -407,10 +407,14 @@ TEST_F(VectorSearchTest, CompareResultWithNativeFaiss2) {
         IndexSearchResult doris_results;
         std::ignore = doris_index->ann_topn_search(query_vec, top_k, 
search_params, doris_results);
 
-        // Search in native Faiss index
+        // Search in native Faiss index using the same efSearch that Doris 
applies:
+        // ann_topn_search clamps to max(ef_search, k), so native must match.
         std::vector<float> native_distances(top_k, -1);
         std::vector<faiss::idx_t> native_indices(top_k, -1);
-        native_index->search(1, query_vec, top_k, native_distances.data(), 
native_indices.data());
+        faiss::SearchParametersHNSW native_params;
+        native_params.efSearch = std::max(search_params.ef_search, top_k);
+        native_index->search(1, query_vec, top_k, native_distances.data(), 
native_indices.data(),
+                             &native_params);
         size_t cnt = std::count_if(native_indices.begin(), 
native_indices.end(),
                                    [](faiss::idx_t idx) { return idx != -1; });
         for (size_t i = 0; i < cnt; ++i) {
@@ -1539,4 +1543,108 @@ TEST_F(VectorSearchTest, 
IVFOnDiskConcurrentSearchStampedeProtection) {
     EXPECT_GT(total_misses, 0) << "Expected cache misses from first-thread 
disk reads";
 }
 
+// NprobeClampedToNlist (name kept for history): after the fix nprobe only 
gets a
+// lower-bound guard. FAISS asserts nprobe > 0, so nprobe < 1 (e.g. 0) throws
+// "nprobe > 0 failed". The upper bound is intentionally NOT clamped: FAISS 
caps
+// nprobe at the index's real nlist internally, and _params.ivf_nlist is 
unreliable
+// after load() (stays at the default 1024). With nlist=4 this test cannot 
expose the
+// stale-nlist upper-bound bug (that needs nlist>1024); it covers the 
lower-bound
+// guard and that a huge nprobe is handled safely by FAISS.
+TEST_F(VectorSearchTest, NprobeClampedToNlist) {
+    const int dim = 32;
+    const int nlist = 4;
+    const int num_vectors = 200;
+
+    auto index = std::make_unique<FaissVectorIndex>();
+    FaissBuildParameter params;
+    params.dim = dim;
+    params.ivf_nlist = nlist;
+    params.index_type = FaissBuildParameter::IndexType::IVF;
+    params.quantizer = FaissBuildParameter::Quantizer::FLAT;
+    index->build(params);
+
+    std::vector<float> vecs;
+    vecs.reserve(num_vectors * dim);
+    for (int i = 0; i < num_vectors; i++) {
+        auto v = vector_search_utils::generate_random_vector(dim);
+        vecs.insert(vecs.end(), v.begin(), v.end());
+    }
+    ASSERT_TRUE(index->train(num_vectors, vecs.data()).ok());
+    ASSERT_TRUE(index->add(num_vectors, vecs.data()).ok());
+
+    ASSERT_TRUE(index->save(_ram_dir.get()).ok());
+    auto loaded = std::make_unique<FaissVectorIndex>();
+    loaded->set_type(AnnIndexType::IVF);
+    ASSERT_TRUE(loaded->load(_ram_dir.get()).ok());
+
+    auto roaring = std::make_unique<roaring::Roaring>();
+    for (int i = 0; i < num_vectors; ++i) roaring->add(i);
+
+    auto query = vector_search_utils::generate_random_vector(dim);
+    IndexSearchResult result;
+
+    // nprobe far above nlist: FAISS caps it at the real nlist on its own, so 
this
+    // is safe with no upper-bound clamp at all. Regression guard for results.
+    IVFSearchParameters search_params;
+    search_params.roaring = roaring.get();
+    search_params.rows_of_segment = num_vectors;
+    search_params.nprobe = 9999;
+    EXPECT_TRUE(loaded->ann_topn_search(query.data(), 5, search_params, 
result).ok())
+            << "nprobe > nlist is safe (FAISS caps at nlist) and must return 
results";
+
+    // nprobe = 0 — THIS is the real bug: FAISS asserts nprobe > 0, so without 
the
+    // lower-bound clamp this would throw "nprobe > 0 failed".
+    IndexSearchResult result2;
+    search_params.nprobe = 0;
+    EXPECT_TRUE(loaded->ann_topn_search(query.data(), 5, search_params, 
result2).ok())
+            << "nprobe = 0 should succeed after clamping to 1";
+}
+
+// efSearch < k silently returns fewer results; after fix efSearch is
+// boosted to max(ef_search, k) so we always get k valid results.
+TEST_F(VectorSearchTest, EfSearchLinkedToK) {
+    const int dim = 16;
+    const int num_vectors = 200;
+    const int k = 50;
+
+    auto index = std::make_unique<FaissVectorIndex>();
+    FaissBuildParameter params;
+    params.dim = dim;
+    params.max_degree = 16;
+    params.ef_construction = 64;
+    params.index_type = FaissBuildParameter::IndexType::HNSW;
+    index->build(params);
+
+    std::vector<float> vecs;
+    vecs.reserve(num_vectors * dim);
+    for (int i = 0; i < num_vectors; i++) {
+        auto v = vector_search_utils::generate_random_vector(dim);
+        vecs.insert(vecs.end(), v.begin(), v.end());
+    }
+    // HNSW does not need explicit train
+    ASSERT_TRUE(index->add(num_vectors, vecs.data()).ok());
+
+    ASSERT_TRUE(index->save(_ram_dir.get()).ok());
+    auto loaded = std::make_unique<FaissVectorIndex>();
+    loaded->set_type(AnnIndexType::HNSW);
+    ASSERT_TRUE(loaded->load(_ram_dir.get()).ok());
+
+    auto roaring = std::make_unique<roaring::Roaring>();
+    for (int i = 0; i < num_vectors; ++i) roaring->add(i);
+
+    auto query = vector_search_utils::generate_random_vector(dim);
+    IndexSearchResult result;
+
+    HNSWSearchParameters search_params;
+    search_params.roaring = roaring.get();
+    search_params.rows_of_segment = num_vectors;
+    // ef_search deliberately set below k; after fix should be raised to k
+    search_params.ef_search = 1;
+
+    ASSERT_TRUE(loaded->ann_topn_search(query.data(), k, search_params, 
result).ok());
+    // With the fix, efSearch = max(1, k) = k, so all k results must be valid 
(no -1 labels)
+    EXPECT_EQ(static_cast<int>(result.roaring->cardinality()), k)
+            << "efSearch boosted to max(ef_search, k): should return exactly k 
results";
+}
+
 } // namespace doris
diff --git a/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java 
b/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
index d70401fc69c..febfe56721b 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
@@ -3565,6 +3565,7 @@ public class SessionVariable implements Serializable, 
Writable {
     public boolean enableAddIndexForNewData = false;
 
     @VarAttrDef.VarAttr(name = HNSW_EF_SEARCH, needForward = true,
+            checker = "checkHnswEfSearch",
             description = {"HNSW 索引的 EF 搜索参数,控制搜索的精度和速度",
                     "HNSW index EF search parameter, controls the precision 
and speed of the search"})
     public int hnswEFSearch = 32;
@@ -3580,6 +3581,7 @@ public class SessionVariable implements Serializable, 
Writable {
     public boolean hnswBoundedQueue = true;
 
     @VarAttrDef.VarAttr(name = IVF_NPROBE, needForward = true,
+            checker = "checkIvfNprobe",
             description = {"IVF 索引的 nprobe 参数,控制搜索时访问的聚类数量",
                     "IVF index nprobe parameter, controls the number of 
clusters to search"})
     public int ivfNprobe = 32;
@@ -6457,6 +6459,22 @@ public class SessionVariable implements Serializable, 
Writable {
         }
     }
 
+    public void checkHnswEfSearch(String efSearch) {
+        int value = Integer.valueOf(efSearch);
+        if (value < 1) {
+            throw new UnsupportedOperationException(
+                    "hnsw_ef_search must be >= 1, got: " + efSearch);
+        }
+    }
+
+    public void checkIvfNprobe(String nprobe) {
+        int value = Integer.valueOf(nprobe);
+        if (value < 1) {
+            throw new UnsupportedOperationException(
+                    "ivf_nprobe must be >= 1, got: " + nprobe);
+        }
+    }
+
     public boolean getDefaultEnableTypedPathsToSparse() {
         return defaultEnableTypedPathsToSparse;
     }
diff --git 
a/fe/fe-core/src/test/java/org/apache/doris/qe/SessionVariablesTest.java 
b/fe/fe-core/src/test/java/org/apache/doris/qe/SessionVariablesTest.java
index 8a8e6f5c641..746c357ef60 100644
--- a/fe/fe-core/src/test/java/org/apache/doris/qe/SessionVariablesTest.java
+++ b/fe/fe-core/src/test/java/org/apache/doris/qe/SessionVariablesTest.java
@@ -289,4 +289,33 @@ public class SessionVariablesTest extends 
TestWithFeService {
 
         
Assertions.assertTrue(sessionVariable.isEnablePreloadExternalMetadata());
     }
+
+    @Test
+    public void testAnnSessionVariableChecker() throws Exception {
+        SessionVariable sv = new SessionVariable();
+
+        // hnsw_ef_search: valid value accepted
+        VariableMgr.setVar(sv, new SetVar(SetType.SESSION, 
SessionVariable.HNSW_EF_SEARCH,
+                new IntLiteral(1)));
+        Assertions.assertEquals(1, sv.hnswEFSearch);
+
+        // hnsw_ef_search: zero rejected
+        DdlException hnswException = 
Assertions.assertThrows(DdlException.class,
+                () -> VariableMgr.setVar(sv, new SetVar(SetType.SESSION,
+                        SessionVariable.HNSW_EF_SEARCH, new IntLiteral(0))));
+        
Assertions.assertTrue(hnswException.getMessage().contains("hnsw_ef_search must 
be >= 1"));
+        Assertions.assertEquals(1, sv.hnswEFSearch);
+
+        // ivf_nprobe: valid value accepted
+        VariableMgr.setVar(sv, new SetVar(SetType.SESSION, 
SessionVariable.IVF_NPROBE,
+                new IntLiteral(2)));
+        Assertions.assertEquals(2, sv.ivfNprobe);
+
+        // ivf_nprobe: zero rejected
+        DdlException nprobeException = 
Assertions.assertThrows(DdlException.class,
+                () -> VariableMgr.setVar(sv, new SetVar(SetType.SESSION,
+                        SessionVariable.IVF_NPROBE, new IntLiteral(0))));
+        
Assertions.assertTrue(nprobeException.getMessage().contains("ivf_nprobe must be 
>= 1"));
+        Assertions.assertEquals(2, sv.ivfNprobe);
+    }
 }
diff --git a/regression-test/data/ann_index_p0/ann_search_params_clamp.out 
b/regression-test/data/ann_index_p0/ann_search_params_clamp.out
new file mode 100644
index 00000000000..a6c38a1696b
--- /dev/null
+++ b/regression-test/data/ann_index_p0/ann_search_params_clamp.out
@@ -0,0 +1,7 @@
+-- This file is automatically generated. You should know what you did if you 
want to edit this
+-- !nprobe_clamped_returns_10 --
+10
+
+-- !ef_search_boosted_to_k --
+50
+
diff --git a/regression-test/suites/ann_index_p0/ann_search_params_clamp.groovy 
b/regression-test/suites/ann_index_p0/ann_search_params_clamp.groovy
new file mode 100644
index 00000000000..b58f845872d
--- /dev/null
+++ b/regression-test/suites/ann_index_p0/ann_search_params_clamp.groovy
@@ -0,0 +1,122 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// Verify that ANN search params are guarded correctly:
+//   1. FE checker rejects ivf_nprobe=0 and hnsw_ef_search=0
+//   2. BE guards nprobe with max(nprobe, 1): nprobe < 1 would crash FAISS 
(asserts
+//      nprobe > 0). nprobe > nlist needs no clamp; FAISS caps it at the real 
nlist.
+//   3. BE boosts efSearch to max(ef_search, k) so LIMIT k always returns k 
results
+
+suite("ann_search_params_clamp", "nonConcurrent") {
+    sql "set enable_common_expr_pushdown=true;"
+    sql "set enable_ann_index_result_cache=false;"
+
+    // -----------------------------------------------------------------------
+    // 1. FE checker: zero values must be rejected
+    // -----------------------------------------------------------------------
+    test {
+        sql "set ivf_nprobe=0"
+        exception "ivf_nprobe must be >= 1"
+    }
+
+    test {
+        sql "set hnsw_ef_search=0"
+        exception "hnsw_ef_search must be >= 1"
+    }
+
+    // -----------------------------------------------------------------------
+    // 2. nprobe > nlist is safe in FAISS itself (it caps nprobe at the real 
nlist),
+    //    so this case passes with or without any BE-side handling; it just 
guards that
+    //    a huge nprobe still returns results. (The real crash case, nprobe < 
1, is
+    //    rejected by the FE checker above, so it never reaches BE; BE UT 
covers the
+    //    BE-side lower-bound guard.)
+    //    Table: IVF nlist=8, 400 rows (>= 39*nlist training threshold)
+    //    Query: nprobe=99999 -> capped to 8 -> returns results normally
+    // -----------------------------------------------------------------------
+    sql "drop table if exists tbl_ann_nprobe_clamp"
+    sql """
+        CREATE TABLE tbl_ann_nprobe_clamp (
+            id INT NOT NULL,
+            v ARRAY<FLOAT> NOT NULL,
+            INDEX idx_v (v) USING ANN PROPERTIES(
+                "index_type" = "ivf",
+                "metric_type" = "l2_distance",
+                "nlist" = "8",
+                "dim" = "4"
+            )
+        ) ENGINE=OLAP
+        DUPLICATE KEY(id)
+        DISTRIBUTED BY HASH(id) BUCKETS 1
+        PROPERTIES ("replication_num" = "1", "disable_auto_compaction" = 
"true");
+    """
+
+    def rows = (1..400).collect { i ->
+        "(${i}, [${i}.0, ${i * 2}.0, ${i * 3}.0, ${i * 4}.0])"
+    }
+    sql "INSERT INTO tbl_ann_nprobe_clamp VALUES ${rows.join(',')};"
+    sql "sync"
+
+    sql "set ivf_nprobe=99999;"
+    // Capped to nlist=8 (FAISS-safe even without the clamp); must return 10 
rows
+    qt_nprobe_clamped_returns_10 """
+        SELECT count(*) FROM (
+            SELECT id FROM tbl_ann_nprobe_clamp
+            ORDER BY l2_distance_approximate(v, [1.0, 2.0, 3.0, 4.0])
+            LIMIT 10
+        ) t;
+    """
+
+    // -----------------------------------------------------------------------
+    // 3. efSearch boosted to max(ef_search, k): LIMIT k must return k rows
+    //    Table: HNSW, 200 rows; ef_search=1 with LIMIT 50
+    //    Before fix: FAISS explores 1 candidate, returns 1 result (others -1)
+    //    After fix:  efSearch = max(1, 50) = 50, returns exactly 50 results
+    // -----------------------------------------------------------------------
+    sql "drop table if exists tbl_ann_ef_search_k"
+    sql """
+        CREATE TABLE tbl_ann_ef_search_k (
+            id INT NOT NULL,
+            v ARRAY<FLOAT> NOT NULL,
+            INDEX idx_v (v) USING ANN PROPERTIES(
+                "index_type" = "hnsw",
+                "metric_type" = "l2_distance",
+                "max_degree" = "16",
+                "ef_construction" = "64",
+                "dim" = "4"
+            )
+        ) ENGINE=OLAP
+        DUPLICATE KEY(id)
+        DISTRIBUTED BY HASH(id) BUCKETS 1
+        PROPERTIES ("replication_num" = "1", "disable_auto_compaction" = 
"true");
+    """
+
+    def rows2 = (1..200).collect { i ->
+        "(${i}, [${i}.0, ${i * 2}.0, ${i * 3}.0, ${i * 4}.0])"
+    }
+    sql "INSERT INTO tbl_ann_ef_search_k VALUES ${rows2.join(',')};"
+    sql "sync"
+
+    sql "set hnsw_ef_search=1;"
+    // efSearch boosted to max(1, 50)=50; must return exactly 50 rows
+    qt_ef_search_boosted_to_k """
+        SELECT count(*) FROM (
+            SELECT id FROM tbl_ann_ef_search_k
+            ORDER BY l2_distance_approximate(v, [1.0, 2.0, 3.0, 4.0])
+            LIMIT 50
+        ) t;
+    """
+}


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to