github-actions[bot] commented on code in PR #64226:
URL: https://github.com/apache/doris/pull/64226#discussion_r3381421600


##########
be/src/storage/index/ann/faiss_ann_index.cpp:
##########
@@ -704,7 +705,10 @@ doris::Status FaissVectorIndex::ann_topn_search(const 
float* query_vec, int k,
                     "IVF search parameters should not be null for IVF index");
         }
         faiss::SearchParametersIVF* param = new faiss::SearchParametersIVF();
-        param->nprobe = ivf_params->nprobe;
+        // FAISS asserts nprobe > 0, so the lower bound guards against a crash 
on
+        // nprobe < 1. nprobe > nlist is harmless (FAISS caps it at nlist 
internally);
+        // we clamp the upper bound only to keep the value semantically 
meaningful.

Review Comment:
   This upper-bound clamp is using `_params.ivf_nlist`, but production-loaded 
IVF indexes do not restore that field. `AnnIndexReader::load_index()` creates a 
fresh `FaissVectorIndex`, calls `set_metric()`/`set_type()`, and then `load()`; 
`load()` installs the FAISS index but leaves `_params.ivf_nlist` at its default 
`1024`. For an IVF or IVF_ON_DISK index created with `nlist=4096`, `SET 
ivf_nprobe=4096` now sends only 1024 probes to FAISS, reducing recall compared 
with the previous behavior where FAISS used/capped against the real nlist. The 
new save/load test with `nlist=4` also does not catch this because FAISS caps 
the still-forwarded 1024 internally. Please derive the upper bound from the 
loaded `faiss::IndexIVF::nlist` (or restore `_params.ivf_nlist` during 
`load()`) before clamping; the same issue applies to the `range_search` clamp 
below.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to