Copilot commented on code in PR #64243:
URL: https://github.com/apache/doris/pull/64243#discussion_r3379763912
##########
be/src/storage/segment/segment_iterator.cpp:
##########
@@ -1056,15 +1056,20 @@ Status SegmentIterator::_apply_ann_topn_predicate() {
size_t pre_size = _row_bitmap.cardinality();
size_t rows_of_segment = _segment->num_rows();
- if (static_cast<double>(pre_size) < static_cast<double>(rows_of_segment) *
0.3) {
+ DORIS_CHECK(_opts.runtime_state != nullptr);
+ const auto user_params = _opts.runtime_state->get_vector_search_params();
+ if (user_params.should_fallback_ann_index_by_small_candidate(pre_size,
rows_of_segment)) {
Review Comment:
`StorageReadOptions::runtime_state` is explicitly nullable
(iterators.h:132-134), and this method already handles a null runtime_state
later when computing `enable_ann_index_result_cache`. The new
`DORIS_CHECK(_opts.runtime_state != nullptr)` will hard-abort BE in any call
path that sets `ann_topn_runtime` but not `runtime_state` (or in future
refactors/tests). Prefer falling back to default `VectorSearchUserParams` when
runtime_state is absent so the previous default behavior (0.3 ratio, 0 absolute
threshold) is preserved without crashing.
##########
fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java:
##########
@@ -3584,6 +3588,38 @@ public boolean isEnableESParallelScroll() {
"IVF index nprobe parameter, controls the number of
clusters to search"})
public int ivfNprobe = 32;
+ @VarAttrDef.VarAttr(name = ANN_INDEX_CANDIDATE_ROWS_THRESHOLD, needForward
= true,
+ checker = "checkAnnIndexCandidateRowsThreshold",
+ description = {"Skip ANN index when candidate rows before ANN
search are less "
+ + "than this threshold. 0 disables the absolute row
threshold",
+ "Skip ANN index when candidate rows before ANN search are
less "
+ + "than this threshold. 0 disables the absolute
row threshold"})
+ public long annIndexCandidateRowsThreshold = 0;
+
+ @VarAttrDef.VarAttr(name = ANN_INDEX_CANDIDATE_ROWS_PERCENT_THRESHOLD,
needForward = true,
+ checker = "checkAnnIndexCandidateRowsPercentThreshold",
+ description = {"Skip ANN index when candidate row ratio before ANN
search is less "
+ + "than this threshold",
+ "Skip ANN index when candidate row ratio before ANN search
is less "
+ + "than this threshold"})
+ public double annIndexCandidateRowsPercentThreshold = 0.3;
+
+ public void checkAnnIndexCandidateRowsThreshold(String value) {
+ long threshold = Long.parseLong(value);
+ if (threshold < 0) {
+ throw new InvalidParameterException(
+ ANN_INDEX_CANDIDATE_ROWS_THRESHOLD + " should be greater
than or equal to 0");
+ }
+ }
+
+ public void checkAnnIndexCandidateRowsPercentThreshold(String value) {
+ double threshold = Double.parseDouble(value);
+ if (threshold < 0 || threshold > 1) {
+ throw new InvalidParameterException(
+ ANN_INDEX_CANDIDATE_ROWS_PERCENT_THRESHOLD + " should be
between 0 and 1");
+ }
+ }
Review Comment:
`checkAnnIndexCandidateRowsPercentThreshold` currently allows `NaN` (e.g.
`SET ann_index_candidate_rows_percent_threshold = 'NaN'`) because `NaN < 0` and
`NaN > 1` are both false. That silently disables the percent-threshold logic in
BE (`NaN > 0` is also false) and makes the variable behave unexpectedly. Reject
non-finite values explicitly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]