yx-keith opened a new pull request, #64377:
URL: https://github.com/apache/doris/pull/64377
…h (IDSelector)
An ANN TopN query carrying a residual column predicate (one not resolvable
by zonemap/inverted/bitmap index) currently gives up the ANN index in
_apply_ann_topn_predicate and degrades to an O(N) brute-force distance scan.
That is the biggest cliff for filtered vector search, and it also makes recall
inconsistent across a distributed query (some segments approximate via index,
others exact via brute force).
The FAISS-side IDSelector path is already wired up; the gap was that the
column predicate was never evaluated into the candidate bitmap. This adds
_eager_filter_predicates_into_bitmap(), which reuses the existing
_read_columns_by_index + vectorized/short-circuit predicate evaluation over the
candidate set and intersects the survivors back into _row_bitmap, so the
narrowed bitmap is fed to the ANN index as an IDSelector instead of falling
back to brute force.
Guarded by a new session variable enable_ann_topn_predicate_prefilter
(default true). A common-expr push-down (arbitrary expression filter) still
falls back for now.
Adds regression test ann_topn_predicate_prefilter using the existing
segment_iterator._read_columns_by_index debug point to assert the predicated
TopN query no longer reads the vector column (index-only), and that its top-K
matches the brute-force result.
Note: mechanism established by reading the code only; not yet built or
regression-tested. Build + run ann_index_p0 (incl. baseline reproduction on an
unpatched BE) before opening any upstream PR.
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
### Release note
None
### Check List (For Author)
- Test <!-- At least one of them must be included. -->
- [ ] Regression test
- [ ] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason <!-- Add your reason? -->
- Behavior changed:
- [ ] No.
- [ ] Yes. <!-- Explain the behavior change -->
- Does this need documentation?
- [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR should
merge into -->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]