github-actions[bot] commented on code in PR #63389:
URL: https://github.com/apache/doris/pull/63389#discussion_r3295767924
##########
be/src/format/parquet/vparquet_reader.cpp:
##########
@@ -1256,6 +1272,85 @@ Status ParquetReader::_process_page_index_filter(
}
RowRanges::ranges_intersection(*candidate_row_ranges, tmp_row_range,
candidate_row_ranges);
}
+ RETURN_IF_ERROR(_process_expr_zonemap_page_filter(&cached_page_index,
candidate_row_ranges));
+ return Status::OK();
Review Comment:
This page-level expr-zonemap pruning is unreachable for expression-only
predicates because `_process_page_index_filter()` returns earlier when
`push_down_pred.empty()` (lines 1159-1162). A query such as `WHERE
starts_with(col, 'z')` can populate `_lazy_read_ctx.conjuncts` for expr-zonemap
evaluation while producing no ordinary Parquet `push_down_pred`; in that case
the function calls `read_whole_row_group()` and returns before this new call.
Please move the expr-zonemap page filtering so it can run whenever page
indexes/min-max filtering are enabled and `_lazy_read_ctx.conjuncts` is
non-empty, not only after ordinary pushdown predicates have already run.
##########
be/src/storage/segment/segment_iterator.cpp:
##########
@@ -1248,6 +1271,17 @@ Status
SegmentIterator::_get_row_ranges_from_conditions(RowRanges* condition_row
_opts.stats->rows_stats_filtered += (pre_size -
condition_row_ranges->count());
}
+ {
+ SCOPED_RAW_TIMER(&_opts.stats->generate_row_ranges_by_zonemap_ns);
+ if (!_common_expr_ctxs_push_down.empty()) {
+ const auto pre_expr_zonemap_size = condition_row_ranges->count();
+
RETURN_IF_ERROR(_apply_expr_zonemap_to_row_ranges(_common_expr_ctxs_push_down,
0,
+
condition_row_ranges));
Review Comment:
This page-level expr-zonemap call is currently gated by the caller's
condition at lines 1000-1002, which only invokes
`_get_row_ranges_from_conditions()` when there are topn filters, column
predicates, or delete predicates. For the main new case, e.g. `WHERE
starts_with(v, 'm')`, `_common_expr_ctxs_push_down` is non-empty but those
other predicate collections can all be empty; if the segment-level zonemap does
not eliminate the whole segment, page-level expr pruning is skipped entirely.
Please include `_common_expr_ctxs_push_down` in the caller's condition or
otherwise invoke this page-pruning block for expression-only predicates.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]