github-actions[bot] commented on code in PR #63389:
URL: https://github.com/apache/doris/pull/63389#discussion_r3295767924


##########
be/src/format/parquet/vparquet_reader.cpp:
##########
@@ -1256,6 +1272,85 @@ Status ParquetReader::_process_page_index_filter(
         }
         RowRanges::ranges_intersection(*candidate_row_ranges, tmp_row_range, 
candidate_row_ranges);
     }
+    RETURN_IF_ERROR(_process_expr_zonemap_page_filter(&cached_page_index, 
candidate_row_ranges));
+    return Status::OK();

Review Comment:
   This page-level expr-zonemap pruning is unreachable for expression-only 
predicates because `_process_page_index_filter()` returns earlier when 
`push_down_pred.empty()` (lines 1159-1162). A query such as `WHERE 
starts_with(col, 'z')` can populate `_lazy_read_ctx.conjuncts` for expr-zonemap 
evaluation while producing no ordinary Parquet `push_down_pred`; in that case 
the function calls `read_whole_row_group()` and returns before this new call. 
Please move the expr-zonemap page filtering so it can run whenever page 
indexes/min-max filtering are enabled and `_lazy_read_ctx.conjuncts` is 
non-empty, not only after ordinary pushdown predicates have already run.



##########
be/src/storage/segment/segment_iterator.cpp:
##########
@@ -1248,6 +1271,17 @@ Status 
SegmentIterator::_get_row_ranges_from_conditions(RowRanges* condition_row
         _opts.stats->rows_stats_filtered += (pre_size - 
condition_row_ranges->count());
     }
 
+    {
+        SCOPED_RAW_TIMER(&_opts.stats->generate_row_ranges_by_zonemap_ns);
+        if (!_common_expr_ctxs_push_down.empty()) {
+            const auto pre_expr_zonemap_size = condition_row_ranges->count();
+            
RETURN_IF_ERROR(_apply_expr_zonemap_to_row_ranges(_common_expr_ctxs_push_down, 
0,
+                                                              
condition_row_ranges));

Review Comment:
   This page-level expr-zonemap call is currently gated by the caller's 
condition at lines 1000-1002, which only invokes 
`_get_row_ranges_from_conditions()` when there are topn filters, column 
predicates, or delete predicates. For the main new case, e.g. `WHERE 
starts_with(v, 'm')`, `_common_expr_ctxs_push_down` is non-empty but those 
other predicate collections can all be empty; if the segment-level zonemap does 
not eliminate the whole segment, page-level expr pruning is skipped entirely. 
Please include `_common_expr_ctxs_push_down` in the caller's condition or 
otherwise invoke this page-pruning block for expression-only predicates.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to