github-actions[bot] commented on code in PR #64413:
URL: https://github.com/apache/doris/pull/64413#discussion_r3397485967
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -964,6 +978,14 @@ private PlanFragment
computePhysicalOlapScan(PhysicalOlapScan olapScan, PlanTran
context.addScanNode(olapScanNode, olapScan);
translateRuntimeFilter(olapScan, olapScanNode, context);
+ if (!storageAlignedScanSlots.filledKeyExprIds.isEmpty()) {
Review Comment:
This projection remaps every output `ExprId` in `PlanTranslatorContext` to
the new projection tuple slot ids. That breaks
`visitPhysicalLazyMaterializeOlapScan`: after
`computePhysicalOlapScan(lazyScan.getScan(), context)` returns, it computes
`scanIds` with `context.findSlotRef(...)` and then prunes
`olapScanNode.getTupleDesc()` (the original scan tuple) using those ids. When a
filled storage key is added for a non-MOW UNIQUE scan and another value column
is lazy-materialized, the ids in `scanIds` are projection tuple ids, so the
real scan tuple slots are removed while the project list still points at the
old scan slots. Please avoid this projection/remap for the lazy-materialized
scan path, or preserve/use the original scan tuple slot mapping when pruning.
##########
be/src/storage/segment/segment_iterator.cpp:
##########
@@ -3159,32 +3037,10 @@ Status SegmentIterator::_process_eof(Block* block) {
Status SegmentIterator::_process_common_expr(uint16_t* sel_rowid_idx,
uint16_t& selected_size,
Block* block) {
- // Here we just use col0 as row_number indicator. when reach here, we will
calculate the predicates first.
- // then use the result to reduce our data read(that is, expr push down).
there's now row in block means the first
- // column is not in common expr. so it's safe to replace it temporarily
to provide correct `selected_size`.
VLOG_DEBUG << fmt::format("Execute common expr. block rows {}, selected
size {}", block->rows(),
_selected_size);
- bool need_mock_col = block->rows() != selected_size;
- MutableColumnPtr col0;
- if (need_mock_col) {
- col0 = std::move(*block->get_by_position(0).column).mutate();
- block->replace_by_position(
- 0,
block->get_by_position(0).type->create_column_const_with_default_value(
- _selected_size));
- }
-
- std::vector<VExprContext*> common_ctxs;
- common_ctxs.reserve(_common_expr_ctxs_push_down.size());
- for (auto& ctx : _common_expr_ctxs_push_down) {
- common_ctxs.push_back(ctx.get());
- }
- _output_index_result_column(common_ctxs, _sel_rowid_idx.data(),
_selected_size, block);
- RETURN_IF_ERROR(_execute_common_expr(_sel_rowid_idx.data(),
_selected_size, block));
Review Comment:
This drops rows for common-expression predicates when the first block column
has not been materialized. `_process_common_expr()` can still be called with
`selected_size > 0` while `block->rows() == 0`: for example, when the common
expr reads a later column and column 0 is neither a predicate/common-expr
column nor otherwise output before this point. The removed code immediately
above this hunk installed a temporary column 0 exactly to give the block
`selected_size` rows. In release builds this new code builds an empty filter
from `block->rows()`, `_evaluate_common_expr_filter(..., selected_size,
filter)` returns 0, and the whole batch is silently filtered out. Please
preserve a `selected_size`-sized row-count column, or otherwise make the
conjunct execution/filtering operate on `selected_size` rows instead of
`block->rows()`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]