Re: [PR] [improvement](query) Align olap scan schema with storage keys [doris]

via GitHub Thu, 11 Jun 2026 09:20:24 -0700


github-actions[bot] commented on code in PR #64413:
URL: https://github.com/apache/doris/pull/64413#discussion_r3397485967



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -964,6 +978,14 @@ private PlanFragment 
computePhysicalOlapScan(PhysicalOlapScan olapScan, PlanTran
         context.addScanNode(olapScanNode, olapScan);
 
         translateRuntimeFilter(olapScan, olapScanNode, context);
+        if (!storageAlignedScanSlots.filledKeyExprIds.isEmpty()) {

Review Comment:
   This projection remaps every output `ExprId` in `PlanTranslatorContext` to 
the new projection tuple slot ids. That breaks 
`visitPhysicalLazyMaterializeOlapScan`: after 
`computePhysicalOlapScan(lazyScan.getScan(), context)` returns, it computes 
`scanIds` with `context.findSlotRef(...)` and then prunes 
`olapScanNode.getTupleDesc()` (the original scan tuple) using those ids. When a 
filled storage key is added for a non-MOW UNIQUE scan and another value column 
is lazy-materialized, the ids in `scanIds` are projection tuple ids, so the 
real scan tuple slots are removed while the project list still points at the 
old scan slots. Please avoid this projection/remap for the lazy-materialized 
scan path, or preserve/use the original scan tuple slot mapping when pruning.



##########
be/src/storage/segment/segment_iterator.cpp:
##########
@@ -3159,32 +3037,10 @@ Status SegmentIterator::_process_eof(Block* block) {
 
 Status SegmentIterator::_process_common_expr(uint16_t* sel_rowid_idx, 
uint16_t& selected_size,
                                              Block* block) {
-    // Here we just use col0 as row_number indicator. when reach here, we will 
calculate the predicates first.
-    //  then use the result to reduce our data read(that is, expr push down). 
there's now row in block means the first
-    //  column is not in common expr. so it's safe to replace it temporarily 
to provide correct `selected_size`.
     VLOG_DEBUG << fmt::format("Execute common expr. block rows {}, selected 
size {}", block->rows(),
                               _selected_size);
 
-    bool need_mock_col = block->rows() != selected_size;
-    MutableColumnPtr col0;
-    if (need_mock_col) {
-        col0 = std::move(*block->get_by_position(0).column).mutate();
-        block->replace_by_position(
-                0, 
block->get_by_position(0).type->create_column_const_with_default_value(
-                           _selected_size));
-    }
-
-    std::vector<VExprContext*> common_ctxs;
-    common_ctxs.reserve(_common_expr_ctxs_push_down.size());
-    for (auto& ctx : _common_expr_ctxs_push_down) {
-        common_ctxs.push_back(ctx.get());
-    }
-    _output_index_result_column(common_ctxs, _sel_rowid_idx.data(), 
_selected_size, block);
-    RETURN_IF_ERROR(_execute_common_expr(_sel_rowid_idx.data(), 
_selected_size, block));

Review Comment:
   This drops rows for common-expression predicates when the first block column 
has not been materialized. `_process_common_expr()` can still be called with 
`selected_size > 0` while `block->rows() == 0`: for example, when the common 
expr reads a later column and column 0 is neither a predicate/common-expr 
column nor otherwise output before this point. The removed code immediately 
above this hunk installed a temporary column 0 exactly to give the block 
`selected_size` rows. In release builds this new code builds an empty filter 
from `block->rows()`, `_evaluate_common_expr_filter(..., selected_size, 
filter)` returns 0, and the whole batch is silently filtered out. Please 
preserve a `selected_size`-sized row-count column, or otherwise make the 
conjunct execution/filtering operate on `selected_size` rows instead of 
`block->rows()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [improvement](query) Align olap scan schema with storage keys [doris]

Reply via email to