github-actions[bot] commented on code in PR #64413:
URL: https://github.com/apache/doris/pull/64413#discussion_r3452053782
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -1001,6 +1024,84 @@ private PlanFragment
computePhysicalOlapScan(PhysicalOlapScan olapScan, PlanTran
return planFragment;
}
+ private StorageAlignedScanSlots
computeStorageAlignedScanSlots(PhysicalOlapScan olapScan) {
+ if (!shouldAlignScanSlotsToStorageSchema(olapScan)) {
+ return new StorageAlignedScanSlots(olapScan.getOutput(),
Collections.emptySet());
+ }
+
+ Set<ExprId> outputExprIds = olapScan.getOutput().stream()
+ .map(Slot::getExprId)
+ .collect(Collectors.toSet());
+ Map<Integer, Slot> slotByColumnUniqueId = new HashMap<>();
+ Map<String, Slot> slotByColumnName = new HashMap<>();
+ for (Slot slot : olapScan.getOutput()) {
+ Optional<Column> originalColumn = ((SlotReference)
slot).getOriginalColumn();
+ if (originalColumn.isPresent()) {
+ Column column = originalColumn.get();
+ if (column.getUniqueId() ==
Column.COLUMN_UNIQUE_ID_INIT_VALUE) {
+ slotByColumnName.put(column.getName(), slot);
+ } else {
+ slotByColumnUniqueId.put(column.getUniqueId(), slot);
+ }
+ }
+ }
+
+ List<Slot> storageSlots = new ArrayList<>();
+ Set<ExprId> storageExprIds = new HashSet<>();
+ Set<ExprId> extraKeyExprIds = new HashSet<>();
+ long selectedIndexId = olapScan.getSelectedIndexId() == -1
+ ? olapScan.getTable().getBaseIndexId()
+ : olapScan.getSelectedIndexId();
+ for (Column column :
olapScan.getTable().getSchemaByIndexId(selectedIndexId, true)) {
+ if (!column.isKey()) {
+ break;
+ }
+ Slot slot = column.getUniqueId() ==
Column.COLUMN_UNIQUE_ID_INIT_VALUE
+ ? slotByColumnName.get(column.getName())
Review Comment:
`extraKeyExprIds` is currently unreachable. `outputExprIds` is built from
`olapScan.getOutput()`, and every storage-key `slot` above is fetched from maps
built from that same `olapScan.getOutput()`, so this test is always false and
`extra_key_column_slot_ids` stays empty. The parent `PhysicalProject` later
calls `updateScanSlotsMaterialization()` and only preserves slots from
`getExtraKeyColumnSlotIds()` before pruning the scan tuple. For an AGG/non-MOW
UNIQUE table with keys `(k1, k2)` and a query that projects only `k2`, `k1` can
still be removed from the scan tuple even though BE non-direct reads expand
`return_columns` back to all keys for merge/aggregation. Please mark required
storage-key slots relative to the projected/required output, or otherwise
preserve these key slots during scan materialization pruning, and add a
translator test for `Project(OlapScan)` on `(k1, k2)` with only `k2` projected.
##########
be/src/storage/segment/segment_iterator.cpp:
##########
@@ -1270,8 +1151,9 @@ Status
SegmentIterator::_extract_common_expr_columns(const VExprSPtr& expr) {
auto node_type = expr->node_type();
if (node_type == TExprNodeType::SLOT_REF) {
auto slot_expr = std::dynamic_pointer_cast<doris::VSlotRef>(expr);
- _is_common_expr_column[_schema->column_id(slot_expr->column_id())] =
true;
-
_common_expr_columns.insert(_schema->column_id(slot_expr->column_id()));
+ auto cid = _schema->column_id(slot_expr->column_id());
+ _is_common_expr_column[cid] = true;
+ _common_expr_columns.insert(cid);
} else if (node_type == TExprNodeType::VIRTUAL_SLOT_REF) {
Review Comment:
This new no-read path should also honor `enable_no_need_read_data_opt`.
Right now `_can_skip_reading_extra_column()` can return `false` from
`_need_read_data()` before the session kill switch is checked, and
`_prune_column()` will synthesize defaults for the extra key. The existing
no-read-data paths below and in `_no_need_read_key_data()` both consult
`query_options().enable_no_need_read_data_opt` first, so setting
`enable_no_need_read_data_opt=false` no longer disables all
default-fill/no-read behavior. Please move the session-variable check before
this extra-column shortcut, or add the same guard inside
`_can_skip_reading_extra_column()`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]