github-actions[bot] commented on PR #61813:
URL: https://github.com/apache/doris/pull/61813#issuecomment-4141423278

   ## Code Review Summary
   
   ### Overview
   
   This PR fixes a bug where using a scalar subquery as the value in a simple 
CASE expression (e.g., `CASE (SELECT ...) WHEN val THEN ...`) caused a 
"groupExpression already exists in memo" planning error due to duplicate 
`RelationId`s created by duplicating the subquery expression at parse time.
   
   The fix correctly defers the `EqualTo` wrapping from parse time 
(`LogicalPlanBuilder`) to analysis time (`ExpressionAnalyzer`), ensuring the 
subquery is analyzed exactly once.
   
   ### Critical Checkpoint Conclusions
   
   **Goal achievement:** The code achieves its stated goal. The simple-case 
value expression is now stored as-is in `CaseWhen`, and 
`ExpressionAnalyzer.visitCaseWhen` constructs `EqualTo(analyzedValue, 
analyzedOperand)` per WhenClause during analysis, producing a standard 
searched-case `CaseWhen`. The test coverage proves correctness.
   
   **Modification size and focus:** Well-scoped -- only 3 files modified for 
the fix, plus 3 files for tests. Each change is minimal and focused.
   
   **Concurrency:** Not applicable -- no concurrent access patterns introduced.
   
   **Lifecycle management:** The `value` field is transient: present only 
between parsing and analysis. After analysis, the simple-case form is desugared 
into searched-case form. All downstream rules (rewrite phase) only ever see 
searched-case `CaseWhen` nodes. This is safe because analysis always completes 
before rewriting begins (`NereidsPlanner.analyze()` runs before `rewrite()`).
   
   **Configuration items:** None added.
   
   **Incompatible changes:** None -- the `value` field is internal to the 
Nereids expression tree, consumed during analysis, and never serialized to 
Thrift or sent to BE. Post-analysis behavior is identical to before.
   
   **Parallel code paths:** Verified. The searched-case path 
(`visitSearchedCase`) in `LogicalPlanBuilder` is unmodified and still 
constructs `CaseWhen` without a value. All downstream CaseWhen rewrite rules 
(`FoldConstantRuleOnFE`, `CaseWhenToIf`, `CaseWhenToCompoundPredicate`, 
`NestedCaseWhenCondToLiteral`, `PushIntoCaseWhenBranch`, `ConditionRewrite`, 
`ExpressionTranslator`) only use `getWhenClauses()` and `getDefaultValue()` -- 
never `getValue()` -- and only ever construct searched-case `CaseWhen` objects. 
This is correct since they run in the rewrite phase after analysis.
   
   **Special conditional checks:** The `withChildren` method uses `instanceof 
WhenClause` to distinguish value/whenClause/defaultValue in the children list. 
This is sound -- the children layout `[value?, WhenClause+, defaultValue?]` is 
unambiguous because the value (if present) is always a non-WhenClause 
expression before any WhenClauses.
   
   **Test coverage:** Comprehensive. 7 unit tests covering constructor 
semantics, `withChildren` round-trip, `toSql` output, parsing, subquery 
planning (the key bug scenario), searched-case regression, and column-reference 
simple case. 7 regression test cases covering the exact bug reproduction, 
subquery with/without ELSE, column references, nested cases, and searched-case 
baseline. Uses `order_qt_` for deterministic output. Tables dropped before use. 
Table names hardcoded.
   
   **Observability:** Not applicable -- no new critical paths requiring logging 
or metrics.
   
   **Transaction/persistence:** Not applicable.
   
   **Data writes:** Not applicable.
   
   **FE-BE variable passing:** Not applicable -- the `value` field is consumed 
during FE-internal analysis and never reaches BE.
   
   **Performance:** No performance concerns. The analysis-time `EqualTo` 
construction is equivalent work to what was done at parse time before, just 
deferred. No redundant copies or allocations.
   
   **Other observations:**
   - The `getDataType()` change from `child(0).getDataType()` to 
`whenClauses.get(0).getDataType()` is a necessary and correct fix -- with the 
new children layout, `child(0)` could be the value expression rather than a 
WhenClause.
   - The `dataTypesForCoercion` field was correctly changed from non-final to 
`final` (minor cleanup).
   - The `AnalysisException` import was correctly removed since it is no longer 
used.
   - `CaseWhen` does not override `equals()`/`hashCode()`, relying on 
children-based comparison from the base `Expression` class. This is correct 
since the `value` field is part of the children list when present.
   
   ### Verdict
   
   **No issues found.** The design is clean, the implementation is correct, and 
the test coverage is thorough. The approach of keeping the simple-case form as 
a transient pre-analysis representation that is desugared into the canonical 
searched-case form during analysis is elegant and avoids any downstream changes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to