github-actions[bot] commented on PR #61813: URL: https://github.com/apache/doris/pull/61813#issuecomment-4141423278
## Code Review Summary ### Overview This PR fixes a bug where using a scalar subquery as the value in a simple CASE expression (e.g., `CASE (SELECT ...) WHEN val THEN ...`) caused a "groupExpression already exists in memo" planning error due to duplicate `RelationId`s created by duplicating the subquery expression at parse time. The fix correctly defers the `EqualTo` wrapping from parse time (`LogicalPlanBuilder`) to analysis time (`ExpressionAnalyzer`), ensuring the subquery is analyzed exactly once. ### Critical Checkpoint Conclusions **Goal achievement:** The code achieves its stated goal. The simple-case value expression is now stored as-is in `CaseWhen`, and `ExpressionAnalyzer.visitCaseWhen` constructs `EqualTo(analyzedValue, analyzedOperand)` per WhenClause during analysis, producing a standard searched-case `CaseWhen`. The test coverage proves correctness. **Modification size and focus:** Well-scoped -- only 3 files modified for the fix, plus 3 files for tests. Each change is minimal and focused. **Concurrency:** Not applicable -- no concurrent access patterns introduced. **Lifecycle management:** The `value` field is transient: present only between parsing and analysis. After analysis, the simple-case form is desugared into searched-case form. All downstream rules (rewrite phase) only ever see searched-case `CaseWhen` nodes. This is safe because analysis always completes before rewriting begins (`NereidsPlanner.analyze()` runs before `rewrite()`). **Configuration items:** None added. **Incompatible changes:** None -- the `value` field is internal to the Nereids expression tree, consumed during analysis, and never serialized to Thrift or sent to BE. Post-analysis behavior is identical to before. **Parallel code paths:** Verified. The searched-case path (`visitSearchedCase`) in `LogicalPlanBuilder` is unmodified and still constructs `CaseWhen` without a value. All downstream CaseWhen rewrite rules (`FoldConstantRuleOnFE`, `CaseWhenToIf`, `CaseWhenToCompoundPredicate`, `NestedCaseWhenCondToLiteral`, `PushIntoCaseWhenBranch`, `ConditionRewrite`, `ExpressionTranslator`) only use `getWhenClauses()` and `getDefaultValue()` -- never `getValue()` -- and only ever construct searched-case `CaseWhen` objects. This is correct since they run in the rewrite phase after analysis. **Special conditional checks:** The `withChildren` method uses `instanceof WhenClause` to distinguish value/whenClause/defaultValue in the children list. This is sound -- the children layout `[value?, WhenClause+, defaultValue?]` is unambiguous because the value (if present) is always a non-WhenClause expression before any WhenClauses. **Test coverage:** Comprehensive. 7 unit tests covering constructor semantics, `withChildren` round-trip, `toSql` output, parsing, subquery planning (the key bug scenario), searched-case regression, and column-reference simple case. 7 regression test cases covering the exact bug reproduction, subquery with/without ELSE, column references, nested cases, and searched-case baseline. Uses `order_qt_` for deterministic output. Tables dropped before use. Table names hardcoded. **Observability:** Not applicable -- no new critical paths requiring logging or metrics. **Transaction/persistence:** Not applicable. **Data writes:** Not applicable. **FE-BE variable passing:** Not applicable -- the `value` field is consumed during FE-internal analysis and never reaches BE. **Performance:** No performance concerns. The analysis-time `EqualTo` construction is equivalent work to what was done at parse time before, just deferred. No redundant copies or allocations. **Other observations:** - The `getDataType()` change from `child(0).getDataType()` to `whenClauses.get(0).getDataType()` is a necessary and correct fix -- with the new children layout, `child(0)` could be the value expression rather than a WhenClause. - The `dataTypesForCoercion` field was correctly changed from non-final to `final` (minor cleanup). - The `AnalysisException` import was correctly removed since it is no longer used. - `CaseWhen` does not override `equals()`/`hashCode()`, relying on children-based comparison from the base `Expression` class. This is correct since the `value` field is part of the children list when present. ### Verdict **No issues found.** The design is clean, the implementation is correct, and the test coverage is thorough. The approach of keeping the simple-case form as a transient pre-analysis representation that is desugared into the canonical searched-case form during analysis is elegant and avoids any downstream changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
