[ 
https://issues.apache.org/jira/browse/FLINK-39720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-39720:
-----------------------------------
    Labels: pull-request-available  (was: )

> SubQueryDecorrelator produces incorrect plans for correlated EXISTS with 
> HAVING on aggregate outputs
> ----------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-39720
>                 URL: https://issues.apache.org/jira/browse/FLINK-39720
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Planner
>    Affects Versions: 1.20.4, 2.3.0, 2.2.1
>            Reporter: lincoln lee
>            Assignee: lincoln lee
>            Priority: Critical
>              Labels: pull-request-available
>
> SubQueryDecorrelator.decorrelateRel(LogicalFilter) reattaches the 
> non-correlated remainder of a Filter condition to the rewritten input without 
> remapping its
>   RexInputRefs through frame.oldToNewOutputs. When the child LogicalAggregate 
> has had correlated columns injected into its group key (which shifts the 
> position of
>   aggregate-output fields), surviving HAVING / Filter predicates silently 
> point at the wrong column. The resulting plan is structurally valid but 
> semantically wrong.
> Reproduction
>   Schema (matches SubQuerySemiJoinTest): l(a INT, b BIGINT, c VARCHAR), r(d 
> INT, e BIGINT, f VARCHAR).
>   SELECT * FROM l
>   WHERE EXISTS (
>     SELECT 1 FROM r
>     WHERE l.a = r.d            -- correlated WHERE
>     GROUP BY r.f
>     HAVING SUM(r.e) >= 3       -- non-correlated HAVING on aggregate output
>   );
>   Expected: HAVING applies to the SUM(r.e) column.
>   Actual (before fix): HAVING applies to the injected r.d group-key column 
> (>=($1, 3) where $1 is now r.d, not SUM(r.e)). Plan is silently wrong.
>   Other shapes that trigger the same drift:
>   - Compound HAVING: HAVING SUM(r.e) >= 3 AND MAX(r.e) < 100
>   - Mixed agg + COUNT: HAVING SUM(r.e) >= 3 AND COUNT(*) > 1
>   - Multiple correlated cols: WHERE l.a = r.d AND l.b = r.e ... HAVING 
> COUNT(r.d) >= 2
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to