[ 
https://issues.apache.org/jira/browse/CALCITE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18064464#comment-18064464
 ] 

Bruno Volpato commented on CALCITE-7440:
----------------------------------------

Hi [~jensen], thanks for the comment!



Yes, I believe every transformation chain should preserve enough correlation 
info. In practice, `*_TO_CORRELATE` and `*_TO_MARK_CORRELATE` are usually 
different decorrelation paths and not always composed together.

 

I re-checked this and tightened the test to RuleSets.ofList() (no explicit 
rules) and I can still reproduce! 

The real issue is correlation context can be missing by the time RelToSql 
resolves it. This patch is intentionally narrow:

  - No arbitrary non-empty map matching
  - Fallback only when the correlation map is empty
  - Otherwise fail fast

 

So I'd say losing info isn’t the expected behavior -- this is a defensive fix 
to avoid NPEs and potentially risking wrong bindings while keeping semantics 
consistent.

> RelToSqlConverter throws NPE (variable $cor1 not found) for correlated 
> projection after semi-join rewrites
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-7440
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7440
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.41.0
>            Reporter: Bruno Volpato
>            Priority: Minor
>              Labels: pull-request-available
>
> Reproduction query
> {code:sql}
> WITH product_keys AS (
>   SELECT p."product_id",
>     (SELECT MAX(p3."product_id")
>      FROM "foodmart"."product" p3
>      WHERE p3."product_id" = p."product_id") AS "mx"
>   FROM "foodmart"."product" p
> )
> SELECT DISTINCT pk."product_id"
> FROM product_keys pk
> LEFT JOIN "foodmart"."product" p2 USING ("product_id")
> WHERE pk."product_id" IN (
>   SELECT p4."product_id"
>   FROM "foodmart"."product" p4
> )
> {code}
> Optimizer rules
> {code:java}
> RuleSets.ofList() {code}
> Generated SQL
> {code:sql}
> SELECT "t4"."product_id"
> FROM (SELECT "$cor0"."product_id", "t1"."EXPR$0" AS "mx"
> FROM "foodmart"."product" AS "$cor0",
> LATERAL (SELECT MAX("product_id") AS "EXPR$0"
> FROM "foodmart"."product"
> WHERE "product_id" = "$cor0"."product_id") AS "t1"
> WHERE EXISTS (SELECT 1
> FROM (SELECT "product_id"
> FROM "foodmart"."product") AS "t3"
> WHERE "$cor0"."product_id" = "t3"."product_id")) AS "t4"
> LEFT JOIN "foodmart"."product" AS "product2" ON "t4"."product_id" = 
> "product2"."product_id"
> GROUP BY "t4"."product_id"
> {code}
> Actual behavior
> RelToSql conversion fails with:
> {code:java}
> java.lang.NullPointerException: variable $cor1 is not found
>   at 
> org.apache.calcite.rel.rel2sql.SqlImplementor.getAliasContext(SqlImplementor.java:1590)
> {code}
> Expected behavior
> RelToSqlConverter should preserve correlation scope and generate valid SQL 
> for this rule pipeline, without missing correlation variables.
> Environment
>  - Calcite main branch (1.42.0-SNAPSHOT)
>  - RelToSqlConverterTest with PostgreSQL dialect
>  - Java 21



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to