[
https://issues.apache.org/jira/browse/CALCITE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18064464#comment-18064464
]
Bruno Volpato commented on CALCITE-7440:
----------------------------------------
Hi [~jensen], thanks for the comment!
Yes, I believe every transformation chain should preserve enough correlation
info. In practice, `*_TO_CORRELATE` and `*_TO_MARK_CORRELATE` are usually
different decorrelation paths and not always composed together.
I re-checked this and tightened the test to RuleSets.ofList() (no explicit
rules) and I can still reproduce!
The real issue is correlation context can be missing by the time RelToSql
resolves it. This patch is intentionally narrow:
- No arbitrary non-empty map matching
- Fallback only when the correlation map is empty
- Otherwise fail fast
So I'd say losing info isn’t the expected behavior -- this is a defensive fix
to avoid NPEs and potentially risking wrong bindings while keeping semantics
consistent.
> RelToSqlConverter throws NPE (variable $cor1 not found) for correlated
> projection after semi-join rewrites
> ----------------------------------------------------------------------------------------------------------
>
> Key: CALCITE-7440
> URL: https://issues.apache.org/jira/browse/CALCITE-7440
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.41.0
> Reporter: Bruno Volpato
> Priority: Minor
> Labels: pull-request-available
>
> Reproduction query
> {code:sql}
> WITH product_keys AS (
> SELECT p."product_id",
> (SELECT MAX(p3."product_id")
> FROM "foodmart"."product" p3
> WHERE p3."product_id" = p."product_id") AS "mx"
> FROM "foodmart"."product" p
> )
> SELECT DISTINCT pk."product_id"
> FROM product_keys pk
> LEFT JOIN "foodmart"."product" p2 USING ("product_id")
> WHERE pk."product_id" IN (
> SELECT p4."product_id"
> FROM "foodmart"."product" p4
> )
> {code}
> Optimizer rules
> {code:java}
> RuleSets.ofList() {code}
> Generated SQL
> {code:sql}
> SELECT "t4"."product_id"
> FROM (SELECT "$cor0"."product_id", "t1"."EXPR$0" AS "mx"
> FROM "foodmart"."product" AS "$cor0",
> LATERAL (SELECT MAX("product_id") AS "EXPR$0"
> FROM "foodmart"."product"
> WHERE "product_id" = "$cor0"."product_id") AS "t1"
> WHERE EXISTS (SELECT 1
> FROM (SELECT "product_id"
> FROM "foodmart"."product") AS "t3"
> WHERE "$cor0"."product_id" = "t3"."product_id")) AS "t4"
> LEFT JOIN "foodmart"."product" AS "product2" ON "t4"."product_id" =
> "product2"."product_id"
> GROUP BY "t4"."product_id"
> {code}
> Actual behavior
> RelToSql conversion fails with:
> {code:java}
> java.lang.NullPointerException: variable $cor1 is not found
> at
> org.apache.calcite.rel.rel2sql.SqlImplementor.getAliasContext(SqlImplementor.java:1590)
> {code}
> Expected behavior
> RelToSqlConverter should preserve correlation scope and generate valid SQL
> for this rule pipeline, without missing correlation variables.
> Environment
> - Calcite main branch (1.42.0-SNAPSHOT)
> - RelToSqlConverterTest with PostgreSQL dialect
> - Java 21
--
This message was sent by Atlassian Jira
(v8.20.10#820010)