[
https://issues.apache.org/jira/browse/CALCITE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065736#comment-18065736
]
Alessandro Solimando commented on CALCITE-7440:
-----------------------------------------------
I am under the impression that we are trying to fix the problem too late in the
call stack, and that "$cor" should be replaced with a table alias earlier on.
I think that "visit(Correlate)" shouldn't use the internal correlation variable
name
[here|https://github.com/apache/calcite/blob/607126846b12348beae54bbcbb7d1f8ea4a44a0d/core/src/main/java/org/apache/calcite/rel/rel2sql/RelToSqlConverter.java#L510-L512],
as the SQL table alias. Correlation variable names are planner-internal
identifiers that should never appear in generated SQL.
I think that generating a clean table alias at that point (like what
"visit(Join)" does
[here|https://github.com/apache/calcite/blob/607126846b12348beae54bbcbb7d1f8ea4a44a0d/core/src/main/java/org/apache/calcite/rel/rel2sql/RelToSqlConverter.java#L317-L318]),
the problem would go away.
Something like:
{code:java}
final Result visitedLeft = visitInput(e, 0);
final String correlAlias = SqlValidatorUtil.uniquify("t", aliasSet,
SqlValidatorUtil.EXPR_SUGGESTER);
final Result leftResult = visitedLeft.resetAlias(correlAlias,
e.getInput(0).getRowType());{code}
> RelToSqlConverter throws NPE (variable $cor1 not found) for correlated
> projection after semi-join rewrites
> ----------------------------------------------------------------------------------------------------------
>
> Key: CALCITE-7440
> URL: https://issues.apache.org/jira/browse/CALCITE-7440
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.41.0
> Reporter: Bruno Volpato
> Priority: Minor
> Labels: pull-request-available
>
> Reproduction query
> {code:sql}
> WITH product_keys AS (
> SELECT p."product_id",
> (SELECT MAX(p3."product_id")
> FROM "foodmart"."product" p3
> WHERE p3."product_id" = p."product_id") AS "mx"
> FROM "foodmart"."product" p
> )
> SELECT DISTINCT pk."product_id"
> FROM product_keys pk
> LEFT JOIN "foodmart"."product" p2 USING ("product_id")
> WHERE pk."product_id" IN (
> SELECT p4."product_id"
> FROM "foodmart"."product" p4
> )
> {code}
> Optimizer rules
> {code:java}
> RuleSets.ofList() {code}
> Generated SQL
> {code:sql}
> SELECT "t4"."product_id"
> FROM (SELECT "$cor0"."product_id", "t1"."EXPR$0" AS "mx"
> FROM "foodmart"."product" AS "$cor0",
> LATERAL (SELECT MAX("product_id") AS "EXPR$0"
> FROM "foodmart"."product"
> WHERE "product_id" = "$cor0"."product_id") AS "t1"
> WHERE EXISTS (SELECT 1
> FROM (SELECT "product_id"
> FROM "foodmart"."product") AS "t3"
> WHERE "$cor0"."product_id" = "t3"."product_id")) AS "t4"
> LEFT JOIN "foodmart"."product" AS "product2" ON "t4"."product_id" =
> "product2"."product_id"
> GROUP BY "t4"."product_id"
> {code}
> Actual behavior
> RelToSql conversion fails with:
> {code:java}
> java.lang.NullPointerException: variable $cor1 is not found
> at
> org.apache.calcite.rel.rel2sql.SqlImplementor.getAliasContext(SqlImplementor.java:1590)
> {code}
> Expected behavior
> RelToSqlConverter should preserve correlation scope and generate valid SQL
> for this rule pipeline, without missing correlation variables.
> Environment
> - Calcite main branch (1.42.0-SNAPSHOT)
> - RelToSqlConverterTest with PostgreSQL dialect
> - Java 21
--
This message was sent by Atlassian Jira
(v8.20.10#820010)