[ 
https://issues.apache.org/jira/browse/CALCITE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783885#comment-17783885
 ] 

Ulrich Kramer edited comment on CALCITE-6090 at 11/8/23 6:08 AM:
-----------------------------------------------------------------

My initial assumption is not correct. Removing [this block 
title|https://github.com/apache/calcite/blob/590ec85f0fcff7173c288c350c3f60e640976a34/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2094-L2105]
 will not fix the problem. Without this block, the generated logical plan 
contains a identity projection, which is optimized away. 


was (Author: kramerul):
My initial assumption is not correct. Removing [this block 
title|https://github.com/apache/calcite/blob/590ec85f0fcff7173c288c350c3f60e640976a34/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2094-L2105]
 will not fix the problem. Without this block, the generated logical plan 
contains a identity projection, which is optimized away. The problem is that 
the generated {{LogicalAggregate}} already has a row type with a column name 
{{latest_store_type}}. This is not correct. It should have an anonymous column 
name.

> Jdbc adapter may create wrong sql for joins
> -------------------------------------------
>
>                 Key: CALCITE-6090
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6090
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.35.0
>            Reporter: Ulrich Kramer
>            Priority: Major
>
> Adding the following unit test to {{JdbcAdapterTest}} leads to an error
> {code:java}
>   @Test void testAmbiguousColumn() {
>     CalciteAssert.model(JdbcTest.FOODMART_SCOTT_MODEL)
>         .query("select\n" +
>             "                  \"store_id\" \"latest_id\",\n" +
>             "                  max(\"store_type\") \"latest_store_type\"\n" +
>             "                from\n" +
>             "                  ( SELECT \"store_id\",\"store_type\" FROM 
> \"foodmart\".\"store\") \n" +
>             "                group by\n" +
>             "                  \"store_id\"")
>         .runs()
>         .enable(CalciteAssert.DB == CalciteAssert.DatabaseInstance.HSQLDB)
>         .explainContains("JdbcToEnumerableConverter\n  JdbcProject" )
>         .planHasSql("SELECT MAX(\"region_id\") AS \"latest_region_id\", 
> \"store_id\" AS \"latest_id\"\nFROM \"foodmart\".\"store\"\nGROUP BY 
> \"store_id\"");
>   }
> {code}
> The projection for the column {{latest_id}} is missing. The problem is 
> located 
> [here|https://github.com/apache/calcite/blob/590ec85f0fcff7173c288c350c3f60e640976a34/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2094-L2095].
>  If the aggregation doesn't add, remove or twist columns, 
> {{RexUtil.isIdentity}} returns true. Additionally the {{inputRowType}} 
> already contains the names of the output columns. Therefore 
> {{fieldNameList.equals(inputRowType.getFieldNames())}} also returns true.
> Normally, this doesn't cause issues. But if this statement is used within a 
> join like this
> {code:sql}
> SELECT
>   DISTINCT "region_id",
>   "store_id",
>   "dummy"
> FROM
>   (
>     SELECT
>       "region_id",
>       "store_id",
>       "dummy"
>     FROM
>       (
>         select
>           B."store_city" "store_city",
>           B."store_id" "store_id",
>           B."region_id" "region_id",
>           cast(null as integer) "dummy"
>         from
>           (
>             select
>               A."store_city" "store_city",
>               A."region_id" "region_id",
>               A."store_id" "store_id"
>             from
>               (
>                 select
>                   max("region_id") "latest_region_id",
>                   "store_id" "latest_id"
>                 from
>                   ( SELECT "region_id", "store_id" FROM "foodmart"."store") 
> "C"
>                 group by
>                   "store_id"
>               ) "D"
>               left outer join "foodmart"."store" A on (
>                 A."store_id" = "D"."latest_id"
>                 and A."region_id" = "D"."latest_region_id"
>               )
>           ) B
>       )
>     WHERE
>       "store_city" IS NOT NULL
>   )
> {code}
> an invalid SQL will be generated because the left and the right side of the 
> join has a column "store_id". This will be fixed in 
> {{SqlValidatorUtil::addFields}} by appending a unique number. But in the end 
> this will result in a statement, which contains a {{GROUP BY ... store_id0}} 
> and will never be successful.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to