[ https://issues.apache.org/jira/browse/CALCITE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ulrich Kramer updated CALCITE-6090: ----------------------------------- Description: Adding the following unit test to {{JdbcAdapterTest}} leads to an error {code:java} @Test void testAmbiguousColumn() { CalciteAssert.model(JdbcTest.FOODMART_SCOTT_MODEL) .query("select\n" + " \"store_id\" \"latest_id\",\n" + " max(\"store_type\") \"latest_store_type\"\n" + " from\n" + " ( SELECT \"store_id\",\"store_type\" FROM \"foodmart\".\"store\") \n" + " group by\n" + " \"store_id\"") .runs() .enable(CalciteAssert.DB == CalciteAssert.DatabaseInstance.HSQLDB) .explainContains("JdbcToEnumerableConverter\n JdbcProject" ) .planHasSql("SELECT MAX(\"region_id\") AS \"latest_region_id\", \"store_id\" AS \"latest_id\"\nFROM \"foodmart\".\"store\"\nGROUP BY \"store_id\""); } {code} The projection for the column {{latest_id}} is missing. The problem is located [here|https://github.com/apache/calcite/blob/590ec85f0fcff7173c288c350c3f60e640976a34/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2094-L2095]. If the aggregation doesn't add, remove or twist columns, {{RexUtil.isIdentity}} returns true. Additionally the {{inputRowType}} already contains the names of the output columns. Therefore {{fieldNameList.equals(inputRowType.getFieldNames())}} also returns true. Normally, this doesn't cause issues. But if this statement is used within a join like this {code:sql} SELECT DISTINCT "region_id", "store_id", "dummy" FROM ( SELECT "region_id", "store_id", "dummy" FROM ( select B."store_city" "store_city", B."store_id" "store_id", B."region_id" "region_id", cast(null as integer) "dummy" from ( select A."store_city" "store_city", A."region_id" "region_id", A."store_id" "store_id" from ( select max("region_id") "latest_region_id", "store_id" "latest_id" from ( SELECT "region_id", "store_id" FROM "foodmart"."store") "C" group by "store_id" ) "D" left outer join "foodmart"."store" A on ( A."store_id" = "D"."latest_id" and A."region_id" = "D"."latest_region_id" ) ) B ) WHERE "store_city" IS NOT NULL ) {code} an invalid SQL will be generated because the left and the right side of the join has a column "store_id". This will be fixed in {{SqlValidatorUtil::addFields}} by appending a unique number. But in the end this will result in a statement, which contains a {{GROUP BY ... store_id0}} and will never be successful. was: Adding the following unit test to {{JdbcAdapterTest}} leads to an error {code:java} @Test void testAmbiguousColumn() { CalciteAssert.model(JdbcTest.FOODMART_SCOTT_MODEL) .query("select\n" + " \"store_id\" \"latest_id\",\n" + " max(\"store_type\") \"latest_store_type\"\n" + " from\n" + " ( SELECT \"store_id\",\"store_type\" FROM \"foodmart\".\"store\") \n" + " group by\n" + " \"store_id\"") .runs() .enable(CalciteAssert.DB == CalciteAssert.DatabaseInstance.HSQLDB) .explainContains("JdbcToEnumerableConverter\n JdbcProject" ) .planHasSql("SELECT MAX(\"region_id\") AS \"latest_region_id\", \"store_id\" AS \"latest_id\"\nFROM \"foodmart\".\"store\"\nGROUP BY \"store_id\""); } {code} The projection for the column {{latest_id}} is missing. The problem is related [here|https://github.com/apache/calcite/blob/590ec85f0fcff7173c288c350c3f60e640976a34/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2094-L2095]. If the aggregation doesn't add, remove or twist columns, {{RexUtil.isIdentity}} returns true. Additionally the {{inputRowType}} already contains the names of the output columns. Therefore {{fieldNameList.equals(inputRowType.getFieldNames())}} also returns true. Normally, this doesn't cause issues. But if this statement is used within a join like this {code:sql} SELECT DISTINCT "region_id", "store_id", "dummy" FROM ( SELECT "region_id", "store_id", "dummy" FROM ( select B."store_city" "store_city", B."store_id" "store_id", B."region_id" "region_id", cast(null as integer) "dummy" from ( select A."store_city" "store_city", A."region_id" "region_id", A."store_id" "store_id" from ( select max("region_id") "latest_region_id", "store_id" "latest_id" from ( SELECT "region_id", "store_id" FROM "foodmart"."store") "C" group by "store_id" ) "D" left outer join "foodmart"."store" A on ( A."store_id" = "D"."latest_id" and A."region_id" = "D"."latest_region_id" ) ) B ) WHERE "store_city" IS NOT NULL ) {code} an invalid SQL will be generated because the left and the right side of the join has a column "store_id". This will be fixed in {{SqlValidatorUtil::addFields}} by appending a unique number. But in the end this will result in a statement, which contains a {{GROUP BY ... store_id0}} and will never be successful. > Jdbc adapter may create wrong sql for joins > ------------------------------------------- > > Key: CALCITE-6090 > URL: https://issues.apache.org/jira/browse/CALCITE-6090 > Project: Calcite > Issue Type: Bug > Components: core > Affects Versions: 1.35.0 > Reporter: Ulrich Kramer > Priority: Major > > Adding the following unit test to {{JdbcAdapterTest}} leads to an error > {code:java} > @Test void testAmbiguousColumn() { > CalciteAssert.model(JdbcTest.FOODMART_SCOTT_MODEL) > .query("select\n" + > " \"store_id\" \"latest_id\",\n" + > " max(\"store_type\") \"latest_store_type\"\n" + > " from\n" + > " ( SELECT \"store_id\",\"store_type\" FROM > \"foodmart\".\"store\") \n" + > " group by\n" + > " \"store_id\"") > .runs() > .enable(CalciteAssert.DB == CalciteAssert.DatabaseInstance.HSQLDB) > .explainContains("JdbcToEnumerableConverter\n JdbcProject" ) > .planHasSql("SELECT MAX(\"region_id\") AS \"latest_region_id\", > \"store_id\" AS \"latest_id\"\nFROM \"foodmart\".\"store\"\nGROUP BY > \"store_id\""); > } > {code} > The projection for the column {{latest_id}} is missing. The problem is > located > [here|https://github.com/apache/calcite/blob/590ec85f0fcff7173c288c350c3f60e640976a34/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2094-L2095]. > If the aggregation doesn't add, remove or twist columns, > {{RexUtil.isIdentity}} returns true. Additionally the {{inputRowType}} > already contains the names of the output columns. Therefore > {{fieldNameList.equals(inputRowType.getFieldNames())}} also returns true. > Normally, this doesn't cause issues. But if this statement is used within a > join like this > {code:sql} > SELECT > DISTINCT "region_id", > "store_id", > "dummy" > FROM > ( > SELECT > "region_id", > "store_id", > "dummy" > FROM > ( > select > B."store_city" "store_city", > B."store_id" "store_id", > B."region_id" "region_id", > cast(null as integer) "dummy" > from > ( > select > A."store_city" "store_city", > A."region_id" "region_id", > A."store_id" "store_id" > from > ( > select > max("region_id") "latest_region_id", > "store_id" "latest_id" > from > ( SELECT "region_id", "store_id" FROM "foodmart"."store") > "C" > group by > "store_id" > ) "D" > left outer join "foodmart"."store" A on ( > A."store_id" = "D"."latest_id" > and A."region_id" = "D"."latest_region_id" > ) > ) B > ) > WHERE > "store_city" IS NOT NULL > ) > {code} > an invalid SQL will be generated because the left and the right side of the > join has a column "store_id". This will be fixed in > {{SqlValidatorUtil::addFields}} by appending a unique number. But in the end > this will result in a statement, which contains a {{GROUP BY ... store_id0}} > and will never be successful. -- This message was sent by Atlassian Jira (v8.20.10#820010)