Ulrich Kramer created CALCITE-6090:
--------------------------------------

             Summary: Jdbc adapter may create wrong sql for joins
                 Key: CALCITE-6090
                 URL: https://issues.apache.org/jira/browse/CALCITE-6090
             Project: Calcite
          Issue Type: Bug
          Components: core
    Affects Versions: 1.35.0
            Reporter: Ulrich Kramer


Adding the following unit test to {{JdbcAdapterTest}} leads to an error
{code:java}
  @Test void testAmbiguousColumn() {
    CalciteAssert.model(JdbcTest.FOODMART_SCOTT_MODEL)
        .query("select\n" +
            "                  \"store_id\" \"latest_id\",\n" +
            "                  max(\"store_type\") \"latest_store_type\"\n" +
            "                from\n" +
            "                  ( SELECT \"store_id\",\"store_type\" FROM 
\"foodmart\".\"store\") \n" +
            "                group by\n" +
            "                  \"store_id\"")
        .runs()
        .enable(CalciteAssert.DB == CalciteAssert.DatabaseInstance.HSQLDB)
        .explainContains("JdbcToEnumerableConverter\n  JdbcProject" )
        .planHasSql("SELECT MAX(\"region_id\") AS \"latest_region_id\", 
\"store_id\" AS \"latest_id\"\nFROM \"foodmart\".\"store\"\nGROUP BY 
\"store_id\"");
  }
{code}

The projection for the column {{latest_id}} is missing. The problem is related 
[here|https://github.com/apache/calcite/blob/590ec85f0fcff7173c288c350c3f60e640976a34/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2094-L2095].
 If the aggregation doesn't add, remove or twist columns, 
{{RexUtil.isIdentity}} returns true. Additionally the {{inputRowType}} already 
contains the names of the output columns. Therefore 
{{fieldNameList.equals(inputRowType.getFieldNames())}} also returns true.

Normally, this doesn't cause issues. But if this statement is used within a 
join like this

{code:sql}
SELECT
  DISTINCT "region_id",
  "store_id",
  "dummy"
FROM
  (
    SELECT
      "region_id",
      "store_id",
      "dummy"
    FROM
      (
        select
          B."store_city" "store_city",
          B."store_id" "store_id",
          B."region_id" "region_id",
          cast(null as integer) "dummy"
        from
          (
            select
              A."store_city" "store_city",
              A."region_id" "region_id",
              A."store_id" "store_id"
            from
              (
                select
                  max("region_id") "latest_region_id",
                  "store_id" "latest_id"
                from
                  ( SELECT "region_id", "store_id" FROM "foodmart"."store") "C"
                group by
                  "store_id"
              ) "D"
              left outer join "foodmart"."store" A on (
                A."store_id" = "D"."latest_id"
                and A."region_id" = "D"."latest_region_id"
              )
          ) B
      )
    WHERE
      "store_city" IS NOT NULL
  )
{code}

an invalid SQL will be generated because the left an the right side of the join 
has a column "store_id". This will be fixed in {{SqlValidatorUtil::addFields}} 
by appending a unique number. But in the end this will result in a statement, 
which contains a {{GROUP BY ... store_id0 }} and will never be successful.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to