[ https://issues.apache.org/jira/browse/CALCITE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jess Balint closed CALCITE-5209. -------------------------------- Resolved in release 1.33.0 (2023-02-06) > Proper sub-query handling if it is used inside select list and group by > ----------------------------------------------------------------------- > > Key: CALCITE-5209 > URL: https://issues.apache.org/jira/browse/CALCITE-5209 > Project: Calcite > Issue Type: Bug > Affects Versions: 1.30.0 > Reporter: Ian Bertolacci > Assignee: Dmitry Sysolyatin > Priority: Major > Labels: pull-request-available > Fix For: 1.33.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > SqlRelConverter.getSubQuery uses reference comparing in order to find > SubQuery instance. > But it does not work in case when select list has column which refers to the > column inside `GROUP BY` clause. > For example: > {code}SELECT deptno IN (1,2) FROM emp.deptno GROUP BY deptno IN (1,2);{code} > In this case `SqlNode` inside select list and `SqlNode` inside `GROUP BY` are > different SqlNode instances. The reference comparing won't work. > The following bug is a result of this behaviour: > For these examples assume that: > - {{`SqlToRelConverter.Config.config().withInSubQueryThreshold(5)`}} has > been set. > - The source table has 4 columns (Column_0 through Column_3) > - Each column is of type BIGINT NOT NULL > The failing query is: > {code:java} > select > case > when Column_3 in (1, 2, 3, 4, 5) THEN 1 > else 0 > end > from T1000 > group by > case > when Column_3 in (1, 2, 3, 4, 5) THEN 1 > else 0 > end > {code} > The exception is: > {code:java} > 3 > java.lang.ArrayIndexOutOfBoundsException: 3 > at > com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:75) > at org.apache.calcite.tools.RelBuilder.inferAlias(RelBuilder.java:2163) > at org.apache.calcite.tools.RelBuilder.project_(RelBuilder.java:1956) > at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1797) > at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1769) > at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1758) > at > org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.register(SqlToRelConverter.java:4680) > at > org.apache.calcite.sql2rel.SqlToRelConverter.substituteSubQuery(SqlToRelConverter.java:1268) > at > org.apache.calcite.sql2rel.SqlToRelConverter.replaceSubQueries(SqlToRelConverter.java:1127) > at > org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:3325) > at > org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:3192) > at > org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:738) > at > org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:664) > at > org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3589) > at > org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:589) > at org.apache.calcite.prepare.PlannerImpl.rel(PlannerImpl.java:259) > {code} > The table could contain N >= 4 columns, and the use of any column at or after > index 3 will also cause this exception. > The use of any column before index 3 gives the RelNode tree: > {code:java} > 66:LogicalProject(EXPR$0=[$0]) > 65:LogicalJoin(condition=[=($3, $4)], joinType=[left]) > 61:LogicalProject(EXPR$0=[$0], $f0=[$1], $f1=[$2], $f10=[$2]) > 60:LogicalJoin(condition=[true], joinType=[inner]) > 56:LogicalAggregate(group=[{0}]) > 55:LogicalProject(EXPR$0=[CASE(CAST(OR(AND(IS NOT NULL($8), <>($4, > 0)), AND(<($5, $4), null, <>($4, 0), IS NULL($8)))):BOOLEAN NOT NULL, 1, 0)]) > 54:LogicalJoin(condition=[=($6, $7)], joinType=[left]) > 50:LogicalProject(Column_0=[$0], Column_1=[$1], Column_2=[$2], > Column_3=[$3], $f0=[$4], $f1=[$5], Column_20=[$2]) > 49:LogicalJoin(condition=[true], joinType=[inner]) > 45:TableScan(...) > 48:LogicalAggregate(group=[{}], agg#0=[COUNT()], > agg#1=[COUNT($0)]) > 47:LogicalProject(ROW_VALUE=[$0], $f1=[true]) > 46:LogicalValues(tuples=[[{ 1 }, { 2 }, { 3 }, { 4 }, { > 5 }]]) > 53:LogicalAggregate(group=[{0}], agg#0=[MIN($1)]) > 52:LogicalProject(ROW_VALUE=[$0], $f1=[true]) > 51:LogicalValues(tuples=[[{ 1 }, { 2 }, { 3 }, { 4 }, { 5 > }]]) > 59:LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)]) > 58:LogicalProject(ROW_VALUE=[$0], $f1=[true]) > 57:LogicalValues(tuples=[[{ 1 }, { 2 }, { 3 }, { 4 }, { 5 }]]) > 64:LogicalAggregate(group=[{0}], agg#0=[MIN($1)]) > 63:LogicalProject(ROW_VALUE=[$0], $f1=[true]) > 62:LogicalValues(tuples=[[{ 1 }, { 2 }, { 3 }, { 4 }, { 5 }]]) > {code} > If the number of predicates is less than the subquery threshold, there are no > issues. > Additionally, the below queries also produce RelNode trees > Using an alias: > {code:java} > select > case > when Column_3 in (1, 2, 3, 4, 5) THEN 1 > else 0 > end as CASE_ALIAS > from T1000 > group by > CASE_ALIAS > {code} > {code:java} > 90:LogicalAggregate(group=[{0}]) > 89:LogicalProject(CASE_ALIAS=[CASE(CAST(OR(AND(IS NOT NULL($8), <>($4, 0)), > AND(<($5, $4), null, <>($4, 0), IS NULL($8)))):BOOLEAN NOT NULL, 1, 0)]) > 88:LogicalJoin(condition=[=($6, $7)], joinType=[left]) > 84:LogicalProject(Column_0=[$0], Column_1=[$1], Column_2=[$2], > Column_3=[$3], $f0=[$4], $f1=[$5], Column_30=[$3]) > 83:LogicalJoin(condition=[true], joinType=[inner]) > 79:TableScan(...) > 82:LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)]) > 81:LogicalProject(ROW_VALUE=[$0], $f1=[true]) > 80:LogicalValues(tuples=[[{ 1 }, { 2 }, { 3 }, { 4 }, { 5 }]]) > 87:LogicalAggregate(group=[{0}], agg#0=[MIN($1)]) > 86:LogicalProject(ROW_VALUE=[$0], $f1=[true]) > 85:LogicalValues(tuples=[[{ 1 }, { 2 }, { 3 }, { 4 }, { 5 }]]) > {code} > Not using the group-by value in the projection: > {code:java} > select > count(*) > from T1000 > group by > case > when Column_3 in (1, 2, 3, 4, 5) THEN 1 > else 0 > end > {code} > {code:java} > 116:LogicalProject(EXPR$0=[$1]) > 115:LogicalAggregate(group=[{0}], EXPR$0=[COUNT()]) > 114:LogicalProject($f0=[CASE(CAST(OR(AND(IS NOT NULL($8), <>($4, 0)), > AND(<($5, $4), null, <>($4, 0), IS NULL($8)))):BOOLEAN NOT NULL, 1, 0)]) > 113:LogicalJoin(condition=[=($6, $7)], joinType=[left]) > 109:LogicalProject(Column_0=[$0], Column_1=[$1], Column_2=[$2], > Column_3=[$3], $f0=[$4], $f1=[$5], Column_30=[$3]) > 108:LogicalJoin(condition=[true], joinType=[inner]) > 104:TableScan(...) > 107:LogicalAggregate(group=[{}], agg#0=[COUNT()], > agg#1=[COUNT($0)]) > 106:LogicalProject(ROW_VALUE=[$0], $f1=[true]) > 105:LogicalValues(tuples=[[{ 1 }, { 2 }, { 3 }, { 4 }, { 5 > }]]) > 112:LogicalAggregate(group=[{0}], agg#0=[MIN($1)]) > 111:LogicalProject(ROW_VALUE=[$0], $f1=[true]) > 110:LogicalValues(tuples=[[{ 1 }, { 2 }, { 3 }, { 4 }, { 5 }]]) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)