Brachi Packter created CALCITE-6357:
---------------------------------------
Summary: Calcite enforces select argument count to be same as row
schema fields which causes aliases to be ignored
Key: CALCITE-6357
URL: https://issues.apache.org/jira/browse/CALCITE-6357
Project: Calcite
Issue Type: Bug
Reporter: Brachi Packter
Calcite RelBuilder.ProjectNamed cehcks if row size in the select is identical
to schema fields, if no, it creates a project with fields as they appear in the
select , meaning if they have aliases, they are returning with their aliases.
Here it checks if they are identical:
https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2063
using RexUtil.isIdentity method:
```
public static boolean isIdentity(List<? extends RexNode> exps,
RelDataType inputRowType) {
return inputRowType.getFieldCount() == exps.size()
&& containIdentity(exps, inputRowType, Litmus.IGNORE);
}
```
This is the problematic part `inputRowType.getFieldCount() == exps.size()`
And then it is ignored in the "rename" method later on
https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2125
and alias is skipped
https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2137
This doesn't impact calcite queries, but in Apache Beam they are doing some
optimization on top of it,
https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamAggregateProjectMergeRule.java
which cause aliases to be ignored, and data is returning suddenly without
correct column field.
I believe the isIdentity check can causes more issues if not fixed, we need to
understand why is it enforced? isn't it valid to have different size of fields
in select from what we have in the schema?
In our case we have a one big row and we run on it different queries, each with
different fields in the select.
Beam issue
https://github.com/apache/beam/issues/30498
--
This message was sent by Atlassian Jira
(v8.20.10#820010)