[
https://issues.apache.org/jira/browse/CALCITE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835858#comment-17835858
]
Brachi Packter edited comment on CALCITE-6357 at 4/10/24 7:09 PM:
--
> If the number of fields does not match, that's probably a problem on your
> end. RelBuilder almost always requires number of fields to match.
why? isn't it a valid case to have a row with wider schema from what you
actually need to select? (e.g group by queries, select one dimension from the
row and make some count/sum on it)
> At RelBuilder#2125 it seems that force is false. For the behavior you want,
> force would need to be true
can't see where I can pass force, only here
[https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2063]
but it looks like it should be false in order to be renamed later on (and is
identical should return true)
was (Author: brachi_packter):
> If the number of fields does not match, that's probably a problem on your
> end. RelBuilder almost always requires number of fields to match.
why? isn't it a valid case to have a row with wider schema from you actually
need to select? (e.g group by queries, select one dimension from the row and
make some count/sum on it)
> At RelBuilder#2125 it seems that force is false. For the behavior you want,
> force would need to be true
can't see where I can pass force, only here
https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2063
but it looks like it should be false in order to be renamed later on (and is
identical should return true)
> Calcite enforces select arguments count to be same as row schema fields which
> causes aliases to be ignored
> --
>
> Key: CALCITE-6357
> URL: https://issues.apache.org/jira/browse/CALCITE-6357
> Project: Calcite
> Issue Type: Bug
>Reporter: Brachi Packter
>Priority: Major
>
> Calcite RelBuilder.ProjectNamed checks if row size in the select is identical
> to schema fields, if no, it creates a project with fields as they appear in
> the select , meaning if they have aliases, they are returning with their
> aliases.
> Here, it checks if they are identical:
> https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2063
> using RexUtil.isIdentity method:
> ```
> public static boolean isIdentity(List exps,
> RelDataType inputRowType) {
> return inputRowType.getFieldCount() == exps.size()
> && containIdentity(exps, inputRowType, Litmus.IGNORE);
> }
> ```
> This is the problematic part `inputRowType.getFieldCount() == exps.size()`
> If they are identical, and return with their aliases, it is ignored in the
> "rename" method later on
> https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2125
> and alias is skipped
> https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2137
> This doesn't impact calcite queries, but in Apache Beam they are doing some
> optimization on top of it,
> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamAggregateProjectMergeRule.java
> which causes aliases to be ignored, and data is returning suddenly without
> correct column field.
> I believe the isIdentity check can causes more issues if not fixed, we need
> to understand why is it enforced? isn't it valid to have different size of
> fields in select from what we have in the schema?
> In our case we have a one big row and we run on it different queries, each
> with different fields in the select.
> Beam issue
> https://github.com/apache/beam/issues/30498
--
This message was sent by Atlassian Jira
(v8.20.10#820010)