[jira] [Comment Edited] (CALCITE-6357) Calcite enforces select arguments count to be same as row schema fields which causes aliases to be ignored

2024-04-10 Thread Brachi Packter (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835858#comment-17835858
 ] 

Brachi Packter edited comment on CALCITE-6357 at 4/10/24 7:09 PM:
--

> If the number of fields does not match, that's probably a problem on your 
> end. RelBuilder almost always requires number of fields to match.

why? isn't it a valid case to have a row with wider schema from what you 
actually need to select? (e.g group by queries, select one dimension from the 
row and make some count/sum on it)

> At RelBuilder#2125 it seems that force is false. For the behavior you want, 
> force would need to be true
can't see where I can pass force, only here 
[https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2063]
but it looks like it should be false in order to be renamed later on (and is 
identical should return true)


was (Author: brachi_packter):
> If the number of fields does not match, that's probably a problem on your 
> end. RelBuilder almost always requires number of fields to match.

why? isn't it a valid case to have a row with wider schema from you actually 
need to select? (e.g group by queries, select one dimension from the row and 
make some count/sum on it)

> At RelBuilder#2125 it seems that force is false. For the behavior you want, 
> force would need to be true
can't see where I can pass force, only here 
https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2063
but it looks like it should be false in order to be renamed later on (and is 
identical should return true)

> Calcite enforces select arguments count to be same as row schema fields which 
> causes aliases to be ignored
> --
>
> Key: CALCITE-6357
> URL: https://issues.apache.org/jira/browse/CALCITE-6357
> Project: Calcite
>  Issue Type: Bug
>Reporter: Brachi Packter
>Priority: Major
>
> Calcite RelBuilder.ProjectNamed checks if row size in the select is identical 
> to schema fields, if no, it creates a project with fields as they appear in 
> the select , meaning if they have aliases, they are returning with their 
> aliases.
> Here, it checks if they are identical:
> https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2063
> using RexUtil.isIdentity method:
> ```
>  public static boolean isIdentity(List exps,
>   RelDataType inputRowType) {
> return inputRowType.getFieldCount() == exps.size()
> && containIdentity(exps, inputRowType, Litmus.IGNORE);
>   }
> ```
> This is the problematic part `inputRowType.getFieldCount() == exps.size()`
> If they are identical, and return with their aliases, it is ignored in the 
> "rename" method later on
> https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2125
> and alias is skipped
> https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2137
> This doesn't impact calcite queries, but in Apache Beam they are doing some 
> optimization on top of it, 
> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamAggregateProjectMergeRule.java
> which causes aliases to be ignored, and data is returning suddenly without 
> correct column field.
> I believe the isIdentity check can causes more issues if not fixed, we need 
> to understand why is it enforced? isn't it valid to have different size of 
> fields in select from what we have in the schema?
> In our case we have a one big row and we run on it different queries, each 
> with different fields in the select.
> Beam issue 
> https://github.com/apache/beam/issues/30498 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-6357) Calcite enforces select arguments count to be same as row schema fields which causes aliases to be ignored

2024-04-10 Thread Brachi Packter (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835858#comment-17835858
 ] 

Brachi Packter edited comment on CALCITE-6357 at 4/10/24 7:09 PM:
--

> If the number of fields does not match, that's probably a problem on your 
> end. RelBuilder almost always requires number of fields to match.

why? isn't it a valid case to have a row with wider schema from you actually 
need to select? (e.g group by queries, select one dimension from the row and 
make some count/sum on it)

> At RelBuilder#2125 it seems that force is false. For the behavior you want, 
> force would need to be true
can't see where I can pass force, only here 
https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2063
but it looks like it should be false in order to be renamed later on (and is 
identical should return true)


was (Author: brachi_packter):
> If the number of fields does not match, that's probably a problem on your 
> end. RelBuilder almost always requires number of fields to match.

why? isn't it a valid case to have a row with wider schema from you actually 
need to select? (e.g group by queries, select one dimension from the row and 
make some count/sum on it)

> Calcite enforces select arguments count to be same as row schema fields which 
> causes aliases to be ignored
> --
>
> Key: CALCITE-6357
> URL: https://issues.apache.org/jira/browse/CALCITE-6357
> Project: Calcite
>  Issue Type: Bug
>Reporter: Brachi Packter
>Priority: Major
>
> Calcite RelBuilder.ProjectNamed checks if row size in the select is identical 
> to schema fields, if no, it creates a project with fields as they appear in 
> the select , meaning if they have aliases, they are returning with their 
> aliases.
> Here, it checks if they are identical:
> https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2063
> using RexUtil.isIdentity method:
> ```
>  public static boolean isIdentity(List exps,
>   RelDataType inputRowType) {
> return inputRowType.getFieldCount() == exps.size()
> && containIdentity(exps, inputRowType, Litmus.IGNORE);
>   }
> ```
> This is the problematic part `inputRowType.getFieldCount() == exps.size()`
> If they are identical, and return with their aliases, it is ignored in the 
> "rename" method later on
> https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2125
> and alias is skipped
> https://github.com/apache/calcite/blob/f14cf4c32b9079984a988bbad40230aa6a59b127/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2137
> This doesn't impact calcite queries, but in Apache Beam they are doing some 
> optimization on top of it, 
> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamAggregateProjectMergeRule.java
> which causes aliases to be ignored, and data is returning suddenly without 
> correct column field.
> I believe the isIdentity check can causes more issues if not fixed, we need 
> to understand why is it enforced? isn't it valid to have different size of 
> fields in select from what we have in the schema?
> In our case we have a one big row and we run on it different queries, each 
> with different fields in the select.
> Beam issue 
> https://github.com/apache/beam/issues/30498 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)