[jira] [Commented] (CALCITE-3936) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157580#comment-17157580 ] Julian Hyde commented on CALCITE-3936: -- Will do. > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3936 > URL: https://issues.apache.org/jira/browse/CALCITE-3936 > Project: Calcite > Issue Type: Bug >Affects Versions: 1.23.0 >Reporter: Steven Talbot >Assignee: Julian Hyde >Priority: Major > Labels: pull-request-available > Fix For: 1.24.0 > > Time Spent: 20m > Remaining Estimate: 0h > > ... for dialects with SqlConformance.isHavingAlias=false > Very, very similar to -CALCITE-3593.- > Reproducing test case in RelToSqlConverter: > {code:java} > @Test public void testHavingAlias2() { > final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as > gross_weight\n" + > " from \"product\"\n" + > " group by \"product_id\"\n" + > " having sum(\"product\".\"gross_weight\") < 200"; > final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" + > "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" + > "FROM foodmart.product\n" + > "GROUP BY product_id\n" + > "HAVING SUM(product.gross_weight) < 200) AS t1" > // (or) "HAVING gross_weight < 200) AS t1" > // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1" > // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1" > // which on BigQuery gives you an error about aggregating aggregates > ; > sql(query).withBigQuery().ok(expected); > } > {code} > In that one, the pattern was Project/Filter/Aggregate, here it is > Filter/Aggregate/Project. In 3593, the project created a new alias, which got > added to the same SELECT clause and caused the ambiguity. Here, the aggregate > creates an alias, but the filter will write a HAVING clause using the aliases > from before the Aggregate, and that will cause the SQL engine to think that > the filter is on the aggregate field, rather than on the underlying field. > Note that this is less an absurdly unlikely occurrence than it might seem > because when Calcite's default aliasing kicks in and everything gets the name > "$f6", "$f4", etc, so chances of a collision are higher if you have multiply > nested selects with default aliases. > Potential fixes: > # force a subselect, as was done for 3593. > # Force the expression in the HAVING to be fully aliased by table (works at > least in BigQuery, where I tested) > # Write the HAVING expression in terms of the aliases from the aggregate, > rather than what's coming from the aggregate (also works on BigQuery) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3936) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157119#comment-17157119 ] Chunwei Lei commented on CALCITE-3936: -- I saw the PR was approved. Could you please squash commits and merge it, [~julianhyde]? > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3936 > URL: https://issues.apache.org/jira/browse/CALCITE-3936 > Project: Calcite > Issue Type: Bug >Affects Versions: 1.23.0 >Reporter: Steven Talbot >Assignee: Julian Hyde >Priority: Major > Labels: pull-request-available > Fix For: 1.24.0 > > Time Spent: 20m > Remaining Estimate: 0h > > ... for dialects with SqlConformance.isHavingAlias=false > Very, very similar to -CALCITE-3593.- > Reproducing test case in RelToSqlConverter: > {code:java} > @Test public void testHavingAlias2() { > final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as > gross_weight\n" + > " from \"product\"\n" + > " group by \"product_id\"\n" + > " having sum(\"product\".\"gross_weight\") < 200"; > final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" + > "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" + > "FROM foodmart.product\n" + > "GROUP BY product_id\n" + > "HAVING SUM(product.gross_weight) < 200) AS t1" > // (or) "HAVING gross_weight < 200) AS t1" > // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1" > // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1" > // which on BigQuery gives you an error about aggregating aggregates > ; > sql(query).withBigQuery().ok(expected); > } > {code} > In that one, the pattern was Project/Filter/Aggregate, here it is > Filter/Aggregate/Project. In 3593, the project created a new alias, which got > added to the same SELECT clause and caused the ambiguity. Here, the aggregate > creates an alias, but the filter will write a HAVING clause using the aliases > from before the Aggregate, and that will cause the SQL engine to think that > the filter is on the aggregate field, rather than on the underlying field. > Note that this is less an absurdly unlikely occurrence than it might seem > because when Calcite's default aliasing kicks in and everything gets the name > "$f6", "$f4", etc, so chances of a collision are higher if you have multiply > nested selects with default aliases. > Potential fixes: > # force a subselect, as was done for 3593. > # Force the expression in the HAVING to be fully aliased by table (works at > least in BigQuery, where I tested) > # Write the HAVING expression in terms of the aliases from the aggregate, > rather than what's coming from the aggregate (also works on BigQuery) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3936) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147576#comment-17147576 ] Julian Hyde commented on CALCITE-3936: -- Please review: [PR 2047|https://github.com/apache/calcite/pull/2047] > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3936 > URL: https://issues.apache.org/jira/browse/CALCITE-3936 > Project: Calcite > Issue Type: Bug >Reporter: Steven Talbot >Assignee: Julian Hyde >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > ... for dialects with SqlConformance.isHavingAlias=false > Very, very similar to -CALCITE-3593.- > Reproducing test case in RelToSqlConverter: > {code:java} > @Test public void testHavingAlias2() { > final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as > gross_weight\n" + > " from \"product\"\n" + > " group by \"product_id\"\n" + > " having sum(\"product\".\"gross_weight\") < 200"; > final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" + > "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" + > "FROM foodmart.product\n" + > "GROUP BY product_id\n" + > "HAVING SUM(product.gross_weight) < 200) AS t1" > // (or) "HAVING gross_weight < 200) AS t1" > // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1" > // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1" > // which on BigQuery gives you an error about aggregating aggregates > ; > sql(query).withBigQuery().ok(expected); > } > {code} > In that one, the pattern was Project/Filter/Aggregate, here it is > Filter/Aggregate/Project. In 3593, the project created a new alias, which got > added to the same SELECT clause and caused the ambiguity. Here, the aggregate > creates an alias, but the filter will write a HAVING clause using the aliases > from before the Aggregate, and that will cause the SQL engine to think that > the filter is on the aggregate field, rather than on the underlying field. > Note that this is less an absurdly unlikely occurrence than it might seem > because when Calcite's default aliasing kicks in and everything gets the name > "$f6", "$f4", etc, so chances of a collision are higher if you have multiply > nested selects with default aliases. > Potential fixes: > # force a subselect, as was done for 3593. > # Force the expression in the HAVING to be fully aliased by table (works at > least in BigQuery, where I tested) > # Write the HAVING expression in terms of the aliases from the aggregate, > rather than what's coming from the aggregate (also works on BigQuery) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3936) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090180#comment-17090180 ] Jin Xing commented on CALCITE-3936: --- Yes [~julianhyde], the generated SQL is invalid. We need to fix the bug when convert rel to sql. > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3936 > URL: https://issues.apache.org/jira/browse/CALCITE-3936 > Project: Calcite > Issue Type: Bug >Reporter: Steven Talbot >Priority: Major > > ... for dialects with SqlConformance.isHavingAlias=false > Very, very similar to -CALCITE-3593.- > Reproducing test case in RelToSqlConverter: > {code:java} > @Test public void testHavingAlias2() { > final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as > gross_weight\n" + > " from \"product\"\n" + > " group by \"product_id\"\n" + > " having sum(\"product\".\"gross_weight\") < 200"; > final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" + > "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" + > "FROM foodmart.product\n" + > "GROUP BY product_id\n" + > "HAVING SUM(product.gross_weight) < 200) AS t1" > // (or) "HAVING gross_weight < 200) AS t1" > // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1" > // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1" > // which on BigQuery gives you an error about aggregating aggregates > ; > sql(query).withBigQuery().ok(expected); > } > {code} > In that one, the pattern was Project/Filter/Aggregate, here it is > Filter/Aggregate/Project. In 3593, the project created a new alias, which got > added to the same SELECT clause and caused the ambiguity. Here, the aggregate > creates an alias, but the filter will write a HAVING clause using the aliases > from before the Aggregate, and that will cause the SQL engine to think that > the filter is on the aggregate field, rather than on the underlying field. > Note that this is less an absurdly unlikely occurrence than it might seem > because when Calcite's default aliasing kicks in and everything gets the name > "$f6", "$f4", etc, so chances of a collision are higher if you have multiply > nested selects with default aliases. > Potential fixes: > # force a subselect, as was done for 3593. > # Force the expression in the HAVING to be fully aliased by table (works at > least in BigQuery, where I tested) > # Write the HAVING expression in terms of the aliases from the aggregate, > rather than what's coming from the aggregate (also works on BigQuery) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3936) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090051#comment-17090051 ] Julian Hyde commented on CALCITE-3936: -- [~jinxing6...@126.com], Your example has initial SQL, plan, and generated SQL. The plan is valid. The generated SQL is invalid. Therefore the problem exists between steps 2 and 3. Whether the initial SQL is valid depends on the value of isHavingAlias, but that's not the point. The problem could be reproduced by creating the plan via RelBuilder. > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3936 > URL: https://issues.apache.org/jira/browse/CALCITE-3936 > Project: Calcite > Issue Type: Bug >Reporter: Steven Talbot >Priority: Major > > ... for dialects with SqlConformance.isHavingAlias=false > Very, very similar to -CALCITE-3593.- > Reproducing test case in RelToSqlConverter: > {code:java} > @Test public void testHavingAlias2() { > final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as > gross_weight\n" + > " from \"product\"\n" + > " group by \"product_id\"\n" + > " having sum(\"product\".\"gross_weight\") < 200"; > final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" + > "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" + > "FROM foodmart.product\n" + > "GROUP BY product_id\n" + > "HAVING SUM(product.gross_weight) < 200) AS t1" > // (or) "HAVING gross_weight < 200) AS t1" > // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1" > // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1" > // which on BigQuery gives you an error about aggregating aggregates > ; > sql(query).withBigQuery().ok(expected); > } > {code} > In that one, the pattern was Project/Filter/Aggregate, here it is > Filter/Aggregate/Project. In 3593, the project created a new alias, which got > added to the same SELECT clause and caused the ambiguity. Here, the aggregate > creates an alias, but the filter will write a HAVING clause using the aliases > from before the Aggregate, and that will cause the SQL engine to think that > the filter is on the aggregate field, rather than on the underlying field. > Note that this is less an absurdly unlikely occurrence than it might seem > because when Calcite's default aliasing kicks in and everything gets the name > "$f6", "$f4", etc, so chances of a collision are higher if you have multiply > nested selects with default aliases. > Potential fixes: > # force a subselect, as was done for 3593. > # Force the expression in the HAVING to be fully aliased by table (works at > least in BigQuery, where I tested) > # Write the HAVING expression in terms of the aliases from the aggregate, > rather than what's coming from the aggregate (also works on BigQuery) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3936) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17086382#comment-17086382 ] Jin Xing commented on CALCITE-3936: --- Add a simpler test case: {code:java} // Sql: isHaving = false SELECT DEPTNO, SUM(SAL) SAL FROM EMP GROUP BY DEPTNO HAVING SUM(SAL) > 100 // Plan: LogicalFilter(condition=[>($1, 100)]) LogicalAggregate(group=[{0}], SAL=[SUM($1)]) LogicalProject(DEPTNO=[$7], SAL=[$5]) JdbcTableScan(table=[[JDBC_SCOTT, EMP]]) // Converted: isHaving = true SELECT DEPTNO, SUM(SAL) AS SAL FROM SCOTT.EMP GROUP BY DEPTNO HAVING SUM(SAL) > 100{code} [1] Calcite doesn't check if the filtering clause conflicts with the alias in Aggregate and convert the filtering condition directly, that's where the bug comes from; We need to add a checking logic when converting a Filter/Aggregate pattern. BTW, [~swtalbot], in the Jira title, do you mean "... for dialects with SqlConformance.isHavingAlias=true" ? I think the issue happen when parse the sql with config (isHaving=false) and convert rel with config (isHaving=true). [1] [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rel2sql/RelToSqlConverter.java#L315] > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3936 > URL: https://issues.apache.org/jira/browse/CALCITE-3936 > Project: Calcite > Issue Type: Bug >Reporter: Steven Talbot >Priority: Major > > ... for dialects with SqlConformance.isHavingAlias=false > Very, very similar to -CALCITE-3593.- > Reproducing test case in RelToSqlConverter: > {code:java} > @Test public void testHavingAlias2() { > final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as > gross_weight\n" + > " from \"product\"\n" + > " group by \"product_id\"\n" + > " having sum(\"product\".\"gross_weight\") < 200"; > final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" + > "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" + > "FROM foodmart.product\n" + > "GROUP BY product_id\n" + > "HAVING SUM(product.gross_weight) < 200) AS t1" > // (or) "HAVING gross_weight < 200) AS t1" > // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1" > // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1" > // which on BigQuery gives you an error about aggregating aggregates > ; > sql(query).withBigQuery().ok(expected); > } > {code} > In that one, the pattern was Project/Filter/Aggregate, here it is > Filter/Aggregate/Project. In 3593, the project created a new alias, which got > added to the same SELECT clause and caused the ambiguity. Here, the aggregate > creates an alias, but the filter will write a HAVING clause using the aliases > from before the Aggregate, and that will cause the SQL engine to think that > the filter is on the aggregate field, rather than on the underlying field. > Note that this is less an absurdly unlikely occurrence than it might seem > because when Calcite's default aliasing kicks in and everything gets the name > "$f6", "$f4", etc, so chances of a collision are higher if you have multiply > nested selects with default aliases. > Potential fixes: > # force a subselect, as was done for 3593. > # Force the expression in the HAVING to be fully aliased by table (works at > least in BigQuery, where I tested) > # Write the HAVING expression in terms of the aliases from the aggregate, > rather than what's coming from the aggregate (also works on BigQuery) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3936) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17086044#comment-17086044 ] Steven Talbot commented on CALCITE-3936: Yes, the only bummer with (1) here is that probably very frequently, this rel structure will have a Project on top of it that effectively drops away the aggregates used for the HAVING. In that case, if we force a subselect when we see the Filter/Aggregate/Project without considering what's above it, we'd end up writing SQL that was less pretty and understandable than it could have been. Perhaps there's a creative solution to that. > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3936 > URL: https://issues.apache.org/jira/browse/CALCITE-3936 > Project: Calcite > Issue Type: Bug >Reporter: Steven Talbot >Priority: Major > > ... for dialects with SqlConformance.isHavingAlias=false > Very, very similar to -CALCITE-3593.- > Reproducing test case in RelToSqlConverter: > {code:java} > @Test public void testHavingAlias2() { > final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as > gross_weight\n" + > " from \"product\"\n" + > " group by \"product_id\"\n" + > " having sum(\"product\".\"gross_weight\") < 200"; > final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" + > "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" + > "FROM foodmart.product\n" + > "GROUP BY product_id\n" + > "HAVING SUM(product.gross_weight) < 200) AS t1" > // (or) "HAVING gross_weight < 200) AS t1" > // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1" > // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1" > // which on BigQuery gives you an error about aggregating aggregates > ; > sql(query).withBigQuery().ok(expected); > } > {code} > In that one, the pattern was Project/Filter/Aggregate, here it is > Filter/Aggregate/Project. In 3593, the project created a new alias, which got > added to the same SELECT clause and caused the ambiguity. Here, the aggregate > creates an alias, but the filter will write a HAVING clause using the aliases > from before the Aggregate, and that will cause the SQL engine to think that > the filter is on the aggregate field, rather than on the underlying field. > Note that this is less an absurdly unlikely occurrence than it might seem > because when Calcite's default aliasing kicks in and everything gets the name > "$f6", "$f4", etc, so chances of a collision are higher if you have multiply > nested selects with default aliases. > Potential fixes: > # force a subselect, as was done for 3593. > # Force the expression in the HAVING to be fully aliased by table (works at > least in BigQuery, where I tested) > # Write the HAVING expression in terms of the aliases from the aggregate, > rather than what's coming from the aggregate (also works on BigQuery) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3936) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17086026#comment-17086026 ] Julian Hyde commented on CALCITE-3936: -- I agree with your analysis. 1 seems to be the best fix. 2 would have worked for 3593 but was presumably too difficult. 3 would fail if the aggregate were not in the SELECT clause (and therefore could not be aliased). > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3936 > URL: https://issues.apache.org/jira/browse/CALCITE-3936 > Project: Calcite > Issue Type: Bug >Reporter: Steven Talbot >Priority: Major > > ... for dialects with SqlConformance.isHavingAlias=false > Very, very similar to -CALCITE-3593.- > Reproducing test case in RelToSqlConverter: > {code:java} > @Test public void testHavingAlias2() { > final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as > gross_weight\n" + > " from \"product\"\n" + > " group by \"product_id\"\n" + > " having sum(\"product\".\"gross_weight\") < 200"; > final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" + > "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" + > "FROM foodmart.product\n" + > "GROUP BY product_id\n" + > "HAVING SUM(product.gross_weight) < 200) AS t1" > // (or) "HAVING gross_weight < 200) AS t1" > // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1" > // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1" > // which on BigQuery gives you an error about aggregating aggregates > ; > sql(query).withBigQuery().ok(expected); > } > {code} > In that one, the pattern was Project/Filter/Aggregate, here it is > Filter/Aggregate/Project. In 3593, the project created a new alias, which got > added to the same SELECT clause and caused the ambiguity. Here, the aggregate > creates an alias, but the filter will write a HAVING clause using the aliases > from before the Aggregate, and that will cause the SQL engine to think that > the filter is on the aggregate field, rather than on the underlying field. > Note that this is less an absurdly unlikely occurrence than it might seem > because when Calcite's default aliasing kicks in and everything gets the name > "$f6", "$f4", etc, so chances of a collision are higher if you have multiply > nested selects with default aliases. > Potential fixes: > # force a subselect, as was done for 3593. > # Force the expression in the HAVING to be fully aliased by table (works at > least in BigQuery, where I tested) > # Write the HAVING expression in terms of the aliases from the aggregate, > rather than what's coming from the aggregate (also works on BigQuery) -- This message was sent by Atlassian Jira (v8.3.4#803005)