Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8801 )
Change subject: IMPALA-5191: Standardize column alias behavior ...................................................................... Patch Set 7: (6 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java: http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@282 PS7, Line 282: * Substitute exprs of the form "<number>" with the corresponding > Update this comment to describe our new alias/ordinal substitution behavior Updated the comment. http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@548 PS7, Line 548: havingClause_; > To simplify and be more internally consistent I think we should use substit substituteOrdinalsAliases() takes a list of Exprs as param, while we only have a single expr here. I introduced the substituteOrdinalAlias() method, which takes a single expr, and now substituteOrdinalsAliases() calls this method in a loop. I can confirm the bug, and added a new check to TestAliasesAndOrdinals(). http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@985 PS7, Line 985: // IMPALA-5191: In GROUP BY, HAVING, ORDER BY, aliases and ordinals must only be > Now that aliases and ordinals are treated consistently, can you combine the I found it cumbersome to combine the tests, because some of them pass, some of them fails with different error messages. But added more tests by hand. ORDER BY ordinal works a bit funny currently. If the expression is an integer and can be interpreted as an ordinal, then the corresponding column is used. If the expression is a floating point number, then this float value is used in the order by, which basically means no sorting. E.g.: select int_col / 2 as x, max(int_col) from functional.alltypes group by 1 order by 1 * 1; // order by first col select int_col / 2 as x, max(int_col) from functional.alltypes group by 1 order by 1 * 3; // error, no 3rd col select int_col / 2 as x, max(int_col) from functional.alltypes group by 1 order by 1 * 3.0; // order by 3.0 => no sorting (PostgreSQL allows writing constant arithmetic expressions in the ORDER BY clause, and an expression like 'ORDER BY 1 * 2' has no effect on the sorting.) The bug hides in AnalysisContext.analyze(): In the first round of analysis ORDER BY gets the ArithmeticExpr '1 * 1'. Then the statement rewriter rewrites it to 1. After that the whole select statement is re-analyzed, so ORDER BY gets NumericLiteral 1 in the second round, which will be interpreted as an ordinal. GROUP BY and HAVING generate error for 1 * 1, but accept 1. It's because GROUP BY and HAVING throw exception during the first round of the analysis. What solution do you suggest for this particular bug? Should I track if the statement is re-analyzed and not substitute ordinals? Should I not rewrite the expressions of ORDER BY? http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@996 PS7, Line 996: + "group by nb having nb"); > you can leave out group by Done http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1007 PS7, Line 1007: // select count(*) a ... GROUP BY, ORDER BY, HAVING > Comment doesn't add much since we can see that from the test query test. A Done http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1019 PS7, Line 1019: // select sum(id) over(order by id) a ... GROUP BY, ORDER BY, HAVING > Alias referring to analytic output in GROUP BY ... Done -- To view, visit http://gerrit.cloudera.org:8080/8801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0f82483b486acf6953876cfa672b0d034f3709a8 Gerrit-Change-Number: 8801 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Mon, 18 Dec 2017 16:19:57 +0000 Gerrit-HasComments: Yes