Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8801 )

Change subject: IMPALA-5191: Standardize column alias behavior
......................................................................


Patch Set 7:

(6 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@282
PS7, Line 282:    * Substitute exprs of the form "<number>"  with the 
corresponding
> Update this comment to describe our new alias/ordinal substitution behavior
Updated the comment.


http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@548
PS7, Line 548: havingClause_;
> To simplify and be more internally consistent I think we should use substit
substituteOrdinalsAliases() takes a list of Exprs as param, while we only have 
a single expr here.
I introduced the substituteOrdinalAlias() method, which takes a single expr, 
and now substituteOrdinalsAliases() calls this method in a loop.

I can confirm the bug, and added a new check to TestAliasesAndOrdinals().


http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@985
PS7, Line 985:     // IMPALA-5191: In GROUP BY, HAVING, ORDER BY, aliases and 
ordinals must only be
> Now that aliases and ordinals are treated consistently, can you combine the
I found it cumbersome to combine the tests, because some of them pass, some of 
them fails with different error messages.
But added more tests by hand.

ORDER BY ordinal works a bit funny currently. If the expression is an integer 
and can be interpreted as an ordinal, then the corresponding column is used.
If the expression is a floating point number, then this float value is used in 
the order by, which basically means no sorting.

E.g.:
 select int_col / 2 as x, max(int_col)
 from functional.alltypes
 group by 1
 order by 1 * 1; // order by first col


 select int_col / 2 as x, max(int_col)
 from functional.alltypes
 group by 1
 order by 1 * 3; // error, no 3rd col


 select int_col / 2 as x, max(int_col)
 from functional.alltypes
 group by 1
 order by 1 * 3.0; // order by 3.0 => no sorting

(PostgreSQL allows writing constant arithmetic expressions in the ORDER BY 
clause, and an expression like 'ORDER BY 1 * 2' has no effect on the sorting.)

The bug hides in AnalysisContext.analyze():
In the first round of analysis ORDER BY gets the ArithmeticExpr '1 * 1'.
Then the statement rewriter rewrites it to 1.
After that the whole select statement is re-analyzed, so ORDER BY gets 
NumericLiteral 1 in the second round, which will be interpreted as an ordinal.

GROUP BY and HAVING generate error for 1 * 1, but accept 1. It's because GROUP 
BY and HAVING throw exception during the first round of the analysis.

What solution do you suggest for this particular bug? Should I track if the 
statement is re-analyzed and not substitute ordinals? Should I not rewrite the 
expressions of ORDER BY?


http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@996
PS7, Line 996:         + "group by nb having nb");
> you can leave out group by
Done


http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1007
PS7, Line 1007:     // select count(*) a ... GROUP BY, ORDER BY, HAVING
> Comment doesn't add much since we can see that from the test query test. A
Done


http://gerrit.cloudera.org:8080/#/c/8801/7/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1019
PS7, Line 1019:     // select sum(id) over(order by id) a ... GROUP BY, ORDER 
BY, HAVING
> Alias referring to analytic output in GROUP BY ...
Done



--
To view, visit http://gerrit.cloudera.org:8080/8801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0f82483b486acf6953876cfa672b0d034f3709a8
Gerrit-Change-Number: 8801
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Mon, 18 Dec 2017 16:19:57 +0000
Gerrit-HasComments: Yes

Reply via email to