Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12001 )

Change subject: IMPALA-7902: NumericLiteral fixes, refactoring
......................................................................


Patch Set 8:

(10 comments)

Thanks Tim for your review comments. I've addressed them via explanations here 
or by code changes.

This change is a step along the way to doing single-pass analysis. The 
"explicit type" thing will go away once we get there. But, until then, this 
prevents the bugs that otherwise crop up as I do the work.

The tests exist because of the old maxim to wrap code with unit tests before 
mucking with it.

http://gerrit.cloudera.org:8080/#/c/12001/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12001/7//COMMIT_MSG@33
PS7, Line 33: IMPALA-7891: Analyzer does not detect numeric overflow in CAST
> This behaviour of ignoring overflow doesn't change does it?
This logic turns out to be highly complex. We have two paths:

* Analyzer code does a type conversion. This is the case that was ignoring 
overflow and is fixed.
* BE does the type conversion as part of constant folding. The BE reports the 
error and the FE retains the cast so that the query will fail when run.

Example: SELECT CAST(1000 AS TINYINT)

Ideally, the analyzer would fail the query due to an invalid cast. Looks like 
there is some subtlety here and the code decided it is better to have the BE 
handle the error. (Though, if you have a union with 1B rows, followed by an 
overflow, not sure it is helpful to grind through the 1B rows then fail...)

A better solution overall is to let the BE handle all type conversions via 
constant folding. That is not in this patch to limit size, but might be a good 
change in the future.


http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/analysis/Expr.java@377
PS7, Line 377:   public long getNumDistinctValues() { return 
numDistinctValues_; }
> Do we even need this on the Expr base class? It seems like it's only actual
You are right, the explicit type is a bit of a hack. It works around the 
"repeated type widening" issue without changing much code. Without this, each 
time we analyze a + b, we pick a size for the result that is larger than the 
inputs:

TINYINT + TINYINT --> SMALLINT

We then cast the inputs to that type. Then if we analyze again:

SMALLINT + SMALLINT --> INT

And we cast the same inputs to INT.

Having this explicit type breaks that loop; a condition unique to numbers. 
(Though, strings have a similar problem.)

The ultimate fix is to run the type propagation logic only once, which is the 
ultimate goal. Will take a few more moves to get there, so this is a temporary 
fix.

That said, I have refined this code a bit more, so I'll update the patch with 
those changes; that will remove the method here in the Expr class.


http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
File fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java:

http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java@66
PS7, Line 66: t
> extra space
Done


http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
File fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java:

http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java@53
PS7, Line 53:  *
> Are "natural type" and "implicit type" the same thing?
Natural type is computed from input: 1:TINYINT, 1000:SMALLINT, etc.

The explicit type is the result of a user-defined cast: CAST(1 AS SMALLINT)

An implicit type is the result of parameter matching:

1:TINYINT + 1:TINYINT matches the SMALLINT addition, so we cast the inputs to 
that type.

If we repeat the process (because we analyze twice in the current version), 
types can widen again.

The code to handle this at present is very fragile. The explicit type provides 
a temporary fix that keeps the wheels on long enough to move to a single-pass 
analysis.


http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java@131
PS7, Line 131:
> double? I don't think this needs to be nullable, right?
Fixed.


http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java@283
PS7, Line 283:
> I find these function names potentially confusing. Something like isInFloat
Renamed to fitsInMumble. Trying to keep noise wording to a minimum...


http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java@332
PS7, Line 332:   /**
> It looks like you're preserving some existing behaviour, but I'm not sure t
Yeah, I thought this was odd given how little extra range FLOAT handles on top 
of DECIMAL: both support up to 10^38. (Oh, typo in the comment: "within the 
range of DECIMAL" (not BIGINT)...

Have made the change to always use DOUBLE. Let's see if the tests complain.


http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/common/SqlCastException.java
File fe/src/main/java/org/apache/impala/common/SqlCastException.java:

http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/main/java/org/apache/impala/common/SqlCastException.java@26
PS7, Line 26:
> typo: planner
Done


http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java@492
PS7, Line 492:     try {
> nit: Blank line seems unnecessary
Done


http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/test/java/org/apache/impala/analysis/NumericLiteralTest.java
File fe/src/test/java/org/apache/impala/analysis/NumericLiteralTest.java:

http://gerrit.cloudera.org:8080/#/c/12001/7/fe/src/test/java/org/apache/impala/analysis/NumericLiteralTest.java@445
PS7, Line 445:     {
> I think it would help to have a very brief comment for some of these tests.
Done



--
To view, visit http://gerrit.cloudera.org:8080/12001
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I484600747b2871d3a6fe9153751973af9a8534f2
Gerrit-Change-Number: 12001
Gerrit-PatchSet: 8
Gerrit-Owner: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fwij...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Tue, 04 Dec 2018 01:11:38 +0000
Gerrit-HasComments: Yes

Reply via email to