[ 
https://issues.apache.org/jira/browse/IMPALA-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers reassigned IMPALA-7865:
-----------------------------------

    Assignee:     (was: Paul Rogers)

> Repeated type widening of arithmetic expressions
> ------------------------------------------------
>
>                 Key: IMPALA-7865
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7865
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> An issue related to IMPALA-7855 occurs in {{ExprRewriterTest.TestToSql()}} in 
> the CTAS test. (This test will be made into a separate method, 
> {{TestCTASToSql()}}). When run with the "integrated rewrite" feature enabled, 
> we get into this odd situation:
>  * Analyze the {{CreateTableAsSelect}} statement. Create a temporary copy of 
> the associated {{SELECT}} statement.
>  * Rewrite the {{SELECT}} statement from {{SELECT 1 + 1}} (both {{TINYINT}}, 
> with {{SMALLINT}} for the {{+}} operation) to {{SELECT 2}} (as type 
> {{TINYINT}}.)
>  * After constant folding, the rule checks the original type of the 
> expression ({{SMALLINT}}) and casts the result ({{TINYINT}}) to the original 
> type ({{SMALLINT}}) using an implicit cast.
>  * Perform column substitutions, reset and reanalyze. This process discards 
> implicit casts. Because the value is 2, it takes the type {{TINYINT}}.
>  * Create the base table expressions using the newly rewritten value 
> ({{TINYINT}}) though the result expression is still {{SMALLINT}}.
>  * Use the base expressions from the above (type as {{TINYINT}}) to declare 
> the target table column.
>  * Now, try to map the result expression {{SMALLINT}} into the newly created 
> table column {{TINYINT}}. Fails with a overflow error.
> While IMPALA-7855 describes how types are widened unnecessarily due to a 
> single expression, the problem here occurs over time, due to repeated 
> analysis of the same numeric expression:
>  * The analyzer implements a set of type propagation rules that generates a 
> resulting type for arithmetic expressions that is wider than the types of the 
> arguments. For example for {{tinyint_col + 1}}, {{tinyint_col}} and {{1}} are 
> {{TINYINT}}, but the result of the expression is promoted to {{SMALLINT}}.
>  * The planner then sets the type of the constant (1 here) to {{SMALLINT}}.
>  * Repeat the process on the next cycle. {{tinyint_col}} is {{TINYINT}}, 
> {{1}} is {{SMALLINT}}. Now the result of the expression is {{INT}} and {{1}} 
> is retyped to be {{INT}}.
>  * Repeat again and the expression (and constant) are promoted to {{BIGINT}}.
> Meanwhile, analysis has taken a clone of the expression with the old types. 
> As a result, the types of columns in the result list for a SELECT statement 
> can differ from the same columns recorded in the SELECT list.
>  * After the above, the base table expression for a {{SELECT}} statement has 
> one schema ({{TINYINT}}), the result expression has another ({{SMALLINT}}).
> While the inconsistency in types may seem a minor issue, it does lead to 
> analysis failures and does need to be addressed.
> Perhaps two fixes are needed:
>  * When rewriting a numeric literal in the constant folding rule, apply the 
> rules from {{NumericLiteral}} to override the type guessed by the constant 
> evaluation.
>  * Modify the {{substituteImpl}} method to a) don't reset numeric literals, 
> or, more generally, b) don't reset expressions that did not change (or their 
> children did not change.)
> Longer term, the implicit cast mechanism is overly fragile: we add it then 
> discard it, resulting in subtle type inconsistencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to