[ https://issues.apache.org/jira/browse/IMPALA-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Rogers reassigned IMPALA-7865: ----------------------------------- Assignee: (was: Paul Rogers) > Repeated type widening of arithmetic expressions > ------------------------------------------------ > > Key: IMPALA-7865 > URL: https://issues.apache.org/jira/browse/IMPALA-7865 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Affects Versions: Impala 3.0 > Reporter: Paul Rogers > Priority: Minor > > An issue related to IMPALA-7855 occurs in {{ExprRewriterTest.TestToSql()}} in > the CTAS test. (This test will be made into a separate method, > {{TestCTASToSql()}}). When run with the "integrated rewrite" feature enabled, > we get into this odd situation: > * Analyze the {{CreateTableAsSelect}} statement. Create a temporary copy of > the associated {{SELECT}} statement. > * Rewrite the {{SELECT}} statement from {{SELECT 1 + 1}} (both {{TINYINT}}, > with {{SMALLINT}} for the {{+}} operation) to {{SELECT 2}} (as type > {{TINYINT}}.) > * After constant folding, the rule checks the original type of the > expression ({{SMALLINT}}) and casts the result ({{TINYINT}}) to the original > type ({{SMALLINT}}) using an implicit cast. > * Perform column substitutions, reset and reanalyze. This process discards > implicit casts. Because the value is 2, it takes the type {{TINYINT}}. > * Create the base table expressions using the newly rewritten value > ({{TINYINT}}) though the result expression is still {{SMALLINT}}. > * Use the base expressions from the above (type as {{TINYINT}}) to declare > the target table column. > * Now, try to map the result expression {{SMALLINT}} into the newly created > table column {{TINYINT}}. Fails with a overflow error. > While IMPALA-7855 describes how types are widened unnecessarily due to a > single expression, the problem here occurs over time, due to repeated > analysis of the same numeric expression: > * The analyzer implements a set of type propagation rules that generates a > resulting type for arithmetic expressions that is wider than the types of the > arguments. For example for {{tinyint_col + 1}}, {{tinyint_col}} and {{1}} are > {{TINYINT}}, but the result of the expression is promoted to {{SMALLINT}}. > * The planner then sets the type of the constant (1 here) to {{SMALLINT}}. > * Repeat the process on the next cycle. {{tinyint_col}} is {{TINYINT}}, > {{1}} is {{SMALLINT}}. Now the result of the expression is {{INT}} and {{1}} > is retyped to be {{INT}}. > * Repeat again and the expression (and constant) are promoted to {{BIGINT}}. > Meanwhile, analysis has taken a clone of the expression with the old types. > As a result, the types of columns in the result list for a SELECT statement > can differ from the same columns recorded in the SELECT list. > * After the above, the base table expression for a {{SELECT}} statement has > one schema ({{TINYINT}}), the result expression has another ({{SMALLINT}}). > While the inconsistency in types may seem a minor issue, it does lead to > analysis failures and does need to be addressed. > Perhaps two fixes are needed: > * When rewriting a numeric literal in the constant folding rule, apply the > rules from {{NumericLiteral}} to override the type guessed by the constant > evaluation. > * Modify the {{substituteImpl}} method to a) don't reset numeric literals, > or, more generally, b) don't reset expressions that did not change (or their > children did not change.) > Longer term, the implicit cast mechanism is overly fragile: we add it then > discard it, resulting in subtle type inconsistencies. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org