[ 
https://issues.apache.org/jira/browse/IMPALA-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16764632#comment-16764632
 ] 

Paul Rogers commented on IMPALA-6590:
-------------------------------------

[~tarmstrong], I have no hard numbers, my guess about the performance comes 
from inspecting the code. The expression rewriter recursively descends though 
the expression tree. For each node, it spins though a dozen rules. For each 
rule, it applies a series of checks which boil down to an {{instanceof}} check, 
plus additional checks if the node type matches. If a rewrite occurs, we again 
descend down the subtree applying all the rewrite rules again on the chance 
that subexpressions changed.

For only some of these do we invoke constant folding. Generally we'll constant 
fold one node at a time from the root up. We could end up making many calls if 
the expression is something like {{1 + 1 + ... + 1}}.

The fix will be to apply rewrite rules during analysis by calling a method (a 
"virtual function" in C++ terminology) that will auto-call the correct rewrite 
rule for the node class without the need to spin through all possible rewrite 
functions.

> Disable expr rewrites and codegen for VALUES() statements
> ---------------------------------------------------------
>
>                 Key: IMPALA-6590
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6590
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 2.11.0
>            Reporter: Alexander Behm
>            Priority: Major
>              Labels: perf, planner, ramp-up, regression
>
> The analysis of statements with big VALUES clauses like INSERT INTO <tbl> 
> VALUES is slow due to expression rewrites like constant folding. The 
> performance of such statements has regressed since the introduction of expr 
> rewrites and constant folding in IMPALA-1788.
> We should skip expr rewrites for VALUES altogether since it mostly provides 
> no benefit but can have a large overhead due to evaluation of expressions in 
> the backend (constant folding). These expressions are ultimately evaluated 
> and materialized in the backend anyway, so there's no point in folding them 
> during analysis.
> Similarly, there is no point in doing codegen for these exprs in the backend 
> union node.
> *Workaround*
> {code}
> SET ENABLE_EXPR_REWRITES=FALSE;
> SET DISABLE_CODEGEN=TRUE;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to