[jira] [Comment Edited] (IMPALA-7655) Codegen output for conditional functions (if,isnull, coalesce) is very suboptimal

Paul Rogers (JIRA) Wed, 24 Oct 2018 19:55:56 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662835#comment-16662835
 ]


Paul Rogers edited comment on IMPALA-7655 at 10/25/18 2:54 AM:
---------------------------------------------------------------

The devil, on this kind of change, is always in the details. Here are some that 
must be resolved

h4. Rule Ordering Dependencies

Prior to this change, several functions were rewritten to use {{if}}, and 
{{if}} was simplified to handle degenerate cases. Now, those functions are 
rewritten to use {{CASE}}. This then leads to a rule-ordering issue. If we 
simplify {{ifNull(null, y)}} to {{CASE WHEN null is not null THEN null ELSE 
y}}, we'd expect this to further simplify to {{y}}.

In unit test path, each simplification rule is applied once. The 
constant-folding rule is applied before the if-rewrite. Thus, if we generate 
{{null is null}}, the constant folding rule won't be applied a second time. 
Instead, the {{ifnull()}} rewrite code must handle these optimizations.

Oddly, outside of the rule rewrite unit tests, rules are applied repeatedly, 
and so the upstream rewrites might be done. This seems a limitation in the 
testing framework.

In general, the conditional function rewrites must handle their own null-based 
simplifications; they can't assume {{CASE}} rewrites will handle them.

h4. Special Aggregate Handling

Another issue is how IMPALA-5125 was implemented: if rewriting an expression 
drops the last aggregate, the fix simply throws away the rewrite. This is fine 
for simplifications, but not for the conditional rewrites. Since there is no BE 
implementation for the conditional functions, we can't revert to the original 
form. This means the the conditional rewrites must be aware of aggregates.

h4. Conditional Rewrites Always Enabled

Next, {{BETWEEN}} is rewritten as {{IF()}}, but appears to be done *after* the 
conditional function rewrite rules, leaving an {{if()}} call in the final plan.

Turns out that this issue can occur because, in the {{Analyzer}}, the user can 
disable optimization. At present, the conditional rewrite is part of the 
{{SimplifyConditionalsRule}} which is optional. But, the conditional rewrite is 
not optional, it must always be done whether optimizations are on or not.

Further, if optimizations are on, we want to apply constant folding and other 
rules before the conditional rewrite.

As a result, the conditional rewrites need to be pulled out into a new rule. 
And, that rule must always be applied. If optimizations are on, apply the above 
optimizations first. Otherwise, just apply the rewrites directly.

h4. Handling {{NVL2()}} in the Function Registry

Finally, the change creates function registry entries for {{nvl2()}} so that it 
can be rewritten in the same place as the other conditional functions (directly 
to {{CASE}} without first going through {{if()}}). Since the signature of this 
function is n^2 in types, the function entry would be huge (which is probably 
why it was rewritten very early in the parse process originally.) A special 
handling of {{NVL2()}} (and, later, {{DECODE()}}, see IMPALA-7747) is needed. 
The special handling can exploit the fact that {{NVL2()}} is a useful fiction; 
a short-hand for a {{CASE}}, rather than being a real function that must 
enforce stricter rules.

h4. Issues with IS [NOT] DISTINCT FROM Optimizations

Testing revealed one issue and one limitation of the {{IS [NOT] DISTINCT FROM}} 
optimizations in {{SimplifyDistinctFromRule}}:

* The rule can simplify {{sum(id) <=> sum(id)}} to {{false}}. But, according to 
IMPALA-5125, the rewrite cannot be applied if it would remove the only 
aggregate. This is a bug.
* A simplification is missing: {{id IS DISTINCT FROM NULL}} &rarr; {{id IS NOT 
NULL}}.

h4. Incomplete Analysis After Rules Fire

The analyzer has a chain of rules which fire in order without (as noted above) 
repeats. The result of rule A (rewriting conditional functions) is fed into 
rule B (simplify CASE). Each rule requires that analysis be done so that 
attributes of expressions can be picked out.

As it turns out, in the current code, this is rather ad-hoc. The 
{{SimplifyConditionalsRule}} re-analyzes its result as part of the fix for 
IMPALA-5125, but others do not, leading to optimizations not working. In 
particular, in a chain of rewrites for {{IS DISTINCT FROM}}, certain rules 
didn't fire because previous rules left new expressions in an un-analyzed 
state. This is a bug.

The fix is to analyze the result any time a rule fires, before passing the 
result to the next rule.Then, in rules that simply refused to run if an 
expression is to analyzed:

{code:java}
public class SimplifyDistinctFromRule implements ExprRewriteRule {
  public Expr apply(Expr expr, Analyzer analyzer) {
    if (!expr.isAnalyzed()) return expr;
{code}

Replace this with an assertion that analysis must have been done:

{code:java}
public class SimplifyDistinctFromRule implements ExprRewriteRule {
  public Expr apply(Expr expr, Analyzer analyzer) {
    assert expr.isAnalyzed();
{code}

To be safe, the assertion fires only in "debug" mode, not in production.

h4. Rewrite of the Wrong Order By Expressions

The select statement represents the ORDER BY clause in two distinct ways. 
First, there is a list of the "original" ordering expressions, 
{{orderByElements_}}. Second, there is a semantisized list in {{sortInfo_}}. 
The explanation is:

{code:java}
      // create copies, we don't want to modify the original parse node, in case
      // we need to print it
{code}

Later, we apply rewrite rules to the {{ORDER BY}} expression, but we do so 
using the original version, not the copy:

{code:java}
      for (OrderByElement orderByElem: orderByElements_) {
        orderByElem.setExpr(rewriteCheckOrdinalResult(rewriter, 
orderByElem.getExpr()));
      }
{code}

The result is that we apply rewrite rules to expressions which have not been 
analyzed, triggering the assertion mentioned above. This assertion is telling 
us something: we skipped a step. Here, it turns out we are rewriting the wrong 
set of expressions. Modifying the code to rewrite those in {{sortInfo_}} solves 
the problem. The current behavior is a bug as the rewrites currently do 
nothing, and the expressions we thought we were rewriting are never touched.

h4. Invalid Logic to Reset a SELECT Statement

The test {{ExprRewriteTest}} has the following logic:

{code:java}
  public void RewritesOk(String stmt, int expectedNumChanges,
      int expectedNumExprTrees) throws ImpalaException {
    // Analyze without rewrites since that's what we want to test here.
    StatementBase parsedStmt = (StatementBase) ParsesOk(stmt);
    ...
    parsedStmt.rewriteExprs(exprToTrue_);
    ...

    // Make sure the stmt can be successfully re-analyzed.
    parsedStmt.reset();
    AnalyzesOkNoRewrite(parsedStmt);
  }
{code}

Basically, this replaces all expressions with a Boolean constant, then counts 
the number of replacements. A fine test. Then, the {{reset())}} call is 
supposed to put things back the way they were.

The problem is, the rewrite rule rewrite the one and only copy of the 
{{SELECT}} list expressions. The second time around, we get a failure because 
the {{ORDER BY}} clause (which was kept as an original copy) refers to the 
now-gone {{SELECT}} clause.

This error was not previously seen because the prior bug masked it.

This is an odd bug as {{reset()}} is called only from this one place.

The premise of test itself is invalid: we want to know that, after we rewrite 
the query from

{code:sql}
select a.int_col a, 10 b, 20.2 c, ...
order by a.int_col, 4 limit 10
{code}

To

{code:sql}
select FALSE a, FALSE b, FALSE c, ...
order by a.int_col, 4 limit 10
{code}

We assert that the query should again analyze correctly. This is an unrealistic 
expectation. Once the above bug is fixed, we verify that the new query is 
actually invalid, which, in fact, it is.

Two fixes are possible:

# Create copies of all lists that are rewritten ({{SELECT}}, {{HAVING}}, etc.)
# Remove the {{reset()}} test and (since this is the only use), the {{reset()}} 
code since it cannot actually do what it is advertised to do.



was (Author: paul.rogers):
The devil, on this kind of change, is always in the details. Here are some that 
must be resolved

h4. Rule Ordering Dependencies

Prior to this change, several functions were rewritten to use {{if}}, and 
{{if}} was simplified to handle degenerate cases. Now, those functions are 
rewritten to use {{CASE}}. This then leads to a rule-ordering issue. If we 
simplify {{ifNull(null, y)}} to {{CASE WHEN null is not null THEN null ELSE 
y}}, we'd expect this to further simplify to {{y}}.

In unit test path, each simplification rule is applied once. The 
constant-folding rule is applied before the if-rewrite. Thus, if we generate 
{{null is null}}, the constant folding rule won't be applied a second time. 
Instead, the {{ifnull()}} rewrite code must handle these optimizations.

Oddly, outside of the rule rewrite unit tests, rules are applied repeatedly, 
and so the upstream rewrites might be done. This seems a limitation in the 
testing framework.

In general, the conditional function rewrites must handle their own null-based 
simplifications; they can't assume {{CASE}} rewrites will handle them.

h4. Special Aggregate Handling

Another issue is how IMPALA-5125 was implemented: if rewriting an expression 
drops the last aggregate, the fix simply throws away the rewrite. This is fine 
for simplifications, but not for the conditional rewrites. Since there is no BE 
implementation for the conditional functions, we can't revert to the original 
form. This means the the conditional rewrites must be aware of aggregates.

h4. Conditional Rewrites Always Enabled

Next, {{BETWEEN}} is rewritten as {{IF()}}, but appears to be done *after* the 
conditional function rewrite rules, leaving an {{if()}} call in the final plan.

Turns out that this issue can occur because, in the {{Analyzer}}, the user can 
disable optimization. At present, the conditional rewrite is part of the 
{{SimplifyConditionalsRule}} which is optional. But, the conditional rewrite is 
not optional, it must always be done whether optimizations are on or not.

Further, if optimizations are on, we want to apply constant folding and other 
rules before the conditional rewrite.

As a result, the conditional rewrites need to be pulled out into a new rule. 
And, that rule must always be applied. If optimizations are on, apply the above 
optimizations first. Otherwise, just apply the rewrites directly.

h4. Handling {{NVL2()}} in the Function Registry

Finally, the change creates function registry entries for {{nvl2()}} so that it 
can be rewritten in the same place as the other conditional functions (directly 
to {{CASE}} without first going through {{if()}}). Since the signature of this 
function is n^2 in types, the function entry would be huge (which is probably 
why it was rewritten very early in the parse process originally.) A special 
handling of {{NVL2()}} (and, later, {{DECODE()}}, see IMPALA-7747) is needed. 
The special handling can exploit the fact that {{NVL2()}} is a useful fiction; 
a short-hand for a {{CASE}}, rather than being a real function that must 
enforce stricter rules.

h4. Issues with IS [NOT] DISTINCT FROM Optimizations

Testing revealed one issue and one limitation of the {{IS [NOT] DISTINCT FROM}} 
optimizations in {{SimplifyDistinctFromRule}}:

* The rule can simplify {{sum(id) <=> sum(id)}} to {{false}}. But, according to 
IMPALA-5125, the rewrite cannot be applied if it would remove the only 
aggregate. This is a bug.
* A simplification is missing: {{id IS DISTINCT FROM NULL}} &rarr; {{id IS NOT 
NULL}}.

h4. Incomplete Analysis After Rules Fire

The analyzer has a chain of rules which fire in order without (as noted above) 
repeats. The result of rule A (rewriting conditional functions) is fed into 
rule B (simplify CASE). Each rule requires that analysis be done so that 
attributes of expressions can be picked out.

As it turns out, in the current code, this is rather ad-hoc. The 
{{SimplifyConditionalsRule}} re-analyzes its result as part of the fix for 
IMPALA-5125, but others do not, leading to optimizations not working. In 
particular, in a chain of rewrites for {{IS DISTINCT FROM}}, certain rules 
didn't fire because previous rules left new expressions in an un-analyzed 
state. This is a bug.

The fix is to analyze the result any time a rule fires, before passing the 
result to the next rule.Then, in rules that simply refused to run if an 
expression is to analyzed:

{code:java}
public class SimplifyDistinctFromRule implements ExprRewriteRule {
  public Expr apply(Expr expr, Analyzer analyzer) {
    if (!expr.isAnalyzed()) return expr;
{code}

Replace this with an assertion that analysis must have been done:

{code:java}
public class SimplifyDistinctFromRule implements ExprRewriteRule {
  public Expr apply(Expr expr, Analyzer analyzer) {
    assert expr.isAnalyzed();
{code}

To be safe, the assertion fires only in "debug" mode, not in production.

h4. Rewrite of the Wrong Order By Expressions

The select statement represents the ORDER BY clause in two distinct ways. 
First, there is a list of the "original" ordering expressions, 
{{orderByElements_}}. Second, there is a semantisized list in {{sortInfo_}}. 
The explanation is:

{code:java}
      // create copies, we don't want to modify the original parse node, in case
      // we need to print it
{code}

Later, we apply rewrite rules to the {{ORDER BY}} expression, but we do so 
using the original version, not the copy:

{code:java}
      for (OrderByElement orderByElem: orderByElements_) {
        orderByElem.setExpr(rewriteCheckOrdinalResult(rewriter, 
orderByElem.getExpr()));
      }
{code}

The result is that we apply rewrite rules to expressions which have not been 
analyzed, triggering the assertion mentioned above. This assertion is telling 
us something: we skipped a step. Here, it turns out we are rewriting the wrong 
set of expressions. Modifying the code to rewrite those in {{sortInfo_}} solves 
the problem. The current behavior is a bug as the rewrites currently do 
nothing, and the expressions we thought we were rewriting are never touched.


> Codegen output for conditional functions (if,isnull, coalesce) is very 
> suboptimal
> ---------------------------------------------------------------------------------
>
>                 Key: IMPALA-7655
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7655
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Tim Armstrong
>            Priority: Major
>              Labels: codegen, perf, performance
>
> https://gerrit.cloudera.org/#/c/11565/ provided a clue that an aggregation 
> involving an if() function was very slow, 10x slower than the equivalent 
> version using a case:
> {noformat}
> [localhost:21000] default> set num_nodes=1; set mt_dop=1; select count(case 
> when l_orderkey is NULL then 1 else NULL end) from 
> tpch10_parquet.lineitem;summary;
> NUM_NODES set to 1
> MT_DOP set to 1
> Query: select count(case when l_orderkey is NULL then 1 else NULL end) from 
> tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 11:17:31 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=274b2a6f35cefe31:95a1964200000000
> +----------------------------------------------------------+
> | count(case when l_orderkey is null then 1 else null end) |
> +----------------------------------------------------------+
> | 0                                                        |
> +----------------------------------------------------------+
> Fetched 1 row(s) in 0.51s
> +--------------+--------+----------+----------+--------+------------+----------+---------------+-------------------------+
> | Operator     | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak 
> Mem | Est. Peak Mem | Detail                  |
> +--------------+--------+----------+----------+--------+------------+----------+---------------+-------------------------+
> | 01:AGGREGATE | 1      | 44.03ms  | 44.03ms  | 1      | 1          | 25.00 
> KB | 10.00 MB      | FINALIZE                |
> | 00:SCAN HDFS | 1      | 411.57ms | 411.57ms | 59.99M | -1         | 16.61 
> MB | 88.00 MB      | tpch10_parquet.lineitem |
> +--------------+--------+----------+----------+--------+------------+----------+---------------+-------------------------+
> [localhost:21000] default> set num_nodes=1; set mt_dop=1; select 
> count(if(l_orderkey is NULL, 1, NULL)) from tpch10_parquet.lineitem;summary;
> NUM_NODES set to 1
> MT_DOP set to 1
> Query: select count(if(l_orderkey is NULL, 1, NULL)) from 
> tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 11:23:07 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=8e46ab1b84c4dbff:2786ca2600000000
> +----------------------------------------+
> | count(if(l_orderkey is null, 1, null)) |
> +----------------------------------------+
> | 0                                      |
> +----------------------------------------+
> Fetched 1 row(s) in 1.01s
> +--------------+--------+----------+----------+--------+------------+----------+---------------+-------------------------+
> | Operator     | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak 
> Mem | Est. Peak Mem | Detail                  |
> +--------------+--------+----------+----------+--------+------------+----------+---------------+-------------------------+
> | 01:AGGREGATE | 1      | 422.07ms | 422.07ms | 1      | 1          | 25.00 
> KB | 10.00 MB      | FINALIZE                |
> | 00:SCAN HDFS | 1      | 511.13ms | 511.13ms | 59.99M | -1         | 16.61 
> MB | 88.00 MB      | tpch10_parquet.lineitem |
> +--------------+--------+----------+----------+--------+------------+----------+---------------+-------------------------+
> {noformat}
> It turns out that this is because we don't have good codegen support for 
> ConditionalFunction, and just fall back to emitting a call to the interpreted 
> path: 
> https://github.com/apache/impala/blob/master/be/src/exprs/conditional-functions.cc#L28
> See CaseExpr for an example of much better codegen support: 
> https://github.com/apache/impala/blob/master/be/src/exprs/case-expr.cc#L178



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Comment Edited] (IMPALA-7655) Codegen output for conditional functions (if,isnull, coalesce) is very suboptimal

Reply via email to