Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
......................................................................


Patch Set 10:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG@43
PS8, Line 43: (IMPALA-7737)
> are there examples of these fns in our benchmarks to quantify the regressio
Done


http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG@45
PS8, Line 45: Still, the fix provides most of what the JIRA ticket requested
> I'd skip these next three paragraphs.
Done


http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG@57
PS8, Line 57:
> pls make a section for this called "Testing" so its easier to jump to. also
Done


http://gerrit.cloudera.org:8080/#/c/11760/8/be/src/exprs/conditional-functions.h
File be/src/exprs/conditional-functions.h:

http://gerrit.cloudera.org:8080/#/c/11760/8/be/src/exprs/conditional-functions.h@76
PS8, Line 76: since
            : /// various bugs mean that this implementation is still sometimes 
used. But
            : /// the goal is to remove these classes at some point.
> simpler: "until their use is eliminated by the frontend".
Done


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
File fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@37
PS8, Line 37: vanish
> is this accurate given the comments in the commit message about order by?
Yes, they vanish after the rewrite. If the rewrite does not occur (due to bugs 
or options), then the functions do not vanish.


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@48
PS8, Line 48: planner runs the rule to simplify CASE
            :  * after this rule. Where that other rule can perform 
simplifications,
            :  * those simplifications are omitted here
> simplify and use the specific rule name for concreteness.
Rewrote. Since many rules apply, did not seem to add value to try to enumerate 
them.


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@106
PS8, Line 106:    return rewriteCoalesceFn(e
> clarify whether you think this happens after the rewrite or before. If its
Done


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@130
PS8, Line 130:    Lists.ne
> isn't this all that's done here (the most general case) and we'll depend on
Rewrote comment


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@172
PS8, Line 172: ESCE(a, literal) when literal
> The simplest rewrite here would be to not look at the child exprs for the v
I found that I could strip this function down as you suggest, but the results 
were sub-optimal due to weaknesses in other parts of the code.

coalesce(id) --> CASE ELSE id END (expected: id)
coalesce(id, null) --> CASE WHEN id IS NOT NULL THEN ID ELSE NULL END 
(expected: id)
coalesce(null, id) --> same as above
coalesce(null is null, id) --> 1 (expected: TRUE)
coalesce(10 + null, id) --> CASE WHEN NULL IS NOT NULL THEN NULL ELSE id END 
(expected: id)
coalesce(42, min(distinct tinyint_col)) --> CASE WHEN FALSE THEN NULL WHEN TRUE 
THEN 42 ELSE min(distinct tinyint_col, 42)) END (Expected: CASE WHEN 
min(distinct tinyint_col) IS NOT NULL THEN min(distinct tinyint_col) ELSE 42 
END)

And so on.

One solution is to restore the original coalesce() function rewrite which 
produces somewhat better results. Even that function has holes which this 
version fixed. So, we'd lose those optimizations.

As it turns out, all this is moot because the BE CASE implementation has a bug 
so we can't even use CASE to replace coalesce()...

For now, I did rework the function to the simple form requested, then disabled 
it. We can tackle any other concerns if/when it is a priority.


http://gerrit.cloudera.org:8080/#/c/11760/8/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test:

http://gerrit.cloudera.org:8080/#/c/11760/8/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test@1137
PS8, Line 1137: |  other predicates: functional.alltypestiny.tinyint_col + 
functional.alltypestiny.smallint_col + functional.alltypestiny.int_col > 10, 
CASE WHEN functional.alltypestiny.tinyint_col + 
functional.alltypestiny.bigint_col IS NULL THEN 1 ELSE 
functional.alltypestiny.tinyint_col + functional.alltypestiny.bigint_col END = 1
> so this is the example of the performance regression (same work on multiple
Yes. See IMPALA-7737. My concern with the coalesce() rewrite is that, without 
IMPALA-7737, and without VERY clever optimizations in LLVM, we'd end up doing 
duplicate work.

As noted elsewhere, the BE has a bug that prevents us from rewriting 
coalesce(), so I had to revert many of these test revisions.

May be the best path is to fix the underlying issues, then do a clean version 
of this change.



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 10
Gerrit-Owner: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Wed, 31 Oct 2018 21:38:24 +0000
Gerrit-HasComments: Yes

Reply via email to