Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15047 )

Change subject: IMPALA-8361: Fix bound predicates optimization doesn't work for 
InlineView
......................................................................


Patch Set 4:

(7 comments)

Thanks for digging into this!

If I understand this solution correctly, the key change of this patch is inside 
SingleNodePlanner#migrateConjunctsToInlineView(). Before this change, we skip 
predicate b.upper_val='1' since canEvalPredicate() returns false on it. Because 
it doesn't satisfy isLastOjMaterializedByTupleIds(). So we can't migrate it 
inside the inline view. However, we can be more aggressive. It's correct to 
duplicate(not migrate) this predicate inside the inline view since it's not 
evaluted to true with null slots. This patch picks up the predicates that being 
skipped due to not satisfying isLastOjMaterializedByTupleIds() and duplicates 
them if they satisfy !isTrueWithNullSlots().

I left some comments for refactoring. Hope we can make this part clearer.

http://gerrit.cloudera.org:8080/#/c/15047/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15047/4//COMMIT_MSG@7
PS4, Line 7: Fix bound predicates optimization doesn't work for
           : InlineView
This is an improvement rather than a fix. I think this is better: "Propagate 
predicates of outer-joined InlineView".


http://gerrit.cloudera.org:8080/#/c/15047/4//COMMIT_MSG@10
PS4, Line 10: This is improvement that migrated predicates of the nullable side 
of
            : the outer join into inline view.We should duplicate it and 
evaluate
            : after join.
Not all the predicates of the nullable side could be duplicated into the inline 
view, e.g. "b.upper_val is null" can only be evaluated at the JOIN node. Hope 
you can explain your change in more details.


http://gerrit.cloudera.org:8080/#/c/15047/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/15047/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1623
PS4, Line 1623:       if (isAntiJoinedConjunct(e)) return 
canEvalAntiJoinedConjunct(e, tupleIds);
              :
              :       if (isIjConjunct(e) || isSjConjunct(e)) {
              :         if (!containsOuterJoinedTid(tids)) return true;
              :         // If the predicate references an outer-joined tuple, 
then evaluate it at
              :         // the join that the On-clause belongs to.
              :         TableRef onClauseTableRef = null;
              :         if (isIjConjunct(e)) {
              :           onClauseTableRef = 
globalState_.ijClauseByConjunct.get(e.getId());
              :         } else {
              :           onClauseTableRef = 
globalState_.sjClauseByConjunct.get(e.getId());
              :         }
              :         Preconditions.checkNotNull(onClauseTableRef);
              :         return 
tupleIds.containsAll(onClauseTableRef.getAllTableRefIds());
              :       }
              :
              :       if (isFullOuterJoined(e)) return 
canEvalFullOuterJoinedConjunct(e, tupleIds);
              :       if (isOjConjunct(e)) {
              :         // Force this predicate to be evaluated by the 
corresponding outer join node.
              :         // The join node will pick up the predicate later via 
getUnassignedOjConjuncts().
              :         if (tids.size() > 1) return false;
              :         // Optimization for single-tid predicates: Legal to 
assign below the outer join
              :         // if the predicate is from the same On-clause that 
makes tid nullable
              :         // (otherwise e needn't be true when that tuple is set).
              :         TupleId tid = tids.get(0);
              :         return globalState_.ojClauseByConjunct.get(e.getId()) 
== getLastOjClause(tid);
              :       }
              :
              :       // Should have returned in one of the cases above.
              :       Preconditions.checkState(false);
The new overload of canEvalPredicate() is not quite clear in my opinion. We can 
encapsulate these into a function 'canEvalOnClauseConjunct(tupleIds, e)' and 
use it directly in SingleNodePlanner#migrateConjunctsToInlineView().


http://gerrit.cloudera.org:8080/#/c/15047/4/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/15047/4/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@184
PS4, Line 184:       Expr.removeDuplicates(conjuncts_);
Can we move this into MultiAggregateInfo#collectConjuncts()?


http://gerrit.cloudera.org:8080/#/c/15047/4/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/15047/4/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1191
PS4, Line 1191:       if (analyzer.canEvalPredicate(tupleIds, e, false)) {
              :         if (e.isOnClauseConjunct() ||
              :             analyzer.isLastOjMaterializedByTupleIds(tupleIds, 
e)) {
              :           preds.add(e);
              :         } else {
              :           // Deal predicate belong to an outer-joined tuple
              :           try {
              :             if (analyzer.isTrueWithNullSlots(e)) continue;
              :           } catch (InternalException ex) {
              :             // Expr evaluation failed in the backend. Skip 'e' 
since we cannot
              :             // determine whether propagation is safe.
              :             LOG.warn("Skipping propagation of conjunct because 
backend evaluation "
              :                 + "failed: " + e.toSql(), ex);
              :             continue;
              :           }
              :           evalAfterJoinPreds.add(e);
              :         }
Althouth the first if-clause is semantically equal to the old logics, it's not 
quite clear since e.isOnClauseConjunct() and 
analyzer.isLastOjMaterializedByTupleIds(tupleIds, e) which are part of 
canEvalPredicate(). I recommend refactoring it to not invoke canEvalPredicate() 
like this (with my comment in Analyzer.java):

      if (e.isOnClauseConjunct()) {
        if (analyzer.canEvalOnClauseConjunct(tupleIds, e)) preds.add(e);
      } else if (analyzer.isLastOjMaterializedByTupleIds(tupleIds, e)) {
        preds.add(e);
      } else {
        // Deal with other predicates bounded to an outer-joined tuple
        try {
          if (analyzer.isTrueWithNullSlots(e)) continue;
        } catch (InternalException ex) {
          // Expr evaluation failed in the backend. Skip 'e' since we cannot
          // determine whether propagation is safe.
          LOG.warn("Skipping propagation of conjunct because backend evaluation 
"
              + "failed: " + e.toSql(), ex);
          continue;
        }
        evalAfterJoinPreds.add(e);
      }

Also add some trace logging for better debugging. What do you think?


http://gerrit.cloudera.org:8080/#/c/15047/4/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1219
PS4, Line 1219:     // Migrate the conjucts evaluate the nullable side of 
outer-join. They should also
              :     // evaluate after join, not mark as assigned.
"Propagate the conjuncts evaluating the nullable side of outer-join. Don't mark 
them as assigned so they would be assigned at the JOIN node."


http://gerrit.cloudera.org:8080/#/c/15047/4/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test:

http://gerrit.cloudera.org:8080/#/c/15047/4/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test@1002
PS4, Line 1002: 730
This is actually 73 after adding the predicate. Note that the test passes since 
it doesn't verify the cardinality (run without 
PlannerTestOption.VALIDATE_CARDINALITY). But it's good to correct the value as 
well.



--
To view, visit http://gerrit.cloudera.org:8080/15047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c23a45aeb5dd1aa06a95c9aa8628ecbe37ef2c1
Gerrit-Change-Number: 15047
Gerrit-PatchSet: 4
Gerrit-Owner: Xianqing He <hexianqing...@126.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Xianqing He <hexianqing...@126.com>
Gerrit-Comment-Date: Mon, 20 Jan 2020 14:46:53 +0000
Gerrit-HasComments: Yes

Reply via email to