Hello Internal Jenkins, Alex Behm, I'd like you to do a code review. Please visit
http://gerrit.cloudera.org:8080/4239 to review the following change. Change subject: IMPALA-3063: Separate join inversion from join ordering. ...................................................................... IMPALA-3063: Separate join inversion from join ordering. Before this change joins were inverted while doing join ordering. That approach was unnecessarily complex because it required modifying the global analysis state for correct conjunct placement, etc. However, join inversion is independent of join ordering, and the existing approach could lead to generating invalid plans with distributed non-equi right outer/semi joins, which we cannot execute in the backend. After this change joins are inverted in a separate pass over the single-node plan. This simplifies the inversion logic and allows us to avoid generating those invalid plans. Note that this change is not only a separation of functionality for the following reasons: 1. Our join cardinality estimation is not symmetric, i.e., A JOIN B may not give the same estimate as B JOIN A due to our FK/PK detection heuristic. In the context of this patch this means that an inverted join may have a different cardinality estimate, so plans may change depending on whether the inversion is done during join ordering of after. 2. We currently only invert outer/semi/anti joins based on the rhs table ref join op. In this patch I want to preserve the existing behavior as much as possible, but when doing the join ordering in a separate pass we may see a join opn in a JoinNode that is different from the rhs table ref. So in some situations the inversion behavior based on the join op could be different and there are some examples in this patch. This patch also moves the logic of converting hash joins to nested-loop joins into a separate pass over the single-node plan. Change-Id: If86db7753fc585bb4c69612745ec0103278888a4 Reviewed-on: http://gerrit.cloudera.org:8080/3846 Reviewed-by: Alex Behm <alex.b...@cloudera.com> Tested-by: Internal Jenkins (cherry picked from commit 532b1fe1186725b8e81fff93b59fc7cebf563c8b) --- M be/src/exec/blocking-join-node.h M fe/src/main/java/com/cloudera/impala/analysis/Analyzer.java M fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java M fe/src/main/java/com/cloudera/impala/analysis/SelectList.java M fe/src/main/java/com/cloudera/impala/analysis/TableRef.java M fe/src/main/java/com/cloudera/impala/planner/AggregationNode.java M fe/src/main/java/com/cloudera/impala/planner/AnalyticEvalNode.java M fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java M fe/src/main/java/com/cloudera/impala/planner/ExchangeNode.java M fe/src/main/java/com/cloudera/impala/planner/HashJoinNode.java M fe/src/main/java/com/cloudera/impala/planner/HdfsScanNode.java M fe/src/main/java/com/cloudera/impala/planner/JoinNode.java M fe/src/main/java/com/cloudera/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/com/cloudera/impala/planner/PlanNode.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/main/java/com/cloudera/impala/planner/SelectNode.java M fe/src/main/java/com/cloudera/impala/planner/SingleNodePlanner.java M fe/src/main/java/com/cloudera/impala/planner/SingularRowSrcNode.java M fe/src/main/java/com/cloudera/impala/planner/SortNode.java M fe/src/main/java/com/cloudera/impala/planner/SubplanNode.java M fe/src/main/java/com/cloudera/impala/planner/UnionNode.java M fe/src/main/java/com/cloudera/impala/planner/UnnestNode.java M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-loop-join.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test 29 files changed, 661 insertions(+), 363 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/39/4239/1 -- To view, visit http://gerrit.cloudera.org:8080/4239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If86db7753fc585bb4c69612745ec0103278888a4 Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: anujphadke <apha...@cloudera.com>