Steve Carlin created IMPALA-14745:
-------------------------------------
Summary: Calcite planner: Join optimizer should factor in
parallelism of nodes
Key: IMPALA-14745
URL: https://issues.apache.org/jira/browse/IMPALA-14745
Project: IMPALA
Issue Type: Sub-task
Reporter: Steve Carlin
Comment from join optimization code review of
[https://gerrit.cloudera.org/#/c/23924/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@1930]
One of the things that Impala's Planner.isInvertedJoinsCheaper() [1] considers
is the parallelism (besides the cost). There are situations where if you only
make the decision based on cost, the probe side can end up with much lower
parallelism than without swapping and this adversely affects performance
because the HashJoin node's parallelism depends on the left input's (probe)
parallelism. Is the numNodes value available at this stage in the Calcite
planning for the left and right child ? If so, it would be useful to
incorporate. Whether in this patch or an enhancement.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)