Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20498


Change subject: IMPALA-12018: Consider runtime filter for cardinality reduction
......................................................................

IMPALA-12018: Consider runtime filter for cardinality reduction

Currently Impala creates a plan first and looks for runtime filters
based on the complete plan. This means cardinality estimate in the query
plan does not incorporate runtime filter selectivity. Actual scan
cardinality from runtime execution is often much lower that the
cardinality estimate due to existence of runtime filter.

This patch applies runtime filter selectivity to lower cardinality
estimates of scan nodes and certain join nodes above them after runtime
filter generation and before resource requirement computation. The
algorithm select a contigous probe pipeline consisting of a scan node,
exchanges, and reducing join nodes. Depending on whether the join node
produce a runtime filter and the type of that runtime filter, it then
applies the runtime filter selectivity to the scan node to reduce its
cardinality and input cardinality estimate.

This cardinality reduction is currently only applied if
COMPUTE_POCESSING_COST option is True. This is because multiple executor
group set setup can benefit the most from reduced scan cardinality. It
can lead towards ProcessingCost reduction, lower scan fragment
parallelism, and increase chance of query assignment to the smaller
executor group set. We can consider enabling this for all cases after
more thorough performance evaluation (which require more planner test
changes).

Testing:
- Pass test_executor_groups.py.
- Pass PlannerTest#testProcessingCost.

Change-Id: I033789c9b63a8188484e3afde8e646563918b3e1
---
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test
5 files changed, 456 insertions(+), 252 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/20498/1
--
To view, visit http://gerrit.cloudera.org:8080/20498
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I033789c9b63a8188484e3afde8e646563918b3e1
Gerrit-Change-Number: 20498
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>

Reply via email to