Hello Andrew Sherman, Kurt Deschler, Abhishek Rawat, Wenzhe Zhou, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19656 to look at the new patch set (#8). Change subject: IMPALA-12029: Relax scan fragment parallelism on first planning ...................................................................... IMPALA-12029: Relax scan fragment parallelism on first planning In a setup with multiple executor group set, Frontend will try to match a query with the smallest executor group set that can fit the memory and cpu requirement of the compiled query. There are kinds of query where the compiled plan will fit to any executor group set but not necessarily deliver the best performance. An example for this is Impala's COMPUTE STATS query. It does full table scan and aggregate the stats, have fairly simple query plan shape, but can benefit from higher scan parallelism. This patch relaxes the scan fragment parallelism on first round of query planning. This allows scan fragment to increase its parallelism based on its ProcessingCost estimation. If the relaxed plan fit in an executor group set, we replan once again with that executor group set but with scan fragment parallelism returned back to MT_DOP. This one extra round of query planning adds couple millisecond overhead depending on the complexity of the query plan, but necessary since the backend scheduler still expect at most MT_DOP amount of scan fragment instances. We can remove the extra replanning in the future once we can fully manage scan node parallelism without MT_DOP. This patch also tune computeScanProcessingCost() to guard against scheduling too many scan fragments by comparing with the actual scan range count that Planner knows. Testing: - Pass test_executor_groups.py - Add test case in test_min_processing_per_thread_small. - Raised impala.admission-control.max-query-mem-limit.root.small from 64MB to 70MB in llama-site-3-groups.xml so that the new grouping query can fit in root.small pool. Change-Id: I7a2276fbd344d00caa67103026661a3644b9a1f9 --- M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/resources/llama-site-3-groups.xml M tests/custom_cluster/test_executor_groups.py 10 files changed, 207 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/19656/8 -- To view, visit http://gerrit.cloudera.org:8080/19656 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7a2276fbd344d00caa67103026661a3644b9a1f9 Gerrit-Change-Number: 19656 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>