Hello Kurt Deschler, Wenzhe Zhou, Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/20024 to look at the new patch set (#6). Change subject: IMPALA-12192: Fix scaling bug in scan fragment ...................................................................... IMPALA-12192: Fix scaling bug in scan fragment IMPALA-12091 has a bug where scan fragment parallelism will always be limited solely by the ScanNode cost. If ScanNode is colocated with other query node operator that have higher processing cost, Planner will not scale it up beyond what is allowed by the ScanNode cost. This patch fix the problem in two aspect. First is to allow scan fragment scale up higher as long as it is within total fragment cost and number of effective scan ranges. Second is to add missing Math.max() in CostingSegment.java that cause lower fragment parallelism even when the total fragment cost is high. This patch also fix "max-parallelism" value in explain string and make a constant in ScanNode.rowMaterializationCost() into a backend flag named scan_range_cost_factor for experimental purpose. Testing: - Manually run TPCDS Q84 over tpcds10_parquet and confirm that the leftmost scan fragment parallelism is raised from 12 (before patch) to 18 (after patch). - Add test in PlannerTest.testProcessingCost that reproduce the issue. - Update compute stats test in test_executor_groups.py to maintain test assertion. - Pass core tests. Change-Id: I7010f6c3bc48ae3f74e8db98a83f645b6c157226 --- M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/planner/CostingSegment.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M tests/custom_cluster/test_executor_groups.py 12 files changed, 277 insertions(+), 151 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/20024/6 -- To view, visit http://gerrit.cloudera.org:8080/20024 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7010f6c3bc48ae3f74e8db98a83f645b6c157226 Gerrit-Change-Number: 20024 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>