Hello Aman Sinha, Abhishek Rawat, Wenzhe Zhou, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/23258
to look at the new patch set (#4).
Change subject: IMPALA-14263: Add broadcast_cost_scale_factor option
......................................................................
IMPALA-14263: Add broadcast_cost_scale_factor option
This commit enhances the distributed planner's costing model for
broadcast joins by introducing the `broadcast_cost_scale_factor` query
option. This option enables users to fine-tune the planner's decision
between broadcast and partitioned joins.
Key changes:
- The total broadcast cost is scaled by the new
`broadcast_cost_scale_factor` query option, allowing users to favor or
penalize broadcast joins as needed when setting query hint is not
feasible.
- Updated the planner logic and test cases to reflect the new costing
model and options.
This addresses scenarios where the default costing could lead to
suboptimal join distribution choices, particularly in a large-scale
cluster where the number of executors can increase broadcast cost, while
choosing a partitioned strategy can lead to data skew. Admin can set
`broadcast_cost_scale_factor` less than 1.0 to make DistributedPlanner
favor broadcast more than partitioned join (with possible downside of
higher memory usage per query and higher network transmission).
Existing query hints still take precedence over this option. Note that
this option is applied independent of `broadcast_to_partition_factor`
option (see IMPALA-10287). In MT_DOP>1 setup, it should be sufficient to
set `use_dop_for_costing=True` and tune `broadcast_to_partition_factor`
only.
Testing:
Added FE tests.
Change-Id: I475f8a26b2171e87952b69f66a5c18f77c2b3133
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-dist-method.test
6 files changed, 454 insertions(+), 3 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/23258/4
--
To view, visit http://gerrit.cloudera.org:8080/23258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I475f8a26b2171e87952b69f66a5c18f77c2b3133
Gerrit-Change-Number: 23258
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>