Hello Aman Sinha, Abhishek Rawat, Wenzhe Zhou, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/23258

to look at the new patch set (#2).

Change subject: IMPALA-14263: Add broadcast_cost_scale to tweak broadcast join 
cost
......................................................................

IMPALA-14263: Add broadcast_cost_scale to tweak broadcast join cost

This commit enhances the distributed planner's costing model for
broadcast joins by introducing the `broadcast_cost_scale` query option.
This option enables users to fine-tune the planner's decision between
broadcast and partitioned joins.

Key changes:
- The total broadcast cost is scaled by the new `broadcast_cost_scale`
  query option, allowing users to favor or penalize broadcast joins as
  needed when setting query hint is not feasible.
- Updated the planner logic and test cases to reflect the new costing
  model and options.

This addresses scenarios where the default costing could lead to
suboptimal join distribution choices, particularly in a large-scale
cluster where the number of executors can increase broadcast cost, while
choosing a partitioned strategy can lead to data skew. Admin can set
`broadcast_cost_scale` less than 1.0 to make DistributedPlanner favor
broadcast more than partitioned join (with possible downside of higher
memory usage per query and higher network transmission).

Existing query hints still take precedence over this option. Note that
this option is applied whether `broadcast_to_partition_factor` option
takes effect or not (see IMPALA-10287). In MT_DOP>0 setup, it should be
sufficient to set `use_dop_for_costing=True` and tune
`broadcast_to_partition_factor` only.

Testing:
Added FE tests.

Change-Id: I475f8a26b2171e87952b69f66a5c18f77c2b3133
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-dist-method.test
6 files changed, 561 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/23258/2
--
To view, visit http://gerrit.cloudera.org:8080/23258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I475f8a26b2171e87952b69f66a5c18f77c2b3133
Gerrit-Change-Number: 23258
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>

Reply via email to