[ https://issues.apache.org/jira/browse/IMPALA-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839697#comment-17839697 ]
ASF subversion and git services commented on IMPALA-12988: ---------------------------------------------------------- Commit c8149d14127968eb9a4d26a623fa6cc82762216e in impala's branch refs/heads/branch-4.4.0 from Riza Suminto [ https://gitbox.apache.org/repos/asf?p=impala.git;h=c8149d141 ] IMPALA-12988: Calculate an unbounded version of CpuAsk Planner calculates CpuAsk through a recursive call beginning at Planner.computeBlockingAwareCores(), which is called after Planner.computeEffectiveParallelism(). It does blocking operator analysis over the selected degree of parallelism that was decided during computeEffectiveParallelism() traversal. That selected degree of parallelism, however, is already bounded by min and max parallelism config, derived from PROCESSING_COST_MIN_THREADS and MAX_FRAGMENT_INSTANCES_PER_NODE options accordingly. This patch calculates an unbounded version of CpuAsk that is not bounded by min and max parallelism config. It is purely based on the fragment's ProcessingCost and query plan relationship constraint (for example, the number of JOIN BUILDER fragments should equal the number of destination JOIN fragments for partitioned join). Frontend will receive both bounded and unbounded CpuAsk values from TQueryExecRequest on each executor group set selection round. The unbounded CpuAsk is then scaled down once using a nth root based sublinear-function, controlled by the total cpu count of the smallest executor group set and the bounded CpuAsk number. Another linear scaling is then applied on both bounded and unbounded CpuAsk using QUERY_CPU_COUNT_DIVISOR option. Frontend then compare the unbounded CpuAsk after scaling against CpuMax to avoid assigning a query to a small executor group set too soon. The last executor group set stays as the "catch-all" executor group set. After this patch, setting COMPUTE_PROCESSING_COST=True will show following changes in query profile: - The "max-parallelism" fields in the query plan will all be set to maximum parallelism based on ProcessingCost. - The CpuAsk counter is changed to show the unbounded CpuAsk after scaling. - A new counter CpuAskBounded shows the bounded CpuAsk after scaling. If QUERY_CPU_COUNT_DIVISOR=1 and PLANNER_CPU_ASK slot counting strategy is selected, this CpuAskBounded is also the minimum total admission slots given to the query. - A new counter MaxParallelism shows the unbounded CpuAsk before scaling. - The EffectiveParallelism counter remains unchanged, showing bounded CpuAsk before scaling. Testing: - Update and pass FE test TpcdsCpuCostPlannerTest and PlannerTest#testProcessingCost. - Pass EE test tests/query_test/test_tpcds_queries.py - Pass custom cluster test tests/custom_cluster/test_executor_groups.py Change-Id: I5441e31088f90761062af35862be4ce09d116923 Reviewed-on: http://gerrit.cloudera.org:8080/21277 Reviewed-by: Kurt Deschler <kdesc...@cloudera.com> Reviewed-by: Abhishek Rawat <ara...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Calculate an unbounded version of CpuAsk > ---------------------------------------- > > Key: IMPALA-12988 > URL: https://issues.apache.org/jira/browse/IMPALA-12988 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Reporter: Riza Suminto > Assignee: Riza Suminto > Priority: Major > Fix For: Impala 4.4.0 > > > CpuAsk is calculated through recursive call beginning at > Planner.computeBlockingAwareCores(), which called after > Planner.computeEffectiveParallelism(). It does blocking operator analysis > over selected degree of parallelism that decided during > computeEffectiveParallelism() traversal. That selected degree of parallelism, > however, is already bounded by min and max parallelism config, derived from > PROCESSING_COST_MIN_THREADS and MAX_FRAGMENT_INSTANCES_PER_NODE options > accordingly. > It is beneficial to have another version of CpuAsk that is not bounded by min > and max parallelism config. It should purely based on the fragment's > ProcessingCost and query plan relationship constraint (ie., num JOIN BUILDER > fragment should be equal as num JOIN fragment for partitioned join). During > executor group set selection, Frontend should use the unbounded CpuAsk number > to avoid assigning query to small executor group set prematurely. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org