Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 )
Change subject: IMPALA-11604 Planner changes for CPU usage ...................................................................... Patch Set 38: (17 comments) Thank you for your review, Qifan! http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc@266 PS37, Line 266: cheduler::CheckEffective > nit. is positive Done. Moved to scheduler.h. http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc@272 PS37, Line 272: || IsExceedMaxFsWriters(fragment_state, state)) { > Is it possible to exclude the checking for scan fragment and certain table Yes! Consolidated the case checking in this method. http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/util/backend-gflag-util.cc File be/src/util/backend-gflag-util.cc: http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/util/backend-gflag-util.cc@201 PS37, Line 201: 0.5 > Just wonder why not directly use a value of 2 here. My intention is to associate the scaling flag to CPU requirement of the query, not the total CPU cores of executor group set. Hence, this means halving the CpuRequirement returned by costing algorithm (oversubscribing the executor group). http://gerrit.cloudera.org:8080/#/c/19033/37/common/thrift/Planner.thrift File common/thrift/Planner.thrift: http://gerrit.cloudera.org:8080/#/c/19033/37/common/thrift/Planner.thrift@87 PS37, Line 87: // > Miss the comment here Added comment. http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/analysis/Expr.java@473 PS37, Line 473: public long getNumDistinctValues() { return n > This is unused. However, this point me to existing evalCost_ added in IMPAL Removed. http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java: http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@364 PS37, Line 364: 'partitionByEq_' and 'orderBy > This does not match with the code at line 368: ExprUtil.computeExprCost(par Removed partitionByEq_ from cost calculation. http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@365 PS37, Line 365: titioned and sorte > This does not match with the code at line 368 ExprUtil.computeExprCost(orde Removed orderByEq_ from cost calculation. http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java File fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java: http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@33 PS37, Line 33: childP > May use the name childProcessingCost_ or add a comment to indicate this is Done http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@47 PS37, Line 47: onditions.checkState( > Code duplication with line at 40. Replaced preconditions in line 40 with single call to getMultiple() http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java File fe/src/main/java/org/apache/impala/planner/CoreRequirement.java: http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java@29 PS37, Line 29: > Comment is missing. Done http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java@31 PS37, Line 31: * A container class that represent CPU core requirement of certain subtree of a query or : * the query itself. : */ > A comment for each data member. Done http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java File fe/src/main/java/org/apache/impala/planner/DataSink.java: http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@49 PS37, Line 49: Set in computeProc > nit Set in computeProcessingCost() for a meaningful value. Done http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@131 PS37, Line 131: ws p > nit. data fields in Done http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@141 PS37, Line 141: setNumRowToProduce(Mat > This is missing in the comment at line 131 above. Done http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/MultipleProcessingCost.java File fe/src/main/java/org/apache/impala/planner/MultipleProcessingCost.java: http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/MultipleProcessingCost.java@26 PS37, Line 26: multiplie > multiplier_ is better. Done http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@1025 PS37, Line 1025: ocessingCost totalChildCost = totalExchangingChildCost; > I'm having a second thought on this join-build recomputation on non-shared Refactored this recursion in ps38 to traverse non-join build child first. This result in recompute elimination. http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/ProcessingCost.java File fe/src/main/java/org/apache/impala/planner/ProcessingCost.java: http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/ProcessingCost.java@40 PS37, Line 40: public static long perInstanceCost(ProcessingCost cost, > This should be a backend flag instead of constant so that tuning is possibl Removed. This is now controlled by min_processing_cost_per_thread flag. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2b5555a789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 38 Gerrit-Owner: Qifan Chen <qfc...@hotmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qfc...@hotmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Comment-Date: Tue, 31 Jan 2023 21:15:39 +0000 Gerrit-HasComments: Yes