Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
......................................................................


Patch Set 38:

(17 comments)

Thank you for your review, Qifan!

http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc@266
PS37, Line 266: cheduler::CheckEffective
> nit. is positive
Done. Moved to scheduler.h.


http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc@272
PS37, Line 272:         || IsExceedMaxFsWriters(fragment_state, state)) {
> Is it possible to exclude the checking for scan fragment and certain table
Yes! Consolidated the case checking in this method.


http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/util/backend-gflag-util.cc@201
PS37, Line 201: 0.5
> Just wonder why not directly use a value of 2 here.
My intention is to associate the scaling flag to CPU requirement of the query, 
not the total CPU cores of executor group set.
Hence, this means halving the CpuRequirement returned by costing algorithm 
(oversubscribing the executor group).


http://gerrit.cloudera.org:8080/#/c/19033/37/common/thrift/Planner.thrift
File common/thrift/Planner.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/37/common/thrift/Planner.thrift@87
PS37, Line 87: //
> Miss the comment here
Added comment.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/analysis/Expr.java@473
PS37, Line 473: public long getNumDistinctValues() { return n
> This is unused. However, this point me to existing evalCost_ added in IMPAL
Removed.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@364
PS37, Line 364: 'partitionByEq_' and 'orderBy
> This does not match with the code at line 368: ExprUtil.computeExprCost(par
Removed partitionByEq_ from cost calculation.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@365
PS37, Line 365: titioned and sorte
> This does not match with the code at line 368 ExprUtil.computeExprCost(orde
Removed orderByEq_ from cost calculation.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java
File fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@33
PS37, Line 33: childP
> May use the name childProcessingCost_  or add a comment to indicate this is
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@47
PS37, Line 47: onditions.checkState(
> Code duplication with line at 40.
Replaced preconditions in line 40 with single call to getMultiple()


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java
File fe/src/main/java/org/apache/impala/planner/CoreRequirement.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java@29
PS37, Line 29:
> Comment is missing.
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java@31
PS37, Line 31: * A container class that represent CPU core requirement of 
certain subtree of a query or
             :  * the query itself.
             :  */
> A comment for each data member.
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java
File fe/src/main/java/org/apache/impala/planner/DataSink.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@49
PS37, Line 49: Set in computeProc
> nit Set in computeProcessingCost() for a meaningful value.
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@131
PS37, Line 131: ws p
> nit. data fields in
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@141
PS37, Line 141: setNumRowToProduce(Mat
> This is missing in the comment at line 131 above.
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/MultipleProcessingCost.java
File fe/src/main/java/org/apache/impala/planner/MultipleProcessingCost.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/MultipleProcessingCost.java@26
PS37, Line 26: multiplie
> multiplier_ is better.
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@1025
PS37, Line 1025: ocessingCost totalChildCost = totalExchangingChildCost;
> I'm having a second thought on this join-build recomputation on non-shared
Refactored this recursion in ps38 to traverse non-join build child first. This 
result in recompute elimination.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/ProcessingCost.java
File fe/src/main/java/org/apache/impala/planner/ProcessingCost.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/ProcessingCost.java@40
PS37, Line 40: public static long perInstanceCost(ProcessingCost cost,
> This should be a backend flag instead of constant so that tuning is possibl
Removed. This is now controlled by min_processing_cost_per_thread flag.



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2b5555a789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 38
Gerrit-Owner: Qifan Chen <qfc...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qfc...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 31 Jan 2023 21:15:39 +0000
Gerrit-HasComments: Yes

Reply via email to