Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
......................................................................


Patch Set 49:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/19033/43//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19033/43//COMMIT_MSG@213
PS43, Line 213: overlapping between fragment execution and blocking operators. 
We
> The upper bound for each fragment should be the number of threads or someth
min_processing_per_thread=10M seems to be a good upper bound in my local 
machine.


http://gerrit.cloudera.org:8080/#/c/19033/47//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19033/47//COMMIT_MSG@140
PS47, Line 140: a subtree of PlanNodes/DataSink in the fragment with a DataSink 
or
Added SegmentCost class for segment abstraction.
Also added TPCDS-Q49 into tpcds-processing-cost.test to test against union 
fragment.


http://gerrit.cloudera.org:8080/#/c/19033/48//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19033/48//COMMIT_MSG@325
PS48, Line 325: sing_per_th
> As the comments in https://github.com/apache/impala/blob/master/fe/src/main
My thought as well. Should we revert the default back to True?


http://gerrit.cloudera.org:8080/#/c/19033/48//COMMIT_MSG@346
PS48, Line 346:
> Could you attach the bench mark which show effective parallelism improvemen
Our single-node benchmark script mainly measure query latency. I don't expect 
any faster query latency with this patch since the default combination of all 
new query options and backend flags will actually reduce parallelism in some 
fragments rather than increasing them. As long as latency does not regress 
severely compared to regular MT_DOP mode, I take it as a good outcome.

The improvement probably best expressed as memory and thread count reduction.


http://gerrit.cloudera.org:8080/#/c/19033/43/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/19033/43/be/src/util/backend-gflag-util.cc@210
PS43, Line 210:
> Would rather keep this as cost as cost of a row is a highly variable metric
Changed into min_processing_per_thread in ps49.


http://gerrit.cloudera.org:8080/#/c/19033/43/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
File fe/src/main/java/org/apache/impala/planner/ExchangeNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/43/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java@263
PS43, Line 263:   // Returns the total estimated size (in bytes) of the row 
batch queues by
> This assume the total cost for a row batch is 1. Is it right estimation?
Changed in ps49 to model the cost as 1 per 1KB of average serialized row size.
That seems good enough to increase DataStreamSink and ExchangeNode cost.


http://gerrit.cloudera.org:8080/#/c/19033/48/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

http://gerrit.cloudera.org:8080/#/c/19033/48/fe/src/main/java/org/apache/impala/planner/Planner.java@470
PS48, Line 470: ot = postOrderFra
> nit: this result seems not used now. Add "TODO" comment
Done


http://gerrit.cloudera.org:8080/#/c/19033/49/fe/src/main/java/org/apache/impala/planner/ProcessingCost.java
File fe/src/main/java/org/apache/impala/planner/ProcessingCost.java:

http://gerrit.cloudera.org:8080/#/c/19033/49/fe/src/main/java/org/apache/impala/planner/ProcessingCost.java@20
PS49, Line 20: 
com.google.cloud.hadoop.repackaged.gcs.com.google.common.math.LongMath
Does not look like the right class to import.


http://gerrit.cloudera.org:8080/#/c/19033/48/fe/src/main/java/org/apache/impala/planner/ScanNode.java
File fe/src/main/java/org/apache/impala/planner/ScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/48/fe/src/main/java/org/apache/impala/planner/ScanNode.java@359
PS48, Line 359:
> In ExchangeNode.estimateProcessingCostPerRow(), the cost per row is calcula
Changed in ps49 to model the cost as 1 per 1KB of average row size.



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2b5555a789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 49
Gerrit-Owner: Qifan Chen <qfc...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qfc...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Fri, 17 Feb 2023 00:32:47 +0000
Gerrit-HasComments: Yes

Reply via email to