[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-22 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19033


Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..

[WIP] IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by allowing the amount of data
processed per core to be used as a new factor in the definition
and selection of an executor group.

Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
---
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
6 files changed, 83 insertions(+), 20 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/1
--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 1
Gerrit-Owner: Qifan Chen 


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11415/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 1
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 22 Sep 2022 21:04:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-23 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..

[WIP] IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by allowing the amount of data
processed to be used as a new factor in the definition and
selection of an executor group.

The number of data processed is evaluated as follows.

  input cardinality * average row size / number of instances

Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
---
M common/thrift/Frontend.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
25 files changed, 158 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/4
--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-23 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 4:

Basic functionality of cpu-usage based auto-scaling.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Fri, 23 Sep 2022 16:38:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11423/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Fri, 23 Sep 2022 16:59:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-23 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..

[WIP] IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by allowing the amount of data
processed to be used as a new factor in the definition and
selection of an executor group.

The number of data processed is the sum of cpu usage of every
fragment, which is the sum of cpu usage of every node in the
fragment. For each node, the cpu usage is evaluated as

  input cardinality * average row size / number of instances

Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
---
M common/thrift/Frontend.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
25 files changed, 170 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/5
--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-23 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 5:

Patch set 5 improves how the number of instances is used to compute cpu usage 
for scan nodes, considering query option mt_dop.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Fri, 23 Sep 2022 18:31:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 5:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/11427/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 23 Sep 2022 18:44:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-25 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19033/5/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/5/common/thrift/Query.thrift@873
PS5, Line 873: cpu_usage_bytes
cpu usage is proportional to the total bytes of data for the query. Should we 
consider other factors, like the operators or functions applied on the data?


http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2147
PS5, Line 2147: int numOfInstancesToComputeCpuUsage =
  : (queryOptions.getMt_dop() >= 1) > numInstances_ : 
maxScannerThreads;
  :
  : (queryOptions.getMt_dop() >= 1) ? long 
cpuUsageBytesPerInstance = inputCardinality_
  : * ((long) (getAvgRowSize())) / 
numOfInstancesToComputeCpuUsage;
Don't understand these statement.


http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@421
PS5, Line 421: >
?


http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
File fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java@96
PS5, Line 96: probeCpuUsage = probePerInstanceMemEstimate;
: long buildCpuUsage
Why need these two variables?



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 25 Sep 2022 13:30:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-26 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..

[WIP] IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by allowing the amount of data
processed to be used as a new factor in the definition and
selection of an executor group.

The number of data processed is the sum of cpu usage of every
fragment, which is the sum of cpu usage of every node in the
fragment. For each node, the cpu usage is evaluated as

  input cardinality * expression evaluation cost per row
   * average row size / number of instances

Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
---
M common/thrift/Frontend.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
27 files changed, 202 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/6
--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-26 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 6:

(4 comments)

Addressed the review comments and improved on scan nodes and exchange node.

http://gerrit.cloudera.org:8080/#/c/19033/5/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/5/common/thrift/Query.thrift@873
PS5, Line 873: cpu_usage_bytes
> cpu usage is proportional to the total bytes of data for the query. Should
Yes, the evaluation cost per row is being worked on.


http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2147
PS5, Line 2147: long cpuUsageBytesPerInstance = 
computeCpuUsageBytes(numOfInstancesToComputeCpuUsage);
  :
  : nodeResourceProfile_ = new ResourceProfileBuilder()
  : .setMemEstimateBytes(perInstanceMemEstimate)
  : .setMinMemReservationBytes(computeMinMemReservation(
> Don't understand these statement.
Sorry about the copy/paste error. The formula is improved further. Done.


http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@421
PS5, Line 421:
> ?
Fixed.


http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
File fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java@96
PS5, Line 96: probeCpuUsage = probePerInstanceMemEstimate / numInstances_;
: long buildCpuUsage
> Why need these two variables?
Good catch. Fixed.



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 26 Sep 2022 19:06:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11446/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 26 Sep 2022 19:28:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-26 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..

[WIP] IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by allowing the amount of data
processed to be used as a new factor in the definition and
selection of an executor group.

The number of data processed is the sum of cpu usage of every
fragment, which is the sum of cpu usage of every node in the
fragment. For each node, the cpu usage is computed as

  I * C * W / N

  where I is input cardinality
C is expression evaluation cost per row
W is average row size
N is number of instances

A description of computation for each kind of plan node is listed
below.

1. Hdfs and Kudu scans:
N is mt_dop when query option mt_dop >= 1, otherwise
N is number of nodes * max scan threads;

2. Hbase scans:
N is 1

3. Hash join and nested joins:
probe side cpu usage:
build side cpu usage:
  C is sum of the evaluation cost for equi-join predicates and
the evaluation cost for other join predicates;

Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
---
M common/thrift/Frontend.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
27 files changed, 234 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/7
--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 7
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-26 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 7:

Refined formula for scans, hash and nested joins.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 7
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 27 Sep 2022 01:56:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11448/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 7
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 27 Sep 2022 02:20:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-27 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Frontend.thrift@747
PS7, Line 747: The optional max_cpu_usage_limit to determine which executor 
group set to run for
 :   // a query.
Add the more detailed description for measuring this variable


http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Query.thrift@872
PS7, Line 872: stimated cpu usage per instance in bytes
Traditionally CPU usage is measured in time. Here we use computing/processing 
bytes to measure CPU usage. Variable name "cpu_usage_bytes" is not very clear 
for me.
In future, we may consider CPU frequency.



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 7
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 27 Sep 2022 07:44:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-27 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..

[WIP] IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by allowing the amount of data
processed to be used as a new factor in the definition and
selection of an executor group.

The number of data processed is the sum of that in every fragment,
which is the sum of data processed in every node in the fragment.
For each node, the data processed is computed as

  I * C * W / N

  where I is input cardinality
C is expression evaluation cost per row
W is average row size
N is number of instances

A description of computation for each kind of plan node is listed
below.

1. Hdfs and Kudu scan nodes:
N is mt_dop when query option mt_dop >= 1, otherwise
N is number of nodes * max scan threads;

2. Hbase scan nodes:
N is 1

3. Hash join and nested join nodes:
C is sum of the evaluation cost for equi-join predicate and for
other join predicate, for both probe and buiild side;

4. Aggregation nodes:
C and W are the sum of the costs and partial row widths for each
aggregate (AggregateInfo).

Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
---
M common/thrift/Frontend.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/ExprUtil.java
29 files changed, 303 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/8
--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 8
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19033/8/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/8/common/thrift/Frontend.thrift@743
PS8, Line 743:   // a value for each of the limit variable dimension and 
compares it with the limit. An
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19033/8/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/19033/8/fe/src/main/java/org/apache/impala/service/Frontend.java@1862
PS8, Line 1862:   i = Long.compare(e1.getMax_data_processed_limit(), 
e2.getMax_data_processed_limit());
line too long (95 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 8
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 27 Sep 2022 19:16:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-27 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 8:

(2 comments)

Rework and improved the calculation for AggregationNode.

http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Frontend.thrift@747
PS7, Line 747: The memory limit variable provides the per host estimated-memory 
limit.
 :   4: optional
> Add the more detailed description for measuring this variable
Done


http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Query.thrift@872
PS7, Line 872: stimated total data processed per instan
> Traditionally CPU usage is measured in time. Here we use computing/processi
Yeah. The use cpu usage is a kind of implicit. Changed the name to data 
processed throughout the patch.



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 8
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 27 Sep 2022 19:16:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-27 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..

[WIP] IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by allowing the amount of data
processed to be used as a new factor in the definition and
selection of an executor group.

The number of data processed is the sum of that in every fragment,
which is the sum of data processed in every node in the fragment.
For each node, the data processed is computed as

  I * C * W / N

  where I is input cardinality
C is expression evaluation cost per row
W is average row size
N is number of instances

A description of computation for each kind of plan node is listed
below.

1. Hdfs and Kudu scan nodes:
N is mt_dop when query option mt_dop >= 1, otherwise
N is number of nodes * max scan threads;

2. Hbase scan nodes:
N is 1

3. Hash join and nested join nodes:
C is sum of the evaluation cost for equi-join predicate and for
other join predicate, for both probe and buiild side;

4. Aggregation nodes:
C and W are the sum of the costs and partial row widths for each
aggregate (AggregateInfo).

Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
---
M common/thrift/Frontend.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/ExprUtil.java
29 files changed, 305 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/9
--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-27 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19033/8/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/8/common/thrift/Frontend.thrift@743
PS8, Line 743:   // the limit dimension and compares it with the limit. An 
executor group is chosen
> line has trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/19033/8/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/19033/8/fe/src/main/java/org/apache/impala/service/Frontend.java@1862
PS8, Line 1862:   i = Long.compare(
> line too long (95 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 27 Sep 2022 19:25:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11459/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 8
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 27 Sep 2022 19:36:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11460/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 27 Sep 2022 19:45:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-28 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..

[WIP] IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by allowing the weighted amount
of data processed to be used as a new factor in the definition
and selection of an executor group.

The weighted amount of data processed is the sum of that in every
fragment, which is the sum of that in every node in the fragment.
For each node, the weighted amount of data processed is computed
with a general formula as follows.

  D = I * C * W / N

  where D is the weighted amount of data processed
I is input cardinality
C is expression evaluation cost per row
W is average row size
N is number of instances

A description of the computation for each kind of plan node is
given below.

1. Aggregation node:
C and W are the sum of the costs and partial row widths for each
AggregateInfo object.

2. AnalyticEval node:
C is sum of the evaluation costs for analytic functions, partition
by equal and order by equal predicate;

3. CardinalityCheck node:
Both C and I are 1;

4. DataSource scan node:
C is computed from a subset of the selection predicates excluding
data source accepted predicates;

5. EmptySet node:
I is 0;

6. Exchange node:
A modification of the general formula when in broadcast mode:
D = (I * C * W / N) * number of receivers;

7. Hash join node:
C is sum of the evaluation cost for equi-join predicate and for
other join predicate, for both probe and build side;

8. Hbase scan node:
N is 1

9. Hdfs and Kudu scan node:
N is mt_dop when query option mt_dop >= 1, otherwise
N is number of nodes * max scan threads;

10. Nested loop join node:
When the right child is not a SingularRowSrc node, C is sum of
the evaluation cost for equi-join predicate and for other join
predicate, for both probe and build side.

When the right child is a SingularRowSrc node, the cost for build
side is multiplied by the cardinality from the probe side;

11. Select node:
Use the general formula;

12. SingularRowSrc node:
I is 1. Since the node is involved once per input in nested loop
join, the total cost of this node is computed in nested loop join.

13. Sort node:

14. Subplan node:

15. Union node:

16. Unnest node:

Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
---
M common/thrift/Frontend.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/ExprUtil.java
29 files changed, 336 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/10
--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 10
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-28 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 10:

Added nodes in the commit message about each node (up to SingularRowSrc) and 
improved in the corresponding class.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 10
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 28 Sep 2022 17:10:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11470/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 10
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 28 Sep 2022 17:30:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage

2022-09-28 Thread Kurt Deschler (Code Review)
Kurt Deschler has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage
..


Patch Set 10:

(7 comments)

Good start. I think the following changes are necessary and worth spending 
extra time on:
1) Fragment cost should conceptually be an amount of time required to process 
the estimated data for the fragment. That way, we can divide by cores available 
and factor consider SLA times through direct calculations. Any functions and 
variables should probably naming like CPUCost or ProcessingCost.
2) It's probably worth spending time making a new set of Expr costing functions 
instead of multiplying the EXPR constants by the row width. Comments in 
Expr.java state that those constants are not suitable for absolute cost 
computation. It should be reasonable to add a mechanism to do actual time 
measurements for Expr evaluation during init and use measurements to compute 
accurate time costs.

http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Frontend.thrift@752
PS10, Line 752:   5: optional i64 max_data_processed_limit
It's probably best to stick with vcores here as what the executor group 
provides. The coordinator can work backwards from the query costing to 
determine vcore sizing for a query.


http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Query.thrift@873
PS10, Line 873:   13: optional i64 data_processed_bytes;
Vcores here also.


http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java@727
PS10, Line 727: * getIntermediateTupleDesc().getByteSize()) / 
Math.max(numInstances, 1);
Don't divide by numInstances in these functions. Instead accumulate the 
fragment cost then divide by a (cost/core) number to arrive at a number of 
cores required. SLA can also be factored in later for to fragment That way.


http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@627
PS10, Line 627: .setDataProcessedBytes(
Use (abstract) processingCost naming instead of processedBytes.


http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@362
PS10, Line 362:   public long computeDataProcessedBytes() {
rename all of these to computeProcessingCost or similar


http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java@308
PS10, Line 308: float eualJoinPredicateEvalCost =
eqJoinPredicateEvalCost


http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/util/ExprUtil.java
File fe/src/main/java/org/apache/impala/util/ExprUtil.java:

http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/util/ExprUtil.java@109
PS10, Line 109:   public static float computeConjuctsTotalCost(List 
conjuncts) {
Use List to share the same function.



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 10
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 28 Sep 2022 19:17:43 +
Gerrit-HasComments: Yes