[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19033 Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. [WIP] IMPALA-11604 Planner changes for CPU usage This patch augments IMPALA-10992 by allowing the amount of data processed per core to be used as a new factor in the definition and selection of an executor group. Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a --- M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java 6 files changed, 83 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/1 -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 1 Gerrit-Owner: Qifan Chen
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/11415/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 1 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 22 Sep 2022 21:04:12 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. [WIP] IMPALA-11604 Planner changes for CPU usage This patch augments IMPALA-10992 by allowing the amount of data processed to be used as a new factor in the definition and selection of an executor group. The number of data processed is evaluated as follows. input cardinality * average row size / number of instances Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a --- M common/thrift/Frontend.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java 25 files changed, 158 insertions(+), 40 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/4 -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 4 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 4: Basic functionality of cpu-usage based auto-scaling. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 4 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Fri, 23 Sep 2022 16:38:50 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/11423/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 4 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Fri, 23 Sep 2022 16:59:16 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. [WIP] IMPALA-11604 Planner changes for CPU usage This patch augments IMPALA-10992 by allowing the amount of data processed to be used as a new factor in the definition and selection of an executor group. The number of data processed is the sum of cpu usage of every fragment, which is the sum of cpu usage of every node in the fragment. For each node, the cpu usage is evaluated as input cardinality * average row size / number of instances Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a --- M common/thrift/Frontend.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java 25 files changed, 170 insertions(+), 40 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/5 -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 5 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 5: Patch set 5 improves how the number of instances is used to compute cpu usage for scan nodes, considering query option mt_dop. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 5 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Fri, 23 Sep 2022 18:31:40 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/11427/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 5 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 23 Sep 2022 18:44:37 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/19033/5/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/19033/5/common/thrift/Query.thrift@873 PS5, Line 873: cpu_usage_bytes cpu usage is proportional to the total bytes of data for the query. Should we consider other factors, like the operators or functions applied on the data? http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2147 PS5, Line 2147: int numOfInstancesToComputeCpuUsage = : (queryOptions.getMt_dop() >= 1) > numInstances_ : maxScannerThreads; : : (queryOptions.getMt_dop() >= 1) ? long cpuUsageBytesPerInstance = inputCardinality_ : * ((long) (getAvgRowSize())) / numOfInstancesToComputeCpuUsage; Don't understand these statement. http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@421 PS5, Line 421: > ? http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java File fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java: http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java@96 PS5, Line 96: probeCpuUsage = probePerInstanceMemEstimate; : long buildCpuUsage Why need these two variables? -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 5 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sun, 25 Sep 2022 13:30:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. [WIP] IMPALA-11604 Planner changes for CPU usage This patch augments IMPALA-10992 by allowing the amount of data processed to be used as a new factor in the definition and selection of an executor group. The number of data processed is the sum of cpu usage of every fragment, which is the sum of cpu usage of every node in the fragment. For each node, the cpu usage is evaluated as input cardinality * expression evaluation cost per row * average row size / number of instances Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a --- M common/thrift/Frontend.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java 27 files changed, 202 insertions(+), 40 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/6 -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 6 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 6: (4 comments) Addressed the review comments and improved on scan nodes and exchange node. http://gerrit.cloudera.org:8080/#/c/19033/5/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/19033/5/common/thrift/Query.thrift@873 PS5, Line 873: cpu_usage_bytes > cpu usage is proportional to the total bytes of data for the query. Should Yes, the evaluation cost per row is being worked on. http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2147 PS5, Line 2147: long cpuUsageBytesPerInstance = computeCpuUsageBytes(numOfInstancesToComputeCpuUsage); : : nodeResourceProfile_ = new ResourceProfileBuilder() : .setMemEstimateBytes(perInstanceMemEstimate) : .setMinMemReservationBytes(computeMinMemReservation( > Don't understand these statement. Sorry about the copy/paste error. The formula is improved further. Done. http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@421 PS5, Line 421: > ? Fixed. http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java File fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java: http://gerrit.cloudera.org:8080/#/c/19033/5/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java@96 PS5, Line 96: probeCpuUsage = probePerInstanceMemEstimate / numInstances_; : long buildCpuUsage > Why need these two variables? Good catch. Fixed. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 6 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 26 Sep 2022 19:06:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/11446/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 6 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 26 Sep 2022 19:28:15 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. [WIP] IMPALA-11604 Planner changes for CPU usage This patch augments IMPALA-10992 by allowing the amount of data processed to be used as a new factor in the definition and selection of an executor group. The number of data processed is the sum of cpu usage of every fragment, which is the sum of cpu usage of every node in the fragment. For each node, the cpu usage is computed as I * C * W / N where I is input cardinality C is expression evaluation cost per row W is average row size N is number of instances A description of computation for each kind of plan node is listed below. 1. Hdfs and Kudu scans: N is mt_dop when query option mt_dop >= 1, otherwise N is number of nodes * max scan threads; 2. Hbase scans: N is 1 3. Hash join and nested joins: probe side cpu usage: build side cpu usage: C is sum of the evaluation cost for equi-join predicates and the evaluation cost for other join predicates; Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a --- M common/thrift/Frontend.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java 27 files changed, 234 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/7 -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 7 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 7: Refined formula for scans, hash and nested joins. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 7 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 27 Sep 2022 01:56:25 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/11448/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 7 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 27 Sep 2022 02:20:48 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Frontend.thrift@747 PS7, Line 747: The optional max_cpu_usage_limit to determine which executor group set to run for : // a query. Add the more detailed description for measuring this variable http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Query.thrift@872 PS7, Line 872: stimated cpu usage per instance in bytes Traditionally CPU usage is measured in time. Here we use computing/processing bytes to measure CPU usage. Variable name "cpu_usage_bytes" is not very clear for me. In future, we may consider CPU frequency. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 7 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 27 Sep 2022 07:44:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. [WIP] IMPALA-11604 Planner changes for CPU usage This patch augments IMPALA-10992 by allowing the amount of data processed to be used as a new factor in the definition and selection of an executor group. The number of data processed is the sum of that in every fragment, which is the sum of data processed in every node in the fragment. For each node, the data processed is computed as I * C * W / N where I is input cardinality C is expression evaluation cost per row W is average row size N is number of instances A description of computation for each kind of plan node is listed below. 1. Hdfs and Kudu scan nodes: N is mt_dop when query option mt_dop >= 1, otherwise N is number of nodes * max scan threads; 2. Hbase scan nodes: N is 1 3. Hash join and nested join nodes: C is sum of the evaluation cost for equi-join predicate and for other join predicate, for both probe and buiild side; 4. Aggregation nodes: C and W are the sum of the costs and partial row widths for each aggregate (AggregateInfo). Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a --- M common/thrift/Frontend.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/ExprUtil.java 29 files changed, 303 insertions(+), 55 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/8 -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 8 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/19033/8/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/19033/8/common/thrift/Frontend.thrift@743 PS8, Line 743: // a value for each of the limit variable dimension and compares it with the limit. An line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/19033/8/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/19033/8/fe/src/main/java/org/apache/impala/service/Frontend.java@1862 PS8, Line 1862: i = Long.compare(e1.getMax_data_processed_limit(), e2.getMax_data_processed_limit()); line too long (95 > 90) -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 8 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 27 Sep 2022 19:16:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 8: (2 comments) Rework and improved the calculation for AggregationNode. http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Frontend.thrift@747 PS7, Line 747: The memory limit variable provides the per host estimated-memory limit. : 4: optional > Add the more detailed description for measuring this variable Done http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/19033/7/common/thrift/Query.thrift@872 PS7, Line 872: stimated total data processed per instan > Traditionally CPU usage is measured in time. Here we use computing/processi Yeah. The use cpu usage is a kind of implicit. Changed the name to data processed throughout the patch. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 8 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 27 Sep 2022 19:16:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. [WIP] IMPALA-11604 Planner changes for CPU usage This patch augments IMPALA-10992 by allowing the amount of data processed to be used as a new factor in the definition and selection of an executor group. The number of data processed is the sum of that in every fragment, which is the sum of data processed in every node in the fragment. For each node, the data processed is computed as I * C * W / N where I is input cardinality C is expression evaluation cost per row W is average row size N is number of instances A description of computation for each kind of plan node is listed below. 1. Hdfs and Kudu scan nodes: N is mt_dop when query option mt_dop >= 1, otherwise N is number of nodes * max scan threads; 2. Hbase scan nodes: N is 1 3. Hash join and nested join nodes: C is sum of the evaluation cost for equi-join predicate and for other join predicate, for both probe and buiild side; 4. Aggregation nodes: C and W are the sum of the costs and partial row widths for each aggregate (AggregateInfo). Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a --- M common/thrift/Frontend.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/ExprUtil.java 29 files changed, 305 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/9 -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/19033/8/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/19033/8/common/thrift/Frontend.thrift@743 PS8, Line 743: // the limit dimension and compares it with the limit. An executor group is chosen > line has trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/19033/8/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/19033/8/fe/src/main/java/org/apache/impala/service/Frontend.java@1862 PS8, Line 1862: i = Long.compare( > line too long (95 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 27 Sep 2022 19:25:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/11459/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 8 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 27 Sep 2022 19:36:21 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/11460/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 27 Sep 2022 19:45:50 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. [WIP] IMPALA-11604 Planner changes for CPU usage This patch augments IMPALA-10992 by allowing the weighted amount of data processed to be used as a new factor in the definition and selection of an executor group. The weighted amount of data processed is the sum of that in every fragment, which is the sum of that in every node in the fragment. For each node, the weighted amount of data processed is computed with a general formula as follows. D = I * C * W / N where D is the weighted amount of data processed I is input cardinality C is expression evaluation cost per row W is average row size N is number of instances A description of the computation for each kind of plan node is given below. 1. Aggregation node: C and W are the sum of the costs and partial row widths for each AggregateInfo object. 2. AnalyticEval node: C is sum of the evaluation costs for analytic functions, partition by equal and order by equal predicate; 3. CardinalityCheck node: Both C and I are 1; 4. DataSource scan node: C is computed from a subset of the selection predicates excluding data source accepted predicates; 5. EmptySet node: I is 0; 6. Exchange node: A modification of the general formula when in broadcast mode: D = (I * C * W / N) * number of receivers; 7. Hash join node: C is sum of the evaluation cost for equi-join predicate and for other join predicate, for both probe and build side; 8. Hbase scan node: N is 1 9. Hdfs and Kudu scan node: N is mt_dop when query option mt_dop >= 1, otherwise N is number of nodes * max scan threads; 10. Nested loop join node: When the right child is not a SingularRowSrc node, C is sum of the evaluation cost for equi-join predicate and for other join predicate, for both probe and build side. When the right child is a SingularRowSrc node, the cost for build side is multiplied by the cardinality from the probe side; 11. Select node: Use the general formula; 12. SingularRowSrc node: I is 1. Since the node is involved once per input in nested loop join, the total cost of this node is computed in nested loop join. 13. Sort node: 14. Subplan node: 15. Union node: 16. Unnest node: Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a --- M common/thrift/Frontend.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/ExprUtil.java 29 files changed, 336 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/10 -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 10: Added nodes in the commit message about each node (up to SingularRowSrc) and improved in the corresponding class. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 28 Sep 2022 17:10:08 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/11470/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 28 Sep 2022 17:30:46 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-11604 Planner changes for CPU usage
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage .. Patch Set 10: (7 comments) Good start. I think the following changes are necessary and worth spending extra time on: 1) Fragment cost should conceptually be an amount of time required to process the estimated data for the fragment. That way, we can divide by cores available and factor consider SLA times through direct calculations. Any functions and variables should probably naming like CPUCost or ProcessingCost. 2) It's probably worth spending time making a new set of Expr costing functions instead of multiplying the EXPR constants by the row width. Comments in Expr.java state that those constants are not suitable for absolute cost computation. It should be reasonable to add a mechanism to do actual time measurements for Expr evaluation during init and use measurements to compute accurate time costs. http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Frontend.thrift@752 PS10, Line 752: 5: optional i64 max_data_processed_limit It's probably best to stick with vcores here as what the executor group provides. The coordinator can work backwards from the query costing to determine vcore sizing for a query. http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Query.thrift@873 PS10, Line 873: 13: optional i64 data_processed_bytes; Vcores here also. http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java@727 PS10, Line 727: * getIntermediateTupleDesc().getByteSize()) / Math.max(numInstances, 1); Don't divide by numInstances in these functions. Instead accumulate the fragment cost then divide by a (cost/core) number to arrive at a number of cores required. SLA can also be factored in later for to fragment That way. http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@627 PS10, Line 627: .setDataProcessedBytes( Use (abstract) processingCost naming instead of processedBytes. http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java: http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@362 PS10, Line 362: public long computeDataProcessedBytes() { rename all of these to computeProcessingCost or similar http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java: http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java@308 PS10, Line 308: float eualJoinPredicateEvalCost = eqJoinPredicateEvalCost http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/util/ExprUtil.java File fe/src/main/java/org/apache/impala/util/ExprUtil.java: http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/util/ExprUtil.java@109 PS10, Line 109: public static float computeConjuctsTotalCost(List conjuncts) { Use List to share the same function. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 28 Sep 2022 19:17:43 + Gerrit-HasComments: Yes