Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21279 )
Change subject: IMPALA-12657: Improve ProcessingCost of ScanNode and NonGroupingAggregator ...................................................................... Patch Set 16: (8 comments) http://gerrit.cloudera.org:8080/#/c/21279/13/fe/src/main/java/org/apache/impala/planner/BaseProcessingCost.java File fe/src/main/java/org/apache/impala/planner/BaseProcessingCost.java: http://gerrit.cloudera.org:8080/#/c/21279/13/fe/src/main/java/org/apache/impala/planner/BaseProcessingCost.java@30 PS13, Line 30: public BaseProcessingCost( : long cardinality, float exprsCost, float materializationCost) { : // TODO: materializationCost accommodates ProcessingCost where row width should : // factor in. Currently, ProcessingCost of base class ScanNode (not HdfsScanNode), : // has row width factored in through materialization parameter here. : // If we convert the remaining non-Hdfs scan node types to use benchmark-based : // cost coefficients we should remove or si > There are still some scan node types (e.g. Kudu, HBase) that don't use the Ack http://gerrit.cloudera.org:8080/#/c/21279/16/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: http://gerrit.cloudera.org:8080/#/c/21279/16/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java@582 PS16, Line 582: Preconditions.checkState(rhsTree instanceof ExchangeNode); I'm having a doubt if rhsTree and lhsTree is guaranteed to be an ExchangeNode at this point. Maybe we should revert this back to static method. Please double check. http://gerrit.cloudera.org:8080/#/c/21279/13/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java File fe/src/main/java/org/apache/impala/planner/ExchangeNode.java: http://gerrit.cloudera.org:8080/#/c/21279/13/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java@230 PS13, Line 230: return getAvgRowSize() + (getTupleIds().size > Changed both methods to non-static. Ack http://gerrit.cloudera.org:8080/#/c/21279/16/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java File fe/src/main/java/org/apache/impala/planner/ExchangeNode.java: http://gerrit.cloudera.org:8080/#/c/21279/16/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java@222 PS16, Line 222: public double getAvgSerializedRowSize( On second thought, maybe this should stay as static method. That way, DistributedPlanner.java does not need to cast PlanNode to ExchangeNode. http://gerrit.cloudera.org:8080/#/c/21279/13/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java: http://gerrit.cloudera.org:8080/#/c/21279/13/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java@342 PS13, Line 342: double totalProbeCost = : (getProbeCardinalityForCosting() * COST_COEFFICIENT_PROBE_INPUT) : + (getCardinality() * COST_COEFFICIENT_HASH_JOIN_OUTPUT); > Not as a separate factor this point. Conjunct eval cost is included in the Ack http://gerrit.cloudera.org:8080/#/c/21279/13/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java File fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java: http://gerrit.cloudera.org:8080/#/c/21279/13/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java@117 PS13, Line 117: (long) Math.ceil(buildCardinality / fragment_.getNumInst > Done Done http://gerrit.cloudera.org:8080/#/c/21279/13/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: http://gerrit.cloudera.org:8080/#/c/21279/13/fe/src/main/java/org/apache/impala/planner/PlanNode.java@251 PS13, Line 251: public float getAvgRowSize() { return avgRowSize_; } : public float getRowPadSize() { return rowPadSize_; } > Done Done http://gerrit.cloudera.org:8080/#/c/21279/16/fe/src/main/java/org/apache/impala/planner/ProcessingCost.java File fe/src/main/java/org/apache/impala/planner/ProcessingCost.java: http://gerrit.cloudera.org:8080/#/c/21279/16/fe/src/main/java/org/apache/impala/planner/ProcessingCost.java@24 PS16, Line 24: import org.apache.impala.thrift.TQueryOptions; Looks like this is unused and can be removed. Sorry I miss this earlier. -- To view, visit http://gerrit.cloudera.org:8080/21279 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf1edd48d4ae255b7b3b7f5b228800d7bac7d2ca Gerrit-Change-Number: 21279 Gerrit-PatchSet: 16 Gerrit-Owner: David Rorke <dro...@cloudera.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: David Rorke <dro...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Comment-Date: Wed, 17 Apr 2024 19:21:47 +0000 Gerrit-HasComments: Yes