[ https://issues.apache.org/jira/browse/IMPALA-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839698#comment-17839698 ]
ASF subversion and git services commented on IMPALA-12657: ---------------------------------------------------------- Commit af2d28b54d4fb6b182b27a278c2cfa6277f3bbbb in impala's branch refs/heads/branch-4.4.0 from David Rorke [ https://gitbox.apache.org/repos/asf?p=impala.git;h=af2d28b54 ] IMPALA-12657: Improve ProcessingCost of ScanNode and NonGroupingAggregator This patch improves the accuracy of the CPU ProcessingCost estimates for several of the CPU intensive operators by basing the costs on benchmark data. The general approach for a given operator was to run a set of queries that exercised the operator under various conditions (e.g. large vs small row sizes and row counts, varying NDV, different file formats, etc) and capture the CPU time spent per unit of work (the unit of work might be measured as some number of rows, some number of bytes, some number of predicates evaluated, or some combination of these). The data was then analyzed in an attempt to fit a simple model that would allow us to predict CPU consumption of a given operator based on information available at planning time. For example, the CPU ProcessingCost for a Parquet scan is estimated as: TotalCost = (0.0144 * BytesMaterialized) + (0.0281 * Rows * Predicate Count) The coefficients (0.0144 and 0.0281) are derived from benchmarking scans under a variety of conditions. Similar cost functions and coefficients were derived for all of the benchmarked operators. The coefficients for all the operators are normalized such that a single unit of cost equates to roughly 100 nanoseconds of CPU time on a r5d.4xlarge instance. So we would predict an operator with a cost of 10,000,000 would complete in roughly one second on a single core. Limitations: * Costing only addresses CPU time spent and doesn't account for any IO or other wait time. * Benchmarking scenarios didn't provide comprehensive coverage of the full range of data types, distributions, etc. More thorough benchmarking could improve the costing estimates further. * This initial patch only covers a subset of the operators, focusing on those that are most common and most CPU intensive. Specifically the following operators are covered by this patch. All others continue to use the previous ProcessingCost code: AggregationNode DataStreamSink (exchange sender) ExchangeNode HashJoinNode HdfsScanNode HdfsTableSink NestedLoopJoinNode SortNode UnionNode Benchmark-based costing of the remaining operators will be covered by a future patch. Future patches will automate the collection and analysis of the benchmark data and the computation of the cost coefficients to simplify maintenance of the costing as performance changes over time. Change-Id: Icf1edd48d4ae255b7b3b7f5b228800d7bac7d2ca Reviewed-on: http://gerrit.cloudera.org:8080/21279 Reviewed-by: Riza Suminto <riza.sumi...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Improve ProcessingCost of ScanNode and NonGroupingAggregator > ------------------------------------------------------------ > > Key: IMPALA-12657 > URL: https://issues.apache.org/jira/browse/IMPALA-12657 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Affects Versions: Impala 4.3.0 > Reporter: Riza Suminto > Assignee: David Rorke > Priority: Major > Fix For: Impala 4.4.0 > > Attachments: profile_1f4d7a679a3e12d5_4223115700000000.txt > > > Several benchmark run measuring Impala scan performance indicates some > costing improvement opportunity around ScanNode and NonGroupingAggregator. > [^profile_1f4d7a679a3e12d5_4223115700000000.txt] shows an example of simple > count query. > Key takeaway: > # There is a strong correlation between total materialized bytes (row-size * > cardinality) with total materialized tuple time per fragment. Row > materialization cost should be adjusted to be based on this row-sized instead > of equal cost per scan range. > # NonGroupingAggregator should have much lower cost that GroupingAggregator. > In example above, the cost of NonGroupingAggregator dominates the scan > fragment even though it only does simple counting instead of hash table > operation. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org