[ 
https://issues.apache.org/jira/browse/IMPALA-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-12657.
-----------------------------------
    Resolution: Fixed

> Improve ProcessingCost of ScanNode and NonGroupingAggregator
> ------------------------------------------------------------
>
>                 Key: IMPALA-12657
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12657
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 4.3.0
>            Reporter: Riza Suminto
>            Assignee: David Rorke
>            Priority: Major
>             Fix For: Impala 4.4.0
>
>         Attachments: profile_1f4d7a679a3e12d5_4223115700000000.txt
>
>
> Several benchmark run measuring Impala scan performance indicates some 
> costing improvement opportunity around ScanNode and NonGroupingAggregator.
> [^profile_1f4d7a679a3e12d5_4223115700000000.txt] shows an example of simple 
> count query.
> Key takeaway:
>  # There is a strong correlation between total materialized bytes (row-size * 
> cardinality) with total materialized tuple time per fragment. Row 
> materialization cost should be adjusted to be based on this row-sized instead 
> of equal cost per scan range.
>  # NonGroupingAggregator should have much lower cost that GroupingAggregator. 
> In example above, the cost of NonGroupingAggregator dominates the scan 
> fragment even though it only does simple counting instead of hash table 
> operation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to