[jira] [Updated] (SPARK-43232) Improve ObjectHashAggregateExec performance for high cardinality

2023-04-23 Thread XiDuo You (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

XiDuo You updated SPARK-43232:
--
Summary: Improve ObjectHashAggregateExec performance for high cardinality  
(was: Improve ObjectHashAggregateExec performance with high cardinality)

> Improve ObjectHashAggregateExec performance for high cardinality
> 
>
> Key: SPARK-43232
> URL: https://issues.apache.org/jira/browse/SPARK-43232
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: XiDuo You
>Priority: Major
>
> The `ObjectHashAggregateExec` has three preformance issues:
>  - heavy overhead of scala sugar in `createNewAggregationBuffer`
>  - unnecessary grouping key comparation after fallback to sort based 
> aggregator
>  - the aggregation buffer in sort based aggregator is not actually reused
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-43232) Improve ObjectHashAggregateExec performance for high cardinality

2023-04-23 Thread XiDuo You (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

XiDuo You updated SPARK-43232:
--
Description: 
The `ObjectHashAggregateExec` has three preformance issues:
 - heavy overhead of scala sugar in `createNewAggregationBuffer`
 - unnecessary grouping key comparation after fallback to sort based aggregator
 - the aggregation buffer in sort based aggregator is not reused for all rest 
rows

 

  was:
The `ObjectHashAggregateExec` has three preformance issues:
 - heavy overhead of scala sugar in `createNewAggregationBuffer`
 - unnecessary grouping key comparation after fallback to sort based aggregator
 - the aggregation buffer in sort based aggregator is not actually reused

 


> Improve ObjectHashAggregateExec performance for high cardinality
> 
>
> Key: SPARK-43232
> URL: https://issues.apache.org/jira/browse/SPARK-43232
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: XiDuo You
>Priority: Major
>
> The `ObjectHashAggregateExec` has three preformance issues:
>  - heavy overhead of scala sugar in `createNewAggregationBuffer`
>  - unnecessary grouping key comparation after fallback to sort based 
> aggregator
>  - the aggregation buffer in sort based aggregator is not reused for all rest 
> rows
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org