[ 
https://issues.apache.org/jira/browse/SPARK-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated SPARK-3196:
-----------------------------

    Description: 
The expression id generations depend on a atomic long object internally, which 
will cause the performance drop dramatically in a multi-threading execution.

I'd like to create 2 sub tasks(maybe more) for the improvements:

1) Reduce the expression tree object creation from the aggregation functions 
(min/max), as they will create expression trees for each single row.
2) Improve the expression id generation algorithm, by not using the AtomicLong, 
or generate the expression id in necessary.

And remove the expression object creation as many as possible, where we have 
the expression evaluation. (I will create couple of subtask soon).



  was:
The expression id generations depend on a atomic long object internally, which 
will cause the performance drop dramatically in a multi-threading execution.

I'd like to create 2 sub tasks(maybe more) for the improvements:

1) Reduce the expression tree object creation from the aggregation functions 
(min/max), as they will create expression trees for each single row.
2) Improve the expression id generation algorithm, by not using the AtomicLong.

And remove the expression object creation as many as possible, where we have 
the expression evaluation. (I will create couple of subtask soon).




> Expression Evaluation Performance Improvement
> ---------------------------------------------
>
>                 Key: SPARK-3196
>                 URL: https://issues.apache.org/jira/browse/SPARK-3196
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Cheng Hao
>
> The expression id generations depend on a atomic long object internally, 
> which will cause the performance drop dramatically in a multi-threading 
> execution.
> I'd like to create 2 sub tasks(maybe more) for the improvements:
> 1) Reduce the expression tree object creation from the aggregation functions 
> (min/max), as they will create expression trees for each single row.
> 2) Improve the expression id generation algorithm, by not using the 
> AtomicLong, or generate the expression id in necessary.
> And remove the expression object creation as many as possible, where we have 
> the expression evaluation. (I will create couple of subtask soon).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to