[ 
https://issues.apache.org/jira/browse/SPARK-35787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vidmantas Drasutis updated SPARK-35787:
---------------------------------------
    Description: 
Hello.

 

We had using spark 3.0.2 and query was executed in ~100 seconds.

After we upgraded Spark to 3.1.1 (tried also 3.1.2 - same, slow performance) - 
our query execution time started taking ~260 seconds it is huge increase 
250-300 % of execution time increase.

 

We tried quite simple query.

In query we using UDF (*org.apache.spark.sql.functions*)

) - which explodes data and do polygon hit test. Nothing changed in our code 
from query perspective.
 It is 1 VM box cluster

 

Maybe anyone faced similar issue?

Attached some details from spark dashboard.

 

*Looks like it is UDF related slowdown. As queries which does not use UDF`s 
performance is same and which uses UDFs - starting from 3.1 performance 
decreased.*

 

 

  was:
Hello.

 

We had using spark 3.0.2 and query was executed in ~100 seconds.

After we upgraded Spark to 3.1.1 (tried also 3.1.2 - same, slow performance) - 
our query execution time started taking ~260 seconds it is huge increase 
250-300 % of execution time increase.

 

We tried quite simple query.

In query we using UDF - which explodes data and do polygon hit test. Nothing 
changed in our code from query perspective.
 It is 1 VM box cluster

 

Maybe anyone faced similar issue?

Attached some details from spark dashboard.

 

*Looks like it is UDF related slowdown. As queries which does not use UDF`s 
performance is same and which uses UDFs - starting from 3.1 performance 
decreased.*

 

 


> Does anyone has performance issue after upgrade from 3.0 to 3.1?
> ----------------------------------------------------------------
>
>                 Key: SPARK-35787
>                 URL: https://issues.apache.org/jira/browse/SPARK-35787
>             Project: Spark
>          Issue Type: Question
>          Components: Spark Core
>    Affects Versions: 3.1.2
>            Reporter: Vidmantas Drasutis
>            Priority: Critical
>         Attachments: Execution_plan_difference.png, 
> spark_3.0_execution_plan_details_fast.txt, 
> spark_3.1_execution_plan_details_slow.txt, spark_job_info_1.png, 
> spark_job_info_2.png
>
>
> Hello.
>  
> We had using spark 3.0.2 and query was executed in ~100 seconds.
> After we upgraded Spark to 3.1.1 (tried also 3.1.2 - same, slow performance) 
> - our query execution time started taking ~260 seconds it is huge increase 
> 250-300 % of execution time increase.
>  
> We tried quite simple query.
> In query we using UDF (*org.apache.spark.sql.functions*)
> ) - which explodes data and do polygon hit test. Nothing changed in our code 
> from query perspective.
>  It is 1 VM box cluster
>  
> Maybe anyone faced similar issue?
> Attached some details from spark dashboard.
>  
> *Looks like it is UDF related slowdown. As queries which does not use UDF`s 
> performance is same and which uses UDFs - starting from 3.1 performance 
> decreased.*
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to