Re: Spark 3 is Slower than Spark 2 for TPCDS Q04 query.

2021-12-20 Thread Senthil Kumar
ain/resources/tpcds_2_4/q4.sql > > > > spark.time(sql(q4).collect) // note q4 result set is only 100 rows > > ``` > > > > Spark 2.4.5: > > Time taken: 256812 ms > > Time taken: 226571 ms > > Time taken: 305508 ms > > > > Spark 3.1.2 >

RE: Spark 3 is Slower than Spark 2 for TPCDS Q04 query.

2021-12-20 Thread Luca Canali
github.com/databricks/spark-sql-perf/blob/master/src/main/resources/tpcds_2_4/q4.sql spark.time(sql(q4).collect) // note q4 result set is only 100 rows ``` Spark 2.4.5: Time taken: 256812 ms Time taken: 226571 ms Time taken: 305508 ms Spark 3.1.2 spark.time(sql(q4).collect) T

Re: Spark 3 is Slower than Spark 2 for TPCDS Q04 query.

2021-12-20 Thread Senthil Kumar
e when tried with Spark 3.0.2 and 3.1.x. Once we >> corrected it, we observed that the queries were executed much faster. >> >> >> >> Thanks and Regards, >> >> Abhishek >> >> >> >> *From:* Senthil Kumar >> *Sent:* Sunday, December 1

Re: Spark 3 is Slower than Spark 2 for TPCDS Q04 query.

2021-12-19 Thread Senthil Kumar
nday, December 19, 2021 11:58 PM > *To:* dev > *Subject:* Spark 3 is Slower than Spark 2 for TPCDS Q04 query. > > > > Hi All, > > We are comparing Spark 2.4.5 and Spark 3(without enabling spark 3 > additional features) with TPCDS queries and found that Spark 3's &g

RE: Spark 3 is Slower than Spark 2 for TPCDS Q04 query.

2021-12-19 Thread Rao, Abhishek (Nokia - IN/Bangalore)
Regards, Abhishek From: Senthil Kumar Sent: Sunday, December 19, 2021 11:58 PM To: dev Subject: Spark 3 is Slower than Spark 2 for TPCDS Q04 query. Hi All, We are comparing Spark 2.4.5 and Spark 3(without enabling spark 3 additional features) with TPCDS queries and found that Spark 3's perfor

Spark 3 is Slower than Spark 2 for TPCDS Q04 query.

2021-12-19 Thread Senthil Kumar
Hi All, We are comparing Spark 2.4.5 and Spark 3(without enabling spark 3 additional features) with TPCDS queries and found that Spark 3's performance is reduced to at-least 30-40% compared to Spark 2.4.5. Eg. Data size used 1TB Spark 2.4.5 finishes the Q4 in 1.5 min, but Spark 3.* takes at-lea