Yuan Zhou created SPARK-32184: --------------------------------- Summary: Performance regression on TPCH Q18 Key: SPARK-32184 URL: https://issues.apache.org/jira/browse/SPARK-32184 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.0.0 Environment: spark 2.4 and spark 3.0 are using the same configurations * spark.driver.memory 20g * spark.executor.memory 20g * spark.executor.cores 7 * spark.executor.memoryOverhead 3g * spark.sql.shuffle.partitions 384 Reporter: Yuan Zhou
Hi Spark developers, Testing with the new Spark 3.0.0 here and found some performance regression on TPCH Q18. Spark 2.4 seems can "reuse" the HashAgg results in two SMJ, while Spark 3.0.0 needs to calculate this results twice. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org