[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235635#comment-15235635 ] JESSE CHEN commented on SPARK-13307: Performance back on track on Spark 2.0. Closing this. > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174501#comment-15174501 ] Xiao Li commented on SPARK-13307: - Really thank you for your response. > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174473#comment-15174473 ] Davies Liu commented on SPARK-13307: In the plan, I saw that the column pruning does not work well, it's fixed recently in master. I ran it locally with scale factor 10, it took 35 seconds (with 1 CPUs), all joins are BroadcastHashJoin. Could you checkout master, it should be faster than 1.4 It's true that SortMergeJoin could be slower than ShuffleHashJoin, we may revisit that later. > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160260#comment-15160260 ] Xiao Li commented on SPARK-13307: - You need to check the plan and check the join type it is using. I guess the threshold 100 MB is still too low. BTW, 1.6 build also has a statistics-related issue. Recently, [~davies] just delivered a fix to resolve this problem. > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160256#comment-15160256 ] Xiao Li commented on SPARK-13307: - First, I am not sure if usage of broadcastjoin makes sense in this query, especially when your table size is huge. Second, are your queries written in SQL? or DataFrame APIs? Spark SQL does not provide broadcast hint for SQL users. If using DataFrame API, you can do something like {code} df1.join(broadcast(df2), $"key" === $"key2", "leftsemi") {code} Third, I think performance regression is expected for avoiding OOM. SortMergeJoin is consuming less memory than ShuffleHashJoin. Thus, it might make more sense to choose SortMergeJoin as a default join type. > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159636#comment-15159636 ] JESSE CHEN commented on SPARK-13307: I tuned up the autoBroadcastJoinThreshold to 100MB, made no difference: spark.sql.autoBroadcastJoinThreshold 104857600 query time is 913 seconds. Now, how would you change the query to add the [broadcast] hint? I will try it if you provide me the modified query. Thanks. > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146765#comment-15146765 ] Xiao Li commented on SPARK-13307: - In the following PR: https://github.com/apache/spark/pull/9645, shuffle hash join is removed from Spark SQL. Try to see if broadcast join works in this test case. You also can use hint to force the broadcast join. Let me CC [~rxin] [~yhuai] [~marmbrus] > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146740#comment-15146740 ] JESSE CHEN commented on SPARK-13307: Uploaded newly collected plans (logical, analyzed, optimized and physical). Links are the same: https://ibm.box.com/spark-sql-q66-debug-160plan https://ibm.box.com/spark-sql-q66-debug-141plan Please let me know any additional info you need to collect. Thanks. > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146755#comment-15146755 ] Xiao Li commented on SPARK-13307: - Please tune “spark.sql.autoBroadcastJoinThreshold” to enable the broadcast Join. Thanks! > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146751#comment-15146751 ] Xiao Li commented on SPARK-13307: - 1.6.1 is using SortMergeJoin, but 1.4.1 is using ShuffleHashJoin. I believe this is the major cause of the performance difference. > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146138#comment-15146138 ] JESSE CHEN commented on SPARK-13307: Have you looked at the plans? I provided at above links. > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146140#comment-15146140 ] Xiao Li commented on SPARK-13307: - Could you provided logical plans, as suggested above? The attached only contains the physical plans. Thanks! > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
[ https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15145829#comment-15145829 ] Xiao Li commented on SPARK-13307: - Please use explain(true). It will be much easier to analyze when reading the logical plans. Thanks! > TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1 > - > > Key: SPARK-13307 > URL: https://issues.apache.org/jira/browse/SPARK-13307 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 >Reporter: JESSE CHEN > > Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average > about 9% faster. There are a few degraded, and one that is definitely not > within error margin is query 66. > Query 66 in 1.4.1: 699 seconds > Query 66 in 1.6.0: 918 seconds > 30% worse. > Collected the physical plans from both versions - drastic difference maybe > partially from using Tungsten in 1.6, but anything else at play here? > Please see plans here: > https://ibm.box.com/spark-sql-q66-debug-160plan > https://ibm.box.com/spark-sql-q66-debug-141plan -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org