[jira] [Commented] (SPARK-26155) Spark SQL performance degradation after apply SPARK-21052 with Q19 of TPC-DS in 3TB scale

Ke Jia (JIRA) Thu, 22 Nov 2018 23:49:59 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-26155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16696476#comment-16696476
 ]


Ke Jia commented on SPARK-26155:
--------------------------------

*Cluster info:*
| |*Master Node*|*Worker Nodes* |
|*Node*|1x |7x|
|*Processor*|Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz|Intel(R) Xeon(R) 
Platinum 8180 CPU @ 2.50GHz|
|*Memory*|192 GB|384 GB|
|*Storage Main*|8 x 960G SSD|8 x 960G SSD|
|*Network*|10Gbe|
|*Role*|CM Management 
 NameNode
 Secondary NameNode
 Resource Manager
 Hive Metastore Server|DataNode
 NodeManager|
|*OS Version*|CentOS 7.2|
|*Hadoop*|Apache Hadoop 2.7.5|
|*Hive*|Apache Hive 2.2.0|
|*Spark*|Apache Spark 2.1.0  VS Apache Spark2.3.0|
|*JDK  version*|1.8.0_112|

*Related parameters setting:*
|*Component*|*Parameter*|*Value*|
|*Yarn Resource Manager*|yarn.scheduler.maximum-allocation-mb|40GB|
|yarn.scheduler.minimum-allocation-mb|1GB|
|yarn.scheduler.maximum-allocation-vcores|121|
|Yarn.resourcemanager.scheduler.class|Fair Scheduler|
|*Yarn Node Manager*|yarn.nodemanager.resource.memory-mb|40GB|
|yarn.nodemanager.resource.cpu-vcores|121|
|*Spark*|spark.executor.memory|34GB|
|spark.executor.cores|40|

> Spark SQL  performance degradation after apply SPARK-21052 with Q19 of TPC-DS 
> in 3TB scale
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-26155
>                 URL: https://issues.apache.org/jira/browse/SPARK-26155
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.4.0
>            Reporter: Ke Jia
>            Priority: Major
>
> In our test environment, we found a serious performance degradation issue in 
> Spark2.3 when running TPC-DS on SKX 8180. Several queries have serious 
> performance degradation. For example, TPC-DS Q19 needs 126 seconds with Spark 
> 2.3 while it needs only 29 seconds with Spark2.1 on 3TB data. We investigated 
> this problem and figured out the root cause is in community patch SPARK-21052 
> which add metrics to hash join process. And the impact code is 
> [L486|https://github.com/apache/spark/blob/1d3dd58d21400b5652b75af7e7e53aad85a31528/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala#L486]
>  and 
> [L487|https://github.com/apache/spark/blob/1d3dd58d21400b5652b75af7e7e53aad85a31528/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala#L487]
>   . Q19 costs about 30 seconds without these two lines code and 126 seconds 
> with these code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-26155) Spark SQL performance degradation after apply SPARK-21052 with Q19 of TPC-DS in 3TB scale

Reply via email to