[ https://issues.apache.org/jira/browse/SPARK-26632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marcelo Vanzin reassigned SPARK-26632: -------------------------------------- Assignee: jiafu zhang > Separate Thread Configurations of Driver and Executor > ----------------------------------------------------- > > Key: SPARK-26632 > URL: https://issues.apache.org/jira/browse/SPARK-26632 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.0.0 > Reporter: jiafu zhang > Assignee: jiafu zhang > Priority: Minor > Fix For: 3.0.0 > > > During the benchmark of Spark 2.4.0 on HPC (High Performance Computing), we > identified an area can be optimized to improve RPC performance on large > number of HPC nodes with omini-path NIC. It's same thread configurations for > both driver and executor. From the test, we find driver and executor should > have different thread configurations because driver has far more RPC messages > than single executor. > These configurations are, > ||Config Key||for Driver||for Executor|| > |spark.rpc.io.serverThreads|spark.driver.rpc.io.serverThreads|spark.executor.rpc.io.serverThreads| > |spark.rpc.io.clientThreads|spark.driver.rpc.io.clientThreads|spark.executor.rpc.io.clientThreads| > |spark.rpc.netty.dispatcher.numThreads|spark.driver.rpc.netty.dispatcher.numThreads|spark.executor.rpc.netty.dispatcher.numThreads| > When Spark reads thread configurations, it tries to read driver's > configurations or executor's configurations first. Then fall back to the > common thread configurations. > After the separation, the performance is improved a lot in 256 nodes and 512 > nodes. see below test result of SimpleMapTask. > || > ||spark.driver.rpc.io.serverThreads||spark.driver.rpc.io.clientThreads||spark.driver.rpc.netty.dispatcher.numThreads||spark.executor.rpc.netty.dispatcher.numThreads||Overall > Time (s)||Overall Time without Separation (s)||Improvement|| > |128 nodes|15|15|10|30|107|108|0.9%| > |256 nodes|12|15|10|30|159|196|18.8%| > |512 nodes|12|15|10|30|283|377|24.9%| > > The implementation is almost done. We are working on the code merge. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org