Zhang, Liye created SPARK-4740:
----------------------------------

             Summary: Netty's network bandwidth is much lower than NIO in 
spark-perf and Netty takes longer running time
                 Key: SPARK-4740
                 URL: https://issues.apache.org/jira/browse/SPARK-4740
             Project: Spark
          Issue Type: Improvement
          Components: Shuffle, Spark Core
            Reporter: Zhang, Liye


When testing current spark master (1.3.0-snapshot) with spark-perf 
(sort-by-key, aggregate-by-key, etc), Netty based shuffle transferService takes 
much longer time than NIO based shuffle transferService. The network throughput 
of Netty is only about half of that of NIO. 

We tested with standalone mode, and the data set we used for test is 20 billion 
records, and the total size is about 400GB. Spark-perf test is Running on a 4 
node cluster with 10G NIC, 48 cpu cores per node and each executor memory is 
64GB. The reduce tasks number is set to 1000. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to