Zhang, Liye created SPARK-14242:
-----------------------------------

             Summary: avoid too many copies in network when a network frame is 
large
                 Key: SPARK-14242
                 URL: https://issues.apache.org/jira/browse/SPARK-14242
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 1.6.1, 1.6.0, 2.0.0
            Reporter: Zhang, Liye
            Priority: Critical


when a shuffle block size is huge, say a large array (array size more than 
128MB), there will be performance issue for getting remote blocks. This is 
because network frame size is large, and when we are using a composite buffer, 
which will consolidate when the components number reaches maximum components 
number (default is 16) in netty underlying, performance issue will occurs. 
There will be too many memory copies inside netty's *compositeBuffer*.

How to reproduce:
{code}
sc.parallelize(Array(1,2,3),3).mapPartitions(a=>Array(new Array[Double](1024 * 
1024 * 50)).iterator).reduce((a,b)=> a).length
{code}

In this case, the serialized result size of each task is about 400MB, the 
result will be transferred to driver as *indirectResult*. We can see after the 
data transferred to driver, on driver side there will still need a lot of time 
to process and the 3 CPUs (in this case, parallelism is 3) are fully utilized 
with system call very high. And this processing time is calculated as result 
getting time on webUI.

Such cases are very common in ML applications, which will return a large array 
from each executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to