Sital Kedia created SPARK-21867:
-----------------------------------

             Summary: Support async spilling in UnsafeShuffleWriter
                 Key: SPARK-21867
                 URL: https://issues.apache.org/jira/browse/SPARK-21867
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 2.2.0
            Reporter: Sital Kedia
            Priority: Minor


Currently, Spark tasks are single-threaded. But we see it could greatly improve 
the performance of the jobs, if we can multi-thread some part of it. For 
example, profiling our map tasks, which reads large amount of data from HDFS 
and spill to disks, we see that we are blocked on HDFS read and spilling 
majority of the time. Since both these operations are IO intensive the average 
CPU consumption during map phase is significantly low. In theory, both HDFS 
read and spilling can be done in parallel if we had additional memory to store 
data read from HDFS while we are spilling the last batch read.

Let's say we have 1G of shuffle memory available per task. Currently, in case 
of map task, it reads from HDFS and the records are stored in the available 
memory buffer. Once we hit the memory limit and there is no more space to store 
the records, we sort and spill the content to disk. While we are spilling to 
disk, since we do not have any available memory, we can not read from HDFS 
concurrently. 

Here we propose supporting async spilling for UnsafeShuffleWriter, so that we 
can support reading from HDFS when sort and spill is happening asynchronously.  
Let's say the total 1G of shuffle memory can be split into two regions - active 
region and spilling region - each of size 500 MB. We start with reading from 
HDFS and filling the active region. Once we hit the limit of active region, we 
issue an asynchronous spill, while fliping the active region and spilling 
region. While the spil is happening asynchronosuly, we still have 500 MB of 
memory available to read the data from HDFS. This way we can amortize the high 
disk/network io cost during spilling.

We made a prototype hack to implement this feature and we could see our map 
tasks were as much as 40% faster. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to