[ 
https://issues.apache.org/jira/browse/SPARK-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287461#comment-16287461
 ] 

Eyal Farago commented on SPARK-21867:
-------------------------------------

[~ericvandenbergfb], looks good few questions though:
1. initial number of sorters?
2. assuming initial number of sorters >1, doesn't this mean you're potentially 
increasing number of spills? if a single sorter would spill after N records, a 
multi-sorter will spill after N/k records (where k is the number of sorters), 
doesn't this mean more spills and merges?
3. when a sorter hits the spill threshold, does it immediately spill or will it 
keep going if there's available execution memory? it might make sense to spill 
and raise the threshold in this condition...
4. merging spills may require some attention: compression, encryption, etc. 
also avoiding too many open files/buffers at the same time

looking forward for the PR

> Support async spilling in UnsafeShuffleWriter
> ---------------------------------------------
>
>                 Key: SPARK-21867
>                 URL: https://issues.apache.org/jira/browse/SPARK-21867
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Sital Kedia
>            Priority: Minor
>         Attachments: Async ShuffleExternalSorter.pdf
>
>
> Currently, Spark tasks are single-threaded. But we see it could greatly 
> improve the performance of the jobs, if we can multi-thread some part of it. 
> For example, profiling our map tasks, which reads large amount of data from 
> HDFS and spill to disks, we see that we are blocked on HDFS read and spilling 
> majority of the time. Since both these operations are IO intensive the 
> average CPU consumption during map phase is significantly low. In theory, 
> both HDFS read and spilling can be done in parallel if we had additional 
> memory to store data read from HDFS while we are spilling the last batch read.
> Let's say we have 1G of shuffle memory available per task. Currently, in case 
> of map task, it reads from HDFS and the records are stored in the available 
> memory buffer. Once we hit the memory limit and there is no more space to 
> store the records, we sort and spill the content to disk. While we are 
> spilling to disk, since we do not have any available memory, we can not read 
> from HDFS concurrently. 
> Here we propose supporting async spilling for UnsafeShuffleWriter, so that we 
> can support reading from HDFS when sort and spill is happening 
> asynchronously.  Let's say the total 1G of shuffle memory can be split into 
> two regions - active region and spilling region - each of size 500 MB. We 
> start with reading from HDFS and filling the active region. Once we hit the 
> limit of active region, we issue an asynchronous spill, while fliping the 
> active region and spilling region. While the spil is happening 
> asynchronosuly, we still have 500 MB of memory available to read the data 
> from HDFS. This way we can amortize the high disk/network io cost during 
> spilling.
> We made a prototype hack to implement this feature and we could see our map 
> tasks were as much as 40% faster. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to