[ 
https://issues.apache.org/jira/browse/SPARK-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099909#comment-14099909
 ] 

Mridul Muralidharan commented on SPARK-3019:
--------------------------------------------


I am yet to go through the proposal in detail so will defer comments on that 
for later; but to get some clarity on discussion around Sandy's point :

- Until we read from all mappers, shuffle cant actually start.
Even if a single mapper's output is small enough to fit into memory (which it 
need not); num_mappers * avg_size_of_map_output_per_reducer could be way larger 
than available memory by orders. (This is fairly common for us for example).
This was the reason we actually worked on 2G fix btw - individual blocks in a 
mapper and also the data per reducer for a mapper was larger than 2G :-)

- While reading data off network, we cannot make an assessment if the read data 
can fit into memory or not (since there are other parallel read requests 
pending for this and other cores in the same executor).
So spooling intermediate data to disk would become necessary at both mapper 
side (which it already does) and at reducer side (which we dont do currently - 
assume that a block can fit into reducer memory as part of doing a remote 
fetch). This becomes more relevant when we want to target bigger blocks of data 
and tackle skew in data (for shuffle)

> Pluggable block transfer (data plane communication) interface
> -------------------------------------------------------------
>
>                 Key: SPARK-3019
>                 URL: https://issues.apache.org/jira/browse/SPARK-3019
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle, Spark Core
>            Reporter: Reynold Xin
>            Assignee: Reynold Xin
>         Attachments: PluggableBlockTransferServiceProposalforSpark - draft 
> 1.pdf
>
>
> The attached design doc proposes a standard interface for block transferring, 
> which will make future engineering of this functionality easier, allowing the 
> Spark community to provide alternative implementations.
> Block transferring is a critical function in Spark. All of the following 
> depend on it:
> * shuffle
> * torrent broadcast
> * block replication in BlockManager
> * remote block reads for tasks scheduled without locality



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to