[ 
https://issues.apache.org/jira/browse/SPARK-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204939#comment-15204939
 ] 

Mridul Muralidharan commented on SPARK-1239:
--------------------------------------------

[~tgraves] For the last part (waiting bit) - why not make the threshold where 
you use Broadcast instead of direct serialization such that the problem 'goes 
away' ? For my case, I was using a fairly high number, but nothing stopping us 
from using say 1mb - which means number of outstanding requests which will 
cause memory issue becomes extremely high to the point of being not possible 
practically.
In general, I dont like the point about waiting for IO to complete - different 
nodes might have different loads, which can cause driver not to respond to fast 
nodes because slow nodes cause the response not to be sent (over time).

> Don't fetch all map output statuses at each reducer during shuffles
> -------------------------------------------------------------------
>
>                 Key: SPARK-1239
>                 URL: https://issues.apache.org/jira/browse/SPARK-1239
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle, Spark Core
>    Affects Versions: 1.0.2, 1.1.0
>            Reporter: Patrick Wendell
>            Assignee: Thomas Graves
>
> Instead we should modify the way we fetch map output statuses to take both a 
> mapper and a reducer - or we should just piggyback the statuses on each task. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to