[ https://issues.apache.org/jira/browse/SPARK-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205279#comment-15205279 ]
Thomas Graves commented on SPARK-1239: -------------------------------------- I do like the idea of broadcast and originally when I had tried it I had the issue mentioned in the second bullet point, but as long as we are synchronizing on the requests so we only broadcast it once we should be ok. It does seem to have some further constraints though too. With a sufficient large job I don't think it matters but what if we only have a small number of reducers, we broadcast it to all executors when only a couple need it. I guess that doesn't hurt much unless the other executors start going to the executors your reducers are on and add more load to them. Should be pretty minimal though. Broadcast also seems to make less sense when using the dynamic allocation. At least I've seen issues when executors go away, it fails fetch from that one, has to retry, etc, adding additional time. We recently specifically fixed one issue with this to make it go get locations again after certain number of failures. That time should be less now that we fixed that but I'll have to run the numbers. I'll do some more analysis/testing of this and see if that really matters. with a sufficient number of threads I don't think a few slow nodes would make much of a difference here, if you have that many slow nodes the shuffle itself is going to be impacted which I would see as a larger affect. The slow nodes could just as well affect the broadcast as well. Hopefully you skip those as it takes longer for those to get a chunk, buts its possible that once that slow one has a chunk or two, more and more executors start going to that one for the broadcast data instead of the driver thus slowing down more transfers. But its a good point and my current method would truly block (for a certain time) rather then being slow. Note that there is a timeout on waiting for the send to happen and when it does it closes the connection and executor would retry. You don't have to worry about that with broadcast. I'll do some more analysis with that approach. I wish Netty had some other built in mechanisms for flow control. > Don't fetch all map output statuses at each reducer during shuffles > ------------------------------------------------------------------- > > Key: SPARK-1239 > URL: https://issues.apache.org/jira/browse/SPARK-1239 > Project: Spark > Issue Type: Improvement > Components: Shuffle, Spark Core > Affects Versions: 1.0.2, 1.1.0 > Reporter: Patrick Wendell > Assignee: Thomas Graves > > Instead we should modify the way we fetch map output statuses to take both a > mapper and a reducer - or we should just piggyback the statuses on each task. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org