[
https://issues.apache.org/jira/browse/HADOOP-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12490118
]
Doug Cutting commented on HADOOP-1270:
--------------------------------------
> randomize the list before firing the fetches
Yes, that was the original reason for randomizing, to avoid overloading nodes
as their maps complete. +1
> Randomize the fetch of map outputs
> ----------------------------------
>
> Key: HADOOP-1270
> URL: https://issues.apache.org/jira/browse/HADOOP-1270
> Project: Hadoop
> Issue Type: Improvement
> Components: mapred
> Reporter: Arun C Murthy
> Fix For: 0.13.0
>
>
> HADOOP-248 did away with random probing of maps for locating map outputs and
> instead we now rely on TaskCompletionEvents for the same.
> However we lost out on the benefit that the randomization in probing resulted
> in an added benefit where the map's jetty isn't overloaded with requests for
> the outputs. We have now a situation where a map completes, the JT is
> notified, *all* the reduces get the TaskCompletionEvent and pretty much swamp
> the poor map's jetty and this repeats for each map.
> I propose we make a minor change where we collect a set of
> TaskCompletionEvents and randomize the list before firing the fetches. Should
> help fix this mass-hysteria at the map's jetty.
> Thoughts?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.