[jira] [Commented] (MAPREDUCE-5891) Improved shuffle error handling across NM restarts

Ming Ma (JIRA) Mon, 08 Sep 2014 22:46:56 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126643#comment-14126643
 ]


Ming Ma commented on MAPREDUCE-5891:
------------------------------------

Thanks, Junping. Regarding the default value, 
mapreduce.reduce.shuffle.fetch.retry.enabled is set to true by default; while 
NM recovery is set to false by default. That means by default, fetcher will 
retry even though shufflehandler won't be able to serve mapper outputs after 
restart. It doesn't seem like a big deal. Just want to call out if that is 
intentional. Do we foresee other scenarios where fetch retry will be useful? If 
not, reducers can ask YARN if NM recovery is enabled or reducers can ask 
shufflehandler if recovery is enabled without defining this retry property.

> Improved shuffle error handling across NM restarts
> --------------------------------------------------
>
>                 Key: MAPREDUCE-5891
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5891
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 2.5.0
>            Reporter: Jason Lowe
>            Assignee: Junping Du
>         Attachments: MAPREDUCE-5891-demo.patch, MAPREDUCE-5891-v2.patch, 
> MAPREDUCE-5891-v3.patch, MAPREDUCE-5891-v4.patch, MAPREDUCE-5891.patch
>
>
> To minimize the number of map fetch failures reported by reducers across an 
> NM restart it would be nice if reducers only reported a fetch failure after 
> trying for at specified period of time to retrieve the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-5891) Improved shuffle error handling across NM restarts

Reply via email to