[ https://issues.apache.org/jira/browse/MAPREDUCE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139364#comment-14139364 ]
Jason Lowe commented on MAPREDUCE-5891: --------------------------------------- Thanks for updating the patch, Junping. The property in mapred-default.xml should be updated to use $\{yarn.nodemanager.recovery.enabled}, otherwise in practice we'll never fallback to the code default that tries to lookup from the NM property. Nit: huffleFetchEnabledDefault should be shuffleFetchEnabledDefault [~mingma] do you have additional comments or concerns? > Improved shuffle error handling across NM restarts > -------------------------------------------------- > > Key: MAPREDUCE-5891 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5891 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Affects Versions: 2.5.0 > Reporter: Jason Lowe > Assignee: Junping Du > Attachments: MAPREDUCE-5891-demo.patch, MAPREDUCE-5891-v2.patch, > MAPREDUCE-5891-v3.patch, MAPREDUCE-5891-v4.patch, MAPREDUCE-5891-v5.patch, > MAPREDUCE-5891.patch > > > To minimize the number of map fetch failures reported by reducers across an > NM restart it would be nice if reducers only reported a fetch failure after > trying for at specified period of time to retrieve the data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)