[ https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261811#comment-13261811 ]
Siddharth Seth commented on MAPREDUCE-3921: ------------------------------------------- Thanks for the updated patch Bikas. Will take a look. Still waiting for input from the MR veterans on some of the previous comments - how things were handled in 20 - specifically for killing map/reduce tasks on unhealthy nodes, and treating 'node unhealthy' similar to 'fetch failure' (State Killed / Failed as well as counting towards max_attempts). bq. About the OBSOLETE part. I get how it is used. What I dont get is why we are marking a previously successful task as obsolete and invalid upon the completion of a new task without first checking if the new task was itself successful or not. Are you considering leaving the task in SUCCESSFUL state, even if it's being retried, so that the Reduce *may* be able to pull data - before there's a new SUCCESSFUL attempt ? Otherwise, marking the attempt as OBSOLETE and removing the task from successAttemptCompletionEventNoMap (tracks only SUCCESSUL attempts) seems like the correct thing to do. > MR AM should act on the nodes liveliness information when nodes go > up/down/unhealthy > ------------------------------------------------------------------------------------ > > Key: MAPREDUCE-3921 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am, mrv2 > Affects Versions: 0.23.0 > Reporter: Vinod Kumar Vavilapalli > Assignee: Bikas Saha > Fix For: 0.23.2 > > Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, > MAPREDUCE-3921-4.patch, MAPREDUCE-3921-5.patch, MAPREDUCE-3921-6.patch, > MAPREDUCE-3921-7.patch, MAPREDUCE-3921-branch-0.23.patch, > MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, > MAPREDUCE-3921.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira