[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261811#comment-13261811
 ] 

Siddharth Seth commented on MAPREDUCE-3921:
-------------------------------------------

Thanks for the updated patch Bikas. Will take a look. Still waiting for input 
from the MR veterans on some of the previous comments - how things were handled 
in 20 - specifically for killing map/reduce tasks on unhealthy nodes, and 
treating 'node unhealthy' similar to 'fetch failure' (State Killed / Failed as 
well as counting towards max_attempts). 

bq. About the OBSOLETE part. I get how it is used. What I dont get is why we 
are marking a previously successful task as obsolete and invalid upon the 
completion of a new task without first checking if the new task was itself 
successful or not.
Are you considering leaving the task in SUCCESSFUL state, even if it's being 
retried, so that the Reduce *may* be able to pull data - before there's a new 
SUCCESSFUL attempt ?
Otherwise, marking the attempt as OBSOLETE and removing the task from 
successAttemptCompletionEventNoMap (tracks only SUCCESSUL attempts) seems like 
the correct thing to do.
                
> MR AM should act on the nodes liveliness information when nodes go 
> up/down/unhealthy
> ------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3921
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.0
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Bikas Saha
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, 
> MAPREDUCE-3921-4.patch, MAPREDUCE-3921-5.patch, MAPREDUCE-3921-6.patch, 
> MAPREDUCE-3921-7.patch, MAPREDUCE-3921-branch-0.23.patch, 
> MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, 
> MAPREDUCE-3921.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to