Bilwa S T created MAPREDUCE-7314:
------------------------------------

             Summary: Job will hang if NM is restarted while its running
                 Key: MAPREDUCE-7314
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7314
             Project: Hadoop Map/Reduce
          Issue Type: Sub-task
            Reporter: Bilwa S T
            Assignee: Bilwa S T


This is due to three different reasons
 # PRIORITY_FAST_FAIL_MAP priority containers should be considered for reuse.
 # Whenever CONTAINER_REMOTE_CLEANUP is fired for task attempt, it wont kill 
current attempt which is assigned to container. That is because task attempt is 
not updated in ContainerLauncherImpl#Container class. 
 # Container gets assigned to task attempt even when container has stopped 
running ie Container completed event is processed. This is because we add reuse 
container map to allocated list. Makeremoterequest gets the same container in 
allocationResponse whereas RM has sent same container in finished container 
list. To avoid this we need to make sure allocated list doesnt have any 
containers which are finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Reply via email to