[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5888:
----------------------------------

    Attachment: MAPREDUCE-5888.patch

Quick patch to fix the issue.  Manually tested it with a fail job and saw that 
the MRAppMaster hung after unregistering without the change and does not hang 
with the patch.

> Failed job leaves hung AM after it unregisters 
> -----------------------------------------------
>
>                 Key: MAPREDUCE-5888
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.2.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-5888.patch
>
>
> When a job fails the AM hangs during shutdown.  A non-daemon thread pool 
> executor thread prevents the JVM teardown from completing, and the AM lingers 
> on the cluster for the AM expiry interval in the FINISHING state until 
> eventually the RM expires it and kills the container.  If application limits 
> on the queue are relatively low (e.g.: small queue or small cluster) this can 
> cause unnecessary delays in resource scheduling on the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to