[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364487#comment-16364487
 ] 

Jason Lowe commented on MAPREDUCE-7053:
---------------------------------------

The easiest "fix" for this issue is to have the AM ignore tasks that are 
unknown as it did before, although that could cause unknown tasks to linger on 
the cluster far longer than they should if somehow a task were to "escape."

> Timed out tasks can fail to produce thread dump
> -----------------------------------------------
>
>                 Key: MAPREDUCE-7053
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6
>            Reporter: Jason Lowe
>            Priority: Major
>
> TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically 
> recently.  When the AM times out a task it immediately removes it from the 
> list of known tasks and then connects to the NM to request a thread dump 
> followed by a kill.  If the task heartbeats in after the task has been 
> removed from the list of known tasks but before the thread dump signal 
> arrives then the task can exit with a "org.apache.hadoop.mapred.Task: Parent 
> died." message and no thread dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to