[jira] [Commented] (MAPREDUCE-5841) uber job doesn't terminate on getting mapred job kill

Sangjin Lee (JIRA) Wed, 16 Apr 2014 20:29:15 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13972267#comment-13972267
 ]


Sangjin Lee commented on MAPREDUCE-5841:
----------------------------------------

This happens because the LocalContainerLauncher.SubtaskRunner thread handles 
the events and runs the mapper/reducer tasks in the same thread.

Once the task is under way the subtask runner will not get to any of the events 
that are delivered to the event queue. As a result, the key event 
(CONTAINER_REMOTE_CLEANUP) sits in the queue until the mapper/reducer task 
finishes.

I think a possible fix is to separate the event handling and the task running 
into their own threads respectively. On receiving the CONTAINER_REMOTE_CLEANUP 
event, the event handling thread can shut down the task running thread and 
continue with the rest of the state transition.

I'll come up with a proposed patch soon.

> uber job doesn't terminate on getting mapred job kill
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-5841
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5841
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 2.3.0
>            Reporter: Sangjin Lee
>
> If you issue a "mapred job -kill" against a uberized job, the job (and the 
> yarn application) state transitions to KILLED, but the application master 
> process continues to run. The job actually runs to completion.
> This can be easily reproduced by running a sleep job:
> {noformat}
> hadoop jar hadoop-mapreduce-client-jobclient-2.3.0-tests.jar sleep -m 1 -r 0 
> -mt 300000
> {noformat}
> Issue a kill with "mapred job -kill \[job-id\]". The UI will show the job 
> (app) is in the KILLED state. However, you can see the application master is 
> still running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAPREDUCE-5841) uber job doesn't terminate on getting mapred job kill

Reply via email to