[jira] [Commented] (MAPREDUCE-5841) uber job doesn't terminate on getting mapred job kill

Jason Lowe (JIRA) Wed, 23 Apr 2014 11:29:18 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978574#comment-13978574
 ]


Jason Lowe commented on MAPREDUCE-5841:
---------------------------------------

bq. Since the uber job executes task attempts serially, we can have a situation 
where a mapper attempt is killed but the new mapper attempt will be queued 
behind an existing reducer attempt. In that case, the job will not be able to 
make progress if the reducer needs the killed mapper to finish. I think this 
happens already before these changes, and what we do here is likely not going 
to change that.

Agreed, I think it's outside of the scope of this JIRA to fix that.  Sounds 
like an uber job shouldn't queue up any reduce tasks until all the map tasks 
have completed to avoid this.

bq. Also, another situation is if mapper/reducer tasks do not respond to 
interrupt (i.e. uninterruptible). If a task does not respond to interrupt, 
-kill-task won't necessarily work.

Also agreed, we can't do much if we can't control the task thread.

Thanks for updating the patch, I'll try to look at it later today.  I'm 
assuming the TestRMContainerAllocator failure is unrelated, but it'd be good if 
you could verify.

> uber job doesn't terminate on getting mapred job kill
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-5841
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5841
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 2.3.0
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: mapreduce-5841.patch, mapreduce-5841.patch
>
>
> If you issue a "mapred job -kill" against a uberized job, the job (and the 
> yarn application) state transitions to KILLED, but the application master 
> process continues to run. The job actually runs to completion despite the 
> killed status.
> This can be easily reproduced by running a sleep job:
> {noformat}
> hadoop jar hadoop-mapreduce-client-jobclient-2.3.0-tests.jar sleep -m 1 -r 0 
> -mt 300000
> {noformat}
> Issue a kill with "mapred job -kill \[job-id\]". The UI will show the job 
> (app) is in the KILLED state. However, you can see the application master is 
> still running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAPREDUCE-5841) uber job doesn't terminate on getting mapred job kill

Reply via email to