[ 
https://issues.apache.org/jira/browse/HADOOP-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646787#action_12646787
 ] 

Vivek Ratan commented on HADOOP-4624:
-------------------------------------

The code does the right thing. It looks for maps that are non-local to any node 
in the cluster, then others. It assumes, however, that a job may have some 
non-local maps (i.e., the JobInProgress object's _nonLocalRunningMaps_ 
structure is not empty), as well as other running maps (in the 
_runningMapCache_ structure). Amar informs me that these two are mutually 
exclusive, i.e., a job will have one or the other structure empty. So, the 
right thing to do is modify the comment in 
CapacityTaskScheduler.killTasksFromJob() to reflect this, and wrap the calls to 
_job.getNonLocalRunningMaps()_ and _job.getRunningMapCache()_ in an 
if...then...else block. 

> CapacityTaskScheduler.MapSchedulingMgr.killTasksFromJob() will not work as 
> expected
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-4624
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4624
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Amar Kamat
>
> Once capacity-scheduler decides on killing tasks, it selects running-jobs 
> from the queue and issues {{killTasksFromJob()}}. The order in which it kills 
> is as follows
> - non-local maps
> - local maps
> _Killing non-local maps :_
> The code here uses {{JobInProgress.getNonLocalRunningMaps()}}. HADOOP-2119 
> introduced this for handling cases like _random-writer_. Hence this api will 
> return an empty structure if there are reducers in the job. Hence the code 
> fails to serve its purpose. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to