[
https://issues.apache.org/jira/browse/HADOOP-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646787#action_12646787
]
Vivek Ratan commented on HADOOP-4624:
-------------------------------------
The code does the right thing. It looks for maps that are non-local to any node
in the cluster, then others. It assumes, however, that a job may have some
non-local maps (i.e., the JobInProgress object's _nonLocalRunningMaps_
structure is not empty), as well as other running maps (in the
_runningMapCache_ structure). Amar informs me that these two are mutually
exclusive, i.e., a job will have one or the other structure empty. So, the
right thing to do is modify the comment in
CapacityTaskScheduler.killTasksFromJob() to reflect this, and wrap the calls to
_job.getNonLocalRunningMaps()_ and _job.getRunningMapCache()_ in an
if...then...else block.
> CapacityTaskScheduler.MapSchedulingMgr.killTasksFromJob() will not work as
> expected
> -----------------------------------------------------------------------------------
>
> Key: HADOOP-4624
> URL: https://issues.apache.org/jira/browse/HADOOP-4624
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/capacity-sched
> Reporter: Amar Kamat
>
> Once capacity-scheduler decides on killing tasks, it selects running-jobs
> from the queue and issues {{killTasksFromJob()}}. The order in which it kills
> is as follows
> - non-local maps
> - local maps
> _Killing non-local maps :_
> The code here uses {{JobInProgress.getNonLocalRunningMaps()}}. HADOOP-2119
> introduced this for handling cases like _random-writer_. Hence this api will
> return an empty structure if there are reducers in the job. Hence the code
> fails to serve its purpose.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.