[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860147#action_12860147
 ] 

Hemanth Yamijala commented on MAPREDUCE-1723:
---------------------------------------------

Hmm. In HADOOP-3445 (God, I am surprised I still remember the number, *smile*) 
which introduced the capacity scheduler, Vivek had argued to have separate 
percentages for map and reduce capacities. At the time though, consensus drove 
towards having a single number.  I think a big factor driving that decision was 
the absence of limits and presence of pre-emption. At that time, queues could 
not impose limits and hence spare capacity could be always used elsewhere; and 
pre-emption was meant to ensure that queues could get their 'guaranteed' 
capacity when required.

With time, limits have come in and pre-emption has gone out. There is this 
valid use case that has come up. To me it seems like there are two ways to 
approach this problem. One is to do the enhancement proposed in the JIRA. Two 
is to re-introduce pre-emption. Clearly the first option is simple and easy to 
understand; I can think of ways we can keep the spec and implementation simple 
for the default case and still support this special requirement. The only thing 
bothering me is that it seems to be handling a specific type of cluster setup 
(i.e. the kind of queue and job profile that is described). The second option 
is clearly quite complicated. But we've had repeated cases from people asking 
for pre-emption in the scheduler, and I think it is a topic that's going to die 
only when it gets implemented. *smile*.

As a side note while we are still discussing this, Subramaniam, what is the 
proportion of map and reduce slots in your cluster ? Are they the same ?

> Capacity Scheduler should allow configuration of Map & Reduce task slots 
> independently per queue
> ------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1723
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1723
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>    Affects Versions: 0.20.1
>         Environment: all
>            Reporter: Subramaniam Krishnan
>             Fix For: 0.20.3
>
>
> The Capacity Scheduler allows configuration of percentage of task slots per 
> queue. We have a scenario in which our biggest queue (50% quota) has Jobs 
> with mainly Map tasks & we need to enforce strict capacity limits per queue 
> due to SLA requirements. So other smaller queues which require Reduce tasks 
> gets starved even though the Reduce slots are idle. The Grid can be more 
> efficiently utilized if Capacity Scheduler allows configuration of Map & 
> Reduce task slots capacity independently per queue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to