[ 
https://issues.apache.org/jira/browse/HADOOP-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665044#action_12665044
 ] 

Owen O'Malley commented on HADOOP-5003:
---------------------------------------

*Sigh* No, I thought it was pretty clear this issue was resolved.

It should never be the case that the sum of the guaranteed capacities are 
higher than the cluster size. To do so is totally counter intuitive. If the 
cluster size changes, of course the guaranteed capacities change too. At any 
point in time, the JobTracker should have queues set so that there are enough 
slots to satisfy all of the guaranteed capacities. Otherwise, the name should 
be something other than *guaranteed*.

Please file a jira to remove the check. The SLA timers should take effect when 
the queue has demand that isn't being satisfied. Clearly, in most cases the 
check should be redundant and therefore irrelevant. (Since you shouldn't have 
an underserved queue for long without an overserved queue.) However, in the 
presence of bugs in the scheduler, it would be better to not prevent the timer 
from starting.

> When computing absoluet guaranteed capacity (GC) from a percent value, 
> Capacity Scheduler should round up floats, rather than truncate them.
> --------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5003
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5003
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Vivek Ratan
>            Priority: Minor
>
> The Capacity Scheduler calculates a queue's absolute GC value by getting its 
> percent of the total cluster capacity (which is a float, since the configured 
> GC% is a float) and casting it to an int. Casting a float to an int always 
> rounds down. For very small clusters, this can result in the GC of a queue 
> being one lower than what it should be. For example, if Q1 has a GC of 50%, 
> Q2 has a GC of 40%, and Q3 has a GC of 10%, and if the cluster capacity is 4 
> (as we have, in our test cases), Q1's GC works out to 2, Q2's to 1, and Q3's 
> to 0 with today's code. Q2's capacity should really be 2, as 40% of 4, 
> rounded up, should be 2. 
> Simple fix is to use Math.round() rather than cast to an int. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to