[
https://issues.apache.org/jira/browse/HADOOP-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662901#action_12662901
]
Owen O'Malley commented on HADOOP-5003:
---------------------------------------
With rounding instead of floor, the issue is the same, just with 11 slots
instead of 10.
Let's say that with the 11 slots and three queues at 33% and all 3 queues get
jobs at the same moment. One of the queues gets 4 slots and the other two get
3. Since they all have guaranteed capacity of 4, the timers start. When the
timers go off, one of the tasks from queue A is killed and given to B. A few
seconds later, the task from queue B is killed and the slot is given to queue
C.
-1
> When computing absoluet guaranteed capacity (GC) from a percent value,
> Capacity Scheduler should round up floats, rather than truncate them.
> --------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5003
> URL: https://issues.apache.org/jira/browse/HADOOP-5003
> Project: Hadoop Core
> Issue Type: Bug
> Reporter: Vivek Ratan
> Priority: Minor
>
> The Capacity Scheduler calculates a queue's absolute GC value by getting its
> percent of the total cluster capacity (which is a float, since the configured
> GC% is a float) and casting it to an int. Casting a float to an int always
> rounds down. For very small clusters, this can result in the GC of a queue
> being one lower than what it should be. For example, if Q1 has a GC of 50%,
> Q2 has a GC of 40%, and Q3 has a GC of 10%, and if the cluster capacity is 4
> (as we have, in our test cases), Q1's GC works out to 2, Q2's to 1, and Q3's
> to 0 with today's code. Q2's capacity should really be 2, as 40% of 4,
> rounded up, should be 2.
> Simple fix is to use Math.round() rather than cast to an int.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.