[ 
https://issues.apache.org/jira/browse/HADOOP-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666049#action_12666049
 ] 

Vivek Ratan commented on HADOOP-5003:
-------------------------------------

I really think we shouldn't tie the reclaim time limit with the time to detect 
TT failures. You can reclaim capacity when everything is running fine, but you 
can also reclaim capacity even when some TTs are failing. As long as some queue 
is running over capacity, you can reclaim it independent of TT failures. I 
can't see any fundamental connection between the two times. For large clusters, 
you probably do not want low timeouts for detecting TT failures as there can be 
a relatively large number of transient failures. For large clusters, with many 
queues, you also likely want smallish reclaim time limits as there will lots of 
spare capacity being moved around between the queues.  Like I said earlier, 
setting reclaim time limits to larger than 10 mins is not going to fly well 
with users, IMO. 

You reclaim capacity as best as you can, in the presence or absence fo TT 
failures, and I think that's in spirit of the SLA requirement. 

bq. Whether or not 10 minutes is high time depends on the profile of the jobs - 
for small jobs this might be high, but for larger jobs (which take 
significantly more time to run), a difference of a few minutes doesn't seem 
like it would make a lot of difference.
I'm not sure I follow. I assume you're referring to reclaim time limit here. 
Your queue will likely get a mixture of jobs. Plus, we're talking about 
reclaiming individual slots to run individual tasks. If I submit a job, large 
or small, long running or not, I expect to start getting slots for my tasks 
within N minutes. I'm not sure why an acceptable value of N depends on what 
kind of job I submit. 

> When computing absoluet guaranteed capacity (GC) from a percent value, 
> Capacity Scheduler should round up floats, rather than truncate them.
> --------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5003
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5003
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Vivek Ratan
>            Priority: Minor
>
> The Capacity Scheduler calculates a queue's absolute GC value by getting its 
> percent of the total cluster capacity (which is a float, since the configured 
> GC% is a float) and casting it to an int. Casting a float to an int always 
> rounds down. For very small clusters, this can result in the GC of a queue 
> being one lower than what it should be. For example, if Q1 has a GC of 50%, 
> Q2 has a GC of 40%, and Q3 has a GC of 10%, and if the cluster capacity is 4 
> (as we have, in our test cases), Q1's GC works out to 2, Q2's to 1, and Q3's 
> to 0 with today's code. Q2's capacity should really be 2, as 40% of 4, 
> rounded up, should be 2. 
> Simple fix is to use Math.round() rather than cast to an int. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to