[ 
https://issues.apache.org/jira/browse/HADOOP-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674038#action_12674038
 ] 

Matei Zaharia commented on HADOOP-5262:
---------------------------------------

I think the second config variable makes sense. When we describe it in the 
documentation, we can say "either you configure the pool to a percent of total 
cluster capacity, or specify an exact number of maps and reduces". Certainly 
nobody pays for 10% of reduces but 20% of maps.

The main difficulty I see with this and fair scheduler configuration in general 
is how to provide feedback to the operator. Right now, if the fair scheduler 
detects something wrong in the config file at startup, it throws a 
RuntimeException to prevent the JobTracker from starting, thus bringing the 
problem to the operator's attention. If it detects a problem when reloading a 
config file at runtime, it logs it as an ERROR in the JobTracker's log4j log 
but it keeps the old settings of the config variables so that the cluster can 
continue operating. Because most of the config info is visible on the web UI, 
the idea is that an operator will look at the web UI to see whether their 
change went through. There are several possible enhancements to improve this:
* Rather than making the config reloading something passive, maybe we can 
require the operator to say "reload the config file now" through either a 
command-like invocation or a button on the web UI. In response to this, we can 
display any errors.
* If things don't make sense at runtuime, e.g. total min shares add up to more 
than 100% because nodes went away, maybe we can display a warning on the web UI 
or email the operator (?). (By the way, right now, if min shares exceed 100%, 
the scheduler doesn't crash or behave weirdly; it just treats them as weights 
and normalizes them to 100%; so it's not the end of the world, but the operator 
should know about it.)

> Allow specifying min shares as percentage of cluster
> ----------------------------------------------------
>
>                 Key: HADOOP-5262
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5262
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/fair-share
>            Reporter: Matei Zaharia
>            Priority: Minor
>
> Currently the guaranteed shares for pools in the fair scheduler are specified 
> as a number of slots. For organizations where a group pays X% of the cluster 
> and the actual number of nodes in the cluster varies due to failures, 
> expansion, etc over time, it would be useful to support a guaranteed share 
> given as a percentage too. This would just let you write in the config file 
> something like <minMaps>5%</minMaps> instead of <minMaps>42</minMaps>. The 
> scheduler would need to recompute what this means in terms of number of slots 
> on every update (probably through some kind of update(ClusterStatus) method 
> in PoolManager).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to