[
https://issues.apache.org/jira/browse/HADOOP-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674038#action_12674038
]
Matei Zaharia commented on HADOOP-5262:
---------------------------------------
I think the second config variable makes sense. When we describe it in the
documentation, we can say "either you configure the pool to a percent of total
cluster capacity, or specify an exact number of maps and reduces". Certainly
nobody pays for 10% of reduces but 20% of maps.
The main difficulty I see with this and fair scheduler configuration in general
is how to provide feedback to the operator. Right now, if the fair scheduler
detects something wrong in the config file at startup, it throws a
RuntimeException to prevent the JobTracker from starting, thus bringing the
problem to the operator's attention. If it detects a problem when reloading a
config file at runtime, it logs it as an ERROR in the JobTracker's log4j log
but it keeps the old settings of the config variables so that the cluster can
continue operating. Because most of the config info is visible on the web UI,
the idea is that an operator will look at the web UI to see whether their
change went through. There are several possible enhancements to improve this:
* Rather than making the config reloading something passive, maybe we can
require the operator to say "reload the config file now" through either a
command-like invocation or a button on the web UI. In response to this, we can
display any errors.
* If things don't make sense at runtuime, e.g. total min shares add up to more
than 100% because nodes went away, maybe we can display a warning on the web UI
or email the operator (?). (By the way, right now, if min shares exceed 100%,
the scheduler doesn't crash or behave weirdly; it just treats them as weights
and normalizes them to 100%; so it's not the end of the world, but the operator
should know about it.)
> Allow specifying min shares as percentage of cluster
> ----------------------------------------------------
>
> Key: HADOOP-5262
> URL: https://issues.apache.org/jira/browse/HADOOP-5262
> Project: Hadoop Core
> Issue Type: New Feature
> Components: contrib/fair-share
> Reporter: Matei Zaharia
> Priority: Minor
>
> Currently the guaranteed shares for pools in the fair scheduler are specified
> as a number of slots. For organizations where a group pays X% of the cluster
> and the actual number of nodes in the cluster varies due to failures,
> expansion, etc over time, it would be useful to support a guaranteed share
> given as a percentage too. This would just let you write in the config file
> something like <minMaps>5%</minMaps> instead of <minMaps>42</minMaps>. The
> scheduler would need to recompute what this means in terms of number of slots
> on every update (probably through some kind of update(ClusterStatus) method
> in PoolManager).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.