On Thu, 6 Aug 2015, Reuti wrote:
Hi,
Am 03.08.2015 um 18:20 schrieb Carl G. Riches <[email protected]>:
On Sat, 1 Aug 2015, Reuti wrote:
Hi,
Am 31.07.2015 um 23:00 schrieb Carl G. Riches:
<snip>
Thanks for these pointers. After going through these and others, I think I
have a basic understanding of share-tree and functional share policies. I'm not
clear on whether or not these can be combined, but I think so.
The goal is to limit user access to CPU when there is contention to a queue.
The limit would apply to all users and the limit would be 12% of all CPUs in a
queue.
How do you came up with 12% - it will never make exactly 100%? Is the intend to
have a fair share scheduling over a time frame then?
The 12% comes from the users' goal of how many CPUs out of the total available
a single user can have when the queue(s) are full. Each user's share should be
measured at the time a job is dispatched (functional share?) but with some
fair-sharing over a time frame (share-tree?). That is, if a user has submitted
10,000 jobs (or 1000 10-core parallel jobs) to the queue at one time and the
queue doesn't have that many available slots/CPUs, the sharing policy should
dispatch other users' jobs in preference to the large-volume user (but still
allowing that user to have some slots available to process jobs). In the case
of users that dump huge numbers of jobs to the queues at a time, the sharing
policy should remember that high level of use for a short time (no more than a
month).
For the "weight_user" and so on values you could keep the default value of 0.25 (they are used for the
functional policy only). More important are the values for "weight_ticket" in relation to the alike entries
"weight_waiting_time". Just give the "weight ticktet" a higher value than the other entries.
The 12% you can achieve by defining a default-user in the share tree as leaf with e.g. 12000 shares out of a total of
100000 ("weight_tickets_share 10000") in the share-tree dialog, just below the "root" entry, i.e.
"Add Node" + "Add Leaf".
While "enforce_user auto" in SGE's configuration will take care that each user gets his
own entry automatically once he submitted a job, it's important to change the
"auto_user_delete_time 0" to zero. This is the actual object which will hold the values
recorded for the past usage. If the user object is deleted (either by hand or automatically), the
knowledge of the past usage is gone.
The record reserved usage and not only the used one (i.e. a user requested more
than one core but runs only a serial job), it's necessary to set:
"execd_params ACCT_RESERVED_USAGE=TRUE SHARETREE_RESERVED_USAGE=TRUE"
in SGE's configuration too.
Reuti--
Thanks for your help and advice. We're getting closer to understanding how
this works. I'm not sure how to create the user--our version of qmon
doesn't have a look that matches your description. We are using:
Open Grid Scheduler/Grid Engine 2011.11p1
If I were to do this
via qconf -auser, would the user configuration parameters look like this?
name default-user
oticket 0
fshare 12000
delete_time 0
default_project NONE
Thanks again!
Carl
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users