[gridengine users] Some generic questions: binding, parallel, over-subscription

Arnau Bria Tue, 11 Dec 2012 11:07:55 -0800

Hi all,

I've configured our cluster in the way that slots/memory are consumable
resources. Our nodes have their limits and there are some default
resources requirements at job submission. All this conf should avoid
memory/processor oversubscription (at least, from what I've read).
something like http://jeetworks.org/node/93 ... is this the recommended
way for avoiding over-subscription?


I've also configured core-binding, and the default for each job is 1
slot. 

But with this conf I have some questions:

1.-) when submitting a job specifying more than 1 job slot (-l slots=2
-binding linear:2), OGS fails and suggest to use parallel environment. I've read
somewhere that this is OGS design, so I need a pe. I've not found a
clear doc about pe (yes, how to create and manage, but not a complete
definition of each parameter and its implications) so, anyone could
share some doc about it? what is the minimun conf I need for allowing
slots requirement al job submission? somethign like:

$ qconf -sp smp
pe_name            smp
slots              1024
user_lists         NONE
xuser_lists        NONE
start_proc_args    NONE
stop_proc_args     NONE
allocation_rule    $pe_slots
control_slaves     FALSE
job_is_first_task  TRUE
urgency_slots      min
accounting_summary FALSE


is enough? (this is what I'm using and works fine)



2.-) I've still not configured share based priorities, but when done,
if a user request more slots/memory than the default, but it does not
use them, is this request taken into account for share calculation?
I mean, user A request 8GB and 1 cpu and uses 8GB and 3600 sec of cpu,
and user B requests 16GB and 2 cpus, but uses 8GB and 3600 of cpu. Are
both users priorities recalculated by resource usage or by resource
requests?

TIA,
Arnau
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

[gridengine users] Some generic questions: binding, parallel, over-subscription

Reply via email to