Hi all, I've configured our cluster in the way that slots/memory are consumable resources. Our nodes have their limits and there are some default resources requirements at job submission. All this conf should avoid memory/processor oversubscription (at least, from what I've read). something like http://jeetworks.org/node/93 ... is this the recommended way for avoiding over-subscription?
I've also configured core-binding, and the default for each job is 1 slot. But with this conf I have some questions: 1.-) when submitting a job specifying more than 1 job slot (-l slots=2 -binding linear:2), OGS fails and suggest to use parallel environment. I've read somewhere that this is OGS design, so I need a pe. I've not found a clear doc about pe (yes, how to create and manage, but not a complete definition of each parameter and its implications) so, anyone could share some doc about it? what is the minimun conf I need for allowing slots requirement al job submission? somethign like: $ qconf -sp smp pe_name smp slots 1024 user_lists NONE xuser_lists NONE start_proc_args NONE stop_proc_args NONE allocation_rule $pe_slots control_slaves FALSE job_is_first_task TRUE urgency_slots min accounting_summary FALSE is enough? (this is what I'm using and works fine) 2.-) I've still not configured share based priorities, but when done, if a user request more slots/memory than the default, but it does not use them, is this request taken into account for share calculation? I mean, user A request 8GB and 1 cpu and uses 8GB and 3600 sec of cpu, and user B requests 16GB and 2 cpus, but uses 8GB and 3600 of cpu. Are both users priorities recalculated by resource usage or by resource requests? TIA, Arnau _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
