On Mon, Oct 7, 2013 at 9:25 AM, Edrisse Chermak
<[email protected]> wrote:
>
> Dear Grid Engine developers and users,
>
> I would like to prevent jobs running when node memory is almost filled.
> Here is the typical situation I have:
>
> 'qhost':
> ================================================================
> HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE
> ----------------------------------------------------------------
> global                  -               -     -       -       -
> node1                  linux-x64      64  12.00  52.4G    42.8G
> node2                  linux-x64      64  24.00  52.4G    12.8G
> ===============================================================
>
> The memory on node1 is almost filled, so I would like the new job I'll
> launch to go on node2. (provided that the job I want to launch requires
> 2 CPUs and that I configured np_load_avg=0.80, so that CPU load doesn't
> matter here).
>

We have cases where we are constrained by memory rather than cpu
count. We solved this with changes to complexes.
We made h_vmem consumable with a default tuned for our environment
(with qconf -mc) and then set per node values for h_vmem equal to a
couple GB less than total physical ram (with qconf -me nodename or
global). We also aliased h_vmem to mem and got users to specify -l
mem=xG on the qsub command line when they need more than the default
amount of memory.

This prevents us from over committing memory. There may also be a way
to accomplish this with RQS or tweaking load_formula.

Best,
Chris
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to