Hi Stephen,

Yes, SGE take into account the current load of nodes even if the load was 
caused with a non-SGE-job. 
I've made a test on my cluster of three machines (one master and 2 nodes). I 
stress node001 without passing through SGE...
[root@node001 ~]# uptime
 19:51:19 up 1 day,  7:41,  1 user,  load average: 12.08, 13.68, 9.81

 [root@fadmin ~]# qhost 
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
fadmin                  lx24-amd64      4  1.10    7.8G  420.6M    4.0G     0.0
node001                 lx24-amd64      4  9.57    7.8G  155.1M   24.0G     0.0
node002                 lx24-amd64      4  0.03    7.8G  150.8M   24.0G     0.0

Now, all jobs are scheduled to the node002, due to the load_thresholds value 
which is exceeded (default value is 1.75).

Best regards,
Farid.

--- En date de : Mar 7.5.13, Stephen Spencer <[email protected]> a 
écrit :

De: Stephen Spencer <[email protected]>
Objet: [gridengine users] Question about load average and slots and 
non-SGE-managed tasks...
À: [email protected]
Date: Mardi 7 mai 2013, 17h09

Good morning.
I'm administering a cluster of machines with SGE (6.2u5, from the RHEL distro) 
and have a question concerning the scheduler's behavior. (I'm rather new to 
SGE.)

On this cluster, users can and do log in (via 'ssh') and run computational 
tasks on cluster nodes, which ties up resources but not an SGE 'slot' because 
the tasks aren't submitted through SGE.

My question is this: does SGE take into consideration the current load average 
on a node when assigning tasks? For example, given two nodes with equivalent 
numbers of slots, and one node has a load average of 10 and the other 0, will 
SGE send a waiting job to the node with less load?


I see "load_thresholds   np_load_avg=1.75" in the output of "qconf -sq all.q" 
and am guessing that if the value of "np_load_avg" on a given host, as SGE 
calculates it, is greater than 1.75, tasks will be assigned elsewhere first, 
but that's only a guess. Confirmation, or clarification of what this means, 
would be wonderful.


Thank you.
Best, -- 
Stephen Spencer
[email protected]


-----La pièce jointe associée suit-----

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to