Am 14.12.2011 um 10:19 schrieb William Hay: > On 13 December 2011 23:46, Gowtham <g...@mtu.edu> wrote: >> >> In some of our Rocks 5.4.2 clusters running SGE >> 6.2u5, I have been noticing the load average on >> several compute nodes being significantly higher >> than others when all cores/processors in all >> compute nodes involved are doing about the same >> amount of work. >> >> When I run >> >> qconf -sq all.q >> >> I see >> >> load_thresholds np_load_avg=1.75 >> >> >> Reading some documentation, I learned that >> np_load_avg=1.75 means a given node can >> support up to NCORES * 1.75 load average >> before SGE stops assigning newer jobs >> (correct me if I'm wrong please). >> >> If possible, I'd love to maintain the >> load avg to be approximately equal to >> the number of cores available on given >> compute node. In other words, SGE should >> not assign any more jobs to a given >> compute node with NCORES processor if >> >> load average is greater than/equal to >> NCORES * 1.10, irrespective of whether >> all NCORES are in user or not >> >> How would I go about achieving it? Is it >> as simple as chaning 'np_load_avg' to 1.10? >> Am I missing something? >> > That should do it. One thing to watch out for is that under Linux > load average includes some processes in uninterruptible sleep as well > as the run queue. Your node can therefore have a high load average > even with idle processors because it can include things waiting on > disk or other hardware.
Yep, I like to point to one of my former posts: http://comments.gmane.org/gmane.comp.clustering.gridengine.users/21620 What about using a load sensor checking the SGE complex "cpu" instead (it runs from 0 to 100) and triggering it at 95 or higher? OTOH: Having slots = cores would make the load_thresholds setting superfluous. -- Reuti > William > >> Thanks for your time and help. >> >> Best, >> g >> >> -- >> Gowtham >> Information Technology Services >> Michigan Technological University >> >> (906) 487/3593 >> http://www.it.mtu.edu/ >> >> _______________________________________________ >> users mailing list >> users@gridengine.org >> https://gridengine.org/mailman/listinfo/users >> >> > > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users