Hello,

we've started to have a problem with the qmaster process on one of my SGE cells.

Basically, it starts to eat up large amounts of memory (and then dies). Seemed to happen more or less out of the blue (i.e. running fine for a while, suddenly stops - in fact, it ran fine for a couple of months, and we had this happen for the first time about a month ago). However, a couple of tests I ran today seem to indicate that it's the number of jobs being submitted that triggers it (as in, we managed to trigger it simply by submitting a lot of jobs in a short space of time).

Unfortunately, a lot of jobs being submitted in a short space of time is our standard use case :)

This is on SGE8.1.3.

I did make me think of the old 'schedd_job_info can cause immense memory consumption' - as I do currently collect job info - but I thought that that was fixed? Should that be fixed in SGE8.1.3?

Tina

--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442

--
This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and 
Wales with its registered office at Diamond House, Harwell Science and 
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom




_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to