I'm on 8.1.1 and cannot use schedd_job_info. When I need that capability, I use
qalter -w v <jobid> and leave schedd_job_info off. I've always had memory leaks with it. I haven't seen any mention in the changelogs between that release and 8.1.3 that suggests any leaks for that have been fixed, so that could very well be your issue. -Brian On Tue, Oct 22, 2013 at 11:13 AM, Tina Friedrich < [email protected]> wrote: > Hello, > > we've started to have a problem with the qmaster process on one of my SGE > cells. > > Basically, it starts to eat up large amounts of memory (and then dies). > Seemed to happen more or less out of the blue (i.e. running fine for a > while, suddenly stops - in fact, it ran fine for a couple of months, and we > had this happen for the first time about a month ago). However, a couple of > tests I ran today seem to indicate that it's the number of jobs being > submitted that triggers it (as in, we managed to trigger it simply by > submitting a lot of jobs in a short space of time). > > Unfortunately, a lot of jobs being submitted in a short space of time is > our standard use case :) > > This is on SGE8.1.3. > > I did make me think of the old 'schedd_job_info can cause immense memory > consumption' - as I do currently collect job info - but I thought that that > was fixed? Should that be fixed in SGE8.1.3? > > Tina > > -- > Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd > Diamond House, Harwell Science and Innovation Campus - 01235 77 8442 > > -- > This e-mail and any attachments may contain confidential, copyright and or > privileged material, and are for the use of the intended addressee only. If > you are not the intended addressee or an authorised recipient of the > addressee please notify us of receipt by returning the e-mail and do not > use, copy, retain, distribute or disclose the information in or attached to > the e-mail. > Any opinions expressed within this e-mail are those of the individual and > not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. > cannot guarantee that this e-mail or any attachments are free from viruses > and we cannot accept liability for any damage which you may sustain as a > result of software viruses which may be transmitted in or with the message. > Diamond Light Source Limited (company no. 4375679). Registered in England > and Wales with its registered office at Diamond House, Harwell Science and > Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom > > > > > ______________________________**_________________ > users mailing list > [email protected] > https://gridengine.org/**mailman/listinfo/users<https://gridengine.org/mailman/listinfo/users> > -- Brian Lindblom (Smith) Assistant Director Research Computing, University of South Florida 4202 E. Fowler Ave. SVC4010 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
