I'm on 8.1.1 and cannot use schedd_job_info.  When I need that capability,
I use

qalter -w v <jobid>

and leave schedd_job_info off.  I've always had memory leaks with it.  I
haven't seen any mention in the changelogs between that release and 8.1.3
that suggests any leaks for that have been fixed, so that could very well
be your issue.

-Brian


On Tue, Oct 22, 2013 at 11:13 AM, Tina Friedrich <
[email protected]> wrote:

> Hello,
>
> we've started to have a problem with the qmaster process on one of my SGE
> cells.
>
> Basically, it starts to eat up large amounts of memory (and then dies).
> Seemed to happen more or less out of the blue (i.e. running fine for a
> while, suddenly stops - in fact, it ran fine for a couple of months, and we
> had this happen for the first time about a month ago). However, a couple of
> tests I ran today seem to indicate that it's the number of jobs being
> submitted that triggers it (as in, we managed to trigger it simply by
> submitting a lot of jobs in a short space of time).
>
> Unfortunately, a lot of jobs being submitted in a short space of time is
> our standard use case :)
>
> This is on SGE8.1.3.
>
> I did make me think of the old 'schedd_job_info can cause immense memory
> consumption' - as I do currently collect job info - but I thought that that
> was fixed? Should that be fixed in SGE8.1.3?
>
> Tina
>
> --
> Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
> Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
>
> --
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd.
> cannot guarantee that this e-mail or any attachments are free from viruses
> and we cannot accept liability for any damage which you may sustain as a
> result of software viruses which may be transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
>
>
>
> ______________________________**_________________
> users mailing list
> [email protected]
> https://gridengine.org/**mailman/listinfo/users<https://gridengine.org/mailman/listinfo/users>
>



-- 
Brian Lindblom (Smith)
Assistant Director
Research Computing, University of South Florida
4202 E. Fowler Ave. SVC4010
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to