On Tue, Aug 16, 2016 at 10:23:45AM -0400, [email protected] wrote:
> On 08/16/2016 04:54 AM, William Hay wrote:
> > You may need to upgrade to 8.1.9 IIRC there were some cgroup/cpuset fixes 
> > there.
> 
> Thanks William.  I had looked at the release notes for 8.1.9 and missed
> the cpuset reference.  Looks like ticket 1523 might apply to what I am
> seeing.  Unfortunately I am not in a position to upgrade the SGE core at
> the moment, so I will have to wait to try this.
> 
> > 
> > Not sure if ENABLE_ADDGRP_KILL=TRUE is compatible with USE_CGROUPS as they 
> > both provide
> > ways to find processes that belong to the job and kill them.  Try using 
> > just  USE_CGROUPS.
> 
> Tried this, but no change (and am not surprised given the above).
> 
> 
> > 
> > Also is this job a serial one or a parallel job?  There are bugs in the SGE 
> > cgroup support
> > WRT some parallel libraries IIRC.
> 
> Serial.
> 
> What bugs exist with the parallel libraries?  This might be a show
> stopper for using cgroups.

I was thinking of this:
https://arc.liv.ac.uk/trac/SGE/ticket/1512

I think it is worked around by recent versions of intel MPI but older versions 
and
other parallel libraries may still trigger it (most don't though).

William

Attachment: signature.asc
Description: Digital signature

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to