On Tue, Aug 16, 2016 at 10:23:45AM -0400, [email protected] wrote: > On 08/16/2016 04:54 AM, William Hay wrote: > > You may need to upgrade to 8.1.9 IIRC there were some cgroup/cpuset fixes > > there. > > Thanks William. I had looked at the release notes for 8.1.9 and missed > the cpuset reference. Looks like ticket 1523 might apply to what I am > seeing. Unfortunately I am not in a position to upgrade the SGE core at > the moment, so I will have to wait to try this. > > > > > Not sure if ENABLE_ADDGRP_KILL=TRUE is compatible with USE_CGROUPS as they > > both provide > > ways to find processes that belong to the job and kill them. Try using > > just USE_CGROUPS. > > Tried this, but no change (and am not surprised given the above). > > > > > > Also is this job a serial one or a parallel job? There are bugs in the SGE > > cgroup support > > WRT some parallel libraries IIRC. > > Serial. > > What bugs exist with the parallel libraries? This might be a show > stopper for using cgroups.
I was thinking of this: https://arc.liv.ac.uk/trac/SGE/ticket/1512 I think it is worked around by recent versions of intel MPI but older versions and other parallel libraries may still trigger it (most don't though). William
signature.asc
Description: Digital signature
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
