On Thu, Aug 10, 2017 at 10:31 AM, Kilian Cavalotti <kilian.cavalotti.w...@gmail.com> wrote: > Do you use cgroups in your Slurm setup with pam_systemd on nodes? And > if so, did you notice any issue with cgroups?
For what it's worth, I just checked again with Slurm 17.02 and CentOS 7.3, and can confirm than enabling pam_systemd.so in /etc/pam.d/slurm breaks cgroups for at least device access. We do enforce GPU isolation through Slurm's ConstraintDevices and /etc/slurm/cgroup_allowed_devices_file.conf, and as soon as pam_systemd is active, all GPUs are visible from any job. Since this is far more important to us than XDG_* dirs, we disable pam_systemd on our systems. That seems to be the official SchedMD recommendation too (see https://bugs.schedmd.com/show_bug.cgi?id=3674 and https://bugs.schedmd.com/show_bug.cgi?id=3158). Cheers, -- Kilian