Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
Matthew BETTINGER writes: > Just curious if this option or oom setting (which we use) can leave > the nodes in CG "completing" state. I don't think so. As far as I know, jobs go into completing state when Slurm is cancelling them or when they exit on their own, and stays in that state until any epilogs are run. In my experience, the most typical reasons for jobs hanging in CG are disk system failures or other failures leading to either the job processes or the epilog processes hanging in "disk wait". -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo signature.asc Description: PGP signature
Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
Just curious if this option or oom setting (which we use) can leave the nodes in CG "completing" state. We have CG states quite often and only way is to reboot the node. I believe it occurs when parent process dies or gets killed or Z? Thanks. MB On 10/8/19, 6:11 AM, "slurm-users on behalf of Bjørn-Helge Mevik" wrote: Marcus Boden writes: > you're looking for KillOnBadExit in the slurm.conf: > KillOnBadExit [...] > this should terminate the job if a step or a process gets oom-killed. That is a good tip! But as I read the documentation (I haven't tested it), it will only kill the job step itself, it will not kill the whole job. Also, it will only have effect for things started with srun, mpirun or similar. However, in combination with "set -o errexit", I believe most OOM kills would get the job itself terminated. -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo
Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
- Mail original - > Maybe I missed something else... That's right. Thank to Bjørn-Helge who help me. You must enable swapaccount in the kernel as shown here: https://unix.stackexchange.com/questions/531480/what-does-swapaccount-1-in-grub-cmdline-linux-default-do By default, this is apparently not necessary to explicit this option for RHEL7, but I'm on Debian. Now everything works fine with this cgroup.conf configuration: CgroupAutomount=yes ConstrainCores=yes ConstrainSwapSpace=yes Thanks. Best regards, Jean-Mathieu
Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
Marcus Boden writes: > you're looking for KillOnBadExit in the slurm.conf: > KillOnBadExit [...] > this should terminate the job if a step or a process gets oom-killed. That is a good tip! But as I read the documentation (I haven't tested it), it will only kill the job step itself, it will not kill the whole job. Also, it will only have effect for things started with srun, mpirun or similar. However, in combination with "set -o errexit", I believe most OOM kills would get the job itself terminated. -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo signature.asc Description: PGP signature
Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
Juergen Salk writes: > that is interesting. We have a very similar setup as well. However, in > our Slurm test cluster I have noticed that it is not the *job* that > gets killed. Instead, the OOM killer terminates one (or more) > *processes* Yes, that is how the kernel OOM killer works. This is why we always tell users to use "set -o errexit" in their job scripts. Then at least the job script exits as soon as one of its processes are killed. -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo signature.asc Description: PGP signature
Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
Hello, thanks for you answers, > - Does it work if you remove the space in "TaskPlugin=task/affinity, > task/cgroup"? (Slurm can be quite picky when reading slurm.conf). It was the case, I make a mistake when I copy/cut... So, I haven't space here. > > - See in slurmd.log on the node(s) of the job if cgroup actually gets > activated and starts limit memory for the job, or if there are any > errors related to cgroup. Yes, example: Launching batch job 1605839 for UID [1605839.batch] task/cgroup: /slurm/uid_/job_1605839: alloc=200MB mem.limit=200MB memsw.limit=200MB [1605839.batch] task/cgroup: /slurm/uid_/job_1605839/step_batch: alloc=200MB mem.limit=200MB memsw.limit=200MB > > - While a job is running, see in the cgroup memory directory (typically > /sys/fs/cgroup/memory/slurm/uid_/job_ for the job (on the > compute node). Does the values there, for instance > memory.limit_in_bytes and memory.max_usage_in_bytes, make sense? Yes, for the same job: cat /sys/fs/cgroup/memory/slurm/uid_/job_1605839/memory.limit_in_bytes 209715200 root@star190:~# cat /sys/fs/cgroup/memory/slurm/uid_/job_1605839/memory.max_usage_in_bytes 209715200 But: cat /sys/fs/cgroup/memory/slurm/uid_/job_1605839/memory.usage_in_bytes 209711104 is always under memory.max_usage_in_bytes. I think it's because the field ConstrainRAMSpace=yes in cgroup.conf, and the process swap (with ConstrainRAMSpace=no)... I try configuration of Michael Renfro in precedent email, but when ConstrainRAMSpace=no and ConstrainSwapSpace=no, cgroup are no activate for the job (nothing appears in slurm.log or /sys/fs/cgroup/memory/slurm/uid_/ ) Set the MemEnforceLimit to no or yes seem to be have no influence... Maybe I missed something else... Regards, Jean-Mathieu > -- > Regards, > Bjørn-Helge Mevik, dr. scient, > Department for Research Computing, University of Oslo
Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
> On 19-10-08 10:36, Juergen Salk wrote: > > * Bjørn-Helge Mevik [191008 08:34]: > > > Jean-mathieu CHANTREIN writes: > > > > > > > I tried using, in slurm.conf > > > > TaskPlugin=task/affinity, task/cgroup > > > > SelectTypeParameters=CR_CPU_Memory > > > > MemLimitEnforce=yes > > > > > > > > and in cgroup.conf: > > > > CgroupAutomount=yes > > > > ConstrainCores=yes > > > > ConstrainRAMSpace=yes > > > > ConstrainSwapSpace=yes > > > > MaxSwapPercent=10 > > > > TaskAffinity=no > > > > > > We have a very similar setup, the biggest difference being that we have > > > MemLimitEnforce=no, and leave the killing to the kernel's cgroup. For > > > us, jobs are killed as they should. [...] > > > > that is interesting. We have a very similar setup as well. However, in > > our Slurm test cluster I have noticed that it is not the *job* that > > gets killed. Instead, the OOM killer terminates one (or more) > > *processes* but keeps the job itself running in a potentially > > unhealthy state. > > > > Is there a way to tell Slurm to terminate the whole job as soon as > > the first OOM kill event takes place during execution? * Marcus Boden [191008 10:46]: > > you're looking for KillOnBadExit in the slurm.conf: > KillOnBadExit > > If set to 1, a step will be terminated immediately if any task > is crashed or aborted, as indicated by a non-zero exit code. > With the default value of 0, if one of the processes is crashed > or aborted the other processes will continue to run while the > crashed or aborted process waits. The user can override this > configuration parameter by using srun's -K, --kill-on-bad-exit. > > this should terminate the job if a step or a process gets oom-killed. Hi Marcus, thank you. I did not consider `KillOnBadExit=1´ so far. It seems this does indeed kill the current job step if it hits the memory limit - but then happily proceeds with the next one. I've also noticed that, in order to work as described above, this requires all the processes to be launched via srun from within the batch script. Right? Admittedly, I am also somewhat scared about potential side effects with `KillOnBadExit=1´ set in a productive environment that needs to cope with all sorts of batch scripts. A non-zero exit code of some process may or may not harm the batch job whereas process(es) that get oom-killed most probably affect the job as a whole. Is `KillOnBadExit=1´ commonly used? Thanks again. Best regards Jürgen -- Jürgen Salk Scientific Software & Compute Services (SSCS) Kommunikations- und Informationszentrum (kiz) Universität Ulm Telefon: +49 (0)731 50-22478 Telefax: +49 (0)731 50-22471
Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
Hi Jürgen, you're looking for KillOnBadExit in the slurm.conf: KillOnBadExit If set to 1, a step will be terminated immediately if any task is crashed or aborted, as indicated by a non-zero exit code. With the default value of 0, if one of the processes is crashed or aborted the other processes will continue to run while the crashed or aborted process waits. The user can override this configuration parameter by using srun's -K, --kill-on-bad-exit. this should terminate the job if a step or a process gets oom-killed. Best, Marcus On 19-10-08 10:36, Juergen Salk wrote: > * Bjørn-Helge Mevik [191008 08:34]: > > Jean-mathieu CHANTREIN writes: > > > > > I tried using, in slurm.conf > > > TaskPlugin=task/affinity, task/cgroup > > > SelectTypeParameters=CR_CPU_Memory > > > MemLimitEnforce=yes > > > > > > and in cgroup.conf: > > > CgroupAutomount=yes > > > ConstrainCores=yes > > > ConstrainRAMSpace=yes > > > ConstrainSwapSpace=yes > > > MaxSwapPercent=10 > > > TaskAffinity=no > > > > We have a very similar setup, the biggest difference being that we have > > MemLimitEnforce=no, and leave the killing to the kernel's cgroup. For > > us, jobs are killed as they should. [...] > > Hello Bjørn-Helge, > > that is interesting. We have a very similar setup as well. However, in > our Slurm test cluster I have noticed that it is not the *job* that > gets killed. Instead, the OOM killer terminates one (or more) > *processes* but keeps the job itself running in a potentially > unhealthy state. > > Is there a way to tell Slurm to terminate the whole job as soon as > the first OOM kill event takes place during execution? > > Best regards > Jürgen > > -- > Jürgen Salk > Scientific Software & Compute Services (SSCS) > Kommunikations- und Informationszentrum (kiz) > Universität Ulm > Telefon: +49 (0)731 50-22478 > Telefax: +49 (0)731 50-22471 > -- Marcus Vincent Boden, M.Sc. Arbeitsgruppe eScience Tel.: +49 (0)551 201-2191 E-Mail: mbo...@gwdg.de --- Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen (GWDG) Am Fassberg 11, 37077 Goettingen URL:http://www.gwdg.de E-Mail: g...@gwdg.de Tel.: +49 (0)551 201-1510 Fax:+49 (0)551 201-2150 Geschaeftsfuehrer: Prof. Dr. Ramin Yahyapour Aufsichtsratsvorsitzender: Prof. Dr. Christian Griesinger Sitz der Gesellschaft: Goettingen Registergericht: Goettingen Handelsregister-Nr. B 598 --- smime.p7s Description: S/MIME cryptographic signature
Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
* Bjørn-Helge Mevik [191008 08:34]: > Jean-mathieu CHANTREIN writes: > > > I tried using, in slurm.conf > > TaskPlugin=task/affinity, task/cgroup > > SelectTypeParameters=CR_CPU_Memory > > MemLimitEnforce=yes > > > > and in cgroup.conf: > > CgroupAutomount=yes > > ConstrainCores=yes > > ConstrainRAMSpace=yes > > ConstrainSwapSpace=yes > > MaxSwapPercent=10 > > TaskAffinity=no > > We have a very similar setup, the biggest difference being that we have > MemLimitEnforce=no, and leave the killing to the kernel's cgroup. For > us, jobs are killed as they should. [...] Hello Bjørn-Helge, that is interesting. We have a very similar setup as well. However, in our Slurm test cluster I have noticed that it is not the *job* that gets killed. Instead, the OOM killer terminates one (or more) *processes* but keeps the job itself running in a potentially unhealthy state. Is there a way to tell Slurm to terminate the whole job as soon as the first OOM kill event takes place during execution? Best regards Jürgen -- Jürgen Salk Scientific Software & Compute Services (SSCS) Kommunikations- und Informationszentrum (kiz) Universität Ulm Telefon: +49 (0)731 50-22478 Telefax: +49 (0)731 50-22471
Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
Jean-mathieu CHANTREIN writes: > I tried using, in slurm.conf > TaskPlugin=task/affinity, task/cgroup > SelectTypeParameters=CR_CPU_Memory > MemLimitEnforce=yes > > and in cgroup.conf: > CgroupAutomount=yes > ConstrainCores=yes > ConstrainRAMSpace=yes > ConstrainSwapSpace=yes > MaxSwapPercent=10 > TaskAffinity=no We have a very similar setup, the biggest difference being that we have MemLimitEnforce=no, and leave the killing to the kernel's cgroup. For us, jobs are killed as they should. Here are a couple of things you could check: - Does it work if you remove the space in "TaskPlugin=task/affinity, task/cgroup"? (Slurm can be quite picky when reading slurm.conf). - See in slurmd.log on the node(s) of the job if cgroup actually gets activated and starts limit memory for the job, or if there are any errors related to cgroup. - While a job is running, see in the cgroup memory directory (typically /sys/fs/cgroup/memory/slurm/uid_/job_ for the job (on the compute node). Does the values there, for instance memory.limit_in_bytes and memory.max_usage_in_bytes, make sense? -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo signature.asc Description: PGP signature
Re: [slurm-users] How to automatically kill a job that exceeds its memory limits (--mem-per-cpu)?
Our cgroup settings are quite a bit different, and we don’t allow jobs to swap, but the following works to limit memory here (I know, because I get emails frequent emails from users who don’t change their jobs from the default 2 GB per CPU that we use): CgroupMountpoint="/sys/fs/cgroup" CgroupAutomount=no CgroupReleaseAgentDir="/etc/slurm/cgroup" AllowedDevicesFile="/etc/slurm/cgroup_allowed_devices_file.conf" ConstrainCores=yes# Not the Slurm default TaskAffinity=no # Slurm default ConstrainRAMSpace=no # Slurm default ConstrainSwapSpace=no # Slurm default ConstrainDevices=no # Slurm default AllowedRamSpace=100 # Slurm default AllowedSwapSpace=0# Slurm default MaxRAMPercent=100 # Slurm default MaxSwapPercent=100# Slurm default MinRAMSpace=30# Slurm default > On Oct 7, 2019, at 11:55 AM, Jean-mathieu CHANTREIN > wrote: > > External Email Warning > This email originated from outside the university. Please use caution when > opening attachments, clicking links, or responding to requests. > Hello, > > I tried using, in slurm.conf > TaskPlugin=task/affinity, task/cgroup > SelectTypeParameters=CR_CPU_Memory > MemLimitEnforce=yes > > and in cgroup.conf: > CgroupAutomount=yes > ConstrainCores=yes > ConstrainRAMSpace=yes > ConstrainSwapSpace=yes > MaxSwapPercent=10 > TaskAffinity=no > > But when the job reaches its limit, it passes alternately from R to D state > without being killed, even when it exceeds the 10% of swap partition allowed. > > Do you have an idea to do this? > > Regards, > > Jean-Mathieu