Hello,

Recently I have had some issues with users complaining that their jobs are
being killed by Slurm due to memory limit constraints.

I have seen that their jobs do not have exceeded the RSS. Instead the
cached size was the cause of oomkiller kill their job steps.

Also, I understand that cgroups computes a process memory as the sum of
RSS+Cached+Semaphores+Shared segments+Swap.

Question: Is it possible to tune cgroups plugin in order to only take RSS
into account?


Relevant info of my configuration:
slurm 15.08.2
openSUSE 13.2 (x86_64)

slurm.conf:
ProctrackType=proctrack/cgroup
TaskPlugin=task/cgroup
SelectTypeParameters=CR_Core_Memory
JobAcctGatherFrequency=15
JobAcctGatherType=jobacct_gather/linux

cgroup.conf:
CgroupAutomount=yes
CgroupReleaseAgentDir="/etc/slurm/cgroup"
ConstrainCores=yes
ConstrainRAMSpace=yes



Thank you!


*--Felip Moll Marquès*
Computer Science Engineer
E-Mail - [email protected]
WebPage - http://lipix.ciutadella.es

Reply via email to