Hello, Recently I have had some issues with users complaining that their jobs are being killed by Slurm due to memory limit constraints.
I have seen that their jobs do not have exceeded the RSS. Instead the cached size was the cause of oomkiller kill their job steps. Also, I understand that cgroups computes a process memory as the sum of RSS+Cached+Semaphores+Shared segments+Swap. Question: Is it possible to tune cgroups plugin in order to only take RSS into account? Relevant info of my configuration: slurm 15.08.2 openSUSE 13.2 (x86_64) slurm.conf: ProctrackType=proctrack/cgroup TaskPlugin=task/cgroup SelectTypeParameters=CR_Core_Memory JobAcctGatherFrequency=15 JobAcctGatherType=jobacct_gather/linux cgroup.conf: CgroupAutomount=yes CgroupReleaseAgentDir="/etc/slurm/cgroup" ConstrainCores=yes ConstrainRAMSpace=yes Thank you! *--Felip Moll Marquès* Computer Science Engineer E-Mail - [email protected] WebPage - http://lipix.ciutadella.es
