On Wednesday 19 April 2017 17:51:03 Mike Cammilleri wrote: > > Hi Slurm community, > > I have hopefully an easy question regarding cpu/partition configuration in > slurm.conf. > > BACKGROUND: > > We are running slurm 16.05.6 built on Ubuntu 14.04 LTS (because 14.04 works > with our current bcfg2 xml configuration management servers). > Each node has two, 12 core Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz > When you run 'cat /proc/cpuinfo' it returns 48 processors because each cores > consists of two threads. > > I want to make sure that we are defining our cpu and available cores to slurm > appropriately. What slurm considers a cpu, and what a process considers a > thread - all can get mixed up with the semantics. > > > PROBLEM: > > Most users run R. R is single threaded so when someone submits a job it will > take 1 thread and leave the other thread on the core empty. So although a > user thinks there are 48 cores available, in actuality they only have the 24 > physical available to them. If however they are running an app that can use > the multiple threads (Julia?) then things are different. We've been getting > by up to this point until a user tried to run a numpy array in his python3.5 > app which has resulted in all kinds of cpu overload and memory swap. He's > using job arrays of size 32, running one array in each job, and on one node > for example 12 of his python apps are running but all 48 cpus are utilized. > Load average is 300.0+. Sometimes memory is swapping and sometimes not.
using cgroups will help to ensure that jobs cannot use more resources than asked for. see: https://slurm.schedmd.com/cgroups.html I have: $ cat /etc/slurm-llnl/slurm.conf | grep -i cgroup ProctrackType=proctrack/cgroup TaskPlugin=task/cgroup JobAcctGatherType=jobacct_gather/cgroup $ cat /etc/slurm-llnl/cgroup.conf CgroupAutomount=yes CgroupReleaseAgentDir="/etc/slurm-llnl/cgroup" CgroupMountpoint=/sys/fs/cgroup ConstrainCores=yes ConstrainDevices=yes ConstrainRAMSpace=yes ConstrainSwapSpace=yes regards Markus Köberl -- Markus Koeberl Graz University of Technology Signal Processing and Speech Communication Laboratory E-mail: markus.koeb...@tugraz.at