Yair, I tested this in our environment where we require --mem and set conservative defaults. The behavior here is the same as you are seeing (14.11.4). I consider this a bug but would love to here from someone who knows why it works this way.
Thanks, jbh On Wed, Mar 11, 2015 at 5:31 PM, Yair Yarom <[email protected]> wrote: > > > Hi all, > > We are using slurm 14.03.1-2, with select/cons_res (CR_CPU_memory) and > task/cgroup plugins, and Shared=NO partitions. I have noticed that when > users specify --mem=0, slurm "grants" them all memory but not marking it > as allocated, causing other users to be able to use it as well. > > example: > > $ sbatch --mem=0 -ptest -whm-39 --wrap xterm > > within the xterm: > > hm-39$ cat > /sys/fs/cgroup/memory/slurm/uid_${SLURM_JOB_UID}/job_${SLURM_JOBID}/memory.limit_in_bytes > 67637346304 > hm-39$ scontrol show node ${SLURMD_NODENAME} | grep AllocMem > OS=Linux RealMemory=64504 AllocMem=0 Sockets=2 Boards=1 > > This somewhat makes the --mem parameter useless - as it doesn't > guarantee allocated memory will not be shared, and everyone can just use > --mem=0 and they won't be accountable for it. I do want to allow users > to request all memory (preferably with --exclusive). > > Am I missing something? > Is it just some misconfiguration on my part? > Do I need to write a job_submit (or other) plugin that will verify that > mem=0 was not set? > > > Thanks in advance, > Yair. >
