Hi all, I sent this mail from a bad email address this week-end. I apologize if it is published duplicate (but not found in the archive yet).
May be this is a basic question but I'm stuck with it. I'm quite new in managing a small cluster with slurm instead of a local batch scheduler. On the nodes I've set memory limits in slurm.conf. DefMemPerCPU=2048 MaxMemPerCPU=4096 Requesting 1.2GB of RAM works: srun --ntasks-per-node=1 --mem-per-cpu=1500M -p tenibre-gpu --pty bash -i and my testcase can allocate until 1.5GB: ./a.out allocation de 1000Mo.........Ok .... allocation de 1419Mo.........Ok allocation de 1524Mo.........Ok Killed Now I would like to use more memory than MaxMemPerCPU: srun --ntasks-per-node=1 --mem-per-cpu=12G -p tenibre-gpu --pty bash -i So, if I understand the documentation, as mem-per-cpu > MaxMemPerCPU this is a limitation applied to the task and it agregates cpu and memory. The squeue command show 3 cpu agregated on the node to reach the 3*MaxMemPerCPU memory requested so all seams correct. JOBID PARTITION NAME USER ST TIME START_TIME TIME_LIMIT CPUS NODELIST(REASON) 497 tenibre-gpu bash begou R 1:23 2021-03-20T14:42:47 12:00:00 3 tenibre-gpu-0 But my task is unable to exceed the MaxMemPerCPU value ? ./a.out allocation de 1000Mo.........Ok .... allocation de 4145Mo.........Ok allocation de 4250Mo.........Ok Killed So, I'm wrong somewhere but ? Running the testcase in a ssh sessions (ssh as root then su as a basic user) allows using more memory so it is related to my bad slurm setup/use Patrick