On Friday, 24 July 2020 9:48:35 AM PDT Paul Raines wrote: > But when I run a job on the node it runs I can find no > evidence in cgroups of any limits being set > > Example job: > > mlscgpu1[0]:~$ salloc -n1 -c3 -p batch --gres=gpu:quadro_rtx_6000:1 --mem=1G > salloc: Granted job allocation 17 > mlscgpu1[0]:~$ echo $$ > 137112 > mlscgpu1[0]:~$
You're not actually running inside a job at that point unless you've defined "SallocDefaultCommand" in your slurm.conf, and I'm guessing that's not the case there. You can make salloc fire up an srun for you in the allocation using that option, see the docs here: https://slurm.schedmd.com/slurm.conf.html#OPT_SallocDefaultCommand All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA