Hello, Can the slurm forum help with these questions, or should we seek help elsewhere?
We need help with salloc gpu allocation. Hopefully this clarifies some, given: % salloc -n 12 -c 2 -gres=gpu:1 % srun env | grep CUDA CUDA_VISIBLE_DEVICES=0 (12 times) *Is this expected behavior if we have more than 1 gpu available (4 total) for the 12 tasks?* We desire different behavior. *Is there a way to specify an salloc+srun to get:* CUDA_VISIBLE_DEVICES=0 CUDA_VISIBLE_DEVICES=1 CUDA_VISIBLE_DEVICES=2 CUDA_VISIBLE_DEVICES=3 And so on...(12 total print statements) ? such that each task gets 1 gpu, but overall gpu usage is spread out among the 4 available devices. (Not one where all device=0). That way each task is not waiting on device 0 to free up from other tasks, as is currently the case. What are we missing or misunderstanding? - salloc / srun parameter? - slurm.conf or gres.conf setting? Thank you! On Tue, Aug 29, 2017 at 12:27 PM, charlie hemlock <charlieheml...@gmail.com> wrote: > Hello, > > We're looking for any advice for salloc/srun setup that uses 1 gpu/task > but where the job makes use of all available gpus. > > > *Test #1:* > > We desire an salloc and srun such that each task gets 1 GPU, but the GPU > usage for the job is spread out among 4 available devices. See gres.conf > below. > > > > % salloc -n 12 -c 2 -gres=gpu:1 > > > > % srun env | grep CUDA > > CUDA_VISIBLE_DEVICES=0 > > (12 times) > > > > Where we desire: > > CUDA_VISIBLE_DEVICES=0 > > CUDA_VISIBLE_DEVICES=1 > > CUDA_VISIBLE_DEVICES=2 > > CUDA_VISIBLE_DEVICES=3 > > And so on (12 times), such that each task still gets 1 gpu, but usage is > spread out among the 4 available devices (see gres.conf below). Not one > (device=0). > > That way each task is not waiting on device 0 to free up, as is currently > the case. > > > What are we missing or misunderstanding? > > - salloc / srun parameter? > - slurm.conf or gres.conf setting? > > > > Also see other additional tests below that illustrate current behavior: > > > > *Test #2* > > Here we believe each srun task will need 4 gpus each. > > % salloc -n 12 -c 2 -gres=gpu:4 > > % srun env | grep CUDA > > CUDA_VISIBLE_DEVICES=0,1,2,3 > > (12 times) > > > > This matches expectation. > > > > > > *Test #3* > > Another test, where I submit multiple sruns in succession: > > Here we use a simple sleepCUDA.py scripts which sleeps a few seconds, and > then prints $CUDA_VISIBLE_DEVICES) > > > > % salloc -n 12 -c 2 -gres=gpu:4 > > % srun -gres=gpu:1 sleepCUDA.py & > > % srun -gres=gpu:1 sleepCUDA.py & > > % srun -gres=gpu:1 sleepCUDA.py & > > % srun -gres=gpu:1 sleepCUDA.py & > > > > Result: > > CUDA_VISIBLE_DEVICES=0 (jobid 1) > > CUDA_VISIBLE_DEVICES=1 (jobid 2) > > CUDA_VISIBLE_DEVICES=2 (jobid 3) > > CUDA_VISIBLE_DEVICES=3 (jobid 4) > > And so on (but not necessarily in 0,1,2,3 order) > > Though a single srun submission would only use 1 gpu (device=0) as before > as expected. > > But this seems like a step in right direction since multiple devices were > used, but not quite what we wanted. > > > And according to: https://slurm.schedmd.com/archive/slurm-16.05.7/gres.htm > l > > *“By default, a job step will be allocated all of the generic resources > allocated to the job/ [Test #2]* > > *If desired, the job step may explicitly specify a different generic > resource count than the job. [Test #3]”* > > > > To Test#3 non-iteractively should we look into creating an sbatch script > (with multiple sruns) instead of salloc? > > > > > *OS: *CentOS 7 > > *Slurm version: *16.05.6 > > > *gres.conf* > > Name=gpu File=/dev/nvidia0 > > Name=gpu File=/dev/nvidia1 > > Name=gpu File=/dev/nvidia2 > > Name=gpu File=/dev/nvidia3 > > > > *slurm.conf (truncated/partial/simplified)* > > NodeName=node1 Gres=gpu:4 > > NodeName=node2 Gres=gpu:4 > > NodeName=node3 Gres=gpu:4 > > NodeName=node4 Gres=gpu:4 > > GresTypes=gpu > > > > No cgroup.conf > > > > Posting actual .conf is not practical due to firewalls. > > > Any advice will be greatly appreciated! > > Thank you! >