Minimal example w/ srun: [bmooreii@gpu-interactive ~]$ salloc --gres=gpu:4 -n4 salloc: Granted job allocation 8868 salloc: Waiting for resource configuration salloc: Nodes gpu-stage08 are ready for job [bmooreii@gpu-interactive ~]$ cat gres_test.sh #!/usr/bin/env bash
srun --gres=gpu:1 -n1 --exclusive bash get_cuda_vis.sh & srun --gres=gpu:1 -n1 --exclusive bash get_cuda_vis.sh & srun --gres=gpu:1 -n1 --exclusive bash get_cuda_vis.sh & srun --gres=gpu:1 -n1 --exclusive bash get_cuda_vis.sh & wait [bmooreii@gpu-interactive ~]$ cat get_cuda_vis.sh #!/usr/bin/env bash echo $CUDA_VISIBLE_DEVICES [bmooreii@gpu-interactive ~]$ bash gres_test.sh 2 0 3 1 On Sat, Sep 2, 2017 at 1:22 PM, charlie hemlock <charlieheml...@gmail.com> wrote: > Barry, > Thank you so much for the reply! > I'm afraid I need more clarification on this comment: > > "CUDA_VISIBLE_DEVICES should only ever contain 1 integer between 0 and 3 > if you have 4 GPUs." > > We only ever get > CUDA_VISIBLE_DEVICES = *0* (12 times) > and devices 1,2,3 are *never* used. > > Eventually - we want to be able to use mpi such that each rank/task can > use 1 gpu -- but the job can spread tasks/ranks among the 4 gpus. > Currently it appears we are only limited to device 0 only. > > *In a mpi context,* I'm not certain about the wrapper based method provided > from the link. > I'll need to consult with the developer. > > Thanks again! > -C > > > > > > > > > On Sat, Sep 2, 2017 at 10:49 AM, Barry Moore <moore0...@gmail.com> wrote: > >> Charlie, >> >> % salloc -n 12 -c 2 -gres=gpu:1 >>> % srun env | grep CUDA >>> CUDA_VISIBLE_DEVICES=0 >>> (12 times) >>> *Is this expected behavior if we have more than 1 gpu available (4 >>> total) for the 12 tasks?* >> >> >> This is absolutely expected. You only ask for 1 GPU. CUDA_VISIBLE_DEVICES >> should only ever contain 1 integer between 0 and 3 if you have 4 GPUs. >> >> This comment might help you: https://bugs.schedmd.com/ >> show_bug.cgi?id=2626#c3 >> >> Basically, loop over the tasks you want to run with an index, take the >> index % NUM_GPUS, use a wrapper like the comment. >> >> - Barry >> >> >> On Fri, Sep 1, 2017 at 1:29 PM, charlie hemlock <charlieheml...@gmail.com >> > wrote: >> >>> Hello, >>> Can the slurm forum help with these questions, or should we seek help >>> elsewhere? >>> >>> We need help with salloc gpu allocation. Hopefully this clarifies some, >>> given: >>> >>> % salloc -n 12 -c 2 -gres=gpu:1 >>> % srun env | grep CUDA >>> CUDA_VISIBLE_DEVICES=0 >>> (12 times) >>> >>> *Is this expected behavior if we have more than 1 gpu available (4 >>> total) for the 12 tasks?* >>> >>> We desire different behavior. *Is there a way to specify an >>> salloc+srun to get:* >>> >>> CUDA_VISIBLE_DEVICES=0 >>> CUDA_VISIBLE_DEVICES=1 >>> CUDA_VISIBLE_DEVICES=2 >>> CUDA_VISIBLE_DEVICES=3 >>> And so on...(12 total print statements) ? >>> >>> such that each task gets 1 gpu, but overall gpu usage is spread out >>> among the 4 available devices. (Not one where all device=0). >>> >>> That way each task is not waiting on device 0 to free up from other >>> tasks, as is currently the case. >>> >>> What are we missing or misunderstanding? >>> >>> - salloc / srun parameter? >>> - slurm.conf or gres.conf setting? >>> >>> Thank you! >>> >>> >>> On Tue, Aug 29, 2017 at 12:27 PM, charlie hemlock < >>> charlieheml...@gmail.com> wrote: >>> >>>> Hello, >>>> >>>> We're looking for any advice for salloc/srun setup that uses 1 gpu/task >>>> but where the job makes use of all available gpus. >>>> >>>> >>>> *Test #1:* >>>> >>>> We desire an salloc and srun such that each task gets 1 GPU, but the >>>> GPU usage for the job is spread out among 4 available devices. See >>>> gres.conf below. >>>> >>>> >>>> >>>> % salloc -n 12 -c 2 -gres=gpu:1 >>>> >>>> >>>> >>>> % srun env | grep CUDA >>>> >>>> CUDA_VISIBLE_DEVICES=0 >>>> >>>> (12 times) >>>> >>>> >>>> >>>> Where we desire: >>>> >>>> CUDA_VISIBLE_DEVICES=0 >>>> >>>> CUDA_VISIBLE_DEVICES=1 >>>> >>>> CUDA_VISIBLE_DEVICES=2 >>>> >>>> CUDA_VISIBLE_DEVICES=3 >>>> >>>> And so on (12 times), such that each task still gets 1 gpu, but usage >>>> is spread out among the 4 available devices (see gres.conf below). Not >>>> one (device=0). >>>> >>>> That way each task is not waiting on device 0 to free up, as is >>>> currently the case. >>>> >>>> >>>> What are we missing or misunderstanding? >>>> >>>> - salloc / srun parameter? >>>> - slurm.conf or gres.conf setting? >>>> >>>> >>>> >>>> Also see other additional tests below that illustrate current behavior: >>>> >>>> >>>> >>>> *Test #2* >>>> >>>> Here we believe each srun task will need 4 gpus each. >>>> >>>> % salloc -n 12 -c 2 -gres=gpu:4 >>>> >>>> % srun env | grep CUDA >>>> >>>> CUDA_VISIBLE_DEVICES=0,1,2,3 >>>> >>>> (12 times) >>>> >>>> >>>> >>>> This matches expectation. >>>> >>>> >>>> >>>> >>>> >>>> *Test #3* >>>> >>>> Another test, where I submit multiple sruns in succession: >>>> >>>> Here we use a simple sleepCUDA.py scripts which sleeps a few seconds, >>>> and then prints $CUDA_VISIBLE_DEVICES) >>>> >>>> >>>> >>>> % salloc -n 12 -c 2 -gres=gpu:4 >>>> >>>> % srun -gres=gpu:1 sleepCUDA.py & >>>> >>>> % srun -gres=gpu:1 sleepCUDA.py & >>>> >>>> % srun -gres=gpu:1 sleepCUDA.py & >>>> >>>> % srun -gres=gpu:1 sleepCUDA.py & >>>> >>>> >>>> >>>> Result: >>>> >>>> CUDA_VISIBLE_DEVICES=0 (jobid 1) >>>> >>>> CUDA_VISIBLE_DEVICES=1 (jobid 2) >>>> >>>> CUDA_VISIBLE_DEVICES=2 (jobid 3) >>>> >>>> CUDA_VISIBLE_DEVICES=3 (jobid 4) >>>> >>>> And so on (but not necessarily in 0,1,2,3 order) >>>> >>>> Though a single srun submission would only use 1 gpu (device=0) as >>>> before as expected. >>>> >>>> But this seems like a step in right direction since multiple devices >>>> were used, but not quite what we wanted. >>>> >>>> >>>> And according to: https://slurm.schedmd.com/ >>>> archive/slurm-16.05.7/gres.html >>>> >>>> *“By default, a job step will be allocated all of the generic resources >>>> allocated to the job/ [Test #2]* >>>> >>>> *If desired, the job step may explicitly specify a different generic >>>> resource count than the job. [Test #3]”* >>>> >>>> >>>> >>>> To Test#3 non-iteractively should we look into creating an sbatch >>>> script (with multiple sruns) instead of salloc? >>>> >>>> >>>> >>>> >>>> *OS: *CentOS 7 >>>> >>>> *Slurm version: *16.05.6 >>>> >>>> >>>> *gres.conf* >>>> >>>> Name=gpu File=/dev/nvidia0 >>>> >>>> Name=gpu File=/dev/nvidia1 >>>> >>>> Name=gpu File=/dev/nvidia2 >>>> >>>> Name=gpu File=/dev/nvidia3 >>>> >>>> >>>> >>>> *slurm.conf (truncated/partial/simplified)* >>>> >>>> NodeName=node1 Gres=gpu:4 >>>> >>>> NodeName=node2 Gres=gpu:4 >>>> >>>> NodeName=node3 Gres=gpu:4 >>>> >>>> NodeName=node4 Gres=gpu:4 >>>> >>>> GresTypes=gpu >>>> >>>> >>>> >>>> No cgroup.conf >>>> >>>> >>>> >>>> Posting actual .conf is not practical due to firewalls. >>>> >>>> >>>> Any advice will be greatly appreciated! >>>> >>>> Thank you! >>>> >>> >>> >> >> >> -- >> Barry E Moore II, PhD >> E-mail: bmoor...@pitt.edu >> >> Assistant Research Professor >> Center for Simulation and Modeling >> University of Pittsburgh >> Pittsburgh, PA 15260 >> > > -- Barry E Moore II, PhD E-mail: bmoor...@pitt.edu Assistant Research Professor Center for Simulation and Modeling University of Pittsburgh Pittsburgh, PA 15260