Charlie,

% salloc -n 12 -c 2 -gres=gpu:1
> % srun  env | grep CUDA
> CUDA_VISIBLE_DEVICES=0
> (12 times)
> *Is this expected behavior if we have more than 1 gpu available (4 total)
> for the 12 tasks?*


This is absolutely expected. You only ask for 1 GPU. CUDA_VISIBLE_DEVICES
should only ever contain 1 integer between 0 and 3 if you have 4 GPUs.

This comment might help you:
https://bugs.schedmd.com/show_bug.cgi?id=2626#c3

Basically, loop over the tasks you want to run with an index, take the
index % NUM_GPUS, use a wrapper like the comment.

- Barry


On Fri, Sep 1, 2017 at 1:29 PM, charlie hemlock <charlieheml...@gmail.com>
wrote:

> Hello,
> Can the slurm forum help with these questions, or should we seek help
> elsewhere?
>
> We need help with salloc gpu allocation.  Hopefully this clarifies some,
> given:
>
> % salloc -n 12 -c 2 -gres=gpu:1
> % srun  env | grep CUDA
> CUDA_VISIBLE_DEVICES=0
> (12 times)
>
> *Is this expected behavior if we have more than 1 gpu available (4 total)
> for the 12 tasks?*
>
> We desire different behavior.  *Is there a way to specify an salloc+srun
> to get:*
>
> CUDA_VISIBLE_DEVICES=0
> CUDA_VISIBLE_DEVICES=1
> CUDA_VISIBLE_DEVICES=2
> CUDA_VISIBLE_DEVICES=3
> And so on...(12 total print statements) ?
>
> such that each task gets 1 gpu, but overall gpu usage is spread out among
> the 4 available devices.   (Not one where all device=0).
>
> That way each task is not waiting on device 0 to free up from other tasks,
> as is currently the case.
>
> What are we missing or misunderstanding?
>
>    - salloc / srun parameter?
>    - slurm.conf or gres.conf setting?
>
> Thank you!
>
>
> On Tue, Aug 29, 2017 at 12:27 PM, charlie hemlock <
> charlieheml...@gmail.com> wrote:
>
>> Hello,
>>
>> We're looking for any advice for salloc/srun setup that uses 1 gpu/task
>> but where the job makes use of all available gpus.
>>
>>
>> *Test #1:*
>>
>> We desire an salloc and srun such that each task gets 1 GPU, but the GPU
>> usage for the job is spread out among 4 available devices.  See gres.conf
>> below.
>>
>>
>>
>> % salloc -n 12 -c 2 -gres=gpu:1
>>
>>
>>
>> % srun  env | grep CUDA
>>
>> CUDA_VISIBLE_DEVICES=0
>>
>> (12 times)
>>
>>
>>
>> Where we desire:
>>
>> CUDA_VISIBLE_DEVICES=0
>>
>> CUDA_VISIBLE_DEVICES=1
>>
>> CUDA_VISIBLE_DEVICES=2
>>
>> CUDA_VISIBLE_DEVICES=3
>>
>> And so on (12 times), such that each task still gets 1 gpu, but usage is
>> spread out among the 4 available devices (see gres.conf below).   Not one
>> (device=0).
>>
>> That way each task is not waiting on device 0 to free up, as is currently
>> the case.
>>
>>
>> What are we missing or misunderstanding?
>>
>>    - salloc / srun parameter?
>>    - slurm.conf or gres.conf setting?
>>
>>
>>
>> Also see other additional tests below that illustrate current behavior:
>>
>>
>>
>> *Test #2*
>>
>> Here we believe each srun task will need 4 gpus each.
>>
>> % salloc -n 12 -c 2 -gres=gpu:4
>>
>> %  srun env | grep CUDA
>>
>> CUDA_VISIBLE_DEVICES=0,1,2,3
>>
>> (12 times)
>>
>>
>>
>> This matches expectation.
>>
>>
>>
>>
>>
>> *Test #3*
>>
>> Another test, where I submit multiple sruns in succession:
>>
>> Here we use a simple sleepCUDA.py scripts which sleeps a few seconds,
>> and then prints $CUDA_VISIBLE_DEVICES)
>>
>>
>>
>> % salloc -n 12 -c 2 -gres=gpu:4
>>
>> %  srun -gres=gpu:1 sleepCUDA.py &
>>
>> % srun -gres=gpu:1 sleepCUDA.py &
>>
>> % srun -gres=gpu:1 sleepCUDA.py &
>>
>> % srun -gres=gpu:1 sleepCUDA.py &
>>
>>
>>
>> Result:
>>
>> CUDA_VISIBLE_DEVICES=0  (jobid 1)
>>
>> CUDA_VISIBLE_DEVICES=1  (jobid 2)
>>
>> CUDA_VISIBLE_DEVICES=2  (jobid 3)
>>
>> CUDA_VISIBLE_DEVICES=3  (jobid 4)
>>
>> And so on (but not necessarily in 0,1,2,3 order)
>>
>> Though a single srun submission would only use 1 gpu (device=0) as before
>>  as expected.
>>
>> But this seems like a step in right direction since multiple devices were
>> used, but not quite what we wanted.
>>
>>
>> And according to: https://slurm.schedmd.com/
>> archive/slurm-16.05.7/gres.html
>>
>> *“By default, a job step will be allocated all of the generic resources
>> allocated to the job/ [Test #2]*
>>
>> *If desired, the job step may explicitly specify a different generic
>> resource count than the job. [Test #3]”*
>>
>>
>>
>> To Test#3 non-iteractively should we look into creating an sbatch script
>> (with multiple sruns) instead of salloc?
>>
>>
>>
>>
>> *OS: *CentOS 7
>>
>> *Slurm version: *16.05.6
>>
>>
>> *gres.conf*
>>
>> Name=gpu File=/dev/nvidia0
>>
>> Name=gpu File=/dev/nvidia1
>>
>> Name=gpu File=/dev/nvidia2
>>
>> Name=gpu File=/dev/nvidia3
>>
>>
>>
>> *slurm.conf (truncated/partial/simplified)*
>>
>> NodeName=node1 Gres=gpu:4
>>
>> NodeName=node2 Gres=gpu:4
>>
>> NodeName=node3 Gres=gpu:4
>>
>> NodeName=node4 Gres=gpu:4
>>
>> GresTypes=gpu
>>
>>
>>
>> No cgroup.conf
>>
>>
>>
>> Posting actual .conf is not practical due to firewalls.
>>
>>
>> Any advice will be greatly appreciated!
>>
>> Thank you!
>>
>
>


-- 
Barry E Moore II, PhD
E-mail: bmoor...@pitt.edu

Assistant Research Professor
Center for Simulation and Modeling
University of Pittsburgh
Pittsburgh, PA 15260

Reply via email to