Hello,

I am attempting to run multiple jobs simultaneously on a node with two
GPUs. However, all my attempts fail. Both jobs are queued, but only one
runs at a time while the other remains in the queue. My SLURM configuration
is below. Any assistance is greatly appreciated.

# slurm.conf
GresTypes=gpu,tfcpu

SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory

NodeName=Hulk NodeAddr=10.0.1.10 Gres=gpu:2,tfcpu:1 CPUs=12
SocketsPerBoard=1 CoresPerSocket=6 ThreadsPerCore=2 RealMemory=15886
State=UNKNOWN Weight=1

# gres.conf
Name=gpu File=/dev/nvidia0
Name=gpu File=/dev/nvidia1
Name=tfcpu

# invocation
sbatch --gres=gpu:1 myjob.bash
sbatch --gres=gpu:1 myjob.bash

-- 
Marc *Rollins*
Software Engineer
o: (509) 863-9228 <15098639228>
<https://gravityjack.com>
*Entrepreneur Magazine's 'Best Entrepreneurial Companies in America 2015'*

Reply via email to