Hello Daniel,
do /dev/nvidia[0-1] exist on the machines?
If not see under
http://docs.nvidia.com/cuda/cuda-installation-guide-linux/
there is shell scripted which creates the device nodes for you. They are not 
always created during startup, especially if there is not X on the system.

kind regards,
Christian

Am 09.02.2017 um 12:50 schrieb Daniel Ruiz Molina:
>
> Hi,
>
> In my GPU cluster, slurmd daemon doesn't start correctly because when
> daemon start, it doesn't find /dev/nvidia[0-1] device (mapped in
> gres.conf). For solving this problem, I have added attribute
> "ExecStartPre=@/usr/bin/nvidia-smi >/dev/null" in service  file and now
> daemon starts correctly. However, could anybody copy-paste his/her
> slurmd daemon file in a GPU cluster? I suppose it must be a better
> solution than mine.
>
> Thanks.

-- 
Dr. Christian Goll
HITS gGmbH
Schloss-Wolfsbrunnenweg 35
69118 Heidelberg
Germany
Phone: +49 6221 533 230
Fax: +49 6221 533 230
________________________________________________
Amtsgericht Mannheim / HRB 337446
Managing Director: Dr. Gesa Schönberger

Reply via email to