On Wednesday, 2 May 2018 11:04:34 PM AEST R. Paul Wiegand wrote: > When I set "--gres=gpu:1", the slurmd log does have encouraging lines such > as: > > [2018-05-02T08:47:04.916] [203.0] debug: Allowing access to device > /dev/nvidia0 for job > [2018-05-02T08:47:04.916] [203.0] debug: Not allowing access to > device /dev/nvidia1 for job > > However, I can still "see" both devices from nvidia-smi, and I can > still access both if I manually unset CUDA_VISIBLE_DEVICES.
The only thing I can think of is a bug that's been fixed since 17.11.0 (as I know it works for us with 17.11.5) or a kernel bug (or missing device cgroups). Sorry I can't be more helpful! All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC