On Thu, Aug 11, 2016 at 12:46 PM, Ryan Novosielski <novos...@rutgers.edu> wrote: > I’ll try adding the Gres debugging, but is there some way to figure out what > this alleged device “819275” is (this number will change with each job).
Weird, indeed. /dev/nv* devices should be 195:x, and slurmd should log something like this: Allowing access to device c 195:0 rwm Not allowing access to device c 195:1 rwm Not allowing access to device c 195:2 rwm Not allowing access to device c 195:3 rwm The fact that it's "actively" allowing access to bogus device 819275 makes me think it considers it as the actual GPU device. Except it got the wrong major for it. What does "ls -al /dev/nv*" look like on the GPU node? And which version of Slurm is it? Cheers, -- Kilian