On Tuesday, 11 February 2020 7:27:56 AM PST Dean Schulze wrote: > No other errors in the logs. Identical slurm.conf on all nodes and > controller. Only the node with gpus has the gres.conf (with the single > line Autodetect=nvml).
It might be useful to post the output of "slurmd -C" and your slurm.conf for us to see (sorry if you've done that already and I've not seen it). You can also increase the debug level for slurmctld and slurm in slurm.conf (we typically run with SlurmctldDebug=debug, you may want to try SlurmdDebug=debug whilst experimenting). Best of luck, Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA