On 02/05/18 10:15, R. Paul Wiegand wrote:

Yes, I am sure they are all the same. Typically, I just scontrol reconfig; however, I have also tried restarting all daemons.

Understood. Any diagnostics in the slurmd logs when trying to start
a GPU job on the node?

We are moving to 7.4 in a few weeks during our downtime.  We had a
QDR -> OFED version constraint -> Lustre client version constraint
issue that delayed our upgrade.

I feel your pain..  BTW RHEL 7.5 is out now so you'll need that if
you need current security fixes.

Should I just wait and test after the upgrade?

Well 17.11.6 will be out then that will include for a deadlock
that some sites hit occasionally, so that will be worth throwing
into the mix too.   Do read the RELEASE_NOTES carefully though,
especially if you're using slurmdbd!

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC

Reply via email to