Re: [slurm-users] Down nodes

2021-07-30 Thread Brian Andrus
That 'not responding' is the issue and usually means 1 of 2 things: 1) slurmd is not running on the node 2) something on the network is stopping the communication between the node and the master (firewall, selinux, congestion, bad nic, routes, etc) Brian Andrus On 7/30/2021 3:51 PM, Soichi

Re: [slurm-users] Down nodes

2021-07-30 Thread Soichi Hayashi
Brian, Thank you for your reply and thanks for setting the email title. I forgot to edit it before I sent it! I am not sure how I can reply to your your reply.. but I hope this make it so the right place.. I've updated slurm.conf to increase the controller debug level > SlurmctldDebug=5 I now