On Tue, 11 Sep 2012, Andrej N. Gritsenko wrote:


   Hello!

Evren Yurtesen IB has written on Monday, 10 September, at  5:10:
I am trying to use scontrol reboot_nodes. I have configured the
RebootProgram as:

RebootProgram=/sbin/reboot

However, after reboot, some nodes (appears random) does not resume and
get stuck in unexpected restart/reboot state. Is this normal?

Would you tell me what is exact state (scontrol show node XXX) of those
nodes in SLURM and have them been rebooted as they should? Have slurmd
restarted on them after reboot?


I dont rememmber exactly what was the state. I think the node showed down and reason was 'unexpected reboot'. I should have copy/pasted the message...next time perhaps.

slurm was working on the nodes, I only had to 'resume' them using scontrol for returning to normal operation.

Reply via email to