On 6/9/20 12:12 PM, Steve Brasier wrote:
Hi all, looking for some advice on the process to following when doing one of the reconfigurations which requires a slurm daemon restart (as listed in docs for "scontrol reconfigure").

When reconfiguring slurm.conf, make sure to propagate that file to all nodes first!

The scontrol manual page explains when a restart of the daemons (and not just "scontrol reconfig") is required:

reconfigure
Instruct all Slurm daemons to re-read the configuration file. This command does not restart the daemons. This mechanism would be used to modify configuration parameters (Epilog, Prolog, SlurmctldLogFile, SlurmdLogFile, etc.). The Slurm controller (slurmctld) forwards the request all other daemons (slurmd daemon on each compute node). Running jobs continue execution. Most configuration parameters can be changed by just running this command, however, Slurm daemons should be shutdown and restarted if any of these parameters are to be changed: AuthType, ControlMach, PluginDir, StateSaveLocation, SlurmctldHost, SlurmctldPort, or SlurmdPort. The slurmctld daemon and all slurmd daemons must be restarted if nodes are added to or removed from the cluster.


In this situation, is there any difference in terms of preservation of slurm's state etc between using "scontrol shutdown" or running "service slurmd/slurmctld stop" on each node?

The slurmctld state is preserved in the server's StateSaveLocation:

# scontrol show config | grep StateSaveLocation
StateSaveLocation       = /var/spool/slurmctld

It is essential not to disturb that folder! Make a backup after stopping slurmctld, just in case...

Is there a recommended order in which to shutdown and restart daemons?

Why do you want to shutdown/restart in the first place? I think you can restart any daemon if necessary, but you have to consider Slurm's timeout parameters SlurmctldTimeout and SlurmdTimeout:

# scontrol show config | grep Timeout

If any daemon is down for a longer time, things will start failing!

Best regards,
Ole

Reply via email to