FWIW, you may be interested in my Wiki on upgrading Slurm:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm
You should also read the pages on Upgrading in the presentation
Technical: Field Notes From A MadMan, Tim Wickberg, SchedMD from last
month's Slurm User Group meeting:
https://slurm.schedmd.com/publications.html
/Ole
On 10/21/19 3:52 AM, Tony Racho wrote:
Hi:
We are planning to upgrade our slurm cluster however we plan on NOT
doing it in a one-go.
We are on 18.08.7 at the moment (db, controller, clients)
We’d like to do it in a phased approach.
Stop communication between controller and slurmdbd while updating
slurmdbd to 19.05.X.
Concurrently, we will update our primary controller to 19.05.X while the
back-up controller will take-over the primary’s chores. (and then the
back-up controller will also be upgraded to 19.05.X)
Once primary controller has been updated to 19.05.X, obviously it
assumes back the cluster but the clients will still be 18.08.7 will
there be any issues with this set-up and consequently if this works, we
will choose a subset of clients and upgrade them to 19.05.X while the
others will be on 18.08.7 until all the clients have been upgraded to
19.05.X.
My question is will the process/set-up above work? Will the clients
still be able to communicate to the controller without any unintended
effect or issues? Has anyone done this process?
Once all the controllers and clients are upgraded to 19.05.X, resume
communication between the controllers and the slurmdbd.
While doing the upgrade the following scenario will take place.
slurmdbd - 19.05.X (but not communicating with the controllers)
slurmctld - 19.05.X
{slurmd..} - 18.08.7
{slurmd..}- 19.05.X
Thanks all for your feedback/comments.