Re: [slurm-users] Rolling upgrade of compute nodes
I can confirm twhat Ümit did worked for my setup as well. But as I mentioned before, if there's any doubt, try the upgrade in a test environment first. Cheers, Stephan On 30.05.22 21:06, Ümit Seren wrote: We did a couple of major and minor SLURM upgrades without draining the compute nodes. Once slurmdbd and slurmctld were updated to the new major version, we did a package update on the compute nodes and restarted slurmd on them. The existing running jobs continued to run fine and new jobs on the same compute started by the updated slurmd daemon and also worked fine. So, for us this worked smoothly. Best Ümit *From: *slurm-users on behalf of Ole Holm Nielsen *Date: *Monday, 30. May 2022 at 20:58 *To: *slurm-users@lists.schedmd.com *Subject: *Re: [slurm-users] Rolling upgrade of compute nodes On 30-05-2022 19:34, Chris Samuel wrote: On 30/5/22 10:06 am, Chris Samuel wrote: If you switch that symlink those jobs will pick up the 20.11 srun binary and that's where you may come unstuck. Just to quickly fix that, srun talks to slurmctld (which would also be 20.11 for you), slurmctld will talk to the slurmd's running the job (which would be 19.05, so OK) but then the slurmd would try and launch a 20.11 slurmstepd and that is where I suspect things could come undone. How about restarting all slurmd's at version 20.11 in one shot? No reboot will be required. There will be running 19.05 slurmstepd's for the running job steps, even though slurmd is at 20.11. You could perhaps restart 20.11 slurmd one partition at a time in order to see if it works correctly on a small partition of the cluster. I think we have done this successfully when we install new RPMs on *all* compute nodes in one shot, and I'm not aware of any job crashes. Your mileage may vary depending on job types! Question: Does anyone have bad experiences with upgrading slurmd while the cluster is running production? /Ole -- ETH Zurich Stephan Roth Systems Administrator IT Support Group (ISG) D-ITET ETF D 104 Sternwartstrasse 7 8092 Zurich Phone +41 44 632 30 59 stephan.r...@ee.ethz.ch www.isg.ee.ethz.ch Working days: Mon,Tue,Thu,Fri smime.p7s Description: S/MIME Cryptographic Signature
Re: [slurm-users] Rolling upgrade of compute nodes
We did a couple of major and minor SLURM upgrades without draining the compute nodes. Once slurmdbd and slurmctld were updated to the new major version, we did a package update on the compute nodes and restarted slurmd on them. The existing running jobs continued to run fine and new jobs on the same compute started by the updated slurmd daemon and also worked fine. So, for us this worked smoothly. Best Ümit From: slurm-users on behalf of Ole Holm Nielsen Date: Monday, 30. May 2022 at 20:58 To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Rolling upgrade of compute nodes On 30-05-2022 19:34, Chris Samuel wrote: > On 30/5/22 10:06 am, Chris Samuel wrote: > >> If you switch that symlink those jobs will pick up the 20.11 srun >> binary and that's where you may come unstuck. > > Just to quickly fix that, srun talks to slurmctld (which would also be > 20.11 for you), slurmctld will talk to the slurmd's running the job > (which would be 19.05, so OK) but then the slurmd would try and launch a > 20.11 slurmstepd and that is where I suspect things could come undone. How about restarting all slurmd's at version 20.11 in one shot? No reboot will be required. There will be running 19.05 slurmstepd's for the running job steps, even though slurmd is at 20.11. You could perhaps restart 20.11 slurmd one partition at a time in order to see if it works correctly on a small partition of the cluster. I think we have done this successfully when we install new RPMs on *all* compute nodes in one shot, and I'm not aware of any job crashes. Your mileage may vary depending on job types! Question: Does anyone have bad experiences with upgrading slurmd while the cluster is running production? /Ole
Re: [slurm-users] Rolling upgrade of compute nodes
On 30-05-2022 19:34, Chris Samuel wrote: On 30/5/22 10:06 am, Chris Samuel wrote: If you switch that symlink those jobs will pick up the 20.11 srun binary and that's where you may come unstuck. Just to quickly fix that, srun talks to slurmctld (which would also be 20.11 for you), slurmctld will talk to the slurmd's running the job (which would be 19.05, so OK) but then the slurmd would try and launch a 20.11 slurmstepd and that is where I suspect things could come undone. How about restarting all slurmd's at version 20.11 in one shot? No reboot will be required. There will be running 19.05 slurmstepd's for the running job steps, even though slurmd is at 20.11. You could perhaps restart 20.11 slurmd one partition at a time in order to see if it works correctly on a small partition of the cluster. I think we have done this successfully when we install new RPMs on *all* compute nodes in one shot, and I'm not aware of any job crashes. Your mileage may vary depending on job types! Question: Does anyone have bad experiences with upgrading slurmd while the cluster is running production? /Ole
Re: [slurm-users] Rolling upgrade of compute nodes
On 30/5/22 10:06 am, Chris Samuel wrote: If you switch that symlink those jobs will pick up the 20.11 srun binary and that's where you may come unstuck. Just to quickly fix that, srun talks to slurmctld (which would also be 20.11 for you), slurmctld will talk to the slurmd's running the job (which would be 19.05, so OK) but then the slurmd would try and launch a 20.11 slurmstepd and that is where I suspect things could come undone. Sorry - hadn't had coffee when I was writing earlier. :-) -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
Re: [slurm-users] Rolling upgrade of compute nodes
On 30/5/22 3:01 am, byron wrote: The one thing I'm unsure about is as much as Linux / NFS issue than a a slurm one. When I change the soft link for "default" to point to the new 20.11 slurm install but all the compute nodes are still run the old 19.05 version because they havent been restarted yet, will that not cause any problems? Or will they still just see the same old 19.05 version of slurm that they are running until they are restarted. That may cause issues, whilst the ASAP flag to scontrol reboot guarantees no new jobs will start on the selected nodes until after they've rebooted that doesn't (and shouldn't) stop new job steps from srun starting on them. If you switch that symlink those jobs will pick up the 20.11 srun binary and that's where you may come unstuck. This is one of the reasons why we do everything with Slurm installed via RPM inside an image, you have a pretty straightforward A -> B transition. If your symlink was node-local in some way (say created at boot time via some config management system before slurmd start) then that could work around that as then the nodes would still see the appropriate slurm binaries for the running slurmd. Best of luck! Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
Re: [slurm-users] Rolling upgrade of compute nodes
Thanks for the feedback. I've done the database dryrun on a clone of our database / slurmdbd and that is all good. We have a reboot program defined. The one thing I'm unsure about is as much as Linux / NFS issue than a a slurm one. When I change the soft link for "default" to point to the new 20.11 slurm install but all the compute nodes are still run the old 19.05 version because they havent been restarted yet, will that not cause any problems? Or will they still just see the same old 19.05 version of slurm that they are running until they are restarted. thanks On Mon, May 30, 2022 at 8:18 AM Ole Holm Nielsen wrote: > Hi Byron, > > Adding to Stephan's note, it's strongly recommended to make a database > dry-run upgrade test before upgrading the production slurmdbd. Many > details are in > https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm > > If you have separate slurmdbd and slurmctld machines (recommended), the > next step is to upgrade the slurmctld. > > Finally you can upgrade the slurmd's while the cluster is running in > production mode. Since you have Slurm om NFS, following Chris' > recommendation of rebooting the nodes may be the safest approach. > > After upgrading everything to 20.11, you should next upgrade to 21.08. > Upgrade to the latest 22.05 should probably wait for a few minor releases. > > /Ole > > On 5/30/22 08:54, Stephan Roth wrote: > > If you have the means to set up a test environment to try the upgrade > > first, I recommend to do it. > > > > The upgrade from 19.05 to 20.11 worked for two clusters I maintain with > a > > similar NFS based setup, except we keep the Slurm configuration > separated > > from the Slurm software accessible through NFS. > > > > For updates staying between 2 major releases this should work well by > > restarting the Slurm daemons in the recommended order (see > > https://slurm.schedmd.com/SLUG19/Field_Notes_3.pdf) after switching the > > soft link to 20.11: > > > > 1. slurmdbd > > 2. slurmctld > > 3. individual slurmd on your nodes > > > > To be able to revert back to 19.05 you should dump the database between > > stopping and starting slurmdbd as well as backing up StateSaveLocation > > between stopping/restarting slurmctld. > > > > slurmstepd's of running jobs will continue to run on 19.05 after > > restarting the slurmd's. > > > > Check individual slurmd.log files for problems. > > > > Cheers, > > Stephan > > > > On 30.05.22 00:09, byron wrote: > >> Hi > >> > >> I'm currently doing an upgrade from 19.05 to 20.11. > >> > >> All of our compute nodes have the same install of slurm NFS mounted. > The > >> system has been setup so that all the start scripts and configuration > >> files point to the default installation which is a soft link to the > most > >> recent installation of slurm. > >> > >> This is the first time I've done an upgrade of slurm and I had been > >> hoping to do a rolling upgrade as opposed to waiting for all the jobs > to > >> finish on all the compute nodes and then switching across but I dont > see > >> how I can do it with this setup. Does any one have any expereience of > >> this? > >
Re: [slurm-users] Rolling upgrade of compute nodes
Hi Byron, Adding to Stephan's note, it's strongly recommended to make a database dry-run upgrade test before upgrading the production slurmdbd. Many details are in https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm If you have separate slurmdbd and slurmctld machines (recommended), the next step is to upgrade the slurmctld. Finally you can upgrade the slurmd's while the cluster is running in production mode. Since you have Slurm om NFS, following Chris' recommendation of rebooting the nodes may be the safest approach. After upgrading everything to 20.11, you should next upgrade to 21.08. Upgrade to the latest 22.05 should probably wait for a few minor releases. /Ole On 5/30/22 08:54, Stephan Roth wrote: If you have the means to set up a test environment to try the upgrade first, I recommend to do it. The upgrade from 19.05 to 20.11 worked for two clusters I maintain with a similar NFS based setup, except we keep the Slurm configuration separated from the Slurm software accessible through NFS. For updates staying between 2 major releases this should work well by restarting the Slurm daemons in the recommended order (see https://slurm.schedmd.com/SLUG19/Field_Notes_3.pdf) after switching the soft link to 20.11: 1. slurmdbd 2. slurmctld 3. individual slurmd on your nodes To be able to revert back to 19.05 you should dump the database between stopping and starting slurmdbd as well as backing up StateSaveLocation between stopping/restarting slurmctld. slurmstepd's of running jobs will continue to run on 19.05 after restarting the slurmd's. Check individual slurmd.log files for problems. Cheers, Stephan On 30.05.22 00:09, byron wrote: Hi I'm currently doing an upgrade from 19.05 to 20.11. All of our compute nodes have the same install of slurm NFS mounted. The system has been setup so that all the start scripts and configuration files point to the default installation which is a soft link to the most recent installation of slurm. This is the first time I've done an upgrade of slurm and I had been hoping to do a rolling upgrade as opposed to waiting for all the jobs to finish on all the compute nodes and then switching across but I dont see how I can do it with this setup. Does any one have any expereience of this?
Re: [slurm-users] Rolling upgrade of compute nodes
Hi Byron, If you have the means to set up a test environment to try the upgrade first, I recommend to do it. The upgrade from 19.05 to 20.11 worked for two clusters I maintain with a similar NFS based setup, except we keep the Slurm configuration separated from the Slurm software accessible through NFS. For updates staying between 2 major releases this should work well by restarting the Slurm daemons in the recommended order (see https://slurm.schedmd.com/SLUG19/Field_Notes_3.pdf) after switching the soft link to 20.11: 1. slurmdbd 2. slurmctld 3. individual slurmd on your nodes To be able to revert back to 19.05 you should dump the database between stopping and starting slurmdbd as well as backing up StateSaveLocation between stopping/restarting slurmctld. slurmstepd's of running jobs will continue to run on 19.05 after restarting the slurmd's. Check individual slurmd.log files for problems. Cheers, Stephan On 30.05.22 00:09, byron wrote: Hi I'm currently doing an upgrade from 19.05 to 20.11. All of our compute nodes have the same install of slurm NFS mounted. The system has been setup so that all the start scripts and configuration files point to the default installation which is a soft link to the most recent installation of slurm. This is the first time I've done an upgrade of slurm and I had been hoping to do a rolling upgrade as opposed to waiting for all the jobs to finish on all the compute nodes and then switching across but I dont see how I can do it with this setup. Does any one have any expereience of this? Thanks smime.p7s Description: S/MIME Cryptographic Signature
Re: [slurm-users] Rolling upgrade of compute nodes
On 5/29/22 3:09 pm, byron wrote: This is the first time I've done an upgrade of slurm and I had been hoping to do a rolling upgrade as opposed to waiting for all the jobs to finish on all the compute nodes and then switching across but I dont see how I can do it with this setup. Does any one have any expereience of this? We do rolling upgrades with: scontrol reboot ASAP nextstate=resume reason="some-useful-reason" [list-of-nodes] But you do need to have RebootProgram defined and an appropriate ResumeTimeout set to allow enough time for your node to reboot (and of course your system must be configured to boot into a production ready state when rebooted, including starting up slurmd). All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
[slurm-users] Rolling upgrade of compute nodes
Hi I'm currently doing an upgrade from 19.05 to 20.11. All of our compute nodes have the same install of slurm NFS mounted. The system has been setup so that all the start scripts and configuration files point to the default installation which is a soft link to the most recent installation of slurm. This is the first time I've done an upgrade of slurm and I had been hoping to do a rolling upgrade as opposed to waiting for all the jobs to finish on all the compute nodes and then switching across but I dont see how I can do it with this setup. Does any one have any expereience of this? Thanks