Re: [slurm-users] Slurm Upgrade from 17.02
When upgrading to 18.08 it is prudent to add following lines into your /etc/my.cnf as per https://slurm.schedmd.com/accounting.html https://slurm.schedmd.com/SLUG19/High_Throughput_Computing.pdf (slide #6) [mysqld] innodb_buffer_pool_size=1G innodb_log_file_size=64M innodb_lock_wait_timeout=900 If the node on which mysql is running has sufficient memory you may want to increase the innodb_buffer_pool_size beyond 1G. That's just the minimum threshold below which slurm complains. We use 8G, for example, because it fits our churn rate for {job arrival, job dispatch to run state} in RAM and our nodes enough RAM to accommodate an 8G cache. (references on tuning below) When you reset this, you will also need to remove the previous innodb caches, which are probably in /var/lib/mysql. When we did this we removed and recreated the slurm_acct_db, although that was partially motivated by the fact that this coincided with an OS and database patch upgrade and a major accounting and allocation cycle. 0. Stop slurmctld, slurmdbd. 1. Create a dump of your database. (mysqldump ...) 2. Verify that the dump is complete and valid. 3. Remove the slurm_acct_db. (mysql -e "drop database slurm_acct_db;") 3. Stop your mysql instance cleanly. 4. Check the logs. Verify that the mysql instance was stopped cleanly. 5. rm /var/lib/mysql/ib_logfile? /var/lib/ibdata1 6. Put the new lines as above into /etc/my.cnf with the log file sized appropriately. 7. Start mysql. 8. Verify it started cleanly. 9. Restart the slurm dbd manually, possibly in non-daemon mode. (slurmdbd -D -vv) 10. sacctmgr create cluster If you want to restore the data back into the data base, do it *before* step 9 so that the schema conversion can be performed. I like using mutiple "-vv" so that I can see some of the messages as that conversion process proceeds. Some references on mysql innodb_buffer_pool_size tuning: https://scalegrid.io/blog/calculating-innodb-buffer-pool-size-for-your-mysql-server/ https://mariadb.com/kb/en/innodb-system-variables/#innodb_buffer_pool_size https://mariadb.com/kb/en/innodb-buffer-pool/ https://www.percona.com/blog/2015/06/02/80-ram-tune-innodb_buffer_pool_size/ https://dev.mysql.com/doc/refman/5.7/en/innodb-buffer-pool-resize.html Hope this helps, -Steve Senator On Wed, Feb 19, 2020 at 7:12 AM Ricardo Gregorio wrote: > > hi all, > > > > I am putting together an upgrade plan for slurm on our HPC. We are currently > running old version 17.02.11. Would you guys advise us upgrading to 18.08 or > 19.05? > > > > I understand we will have to also upgrade the version of mariadb from 5.5 to > 10.X and pay attention to 'long db upgrade from 17.02 to 18.X or 19.X' and > 'bug 6796' amongst other things. > > > > We would appreciate your comments/recommendations > > > > Regards, > > Ricardo Gregorio > > Research and Systems Administrator > > Operations ITS > > > > > > > Rothamsted Research is a company limited by guarantee, registered in England > at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 > and a not for profit charity number 802038.
Re: [slurm-users] Slurm Upgrade from 17.02
Thank you Ole/Chris/Marcus. Your input was much appreciated Ole, I was(am) basing my upgrade plan using the documentation found on the link you had sent me. In fact your wiki is always my first stop when learning/tshooting SLURM issues, even before SLURM docs pages. Excellent work, well done. Regards, Ricardo Gregorio -Original Message- From: slurm-users On Behalf Of Ole Holm Nielsen Sent: 19 February 2020 14:41 To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Slurm Upgrade from 17.02 On 2/19/20 3:10 PM, Ricardo Gregorio wrote: > I am putting together an upgrade plan for slurm on our HPC. We are > currently running old version 17.02.11. Would you guys advise us > upgrading to 18.08 or 19.05? You should be able to upgrade 2 Slurm major versions in one step. The 18.08 version is just about to become unsupported since 20.02 will be released shortly. We use 19.05.5. I have collected a number of upgrading details in my Slurm Wiki page: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.fysik.dtu.dk%2Fniflheim%2FSlurm_installation%23upgrading-slurm&data=01%7C01%7Cricardo.gregorio%40rothamsted.ac.uk%7C5fe28607ff8d455f5d9c08d7b54a06f6%7Cb688362589414342b0e37b8cc8392f64%7C1&sdata=zQfmqJcyEEp%2BvC2WxHR1eKWIu4F%2Ftbms344YlwW0Bs0%3D&reserved=0 You really, really want to perform a dry-run Slurm database upgrade on a test machine before doing the real upgrade! See https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.fysik.dtu.dk%2Fniflheim%2FSlurm_installation%23make-a-dry-run-database-upgrade&data=01%7C01%7Cricardo.gregorio%40rothamsted.ac.uk%7C5fe28607ff8d455f5d9c08d7b54a06f6%7Cb688362589414342b0e37b8cc8392f64%7C1&sdata=WWqmE7erGoSEJ9cMQ1o%2FOgXsI8kqK7YQ8zztSr9JpIg%3D&reserved=0 > I understand we will have to also upgrade the version of mariadb from > 5.5 to 10.X and pay attention to 'long db upgrade from 17.02 to 18.X or 19.X' > and 'bug 6796' amongst other things. We use the default MariaDB 5.5 in CentOS 7.7. Upgrading to MariaDB 10 seems to have quite a number of unresolved installation issues, so I would skip that for now. Se s > We would appreciate your comments/recommendations Slurm 19.05 works great for us. We're happy with our SchedMD support contract. /Ole Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
Re: [slurm-users] Slurm Upgrade from 17.02
Hi Ricardo, If I remember right, you can only upgrade two versions further. So you WILL have to upgrade to 18.08, even if you want to use 19.05 or the coming 20.02 17.02 -> 17.11 -> 18.08 -> 19.05 -> 20.02 ^ ^ | | |- you are here |- "farthest jump" to a newer version in one step. As SchedMD introduced constres in 19.05, consres will become depercated in future versions. The way you order GPUs is more consistent in the new version. So, I would upgrade to 19.05. Still you will have in a first step to upgrade to 18.08 though. Best Marcus On 2/19/20 3:10 PM, Ricardo Gregorio wrote: hi all, I am putting together an upgrade plan for slurm on our HPC. We are currently running old version 17.02.11. Would you guys advise us upgrading to 18.08 or 19.05? I understand we will have to also upgrade the version of mariadb from 5.5 to 10.X and pay attention to 'long db upgrade from 17.02 to 18.X or 19.X' and 'bug 6796' amongst other things. We would appreciate your comments/recommendations Regards, *Ricardo Gregorio* Research and Systems Administrator Operations ITS Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038. -- Marcus Wagner, Dipl.-Inf. IT Center Abteilung: Systeme und Betrieb RWTH Aachen University Seffenter Weg 23 52074 Aachen Tel: +49 241 80-24383 Fax: +49 241 80-624383 wag...@itc.rwth-aachen.de www.itc.rwth-aachen.de
Re: [slurm-users] Slurm Upgrade from 17.02
On 19/2/20 6:10 am, Ricardo Gregorio wrote: I am putting together an upgrade plan for slurm on our HPC. We are currently running old version 17.02.11. Would you guys advise us upgrading to 18.08 or 19.05? Slurm versions only support upgrading from 2 major versions back, so you could only upgrade from 17.02 to 17.11 or 18.08. I'd suggest going straight to 18.08. Remember you have to upgrade slurmdbd first, then upgrade slurmctld and then finally the slurmd's. Also, as Ole points out, 20.02 is due out soon at which point 18.08 gets retired from support, so you'd probably want to jump to 19.05 from 18.08. Don't forget to take backups first! We do a mysqldump of the whole accounting DB and rsync backups of our state directories before an upgrade. Best of luck! Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
Re: [slurm-users] Slurm Upgrade from 17.02
On 2/19/20 3:10 PM, Ricardo Gregorio wrote: I am putting together an upgrade plan for slurm on our HPC. We are currently running old version 17.02.11. Would you guys advise us upgrading to 18.08 or 19.05? You should be able to upgrade 2 Slurm major versions in one step. The 18.08 version is just about to become unsupported since 20.02 will be released shortly. We use 19.05.5. I have collected a number of upgrading details in my Slurm Wiki page: https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm You really, really want to perform a dry-run Slurm database upgrade on a test machine before doing the real upgrade! See https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#make-a-dry-run-database-upgrade I understand we will have to also upgrade the version of mariadb from 5.5 to 10.X and pay attention to 'long db upgrade from 17.02 to 18.X or 19.X' and 'bug 6796' amongst other things. We use the default MariaDB 5.5 in CentOS 7.7. Upgrading to MariaDB 10 seems to have quite a number of unresolved installation issues, so I would skip that for now. Se s We would appreciate your comments/recommendations Slurm 19.05 works great for us. We're happy with our SchedMD support contract. /Ole
[slurm-users] Slurm Upgrade from 17.02
hi all, I am putting together an upgrade plan for slurm on our HPC. We are currently running old version 17.02.11. Would you guys advise us upgrading to 18.08 or 19.05? I understand we will have to also upgrade the version of mariadb from 5.5 to 10.X and pay attention to 'long db upgrade from 17.02 to 18.X or 19.X' and 'bug 6796' amongst other things. We would appreciate your comments/recommendations Regards, Ricardo Gregorio Research and Systems Administrator Operations ITS Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.