Re: [slurm-users] slurmdbd purge not working

2019-04-08 Thread Lech Nieroda
Hello Julien, the innodb engine may stop working if you change parameters such as innodb_log_file_size without rebuilding the database, as the expected values no longer correspond to the encountered ones. Try using the old parameters. In order to debug the archive dump error you might want to ru

Re: [slurm-users] slurmdbd purge not working

2019-04-08 Thread Julien Rey
Hi Ole, Thank you for your advice. As I said in my previous messages, this is how I set the my.cnf: innodb_buffer_pool_size = 32G innodb_log_file_size= 64M innodb_lock_wait_timeout= 3600 I have read the thread "Extreme long db upgrade 16.05.6 -> 17.11.3". However I

Re: [slurm-users] slurmdbd purge not working

2019-04-05 Thread Ole Holm Nielsen
On 4/5/19 4:28 PM, Julien Rey wrote: The failure occurs after a few minutes (~10). And we are running out of space on the slurm controller. The mysql daemon is at 100% CPU usage all the time. This issue is becoming critical. ... Our slurm accounting database is growing bigger and bigger (more

Re: [slurm-users] slurmdbd purge not working

2019-04-05 Thread Ole Holm Nielsen
Hi Julien, Did you optimize the MySQL database, in particular InnoDB? I have collected some documentation in my Wiki page https://wiki.fysik.dtu.dk/niflheim/Slurm_database#mysql-configuration and I also discuss database purging. Please note that we run Slurm 17.11 (and recently 18.08) on Cent

Re: [slurm-users] slurmdbd purge not working

2019-04-05 Thread Julien Rey
The failure occurs after a few minutes (~10). And we are running out of space on the slurm controller. The mysql daemon is at 100% CPU usage all the time. This issue is becoming critical. Le 05/04/2019 16:10, Paul Edmon a écrit : Did it just time out, or did that failure happen immediately. I

Re: [slurm-users] slurmdbd purge not working

2019-04-05 Thread Paul Edmon
Did it just time out, or did that failure happen immediately.  If immediate you may be in a situation where you are hitting a bug. It "should" be safe to upgrade to a later version of 15.08.*. There may be fixes in there related to that.  I would look at the changelog though just to see if ther

Re: [slurm-users] slurmdbd purge not working

2019-04-05 Thread Julien Rey
Hi Paul, thanks for your advice. Actually I already tried what you suggested. No matter what value do I put after PurgeJobAfter I always end up with the same error: sacctmgr archive dump Directory=/home/joule/archives/ PurgeJobAfter=1days sacctmgr: error: slurmdbd: Getting response to message t

Re: [slurm-users] slurmdbd purge not working

2019-04-04 Thread Paul Edmon
We ran into this problem in the past.  I know that fixes were put in to deal with large purges as a result of our problems but I don't recall what version they ended up in, likely newer than 15.08.0. A solution that can work is to walk up the time so that instead of one large purge you do seve

[slurm-users] slurmdbd purge not working

2019-04-04 Thread Julien Rey
Hello, Our slurm accounting database is growing bigger and bigger (more than 100Gb) and is never being purged. We are running slurm 15.08.0-0pre1. I would like to upgrade to a more recent version of the slurmdbd, but my fear is that it may break everything during the update of the database.