Re: [slurm-users] Best method to determine if a node is down

2021-06-27 Thread Marcus Boden
Hi Doug, Slurm has the strigger[1] mechanism that can do exactly that, the manpage even has your use case as an example. It works quite well for us. Best, Marcus [1] https://slurm.schedmd.com/strigger.html On 26.06.21 19:10, Doug Niven wrote: Hi Folks, I’d like to setup an email notificati

Re: [slurm-users] nodes that finished calculation do not become idle

2021-06-27 Thread Brian Andrus
I suspect you are misunderstanding how the flow works. 1. You request X nodes to do some work. 2. You start a job that uses all the nodes. 3. Job runs until everything is done. 4. Resources are released back to be used again. If your job allows it, you probably want an array job, which will be

Re: [slurm-users] slurmd running on IBM Power9 systems

2021-06-27 Thread Karl Lovink
Hi Yair, This means it’s not some kind of fatal error. My jobs are not running, this means something else is not Ok. Thanks for your reply. Sincerely yours, Karl > On 27 Jun 2021, at 09:18, Yair Yarom wrote: > >  > Hi, > > If it helps we have slurm 19.05 running on power8 and we are ignori

Re: [slurm-users] slurmd running on IBM Power9 systems

2021-06-27 Thread Yair Yarom
Hi, If it helps we have slurm 19.05 running on power8 and we are ignoring these messages for quite a while now. I'm not sure what impact it has on the scheduler or the jobs, but we generally don't play with the frequency anyway. On Wed, Jun 23, 2021 at 7:16 PM Karl Lovink wrote: > Hello, > > I

[slurm-users] nodes that finished calculation do not become idle

2021-06-27 Thread Grigory Ptashko
Hello! Recently I've started using MPI on our HPC-cluster. It has 40 nodes. It runs SLURM. I'm new to MPI and SLURM but so far everything works fine except one thing. In short: nodes that finished calculation do not become idle. Only after all the nodes finished calculations they all become idle.