Re: [slurm-users] WTERMSIG 15

2021-11-29 Thread Yair Yarom
Hi, There were two cases where this happened to us as well: 1. The systemd slurmd.service wasn't configured properly, and so the jobs ran under the slurmd.slice. So by restarting slurmd, systemd will send a signal to all processes. You can check if this is the case with 'systemctl status slurmd.se

[slurm-users] incorrectly added account and now get "AssocGrpCPUMinutesLimit" when trying to run job

2021-11-29 Thread byron
I’m trying to replicate the setup of a new account where there is a new “grouping” of accounts and a new account that will actually be used, so something like this when you run sacctmgr show assoc tree mycluster account1. (which is just being used to group accounts and so has no GrpTRE

[slurm-users] WTERMSIG 15

2021-11-29 Thread LEROY Christine 208562
Hello all, I did some modification in my slurm.conf and I’ve restarted the slurmctld on the master and then the slurmd on the nodes. During this process I’ve lost some jobs (*), curiously all these jobs were on ubuntu nodes . These jobs were ok with the consumed resources (**). Any Idea what co

[slurm-users] Checkpoint

2021-11-29 Thread Alberto Morillas, Angelines
Hi! I need your help How could I use chekpoint (dmtcp) with slurm? Thanks in advance Angelines