Hi Guys, I have successfully installed slurm 15.08 on a small test cluster running CentOS 7.1. Everything seems like it is running fine and I can submit jobs without any issues. However on the clients I am seeing some errors on the systemctl status slurm command that don't make sense:
-- # systemctl status slurm slurm.service - LSB: slurm daemon management Loaded: loaded (/etc/rc.d/init.d/slurm) Active: failed (Result: timeout) since Sat 2016-03-12 09:14:22 PST; 7min ago Mar 12 09:12:20 client1 slurmd[123729]: _run_prolog: run job script took usec=4 Mar 12 09:12:20 client1 slurmd[123729]: _run_prolog: prolog with lock for job 6 ran for 0 seconds Mar 12 09:12:20 client1 slurmstepd[126069]: done with job Mar 12 09:12:30 client1 slurmd[123729]: launch task 7.0 request from [email protected] (port 42986) Mar 12 09:12:30 client1 slurmd[123729]: _run_prolog: run job script took usec=4 Mar 12 09:12:30 client1 slurmd[123729]: _run_prolog: prolog with lock for job 7 ran for 0 seconds Mar 12 09:12:30 client1 slurmstepd[126100]: done with job Mar 12 09:14:22 client1 systemd[1]: slurm.service operation timed out. Terminating. Mar 12 09:14:22 client1 systemd[1]: Failed to start LSB: slurm daemon management. Mar 12 09:14:22 client1 systemd[1]: Unit slurm.service entered failed state. -- However slurm seems to be working fine: -- # sinfo -lNe Sat Mar 12 09:25:40 2016 NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT FEATURES REASON client[1-10] 8 dev* idle 40 2:10:2 257680 0 1 (null) none # srun hostname client1 -- Any ideas why the slurm service in the client might be throwing those timed out errors? Thanks!
