sorry, misread the error message, please ignore.
> On Mar 12, 2016, at 10:46 AM, Craig Yoshioka <[email protected]> wrote: > > > I thought slurmctld is only meant to run on the head node? The clients just > run slurmd? > >> On Mar 12, 2016, at 9:32 AM, Jagga Soorma <[email protected]> wrote: >> >> >> Hi Guys, >> >> I have successfully installed slurm 15.08 on a small test cluster >> running CentOS 7.1. Everything seems like it is running fine and I >> can submit jobs without any issues. However on the clients I am >> seeing some errors on the systemctl status slurm command that don't >> make sense: >> >> -- >> # systemctl status slurm >> slurm.service - LSB: slurm daemon management >> Loaded: loaded (/etc/rc.d/init.d/slurm) >> Active: failed (Result: timeout) since Sat 2016-03-12 09:14:22 PST; 7min ago >> >> Mar 12 09:12:20 client1 slurmd[123729]: _run_prolog: run job script took >> usec=4 >> Mar 12 09:12:20 client1 slurmd[123729]: _run_prolog: prolog with lock >> for job 6 ran for 0 seconds >> Mar 12 09:12:20 client1 slurmstepd[126069]: done with job >> Mar 12 09:12:30 client1 slurmd[123729]: launch task 7.0 request from >> [email protected] (port 42986) >> Mar 12 09:12:30 client1 slurmd[123729]: _run_prolog: run job script took >> usec=4 >> Mar 12 09:12:30 client1 slurmd[123729]: _run_prolog: prolog with lock >> for job 7 ran for 0 seconds >> Mar 12 09:12:30 client1 slurmstepd[126100]: done with job >> Mar 12 09:14:22 client1 systemd[1]: slurm.service operation timed out. >> Terminating. >> Mar 12 09:14:22 client1 systemd[1]: Failed to start LSB: slurm daemon >> management. >> Mar 12 09:14:22 client1 systemd[1]: Unit slurm.service entered failed state. >> -- >> >> However slurm seems to be working fine: >> >> -- >> # sinfo -lNe >> Sat Mar 12 09:25:40 2016 >> NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY >> TMP_DISK WEIGHT FEATURES REASON >> client[1-10] 8 dev* idle 40 2:10:2 257680 0 >> 1 (null) none >> # srun hostname >> client1 >> -- >> >> Any ideas why the slurm service in the client might be throwing those >> timed out errors? >> >> Thanks!
