that's what i have done yesterday actually : /etc/init.d/slurm-llnl start
[ ok ] Starting slurm central management daemon: slurmctld. /usr/sbin/slurmctld already running. ----- Mail original ----- > De: "Nikita Burtsev" <nikita.burt...@gmail.com> > À: "slurm-dev" <slurm-dev@schedmd.com> > Envoyé: Jeudi 22 Août 2013 09:59:52 > Objet: [slurm-dev] Re: Required node not available (down or drained) > Re: [slurm-dev] Re: Required node not available (down or drained) > You need to have slurmd running on all nodes that will execute jobs, > so you should start it with init script. > -- > Nikita Burtsev > Sent with Sparrow > On Thursday, August 22, 2013 at 11:55 AM, Sivasangari Nandy wrote: > > " check if the slurmd daemon is running with the command " ps -el | > > grep slurmd ". " > > > Nothing is happened with ps -el ... > > > root@VM-667:~# ps -el | grep slurmd > > > > De: "Nikita Burtsev" < nikita.burt...@gmail.com > > > > > > > À: "slurm-dev" < slurm-dev@schedmd.com > > > > > > > Envoyé: Mercredi 21 Août 2013 18:58:52 > > > > > > Objet: [slurm-dev] Re: Required node not available (down or > > > drained) > > > > > > Re: [slurm-dev] Re: Required node not available (down or drained) > > > > > > slurmctld is the management process and since your have access to > > > squeue/sinfo information it is running just fine. You need to > > > check > > > if slurmd (which is the agent part) is running on your nodes, > > > i.e. > > > VM-[669-671] > > > > > > -- > > > > > > Nikita Burtsev > > > > > > On Wednesday, August 21, 2013 at 8:13 PM, Sivasangari Nandy > > > wrote: > > > > > > > I have tried : > > > > > > > > > > /etc/init.d/slurm-llnl start > > > > > > > > > > [ ok ] Starting slurm central management daemon: slurmctld. > > > > > > > > > > /usr/sbin/slurmctld already running. > > > > > > > > > > And : > > > > > > > > > > scontrol show slurmd > > > > > > > > > > scontrol: error: slurm_slurmd_info: Connection refused > > > > > > > > > > slurm_load_slurmd_status: Connection refused > > > > > > > > > > Hum how to proceed to repair that problem ? > > > > > > > > > > > De: "Danny Auble" < d...@schedmd.com > > > > > > > > > > > > > > > > À: "slurm-dev" < slurm-dev@schedmd.com > > > > > > > > > > > > > > > > Envoyé: Mercredi 21 Août 2013 15:36:53 > > > > > > > > > > > > > > > Objet: [slurm-dev] Re: Required node not available (down or > > > > > drained) > > > > > > > > > > > > > > > Check your slurmd log. It doesn't appear the slurmd is > > > > > running. > > > > > > > > > > > > > > > Sivasangari Nandy < sivasangari.na...@irisa.fr > wrote: > > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm trying to use Slurm for the first time, and I got a > > > > > > > > problem > > > > > > > > with > > > > > > > > nodes I think. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have this message when I used squeue : > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > root@VM-667:~# squeue > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > JOBID PARTITION NAME USER ST TIME NODES > > > > > > > > NODELIST(REASON) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 50 SLURM-deb test.sh root PD ; 0:00 1 (ReqNodeNotAvail) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > or this one with an other squeue : > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > root@VM-671:~# squeue > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > JOBID PARTITION NAME USER ST TIME NODES > > > > > > > > NODELIST(REASON) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 50 SLURM-deb test.sh root PD 0:00 &n bsp; 1 (Resources) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > sinfo gives me : > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > SLURM-de* up infinite 3 down VM-[669-671] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have already used slurm one time with the same > > > > > > > > configuration > > > > > > > > and > > > > > > > > I > > > > > > > > wan able to run my job. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > But now the second time I always got : > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > srun: Required node not available (down or drained) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > srun: job 51 queued and waiting for resources > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Advance thanks for your help, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Siva > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Siva sangari NANDY - Plate-forme GenOuest > > > > > > > > > > IRISA-INRIA, Campus de Beaulieu > > > > > > > > > > 263 Avenue du Général Leclerc > > > > > > > > > > 35042 Rennes cedex, France > > > > > > > > > > Tél: +33 (0) 2 99 84 25 69 > > > > > > > > > > Bureau : D152 > > > > > > > > -- > > > Siva sangari NANDY - Plate-forme GenOuest > > > IRISA-INRIA, Campus de Beaulieu > > > 263 Avenue du Général Leclerc > > > 35042 Rennes cedex, France > > > Tél: +33 (0) 2 99 84 25 69 > > > Bureau : D152 > -- Siva sangari NANDY - Plate-forme GenOuest IRISA-INRIA, Campus de Beaulieu 263 Avenue du Général Leclerc 35042 Rennes cedex, France Tél: +33 (0) 2 99 84 25 69 Bureau : D152