that's what i have done yesterday actually : 

/etc/init.d/slurm-llnl start 

[ ok ] Starting slurm central management daemon: slurmctld. 
/usr/sbin/slurmctld already running. 
----- Mail original -----

> De: "Nikita Burtsev" <nikita.burt...@gmail.com>
> À: "slurm-dev" <slurm-dev@schedmd.com>
> Envoyé: Jeudi 22 Août 2013 09:59:52
> Objet: [slurm-dev] Re: Required node not available (down or drained)

> Re: [slurm-dev] Re: Required node not available (down or drained)
> You need to have slurmd running on all nodes that will execute jobs,
> so you should start it with init script.

> --
> Nikita Burtsev
> Sent with Sparrow

> On Thursday, August 22, 2013 at 11:55 AM, Sivasangari Nandy wrote:
> > " check if the slurmd daemon is running with the command " ps -el |
> > grep slurmd ". "
> 

> > Nothing is happened with ps -el ...
> 

> > root@VM-667:~# ps -el | grep slurmd
> 

> > > De: "Nikita Burtsev" < nikita.burt...@gmail.com >
> > 
> 
> > > À: "slurm-dev" < slurm-dev@schedmd.com >
> > 
> 
> > > Envoyé: Mercredi 21 Août 2013 18:58:52
> > 
> 
> > > Objet: [slurm-dev] Re: Required node not available (down or
> > > drained)
> > 
> 

> > > Re: [slurm-dev] Re: Required node not available (down or drained)
> > 
> 
> > > slurmctld is the management process and since your have access to
> > > squeue/sinfo information it is running just fine. You need to
> > > check
> > > if slurmd (which is the agent part) is running on your nodes,
> > > i.e.
> > > VM-[669-671]
> > 
> 

> > > --
> > 
> 
> > > Nikita Burtsev
> > 
> 

> > > On Wednesday, August 21, 2013 at 8:13 PM, Sivasangari Nandy
> > > wrote:
> > 
> 
> > > > I have tried :
> > > 
> > 
> 

> > > > /etc/init.d/slurm-llnl start
> > > 
> > 
> 

> > > > [ ok ] Starting slurm central management daemon: slurmctld.
> > > 
> > 
> 
> > > > /usr/sbin/slurmctld already running.
> > > 
> > 
> 

> > > > And :
> > > 
> > 
> 

> > > > scontrol show slurmd
> > > 
> > 
> 

> > > > scontrol: error: slurm_slurmd_info: Connection refused
> > > 
> > 
> 
> > > > slurm_load_slurmd_status: Connection refused
> > > 
> > 
> 

> > > > Hum how to proceed to repair that problem ?
> > > 
> > 
> 

> > > > > De: "Danny Auble" < d...@schedmd.com >
> > > > 
> > > 
> > 
> 
> > > > > À: "slurm-dev" < slurm-dev@schedmd.com >
> > > > 
> > > 
> > 
> 
> > > > > Envoyé: Mercredi 21 Août 2013 15:36:53
> > > > 
> > > 
> > 
> 
> > > > > Objet: [slurm-dev] Re: Required node not available (down or
> > > > > drained)
> > > > 
> > > 
> > 
> 

> > > > > Check your slurmd log. It doesn't appear the slurmd is
> > > > > running.
> > > > 
> > > 
> > 
> 

> > > > > Sivasangari Nandy < sivasangari.na...@irisa.fr > wrote:
> > > > 
> > > 
> > 
> 
> > > > > > > > Hello,
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > I'm trying to use Slurm for the first time, and I got a
> > > > > > > > problem
> > > > > > > > with
> > > > > > > > nodes I think.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > I have this message when I used squeue :
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > root@VM-667:~# squeue
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > JOBID PARTITION NAME USER ST TIME NODES
> > > > > > > > NODELIST(REASON)
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > 50 SLURM-deb test.sh root PD ; 0:00 1 (ReqNodeNotAvail)
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > or this one with an other squeue :
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > root@VM-671:~# squeue
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > JOBID PARTITION NAME USER ST TIME NODES
> > > > > > > > NODELIST(REASON)
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > 50 SLURM-deb test.sh root PD 0:00 &n bsp; 1 (Resources)
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > sinfo gives me :
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > SLURM-de* up infinite 3 down VM-[669-671]
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > I have already used slurm one time with the same
> > > > > > > > configuration
> > > > > > > > and
> > > > > > > > I
> > > > > > > > wan able to run my job.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > But now the second time I always got :
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > srun: Required node not available (down or drained)
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > srun: job 51 queued and waiting for resources
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > Advance thanks for your help,
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > Siva
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > --
> > > 
> > 
> 

> > > > Siva sangari NANDY - Plate-forme GenOuest
> > > 
> > 
> 
> > > > IRISA-INRIA, Campus de Beaulieu
> > > 
> > 
> 
> > > > 263 Avenue du Général Leclerc
> > > 
> > 
> 

> > > > 35042 Rennes cedex, France
> > > 
> > 
> 
> > > > Tél: +33 (0) 2 99 84 25 69
> > > 
> > 
> 

> > > > Bureau : D152
> > > 
> > 
> 

> > --
> 

> > Siva sangari NANDY - Plate-forme GenOuest
> 
> > IRISA-INRIA, Campus de Beaulieu
> 
> > 263 Avenue du Général Leclerc
> 

> > 35042 Rennes cedex, France
> 
> > Tél: +33 (0) 2 99 84 25 69
> 

> > Bureau : D152
> 

-- 

Siva sangari NANDY - Plate-forme GenOuest 
IRISA-INRIA, Campus de Beaulieu 
263 Avenue du Général Leclerc 

35042 Rennes cedex, France 
Tél: +33 (0) 2 99 84 25 69 

Bureau : D152 

Reply via email to