[slurm-dev] Re: Failed to allocate resource : Unable to contact slurm controller

2013-09-17 Thread Arjun J Rao
Umm... I don't have any file named slurm or slurm-llnl in my /etc/init.d


On Tue, Sep 17, 2013 at 4:46 PM, Sivasangari Nandy <
sivasangari.na...@irisa.fr> wrote:

> hello,
>
> try this :
> /etc/init.d/slurm-llnl start
> /etc/init.d/slurm-llnl stop
> /etc/init.d/slurm-llnl startclean
>
>
>
>
> --
>
> *De: *"Arjun J Rao" 
> *À: *"slurm-dev" 
> *Envoyé: *Mardi 17 Septembre 2013 11:33:53
> *Objet: *[slurm-dev] Failed to allocate resource : Unable to contact
> slurm controller
>
>
> I want to run SLURM on on a single machine as a proof of concept to run
> some trivial MPI programs on my machine.
> I keep getting the message :
>   Failed to allocate resources ; Unable to contact
> slurm controller
> In my slurm.conf file, I have named the ControlMachine as localhost and
> ControlAddr as 127.0.0.1
> Compute Node name as localhost and NodeAddr as 127.0.0.1 too.
>
> What am i doing wrong ?
>
> Scientific Linux 6.4
> 64-bit
>
>
>
>
> --
> *Siva*sangari NANDY -  Plate-forme *GenOuest*
> IRISA-INRIA, Campus de Beaulieu
> 263 Avenue du Général Leclerc
> 35042 Rennes cedex, France
> Tél: +33 (0) 2 99 84 25 69
> Bureau :  D152
>
>


[slurm-dev] Re: Failed to allocate resource : Unable to contact slurm controller

2013-09-17 Thread Sivasangari Nandy
hello, 

try this : 
/etc/init.d/slurm-llnl start 

/etc/init.d/slurm-llnl stop 

/etc/init.d/slurm-llnl startclean 

- Mail original -

> De: "Arjun J Rao" 
> À: "slurm-dev" 
> Envoyé: Mardi 17 Septembre 2013 11:33:53
> Objet: [slurm-dev] Failed to allocate resource : Unable to contact
> slurm controller

> Failed to allocate resource : Unable to contact slurm controller

> I want to run SLURM on on a single machine as a proof of concept to
> run some trivial MPI programs on my machine.
> I keep getting the message :
> Failed to allocate resources ; Unable to contact slurm controller
> In my slurm.conf file, I have named the ControlMachine as localhost
> and ControlAddr as 127.0.0.1
> Compute Node name as localhost and NodeAddr as 127.0.0.1 too.

> What am i doing wrong ?

> Scientific Linux 6.4
> 64-bit

-- 

Siva sangari NANDY - Plate-forme GenOuest 
IRISA-INRIA, Campus de Beaulieu 
263 Avenue du Général Leclerc 

35042 Rennes cedex, France 
Tél: +33 (0) 2 99 84 25 69 

Bureau : D152 


[slurm-dev] Failed to allocate resource : Unable to contact slurm controller

2013-09-17 Thread Arjun J Rao
I want to run SLURM on on a single machine as a proof of concept to run
some trivial MPI programs on my machine.
I keep getting the message :
  Failed to allocate resources ; Unable to contact
slurm controller
In my slurm.conf file, I have named the ControlMachine as localhost and
ControlAddr as 127.0.0.1
Compute Node name as localhost and NodeAddr as 127.0.0.1 too.

What am i doing wrong ?

Scientific Linux 6.4
64-bit


[slurm-dev] mpich2 to use multiple machines

2013-09-17 Thread Sivasangari Nandy
Hello, 


I got a small problem with mpich2 for slurm. I want to run my jobs in more than 
one machines (here for the test I just wanted it with VM-669, so I've put in 
the file 
Mname.txt just VM-669) 
From the master (VM-667) I run : 


mpiexec -machinefile Mname.txt -np 1 /bin/spleep 60 

but I have these errors : 



[proxy:0:0@VM-669] launch_procs (./pm/pmiserv/pmip_cb.c:687): unable to change 
wdir to /root/omaha-beach (No such file or directory) 
[proxy:0:0@VM-669] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:935): 
launch_procs returned error 
[proxy:0:0@VM-669] HYDT_dmxu_poll_wait_for_event 
(./tools/demux/demux_poll.c:77): callback returned error status 
[proxy:0:0@VM-669] main (./pm/pmiserv/pmip.c:226): demux engine error waiting 
for event 
[mpiexec@VM-667] control_cb (./pm/pmiserv/pmiserv_cb.c:215): assert (!closed) 
failed 
[mpiexec@VM-667] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): 
callback returned error status 
[mpiexec@VM-667] HYD_pmci_wait_for_completion 
(./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event 
[mpiexec@VM-667] main (./ui/mpich/mpiexec.c:405): process manager error waiting 
for completion 

Have you got any idea ? 
advance thanks, 
Siva 


-- 

Siva sangari NANDY - Plate-forme GenOuest 
IRISA-INRIA, Campus de Beaulieu 
263 Avenue du Général Leclerc 

35042 Rennes cedex, France 
Tél: +33 (0) 2 99 84 25 69 

Bureau : D152 



[slurm-dev] Limiting number of CPUs per Job

2013-09-17 Thread Olaf Gellert


Hi,

I want to have a partition for serial jobs: Each job running with
only a single CPU, many jobs on a single node. How would I do that?

MaxCPUs or QOS seem to relate to associations (and not to single
partitions) or could I use that somehow?

Regards, Olaf

--
Dipl. Inform. Olaf Gellertemail  gell...@dkrz.de
Deutsches Klimarechenzentrum GmbH phone  +49 (0)40 460094 214
Bundesstrasse 45a fax+49 (0)40 460094 270
D-20146 Hamburg, Germany  wwwhttp://www.dkrz.de

Sitz der Gesellschaft: Hamburg
Gesch�ftsf�hrer: Prof. Dr. Thomas Ludwig
Registergericht: Amtsgericht Hamburg, HRB 39784