[Wien] running wien2k on cluster

2017-04-26 Thread ahmed amine
Hello,

i have problem running on a cluster, I can't run paralel calculation

dayfile
start   (mer. avril 26 21:48:01 CET 2017) with lapw0 (40/99 to go)

cycle 1 (mer. avril 26 21:48:01 CET 2017)   (40/99 to go)

>   lapw0 -p(21:48:01) starting parallel lapw0 at mer. avril 26 21:48:16 
> CET 2017
 .machine0 : 24 processors
0.030u 0.080s 0:12.95 0.8%  0+0k 0+304io 0pf+0w
>   lapw1  -p   -c  (21:48:24) starting parallel lapw1 at mer. avril 26 
> 21:48:39 CET 2017
->  starting parallel LAPW1 jobs at mer. avril 26 21:48:44 CET 2017
running LAPW1 in parallel mode (using .machines)
running lapw1c in single mode

i have this error in job.out
/tmp/slurmd/job03057/slurm_script: line 12: hostlist : commande introuvable

.machine
lapw0: :24
granularity:1
extrafine:1

i'm using this slurm script

#!/bin/bash
#SBATCH --mem=1024
#SBATCH --ntasks=12
#SBATCH --nodes=2
#SBATCH --output=job.out

# set .machines for parallel job
# lapw0 running on one node
echo -n "lapw0: " > .machines
echo -n $(hostlist -e $SLURM_JOB_NODELIST | tail -1) >> .machines
echo "$i:24" >> .machines

echo granularity:1 >> .machines
echo extrafine:1   >> .machines

run_lapw -p -NI

___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] running wien2k on cluster

2017-04-26 Thread Laurence Marks
As constructed your .machines file will only run lapw0 using mpi, and
is missing lines for how to run lapw1. User error.

On Wed, Apr 26, 2017 at 4:15 PM, ahmed amine  wrote:
> Hello,
>
> i have problem running on a cluster, I can't run paralel calculation
>
> dayfile
> start   (mer. avril 26 21:48:01 CET 2017) with lapw0 (40/99 to go)
>
> cycle 1 (mer. avril 26 21:48:01 CET 2017)   (40/99 to go)
>
>>   lapw0 -p(21:48:01) starting parallel lapw0 at mer. avril 26 21:48:16
>> CET 2017
>  .machine0 : 24 processors
> 0.030u 0.080s 0:12.95 0.8%  0+0k 0+304io 0pf+0w
>>   lapw1  -p   -c  (21:48:24) starting parallel lapw1 at mer. avril 26
>> 21:48:39 CET 2017
> ->  starting parallel LAPW1 jobs at mer. avril 26 21:48:44 CET 2017
> running LAPW1 in parallel mode (using .machines)
> running lapw1c in single mode
>
> i have this error in job.out
> /tmp/slurmd/job03057/slurm_script: line 12: hostlist : commande introuvable
>
> .machine
> lapw0: :24
> granularity:1
> extrafine:1
>
> i'm using this slurm script
>
> #!/bin/bash
> #SBATCH --mem=1024
> #SBATCH --ntasks=12
> #SBATCH --nodes=2
> #SBATCH --output=job.out
>
> # set .machines for parallel job
> # lapw0 running on one node
> echo -n "lapw0: " > .machines
> echo -n $(hostlist -e $SLURM_JOB_NODELIST | tail -1) >> .machines
> echo "$i:24" >> .machines
>
> echo granularity:1 >> .machines
> echo extrafine:1   >> .machines
>
> run_lapw -p -NI
>



-- 
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what
nobody else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu ; Corrosion in 4D: MURI4D.numis.northwestern.edu
Partner of the CFW 100% program for gender equity, www.cfw.org/100-percent
Co-Editor, Acta Cryst A
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] running wien2k on cluster

2017-04-26 Thread Gavin Abo

Also, it looks like job.out tells you that the problem is:

slurm_script: line 12: hostlist : command not found

My guess is that whomever setup slurm didn't get it from the slurm 
download website [1] and install it. If pip is installed, it can likely 
be installed using [2]:


sudo pip install python-hostlist

For one mpi job on each node using slurm, refer to [3].

References

[1] https://slurm.schedmd.com/download.html
[2] https://packaging.python.org/installing/#use-pip-for-installing
[3] 
https://www.nsc.liu.se/systems/triolith/software/triolith-software-apps-wien2k.html


On 4/26/2017 3:22 PM, Laurence Marks wrote:

As constructed your .machines file will only run lapw0 using mpi, and
is missing lines for how to run lapw1. User error.

On Wed, Apr 26, 2017 at 4:15 PM, ahmed amine  wrote:

Hello,

i have problem running on a cluster, I can't run paralel calculation

dayfile
start   (mer. avril 26 21:48:01 CET 2017) with lapw0 (40/99 to go)

 cycle 1 (mer. avril 26 21:48:01 CET 2017)   (40/99 to go)


   lapw0 -p(21:48:01) starting parallel lapw0 at mer. avril 26 21:48:16
CET 2017

 .machine0 : 24 processors
0.030u 0.080s 0:12.95 0.8%  0+0k 0+304io 0pf+0w

   lapw1  -p   -c  (21:48:24) starting parallel lapw1 at mer. avril 26
21:48:39 CET 2017

->  starting parallel LAPW1 jobs at mer. avril 26 21:48:44 CET 2017
running LAPW1 in parallel mode (using .machines)
running lapw1c in single mode

i have this error in job.out
/tmp/slurmd/job03057/slurm_script: line 12: hostlist : commande introuvable

.machine
lapw0: :24
granularity:1
extrafine:1

i'm using this slurm script

#!/bin/bash
#SBATCH --mem=1024
#SBATCH --ntasks=12
#SBATCH --nodes=2
#SBATCH --output=job.out

# set .machines for parallel job
# lapw0 running on one node
echo -n "lapw0: " > .machines
echo -n $(hostlist -e $SLURM_JOB_NODELIST | tail -1) >> .machines
echo "$i:24" >> .machines

echo granularity:1 >> .machines
echo extrafine:1   >> .machines

run_lapw -p -NI






___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html