Hello,

Our problem is more or less related to Wei Xie's postings of two weeks 
ago.  We can't get Wien2k 10.1 running using the MPI setup.  Serial 
versions and parallel versions based on ssh do work.  Since his solution 
does not seem to work for us, I'll describe our problem/setup.

FYI: the Intel MPI setup does work for lots of other programs on our 
cluster, so I guess it must be an Intel MPI-Wien2k(-Torque-MOAB) specific 
problem.

Software environment:

icc/ifort: 11.1.073
impi:      4.0.0.028
imkl:      10.2.6.038
FFTW:      2.1.5
Torque/MOAB


$ cat parallel_options
setenv USE_REMOTE 1
setenv MPI_REMOTE 1
setenv WIEN_GRANULARITY 1
setenv WIEN_MPIRUN "mpirun -r ssh -np _NP_ _EXEC_"


Call:

clean_lapw -s
run_lapw -p -ec 0.00001 -i 1000


$ cat .machines
lapw0: cn002:8 cn004:8 cn016:8 cn018:8
1: cn002:8
1: cn004:8
1: cn016:8
1: cn018:8
granularity:1
extrafine:1


Also, the appropriate .machine1, .machine2, etc are generated.


$ cat TiC.dayfile
[...]
>   lapw0 -p    (09:59:34) starting parallel lapw0 at Sun Nov 14 09:59:34 CET 
> 2010
-------- .machine0 : 32 processors
0.428u 0.255s 0:05.12 13.0%     0+0k 0+0io 0pf+0w
>   lapw1  -p   (09:59:39) starting parallel lapw1 at Sun Nov 14 09:59:39 CET 
> 2010
->  starting parallel LAPW1 jobs at Sun Nov 14 09:59:39 CET 2010
running LAPW1 in parallel mode (using .machines)
4 number_of_parallel_jobs
      cn002 cn002 cn002 cn002 cn002 cn002 cn002 cn002(1) WARNING: Unable to 
read mpd.hosts or list of hosts isn't provided. MPI job will be run on the 
current machine only.
rank 5 in job 1  cn002_55855   caused collective abort of all ranks
   exit status of rank 5: killed by signal 9
rank 4 in job 1  cn002_55855   caused collective abort of all ranks
   exit status of rank 4: killed by signal 9
rank 3 in job 1  cn002_55855   caused collective abort of all ranks
   exit status of rank 3: killed by signal 9
[...]


Specifying -hostfile in the WIEN_MPIRUN variable results in the following 
error

invalid "local" arg: -hostfile


Thanks in advance for helping us running Wien2k in an MPI setup ;-)

Regards


Stefan Becuwe

Reply via email to