[Wien] Parallel Wien2k using Intel MPI?

2010-11-14 Thread Stefan Becuwe

Hello,

Our problem is more or less related to Wei Xie's postings of two weeks 
ago.  We can't get Wien2k 10.1 running using the MPI setup.  Serial 
versions and parallel versions based on ssh do work.  Since his solution 
does not seem to work for us, I'll describe our problem/setup.

FYI: the Intel MPI setup does work for lots of other programs on our 
cluster, so I guess it must be an Intel MPI-Wien2k(-Torque-MOAB) specific 
problem.

Software environment:

icc/ifort: 11.1.073
impi:  4.0.0.028
imkl:  10.2.6.038
FFTW:  2.1.5
Torque/MOAB


$ cat parallel_options
setenv USE_REMOTE 1
setenv MPI_REMOTE 1
setenv WIEN_GRANULARITY 1
setenv WIEN_MPIRUN "mpirun -r ssh -np _NP_ _EXEC_"


Call:

clean_lapw -s
run_lapw -p -ec 0.1 -i 1000


$ cat .machines
lapw0: cn002:8 cn004:8 cn016:8 cn018:8
1: cn002:8
1: cn004:8
1: cn016:8
1: cn018:8
granularity:1
extrafine:1


Also, the appropriate .machine1, .machine2, etc are generated.


$ cat TiC.dayfile
[...]
>   lapw0 -p(09:59:34) starting parallel lapw0 at Sun Nov 14 09:59:34 CET 
> 2010
 .machine0 : 32 processors
0.428u 0.255s 0:05.12 13.0% 0+0k 0+0io 0pf+0w
>   lapw1  -p   (09:59:39) starting parallel lapw1 at Sun Nov 14 09:59:39 CET 
> 2010
->  starting parallel LAPW1 jobs at Sun Nov 14 09:59:39 CET 2010
running LAPW1 in parallel mode (using .machines)
4 number_of_parallel_jobs
  cn002 cn002 cn002 cn002 cn002 cn002 cn002 cn002(1) WARNING: Unable to 
read mpd.hosts or list of hosts isn't provided. MPI job will be run on the 
current machine only.
rank 5 in job 1  cn002_55855   caused collective abort of all ranks
   exit status of rank 5: killed by signal 9
rank 4 in job 1  cn002_55855   caused collective abort of all ranks
   exit status of rank 4: killed by signal 9
rank 3 in job 1  cn002_55855   caused collective abort of all ranks
   exit status of rank 3: killed by signal 9
[...]


Specifying -hostfile in the WIEN_MPIRUN variable results in the following 
error

invalid "local" arg: -hostfile


Thanks in advance for helping us running Wien2k in an MPI setup ;-)

Regards


Stefan Becuwe


[Wien] Parallel Wien2k using Intel MPI?

2010-11-14 Thread Laurence Marks
One addendum. Torque-MOAB probably sets up some default files for you
in many cases under the assumption that all you are doing is running a
single mpi task using all the nodes you asked for. You might be able
to get away with something like changing to

setenv WIEN_MPIRUN "mpirun _EXEC_"

and a machines file such as
lapw0: cn002:8 cn004:8 cn016:8 cn018:8
1: cn002:8 cn004:8 cn016:8 cn018:8
granularity:1
extrafine:1

so in effect you are running 1 mpi job on all the nodes with the MOAB
defaults. (You might need -np _NP_ is the WIEN_MPIRUN, you have to
experiment and read your mpirun instructions, e.g.  "mpirun --help" or
"man mpirun".) However, this is not very efficient

On Sun, Nov 14, 2010 at 9:53 AM, Laurence Marks
 wrote:
> I don't think that this has much to do with Wien2k, it is an issue
> with how you are setting up your mpi. From the looks of it you are
> using MPICH2, whereas most of the scripts in Wien2k are setup to use
> MPICH1 which is rather simpler. For MPICH2 you have to setup the mpd
> daemon and configuration files, which is very different from the
> simpler hostfile structure of MPICH1.
>
> (I personally have never got Wien2k running smoothly with MPICH2, but
> have not tried too hard. If ?anyone has a detailed description this
> would be a useful post.)
>
> You can find some information about the other steps you need for
> MPICH2 on the web, e.g.
>
> http://developer.amd.com/documentation/articles/pages/HPCHighPerformanceLinpack.aspx
>
> and google searches on "WARNING: Unable to read mpd.hosts or list of
> hosts isn't provided"
>
> On Sun, Nov 14, 2010 at 3:19 AM, Stefan Becuwe  
> wrote:
>>
>> Hello,
>>
>> Our problem is more or less related to Wei Xie's postings of two weeks ago.
>> ?We can't get Wien2k 10.1 running using the MPI setup. ?Serial versions and
>> parallel versions based on ssh do work. ?Since his solution does not seem to
>> work for us, I'll describe our problem/setup.
>>
>> FYI: the Intel MPI setup does work for lots of other programs on our
>> cluster, so I guess it must be an Intel MPI-Wien2k(-Torque-MOAB) specific
>> problem.
>>
>> Software environment:
>>
>> icc/ifort: 11.1.073
>> impi: ? ? ?4.0.0.028
>> imkl: ? ? ?10.2.6.038
>> FFTW: ? ? ?2.1.5
>> Torque/MOAB
>>
>>
>> $ cat parallel_options
>> setenv USE_REMOTE 1
>> setenv MPI_REMOTE 1
>> setenv WIEN_GRANULARITY 1
>> setenv WIEN_MPIRUN "mpirun -r ssh -np _NP_ _EXEC_"
>>
>>
>> Call:
>>
>> clean_lapw -s
>> run_lapw -p -ec 0.1 -i 1000
>>
>>
>> $ cat .machines
>> lapw0: cn002:8 cn004:8 cn016:8 cn018:8
>> 1: cn002:8
>> 1: cn004:8
>> 1: cn016:8
>> 1: cn018:8
>> granularity:1
>> extrafine:1
>>
>>
>> Also, the appropriate .machine1, .machine2, etc are generated.
>>
>>
>> $ cat TiC.dayfile
>> [...]
>>>
>>> ?lapw0 -p ? ?(09:59:34) starting parallel lapw0 at Sun Nov 14 09:59:34 CET
>>> 2010
>>
>>  .machine0 : 32 processors
>> 0.428u 0.255s 0:05.12 13.0% ? ? 0+0k 0+0io 0pf+0w
>>>
>>> ?lapw1 ?-p ? (09:59:39) starting parallel lapw1 at Sun Nov 14 09:59:39 CET
>>> 2010
>>
>> -> ?starting parallel LAPW1 jobs at Sun Nov 14 09:59:39 CET 2010
>> running LAPW1 in parallel mode (using .machines)
>> 4 number_of_parallel_jobs
>> ? ? cn002 cn002 cn002 cn002 cn002 cn002 cn002 cn002(1) WARNING: Unable to
>> read mpd.hosts or list of hosts isn't provided. MPI job will be run on the
>> current machine only.
>> rank 5 in job 1 ?cn002_55855 ? caused collective abort of all ranks
>> ?exit status of rank 5: killed by signal 9
>> rank 4 in job 1 ?cn002_55855 ? caused collective abort of all ranks
>> ?exit status of rank 4: killed by signal 9
>> rank 3 in job 1 ?cn002_55855 ? caused collective abort of all ranks
>> ?exit status of rank 3: killed by signal 9
>> [...]
>>
>>
>> Specifying -hostfile in the WIEN_MPIRUN variable results in the following
>> error
>>
>> invalid "local" arg: -hostfile
>>
>>
>> Thanks in advance for helping us running Wien2k in an MPI setup ;-)
>>
>> Regards
>>
>>
>> Stefan Becuwe
>> ___
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>
>
>
>
> --
> Laurence Marks
> Department of Materials Science and Engineering
> MSE Rm 2036 Cook Hall
> 2220 N Campus Drive
> Northwestern University
> Evanston, IL 60208, USA
> Tel: (847) 491-3996 Fax: (847) 491-7820
> email: L-marks at northwestern dot edu
> Web: www.numis.northwestern.edu
> Chair, Commission on Electron Crystallography of IUCR
> www.numis.northwestern.edu/
> Electron crystallography is the branch of science that uses electron
> scattering and imaging to study the structure of matter.
>



-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/
Electron cryst

[Wien] Parallel Wien2k using Intel MPI?

2010-11-14 Thread Laurence Marks
I don't think that this has much to do with Wien2k, it is an issue
with how you are setting up your mpi. From the looks of it you are
using MPICH2, whereas most of the scripts in Wien2k are setup to use
MPICH1 which is rather simpler. For MPICH2 you have to setup the mpd
daemon and configuration files, which is very different from the
simpler hostfile structure of MPICH1.

(I personally have never got Wien2k running smoothly with MPICH2, but
have not tried too hard. If  anyone has a detailed description this
would be a useful post.)

You can find some information about the other steps you need for
MPICH2 on the web, e.g.

http://developer.amd.com/documentation/articles/pages/HPCHighPerformanceLinpack.aspx

and google searches on "WARNING: Unable to read mpd.hosts or list of
hosts isn't provided"

On Sun, Nov 14, 2010 at 3:19 AM, Stefan Becuwe  
wrote:
>
> Hello,
>
> Our problem is more or less related to Wei Xie's postings of two weeks ago.
> ?We can't get Wien2k 10.1 running using the MPI setup. ?Serial versions and
> parallel versions based on ssh do work. ?Since his solution does not seem to
> work for us, I'll describe our problem/setup.
>
> FYI: the Intel MPI setup does work for lots of other programs on our
> cluster, so I guess it must be an Intel MPI-Wien2k(-Torque-MOAB) specific
> problem.
>
> Software environment:
>
> icc/ifort: 11.1.073
> impi: ? ? ?4.0.0.028
> imkl: ? ? ?10.2.6.038
> FFTW: ? ? ?2.1.5
> Torque/MOAB
>
>
> $ cat parallel_options
> setenv USE_REMOTE 1
> setenv MPI_REMOTE 1
> setenv WIEN_GRANULARITY 1
> setenv WIEN_MPIRUN "mpirun -r ssh -np _NP_ _EXEC_"
>
>
> Call:
>
> clean_lapw -s
> run_lapw -p -ec 0.1 -i 1000
>
>
> $ cat .machines
> lapw0: cn002:8 cn004:8 cn016:8 cn018:8
> 1: cn002:8
> 1: cn004:8
> 1: cn016:8
> 1: cn018:8
> granularity:1
> extrafine:1
>
>
> Also, the appropriate .machine1, .machine2, etc are generated.
>
>
> $ cat TiC.dayfile
> [...]
>>
>> ?lapw0 -p ? ?(09:59:34) starting parallel lapw0 at Sun Nov 14 09:59:34 CET
>> 2010
>
>  .machine0 : 32 processors
> 0.428u 0.255s 0:05.12 13.0% ? ? 0+0k 0+0io 0pf+0w
>>
>> ?lapw1 ?-p ? (09:59:39) starting parallel lapw1 at Sun Nov 14 09:59:39 CET
>> 2010
>
> -> ?starting parallel LAPW1 jobs at Sun Nov 14 09:59:39 CET 2010
> running LAPW1 in parallel mode (using .machines)
> 4 number_of_parallel_jobs
> ? ? cn002 cn002 cn002 cn002 cn002 cn002 cn002 cn002(1) WARNING: Unable to
> read mpd.hosts or list of hosts isn't provided. MPI job will be run on the
> current machine only.
> rank 5 in job 1 ?cn002_55855 ? caused collective abort of all ranks
> ?exit status of rank 5: killed by signal 9
> rank 4 in job 1 ?cn002_55855 ? caused collective abort of all ranks
> ?exit status of rank 4: killed by signal 9
> rank 3 in job 1 ?cn002_55855 ? caused collective abort of all ranks
> ?exit status of rank 3: killed by signal 9
> [...]
>
>
> Specifying -hostfile in the WIEN_MPIRUN variable results in the following
> error
>
> invalid "local" arg: -hostfile
>
>
> Thanks in advance for helping us running Wien2k in an MPI setup ;-)
>
> Regards
>
>
> Stefan Becuwe
> ___
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>



-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/
Electron crystallography is the branch of science that uses electron
scattering and imaging to study the structure of matter.


[Wien] parallel wien2k

2010-03-02 Thread Laurence Marks
Similar questions have come up before about mpi compilations. We need
more information to be able to help you, and there are some general
things that you have to check as well, mainly (but there may be more);
a) How was mpi (here mpich2) compiled, and what is it's version?
b) Are all the libraries being picked up (use ldd) on the child nodes?
c) Have you checked against the online mkl linking advisor
(http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/
) -- I am not sure that you have.
d) What is the error message?

On Mon, Mar 1, 2010 at 8:05 PM, Zhiyong Zhang  wrote:
> Dear All,
>
> I think I have problem with the compiler options/libraries for the parallel 
> wien2k. I can run lapw0/1_mpi with k-point parallel mode but not with mpi. 
> Here are the options and libraries with which I built the wien2k:
>
> RP_LIB(SCALAPACK+PBLAS): -lmkl_scalapack_lp64 -lmkl_solver_lp64_sequential 
> -Wl,--start-group -lmkl_intel_lp64 -lmkl_sequential -lmkl_core 
> -lmkl_blacs_lp64 -Wl,--end-group -lpthread -L/home/zzhang/fftw/fftw-2.1.5/lib 
> -lfftw_mpi -lfftw
>
> FPOPT(par.comp.options): FPOPT(par.comp.options): -FR -mp1 -w -prec_div -pc80 
> -pad -ip -DINTEL_VML -traceback
>
> I used fftw-2.1.5 for the parallel fftw.
>
> Does anybody see a problem with the options I used?
>
> Does anybody have a set of compiler options and libraries for working 
> lapw0_mpi? I used mpich2 to compile the code and the architecture is Intel 
> x86_64.
>
> Thanks in advance!
>
> Zhiyong
>
> - Original Message -
> From: "Zhiyong Zhang" 
> To: "A Mailing list for WIEN2k users" 
> Sent: Wednesday, February 24, 2010 5:16:23 PM GMT -08:00 US/Canada Pacific
> Subject: Re: [Wien] parallel wien2k
>
> Dear Laurence and All,
>
> Thank you very much for the information. It has been very helpful in 
> clarifying some of the issues. Based on your input, I was able to prepare the 
> .machines in the correct format, I believe:
>
> .machines:
> #
> lapw0: nx59:2 nx58:2
> 1:nx59
> 1:nx59
> 1:nx58
> 1:nx58
> granularity:1
> extrafine:1
>
> and
>
> .machine0
>
> nx59
> nx59
> nx58
> nx58
>
> However, I still got the same problem in TiC.vns, which presumably resulted 
> in the crash in lapw1para.
>
> Are there any places in the output files that I can look for clues of 
> problem? For the same calculation, I can run the lapw0 in serial mode and 
> lapw1 in k-point parallel mode successfully.
>
> Thanks in advance,
> Zhiyong
> - Original Message -
> From: "Laurence Marks" 
> To: "A Mailing list for WIEN2k users" 
> Sent: Tuesday, February 23, 2010 4:55:19 AM GMT -08:00 US/Canada Pacific
> Subject: Re: [Wien] parallel wien2k
>
> Several points:
>
> 1) You only use "-fc X" for a structure with variable atomic
> positions, and the TiC example has none so it will report that there
> are no forces (but this should not stop the calculation).
>
> 2) The "NaN" in your case.vns file means that something went wrong in
> the lapw0 call, which is why lapw1 is crashing. It is safer to delete
> the case.vns file.
>
> 3) Are you using single core CPU's or multicore? The normal format for
> a parallel lapw0 call (using mpi) is
>
> lapw0: nx1:2 nx62:2 -- please note the space after the ":", it often matters
>
> To do this you have to have mpi installed and have compiled lapw0_mpi.
> If you do not have it you can use
>
> lapw0: nx1
>
> This will run a serial lapw0 on nx1
>
> 4) All the above assumes that you have local control of what nodes you
> can use rather than this being controlled by a queuing system such as
> pbs. If you are using pbs or similar then you have to have a script to
> generate the .machines file since you do not know what machines to use
> (unless you are running interactively).
>
> 5) The script you have will run serial (i.e. not mpi) lapw1, 2
> k-vectors on nx1 and 2 on nx62. If you want to have these run using
> the parallel versions (i.e. lapw1_mpi) you would need to use
>
> 1:nx1:2
> 1:nx62:2
>
> (Note no space after the ":").
>
> Whether it is faster to run with 2 processors on nx1, as against 2
> different k-points will depend upon your cpu's. For a simple
> calculation such as TiC it will be hard to see much difference, but
> this can matter for larger ones. Be aware that if (for instance) you
> had 4 processors on nx1 it may be a bad idea to use
>
> 1:nx1:2
> 1:nx1:2
>
> because some variants of mpi will launch both lapw1_mpi jobs on the
> same cores (CPU_AFFINITY is often the relevant flag, but this varies
> with mpi flavor).
>
> 2010/2/22 zyzhang :
>&

[Wien] parallel wien2k

2010-03-01 Thread Zhiyong Zhang
Dear All, 

I think I have problem with the compiler options/libraries for the parallel 
wien2k. I can run lapw0/1_mpi with k-point parallel mode but not with mpi. Here 
are the options and libraries with which I built the wien2k: 

RP_LIB(SCALAPACK+PBLAS): -lmkl_scalapack_lp64 -lmkl_solver_lp64_sequential 
-Wl,--start-group -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_lp64 
-Wl,--end-group -lpthread -L/home/zzhang/fftw/fftw-2.1.5/lib -lfftw_mpi -lfftw

FPOPT(par.comp.options): FPOPT(par.comp.options): -FR -mp1 -w -prec_div -pc80 
-pad -ip -DINTEL_VML -traceback

I used fftw-2.1.5 for the parallel fftw. 

Does anybody see a problem with the options I used? 

Does anybody have a set of compiler options and libraries for working 
lapw0_mpi? I used mpich2 to compile the code and the architecture is Intel 
x86_64. 

Thanks in advance!

Zhiyong
   
- Original Message -
From: "Zhiyong Zhang" 
To: "A Mailing list for WIEN2k users" 
Sent: Wednesday, February 24, 2010 5:16:23 PM GMT -08:00 US/Canada Pacific
Subject: Re: [Wien] parallel wien2k

Dear Laurence and All, 

Thank you very much for the information. It has been very helpful in clarifying 
some of the issues. Based on your input, I was able to prepare the .machines in 
the correct format, I believe: 

.machines:
#
lapw0: nx59:2 nx58:2
1:nx59
1:nx59
1:nx58
1:nx58
granularity:1
extrafine:1

and 

.machine0

nx59
nx59
nx58
nx58

However, I still got the same problem in TiC.vns, which presumably resulted in 
the crash in lapw1para. 

Are there any places in the output files that I can look for clues of problem? 
For the same calculation, I can run the lapw0 in serial mode and lapw1 in 
k-point parallel mode successfully. 

Thanks in advance, 
Zhiyong 
- Original Message -
From: "Laurence Marks" 
To: "A Mailing list for WIEN2k users" 
Sent: Tuesday, February 23, 2010 4:55:19 AM GMT -08:00 US/Canada Pacific
Subject: Re: [Wien] parallel wien2k

Several points:

1) You only use "-fc X" for a structure with variable atomic
positions, and the TiC example has none so it will report that there
are no forces (but this should not stop the calculation).

2) The "NaN" in your case.vns file means that something went wrong in
the lapw0 call, which is why lapw1 is crashing. It is safer to delete
the case.vns file.

3) Are you using single core CPU's or multicore? The normal format for
a parallel lapw0 call (using mpi) is

lapw0: nx1:2 nx62:2 -- please note the space after the ":", it often matters

To do this you have to have mpi installed and have compiled lapw0_mpi.
If you do not have it you can use

lapw0: nx1

This will run a serial lapw0 on nx1

4) All the above assumes that you have local control of what nodes you
can use rather than this being controlled by a queuing system such as
pbs. If you are using pbs or similar then you have to have a script to
generate the .machines file since you do not know what machines to use
(unless you are running interactively).

5) The script you have will run serial (i.e. not mpi) lapw1, 2
k-vectors on nx1 and 2 on nx62. If you want to have these run using
the parallel versions (i.e. lapw1_mpi) you would need to use

1:nx1:2
1:nx62:2

(Note no space after the ":").

Whether it is faster to run with 2 processors on nx1, as against 2
different k-points will depend upon your cpu's. For a simple
calculation such as TiC it will be hard to see much difference, but
this can matter for larger ones. Be aware that if (for instance) you
had 4 processors on nx1 it may be a bad idea to use

1:nx1:2
1:nx1:2

because some variants of mpi will launch both lapw1_mpi jobs on the
same cores (CPU_AFFINITY is often the relevant flag, but this varies
with mpi flavor).

2010/2/22 zyzhang :
> Dear All,
>
>
>
> I am trying to test wien2k in parallel mode and I got into some problem. I
> am using
>
>
>
> run_lapw -p -i 40 -fc 0.001 ?I
>
>
>
> If I use a number of 0.001 for the option fc above, I got the following
> error:
>
>
>
> Force-convergence not possible. Forces not present.
>
>
>
> If I do not use a number for the ?fc option, but use ?run_lapw -p -i 40 -fc
> ?I? instead
>
>
>
> Then lapw0 finishes without a problem but the program doesn?t branch to
> lapw1. An error message is generated when doing the test
>
>
>
> ?if ($fcut == "0") goto lapw1
>
>
>
> I was able to do ?run_lapw -p -i 40 ?I?, without the ?-fc? option at all and
> was able to finish ?lapw0 ?p? and then start ?lapw1 ?p? but got into the
> following error:
>
>
>
> error: command?? /home/zzhang/wien2k/lapw1para lapw1.def?? failed
>
>
>
> Does anybody have similar problems and know how to fix this?
>
>
>
> It does the following:
>
>
>
> running LAPW1 in parallel mode (usin

[Wien] parallel wien2k

2010-02-24 Thread Zhiyong Zhang
Dear Laurence and All, 

Thank you very much for the information. It has been very helpful in clarifying 
some of the issues. Based on your input, I was able to prepare the .machines in 
the correct format, I believe: 

.machines:
#
lapw0: nx59:2 nx58:2
1:nx59
1:nx59
1:nx58
1:nx58
granularity:1
extrafine:1

and 

.machine0

nx59
nx59
nx58
nx58

However, I still got the same problem in TiC.vns, which presumably resulted in 
the crash in lapw1para. 

Are there any places in the output files that I can look for clues of problem? 
For the same calculation, I can run the lapw0 in serial mode and lapw1 in 
k-point parallel mode successfully. 

Thanks in advance, 
Zhiyong 
- Original Message -
From: "Laurence Marks" 
To: "A Mailing list for WIEN2k users" 
Sent: Tuesday, February 23, 2010 4:55:19 AM GMT -08:00 US/Canada Pacific
Subject: Re: [Wien] parallel wien2k

Several points:

1) You only use "-fc X" for a structure with variable atomic
positions, and the TiC example has none so it will report that there
are no forces (but this should not stop the calculation).

2) The "NaN" in your case.vns file means that something went wrong in
the lapw0 call, which is why lapw1 is crashing. It is safer to delete
the case.vns file.

3) Are you using single core CPU's or multicore? The normal format for
a parallel lapw0 call (using mpi) is

lapw0: nx1:2 nx62:2 -- please note the space after the ":", it often matters

To do this you have to have mpi installed and have compiled lapw0_mpi.
If you do not have it you can use

lapw0: nx1

This will run a serial lapw0 on nx1

4) All the above assumes that you have local control of what nodes you
can use rather than this being controlled by a queuing system such as
pbs. If you are using pbs or similar then you have to have a script to
generate the .machines file since you do not know what machines to use
(unless you are running interactively).

5) The script you have will run serial (i.e. not mpi) lapw1, 2
k-vectors on nx1 and 2 on nx62. If you want to have these run using
the parallel versions (i.e. lapw1_mpi) you would need to use

1:nx1:2
1:nx62:2

(Note no space after the ":").

Whether it is faster to run with 2 processors on nx1, as against 2
different k-points will depend upon your cpu's. For a simple
calculation such as TiC it will be hard to see much difference, but
this can matter for larger ones. Be aware that if (for instance) you
had 4 processors on nx1 it may be a bad idea to use

1:nx1:2
1:nx1:2

because some variants of mpi will launch both lapw1_mpi jobs on the
same cores (CPU_AFFINITY is often the relevant flag, but this varies
with mpi flavor).

2010/2/22 zyzhang :
> Dear All,
>
>
>
> I am trying to test wien2k in parallel mode and I got into some problem. I
> am using
>
>
>
> run_lapw -p -i 40 -fc 0.001 ?I
>
>
>
> If I use a number of 0.001 for the option fc above, I got the following
> error:
>
>
>
> Force-convergence not possible. Forces not present.
>
>
>
> If I do not use a number for the ?fc option, but use ?run_lapw -p -i 40 -fc
> ?I? instead
>
>
>
> Then lapw0 finishes without a problem but the program doesn?t branch to
> lapw1. An error message is generated when doing the test
>
>
>
> ?if ($fcut == "0") goto lapw1
>
>
>
> I was able to do ?run_lapw -p -i 40 ?I?, without the ?-fc? option at all and
> was able to finish ?lapw0 ?p? and then start ?lapw1 ?p? but got into the
> following error:
>
>
>
> error: command?? /home/zzhang/wien2k/lapw1para lapw1.def?? failed
>
>
>
> Does anybody have similar problems and know how to fix this?
>
>
>
> It does the following:
>
>
>
> running LAPW1 in parallel mode (using .machines)
>
>
>
> and the .machines file is as follows:
>
>
>
> #
>
> lapw0:nx1? nx1? nx62? nx62
>
> lapw1:nx1? nx1? nx62? nx62
>
> lapw2:nx1? nx1? nx62? nx62
>
> 1:nx1
>
> 1:nx1
>
> 1:nx62
>
> 1:nx62
>
> granularity:1
>
> extrafine:1
>
>
>
> Thanks,
>
> Zhiyong
>
>
>
> ___
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
>



-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/
Electron crystallography is the branch of science that uses electron
scattering and imaging to study the structure of matter.
___
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien


[Wien] parallel wien2k

2010-02-23 Thread Yurko Natanzon
Try to remove the "lapw0" string from the .machines file, so it reads:

1:nx1
1:nx1
1:nx62
1:nx62
granularity:1
extrafine:1

If it will not work, also try running lapw0 in serial mode :
lapw0:nx1
1:nx1
1:nx1
1:nx62
1:nx62
granularity:1
extrafine:1

also, take a look at the scripts which generate the proper .machines
file: http://www.wien2k.at/reg_user/faq/pbs.html

regards,
Yurko

On 23 February 2010 06:24, Zhiyong Zhang  wrote:
> OK. Here are some more clues about the problem:
>
> forrtl: severe (64): input conversion error, unit 19, file 
> /home/zzhang/wien2k-runs/lapw/TiC/TiC.vns
> Image ? ? ? ? ? ? ?PC ? ? ? ? ? ? ? ?Routine ? ? ? ? ? ?Line ? ? ? ?Source
> lapw1 ? ? ? ? ? ? ?004E6F1E ?Unknown ? ? ? ? ? ? ? Unknown ?Unknown
> lapw1 ? ? ? ? ? ? ?004E611A ?Unknown ? ? ? ? ? ? ? Unknown ?Unknown
> lapw1 ? ? ? ? ? ? ?0049FB76 ?Unknown ? ? ? ? ? ? ? Unknown ?Unknown
> lapw1 ? ? ? ? ? ? ?0046D75A ?Unknown ? ? ? ? ? ? ? Unknown ?Unknown
> lapw1 ? ? ? ? ? ? ?0046CD76 ?Unknown ? ? ? ? ? ? ? Unknown ?Unknown
> lapw1 ? ? ? ? ? ? ?00486885 ?Unknown ? ? ? ? ? ? ? Unknown ?Unknown
> lapw1 ? ? ? ? ? ? ?004540F8 ?rdswar_ ? ? ? ? ? ? ? ? ? ?29 
> ?rdswar_tmp_.F
> lapw1 ? ? ? ? ? ? ?00435FD3 ?inilpw_ ? ? ? ? ? ? ? ? ? 393 ?inilpw.f
> lapw1 ? ? ? ? ? ? ?00438224 ?MAIN__ ? ? ? ? ? ? ? ? ? ? 41 
> ?lapw1_tmp_.F
> lapw1 ? ? ? ? ? ? ?00404422 ?Unknown ? ? ? ? ? ? ? Unknown ?Unknown
> libc.so.6 ? ? ? ? ?003E1251C40B ?Unknown ? ? ? ? ? ? ? Unknown ?Unknown
> lapw1 ? ? ? ? ? ? ?0040436A ?Unknown ? ? ? ? ? ? ? Unknown ?Unknown
>
> I checked the TiC.vns in the parallel calculation and found the following 
> (Please note the NaN entries):
>
> ? ? TOTAL POTENTIAL IN INTERSTITIAL
>
> ? ? ? ? ? ? ? ?136 NUMBER OF PW
> ? ? ? 0 ? ?0 ? ?0 NaN ? ? ? ? ? ? ? ?0.E+00
> ? ? ?-1 ? -1 ? -1 0.966480192428E-08 0.E+00
> ? ? ? 0 ? ?0 ? -2 0.237305964226E-06 0.E+00
> ? ? ? 0 ? -2 ? -2 0.383070560427E-08 0.E+00
> ? ? ?-1 ? -1 ? -3-0.108089242452E-08 0.E+00
>
> However, in the TiC.vns from the serial run, which seem to have worked fine, 
> I found the following:
>
> ? ? TOTAL POTENTIAL IN INTERSTITIAL
>
> ? ? ? ? ? ? ? ?136 NUMBER OF PW
> ? ? ? 0 ? ?0 ? ?0-0.227173083856E-01 0.E+00
> ? ? ?-1 ? -1 ? -1 0.114592956480E-02 0.E+00
> ? ? ? 0 ? ?0 ? -2-0.115420958078E-01 0.E+00
> ? ? ? 0 ? -2 ? -2 0.184312999415E-01 0.E+00
> ? ? ?-1 ? -1 ? -3-0.137802961139E-03 0.E+00
> ? ? ?-2 ? -2 ? -2-0.285539143809E-02 0.E+00
>
> Does anybody have any clue about the problem?
>
> Thanks again,
>
> Zhiyong
>
>
> - Original Message -
> From: "Zhiyong Zhang" 
> To: "A Mailing list for WIEN2k users" 
> Sent: Monday, February 22, 2010 9:15:44 PM GMT -08:00 US/Canada Pacific
> Subject: Re: [Wien] parallel wien2k
>
> Hello Ricardo and All,
>
> Thank you for the information. I think you are right that part of the problem 
> is because no forces printed. The example I am using is the TiC in the user 
> guide. when I used "run_lapw -i 40 0.001 -I" in serial mode it worked fine.
>
> The problem "/home/zzhang/wien2k/lapw1para lapw1.def" seems to be due to the 
> .machines file definition. If I remove the "lapw1:nx1 ?nx1 ?nx62 ?nx62" from 
> the .machines file ans use the following .machines file,
>
> lapw0:nx1 ?nx1 ?nx62 ?nx62
> 1:nx1
> 1:nx1
> 1:nx62
> 1:nx62
> granularity:1
> extrafine:1
>
> Then the LAPW1 can run in parallel.
>
> Does this mean that lapw1/2 can only be run in k-point parallel mode, not 
> fine grain MPI mode?
>
> How ever, I still got the following error in TiC.dayfile:
>
> 4 number_of_parallel_jobs
> ? ? nx1(11) 0.226u 0.017s 0.31 76.18% ? ? ?0+0k 0+0io 0pf+0w
> ? ? nx1(11) 0.224u 0.009s 0.31 73.04% ? ? ?0+0k 0+0io 0pf+0w
> ? ? nx62(11) 0.222u 0.008s 0.32 71.21% ? ? ?0+0k 0+0io 0pf+0w
> ? ? nx62(11) 0.222u 0.010s 0.26 88.21% ? ? ?0+0k 0+0io 0pf+0w
> ? ? nx1(1) 0.224u 0.008s 0.26 88.89% ? ? ?0+0k 0+0io 0pf+0w
> ? ? nx1(1) 0.223u 0.008s 0.26 88.17% ? ? ?0+0k 0+0io 0pf+0w
> ? ? nx62(1) 0.222u 0.009s 0.26 86.19% ? ? ?0+0k 0+0io 0pf+0w
> ** ?LAPW1 crashed!
> 0.062u 0.436s 0:11.45 4.2% ? ? ?0+0k 0+0io 0pf+0w
> error: command ? /home/zzhang/wien2k/lapw1para lapw1.def ? failed
>
> Which files should I read to find possible causes of the crash? I looked the 
> *.error files but can't seem to find anything useful.
>
> Best,
> Zhiyong
>
>
>
> - Original Message -
> From: "Ricardo Faccio" 
> To: "A Mailing

[Wien] parallel wien2k

2010-02-23 Thread Laurence Marks
Several points:

1) You only use "-fc X" for a structure with variable atomic
positions, and the TiC example has none so it will report that there
are no forces (but this should not stop the calculation).

2) The "NaN" in your case.vns file means that something went wrong in
the lapw0 call, which is why lapw1 is crashing. It is safer to delete
the case.vns file.

3) Are you using single core CPU's or multicore? The normal format for
a parallel lapw0 call (using mpi) is

lapw0: nx1:2 nx62:2 -- please note the space after the ":", it often matters

To do this you have to have mpi installed and have compiled lapw0_mpi.
If you do not have it you can use

lapw0: nx1

This will run a serial lapw0 on nx1

4) All the above assumes that you have local control of what nodes you
can use rather than this being controlled by a queuing system such as
pbs. If you are using pbs or similar then you have to have a script to
generate the .machines file since you do not know what machines to use
(unless you are running interactively).

5) The script you have will run serial (i.e. not mpi) lapw1, 2
k-vectors on nx1 and 2 on nx62. If you want to have these run using
the parallel versions (i.e. lapw1_mpi) you would need to use

1:nx1:2
1:nx62:2

(Note no space after the ":").

Whether it is faster to run with 2 processors on nx1, as against 2
different k-points will depend upon your cpu's. For a simple
calculation such as TiC it will be hard to see much difference, but
this can matter for larger ones. Be aware that if (for instance) you
had 4 processors on nx1 it may be a bad idea to use

1:nx1:2
1:nx1:2

because some variants of mpi will launch both lapw1_mpi jobs on the
same cores (CPU_AFFINITY is often the relevant flag, but this varies
with mpi flavor).

2010/2/22 zyzhang :
> Dear All,
>
>
>
> I am trying to test wien2k in parallel mode and I got into some problem. I
> am using
>
>
>
> run_lapw -p -i 40 -fc 0.001 ?I
>
>
>
> If I use a number of 0.001 for the option fc above, I got the following
> error:
>
>
>
> Force-convergence not possible. Forces not present.
>
>
>
> If I do not use a number for the ?fc option, but use ?run_lapw -p -i 40 -fc
> ?I? instead
>
>
>
> Then lapw0 finishes without a problem but the program doesn?t branch to
> lapw1. An error message is generated when doing the test
>
>
>
> ?if ($fcut == "0") goto lapw1
>
>
>
> I was able to do ?run_lapw -p -i 40 ?I?, without the ?-fc? option at all and
> was able to finish ?lapw0 ?p? and then start ?lapw1 ?p? but got into the
> following error:
>
>
>
> error: command?? /home/zzhang/wien2k/lapw1para lapw1.def?? failed
>
>
>
> Does anybody have similar problems and know how to fix this?
>
>
>
> It does the following:
>
>
>
> running LAPW1 in parallel mode (using .machines)
>
>
>
> and the .machines file is as follows:
>
>
>
> #
>
> lapw0:nx1? nx1? nx62? nx62
>
> lapw1:nx1? nx1? nx62? nx62
>
> lapw2:nx1? nx1? nx62? nx62
>
> 1:nx1
>
> 1:nx1
>
> 1:nx62
>
> 1:nx62
>
> granularity:1
>
> extrafine:1
>
>
>
> Thanks,
>
> Zhiyong
>
>
>
> ___
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
>



-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/
Electron crystallography is the branch of science that uses electron
scattering and imaging to study the structure of matter.


[Wien] parallel wien2k

2010-02-23 Thread Ricardo Faccio
Hi Zhiyong
What is your test case? remember that forces are printed if you have atoms
located in general positions. For example, Fe in the bcc space group, will
not print forces, since all atoms have the same symmetric environment.
Regards
Ricardo

-- 
  -
-   Dr. Ricardo Faccio

  Mail: Cryssmat-Lab., C?tedra de F?sica, DETEMA
  Facultad de Qu?mica, Universidad de la Rep?blica
   Av. Gral. Flores 2124, C.C. 1157
   C.P. 11800, Montevideo, Uruguay.
  E-mail: rfaccio at fq.edu.uy
  Phone: 598 2 9241860 Int. 109
 598 2 9290705
  Fax:598 2 9241906
  Web:  http://cryssmat.fq.edu.uy/ricardo/ricardo.htm

> Dear All,
>
>
>
> I am trying to test wien2k in parallel mode and I got into some problem. I
> am using
>
>
>
> run_lapw -p -i 40 -fc 0.001 -I
>
>
>
> If I use a number of 0.001 for the option fc above, I got the following
> error:
>
>
>
> Force-convergence not possible. Forces not present.
>
>
>
> If I do not use a number for the -fc option, but use "run_lapw -p -i 40
> -fc
> -I" instead
>
>
>
> Then lapw0 finishes without a problem but the program doesn't branch to
> lapw1. An error message is generated when doing the test
>
>
>
> "if ($fcut == "0") goto lapw1
>
>
>
> I was able to do "run_lapw -p -i 40 -I", without the "-fc" option at all
> and
> was able to finish "lapw0 -p" and then start "lapw1 -p" but got into the
> following error:
>
>
>
> error: command   /home/zzhang/wien2k/lapw1para lapw1.def   failed
>
>
>
> Does anybody have similar problems and know how to fix this?
>
>
>
> It does the following:
>
>
>
> running LAPW1 in parallel mode (using .machines)
>
>
>
> and the .machines file is as follows:
>
>
>
> #
>
> lapw0:nx1  nx1  nx62  nx62
>
> lapw1:nx1  nx1  nx62  nx62
>
> lapw2:nx1  nx1  nx62  nx62
>
> 1:nx1
>
> 1:nx1
>
> 1:nx62
>
> 1:nx62
>
> granularity:1
>
> extrafine:1
>
>
>
> Thanks,
>
> Zhiyong
>
>
>
> ___
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>




[Wien] parallel wien2k

2010-02-22 Thread Zhiyong Zhang
OK. Here are some more clues about the problem: 

forrtl: severe (64): input conversion error, unit 19, file 
/home/zzhang/wien2k-runs/lapw/TiC/TiC.vns
Image  PCRoutineLineSource
lapw1  004E6F1E  Unknown   Unknown  Unknown
lapw1  004E611A  Unknown   Unknown  Unknown
lapw1  0049FB76  Unknown   Unknown  Unknown
lapw1  0046D75A  Unknown   Unknown  Unknown
lapw1  0046CD76  Unknown   Unknown  Unknown
lapw1  00486885  Unknown   Unknown  Unknown
lapw1  004540F8  rdswar_29  
rdswar_tmp_.F
lapw1  00435FD3  inilpw_   393  inilpw.f
lapw1  00438224  MAIN__ 41  lapw1_tmp_.F
lapw1  00404422  Unknown   Unknown  Unknown
libc.so.6  003E1251C40B  Unknown   Unknown  Unknown
lapw1  0040436A  Unknown   Unknown  Unknown

I checked the TiC.vns in the parallel calculation and found the following 
(Please note the NaN entries): 

 TOTAL POTENTIAL IN INTERSTITIAL

136 NUMBER OF PW
   000 NaN0.E+00
  -1   -1   -1 0.966480192428E-08 0.E+00
   00   -2 0.237305964226E-06 0.E+00
   0   -2   -2 0.383070560427E-08 0.E+00
  -1   -1   -3-0.108089242452E-08 0.E+00

However, in the TiC.vns from the serial run, which seem to have worked fine, I 
found the following: 

 TOTAL POTENTIAL IN INTERSTITIAL

136 NUMBER OF PW
   000-0.227173083856E-01 0.E+00
  -1   -1   -1 0.114592956480E-02 0.E+00
   00   -2-0.115420958078E-01 0.E+00
   0   -2   -2 0.184312999415E-01 0.E+00
  -1   -1   -3-0.137802961139E-03 0.E+00
  -2   -2   -2-0.285539143809E-02 0.E+00
 
Does anybody have any clue about the problem? 

Thanks again, 

Zhiyong


- Original Message -
From: "Zhiyong Zhang" 
To: "A Mailing list for WIEN2k users" 
Sent: Monday, February 22, 2010 9:15:44 PM GMT -08:00 US/Canada Pacific
Subject: Re: [Wien] parallel wien2k

Hello Ricardo and All, 

Thank you for the information. I think you are right that part of the problem 
is because no forces printed. The example I am using is the TiC in the user 
guide. when I used "run_lapw -i 40 0.001 -I" in serial mode it worked fine. 

The problem "/home/zzhang/wien2k/lapw1para lapw1.def" seems to be due to the 
.machines file definition. If I remove the "lapw1:nx1  nx1  nx62  nx62" from 
the .machines file ans use the following .machines file,  

lapw0:nx1  nx1  nx62  nx62  
1:nx1
1:nx1
1:nx62
1:nx62
granularity:1
extrafine:1

Then the LAPW1 can run in parallel. 

Does this mean that lapw1/2 can only be run in k-point parallel mode, not fine 
grain MPI mode? 

How ever, I still got the following error in TiC.dayfile: 

4 number_of_parallel_jobs
 nx1(11) 0.226u 0.017s 0.31 76.18%  0+0k 0+0io 0pf+0w
 nx1(11) 0.224u 0.009s 0.31 73.04%  0+0k 0+0io 0pf+0w
 nx62(11) 0.222u 0.008s 0.32 71.21%  0+0k 0+0io 0pf+0w
 nx62(11) 0.222u 0.010s 0.26 88.21%  0+0k 0+0io 0pf+0w
 nx1(1) 0.224u 0.008s 0.26 88.89%  0+0k 0+0io 0pf+0w
 nx1(1) 0.223u 0.008s 0.26 88.17%  0+0k 0+0io 0pf+0w
 nx62(1) 0.222u 0.009s 0.26 86.19%  0+0k 0+0io 0pf+0w
**  LAPW1 crashed!
0.062u 0.436s 0:11.45 4.2%  0+0k 0+0io 0pf+0w
error: command   /home/zzhang/wien2k/lapw1para lapw1.def   failed

Which files should I read to find possible causes of the crash? I looked the 
*.error files but can't seem to find anything useful. 

Best, 
Zhiyong



- Original Message -
From: "Ricardo Faccio" 
To: "A Mailing list for WIEN2k users" 
Sent: Monday, February 22, 2010 8:28:35 PM GMT -08:00 US/Canada Pacific
Subject: Re: [Wien] parallel wien2k

Hi Zhiyong
What is your test case? remember that forces are printed if you have atoms
located in general positions. For example, Fe in the bcc space group, will
not print forces, since all atoms have the same symmetric environment.
Regards
Ricardo

-- 
  -
-   Dr. Ricardo Faccio

  Mail: Cryssmat-Lab., C?tedra de F?sica, DETEMA
  Facultad de Qu?mica, Universidad de la Rep?blica
   Av. Gral. Flores 2124, C.C. 1157
   C.P. 11800, Montevideo, Uruguay.
  E-mail: rfaccio at fq.edu.uy
  Phone: 598 2 9241860 Int. 109
 598 2 9290705
  Fax:598 2 9241906
  Web:  http://cryssmat.fq.edu.uy/ricardo/ricardo.htm

> Dear All,
>
>
>
> I am trying to test wien2k in parallel mode and I got into some probl

[Wien] parallel wien2k

2010-02-22 Thread Zhiyong Zhang
Hello Ricardo and All, 

Thank you for the information. I think you are right that part of the problem 
is because no forces printed. The example I am using is the TiC in the user 
guide. when I used "run_lapw -i 40 0.001 -I" in serial mode it worked fine. 

The problem "/home/zzhang/wien2k/lapw1para lapw1.def" seems to be due to the 
.machines file definition. If I remove the "lapw1:nx1  nx1  nx62  nx62" from 
the .machines file ans use the following .machines file,  

lapw0:nx1  nx1  nx62  nx62  
1:nx1
1:nx1
1:nx62
1:nx62
granularity:1
extrafine:1

Then the LAPW1 can run in parallel. 

Does this mean that lapw1/2 can only be run in k-point parallel mode, not fine 
grain MPI mode? 

How ever, I still got the following error in TiC.dayfile: 

4 number_of_parallel_jobs
 nx1(11) 0.226u 0.017s 0.31 76.18%  0+0k 0+0io 0pf+0w
 nx1(11) 0.224u 0.009s 0.31 73.04%  0+0k 0+0io 0pf+0w
 nx62(11) 0.222u 0.008s 0.32 71.21%  0+0k 0+0io 0pf+0w
 nx62(11) 0.222u 0.010s 0.26 88.21%  0+0k 0+0io 0pf+0w
 nx1(1) 0.224u 0.008s 0.26 88.89%  0+0k 0+0io 0pf+0w
 nx1(1) 0.223u 0.008s 0.26 88.17%  0+0k 0+0io 0pf+0w
 nx62(1) 0.222u 0.009s 0.26 86.19%  0+0k 0+0io 0pf+0w
**  LAPW1 crashed!
0.062u 0.436s 0:11.45 4.2%  0+0k 0+0io 0pf+0w
error: command   /home/zzhang/wien2k/lapw1para lapw1.def   failed

Which files should I read to find possible causes of the crash? I looked the 
*.error files but can't seem to find anything useful. 

Best, 
Zhiyong



- Original Message -
From: "Ricardo Faccio" 
To: "A Mailing list for WIEN2k users" 
Sent: Monday, February 22, 2010 8:28:35 PM GMT -08:00 US/Canada Pacific
Subject: Re: [Wien] parallel wien2k

Hi Zhiyong
What is your test case? remember that forces are printed if you have atoms
located in general positions. For example, Fe in the bcc space group, will
not print forces, since all atoms have the same symmetric environment.
Regards
Ricardo

-- 
  -
-   Dr. Ricardo Faccio

  Mail: Cryssmat-Lab., C?tedra de F?sica, DETEMA
  Facultad de Qu?mica, Universidad de la Rep?blica
   Av. Gral. Flores 2124, C.C. 1157
   C.P. 11800, Montevideo, Uruguay.
  E-mail: rfaccio at fq.edu.uy
  Phone: 598 2 9241860 Int. 109
 598 2 9290705
  Fax:598 2 9241906
  Web:  http://cryssmat.fq.edu.uy/ricardo/ricardo.htm

> Dear All,
>
>
>
> I am trying to test wien2k in parallel mode and I got into some problem. I
> am using
>
>
>
> run_lapw -p -i 40 -fc 0.001 -I
>
>
>
> If I use a number of 0.001 for the option fc above, I got the following
> error:
>
>
>
> Force-convergence not possible. Forces not present.
>
>
>
> If I do not use a number for the -fc option, but use "run_lapw -p -i 40
> -fc
> -I" instead
>
>
>
> Then lapw0 finishes without a problem but the program doesn't branch to
> lapw1. An error message is generated when doing the test
>
>
>
> "if ($fcut == "0") goto lapw1
>
>
>
> I was able to do "run_lapw -p -i 40 -I", without the "-fc" option at all
> and
> was able to finish "lapw0 -p" and then start "lapw1 -p" but got into the
> following error:
>
>
>
> error: command   /home/zzhang/wien2k/lapw1para lapw1.def   failed
>
>
>
> Does anybody have similar problems and know how to fix this?
>
>
>
> It does the following:
>
>
>
> running LAPW1 in parallel mode (using .machines)
>
>
>
> and the .machines file is as follows:
>
>
>
> #
>
> lapw0:nx1  nx1  nx62  nx62
>
> lapw1:nx1  nx1  nx62  nx62
>
> lapw2:nx1  nx1  nx62  nx62
>
> 1:nx1
>
> 1:nx1
>
> 1:nx62
>
> 1:nx62
>
> granularity:1
>
> extrafine:1
>
>
>
> Thanks,
>
> Zhiyong
>
>
>
> ___
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>


___
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien


[Wien] parallel wien2k

2010-02-22 Thread zyzhang
Dear All, 

 

I am trying to test wien2k in parallel mode and I got into some problem. I
am using 

 

run_lapw -p -i 40 -fc 0.001 -I

 

If I use a number of 0.001 for the option fc above, I got the following
error:

 

Force-convergence not possible. Forces not present.

 

If I do not use a number for the -fc option, but use "run_lapw -p -i 40 -fc
-I" instead 

 

Then lapw0 finishes without a problem but the program doesn't branch to
lapw1. An error message is generated when doing the test 

 

"if ($fcut == "0") goto lapw1

 

I was able to do "run_lapw -p -i 40 -I", without the "-fc" option at all and
was able to finish "lapw0 -p" and then start "lapw1 -p" but got into the
following error: 

 

error: command   /home/zzhang/wien2k/lapw1para lapw1.def   failed

 

Does anybody have similar problems and know how to fix this? 

 

It does the following: 

 

running LAPW1 in parallel mode (using .machines)

 

and the .machines file is as follows: 

 

#

lapw0:nx1  nx1  nx62  nx62  

lapw1:nx1  nx1  nx62  nx62  

lapw2:nx1  nx1  nx62  nx62  

1:nx1

1:nx1

1:nx62

1:nx62

granularity:1

extrafine:1

 

Thanks, 

Zhiyong

 

-- next part --
An HTML attachment was scrubbed...
URL: