Re: [Wien] .machines for several nodes

2020-10-12 Thread Peter Blaha

To run a single program for testing, do:

x lapw0 -p

(after creation of .machines.)

Then check all error files, but in particular also the slurm-output 
(whatever it is called on your machines. It probably gives some messages 
like library  not found or so, which is needed for additional debugging.


AND:

We still don't know how many cores your nodes have

We still don't know your compiler options (WIEN2k_OPTIONS, 
parallel_options)  and if the compilation of eg. lapw0_mpi did work at 
all (compile.msg in SRC_lapw0).


Am 12.10.2020 um 22:17 schrieb Christian Søndergaard Pedersen:

Dear everybody


I am following up on this thread to report on two separate errors in my 
attempts to properly parallellize a calculation. For the first, a 
calculation utilized 0.00% of available CPU resources. My .machines file 
looks like this:



#
dstart:g004:8 g010:8 g011:8 g040:8
lapw0:g004:8 g010:8 g011:8 g040:8
1:g004:16
1:g010:16
1:g011:16
1:g040:16

With my submit script calling the following commands:


srun hostname -s > slurm.hosts

run_lapw -p

x qtl -p -telnes


Of course, the job didn't reach x qtl. The resultant case.dayfile is 
short, so I am dumping all of it here:



Calculating test-machines in /path/to/directory
on node.host.name.dtu.dk with PID X
using WIEN2k_19.1 (Release 25/6/2019) in 
/path/to/installation/directory/WIEN2k/19.1-intel-2019a



     start   (Mon Oct 12 19:04:06 CEST 2020) with lapw0 (40/99 to go)

     cycle 1 (Mon Oct 12 19:04:06 CEST 2020) (40/99 to go)


   lapw0   -p  (19:04:06) starting parallel lapw0 at Mon Oct 12 19:04:06 CEST 
2020

 .machine0 : 32 processors
[1] 16095


The .machine0 file displays the lines

g004 [repeated for 8 lines]
g010 [repeated for 8 lines]
g011 [repeated for 8 lines]
g040 [repeated for 8 lines]

which tells me that the .machines file works as intended, and that the 
cause of the problem is located somewhere else. Which brings me to the 
second error, which occured when I tried calling mpirun explicitly like so:


srun hostname -s > slurm.hosts
mpirun run_lapw -p
mpirun qtl -p -telnes

from within the job script. This crashed the job right away. The 
lapw0.error file prints out "Error in Parallel lapw0" and "check ERROR 
FILES!" a number of times. The case.clmsum file is present and looks 
correct, and the .machines file looks like the one from before (with 
different node numbers). However, the .machine0 file now looks like:


g094
g094
g094
g081
g081
g08g094
g094
g094
g094
g094
[...]

I.e. there's an error on line 6, where a node is not properly named and 
a line break is missing. The dayfile repeatedly prints out "> stop 
error" a total of sixteen times. I don't know if the above .machine0 
file is the culprit, but it seems the obvious conclusion. Any help in 
this matter will be much appreciated.


Best regards
Christian

___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html



--
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WWW: 
http://www.imc.tuwien.ac.at/tc_blaha- 


___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] .machines for several nodes

2020-10-12 Thread Christian Søndergaard Pedersen
Dear everybody


I am following up on this thread to report on two separate errors in my 
attempts to properly parallellize a calculation. For the first, a calculation 
utilized 0.00% of available CPU resources. My .machines file looks like this:


#
dstart:g004:8 g010:8 g011:8 g040:8
lapw0:g004:8 g010:8 g011:8 g040:8
1:g004:16
1:g010:16
1:g011:16
1:g040:16

With my submit script calling the following commands:


srun hostname -s > slurm.hosts

run_lapw -p

x qtl -p -telnes


Of course, the job didn't reach x qtl. The resultant case.dayfile is short, so 
I am dumping all of it here:

Calculating test-machines in /path/to/directory
on node.host.name.dtu.dk with PID X
using WIEN2k_19.1 (Release 25/6/2019) in 
/path/to/installation/directory/WIEN2k/19.1-intel-2019a


start   (Mon Oct 12 19:04:06 CEST 2020) with lapw0 (40/99 to go)

cycle 1 (Mon Oct 12 19:04:06 CEST 2020) (40/99 to go)

>   lapw0   -p  (19:04:06) starting parallel lapw0 at Mon Oct 12 19:04:06 CEST 
> 2020
 .machine0 : 32 processors
[1] 16095


The .machine0 file displays the lines

g004 [repeated for 8 lines]
g010 [repeated for 8 lines]
g011 [repeated for 8 lines]
g040 [repeated for 8 lines]

which tells me that the .machines file works as intended, and that the cause of 
the problem is located somewhere else. Which brings me to the second error, 
which occured when I tried calling mpirun explicitly like so:

srun hostname -s > slurm.hosts
mpirun run_lapw -p
mpirun qtl -p -telnes

from within the job script. This crashed the job right away. The lapw0.error 
file prints out "Error in Parallel lapw0" and "check ERROR FILES!" a number of 
times. The case.clmsum file is present and looks correct, and the .machines 
file looks like the one from before (with different node numbers). However, the 
.machine0 file now looks like:

g094
g094
g094
g081
g081
g08g094
g094
g094
g094
g094
[...]

I.e. there's an error on line 6, where a node is not properly named and a line 
break is missing. The dayfile repeatedly prints out "> stop error" a total of 
sixteen times. I don't know if the above .machine0 file is the culprit, but it 
seems the obvious conclusion. Any help in this matter will be much appreciated.

Best regards
Christian
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] .machines for several nodes

2020-10-12 Thread Laurence Marks
Do not call mpirun yourself -- it is called by run_lapw.

What is your $WIENROOT/parallel_options file? It was setup during
installation, and needs to be correct for your srun environment.

What is in case.scf0, case.output000* and lapw0.error? These may indicate
what you did wrong.

_
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what nobody
else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu

On Mon, Oct 12, 2020, 15:17 Christian Søndergaard Pedersen 
wrote:

> Dear everybody
>
>
> I am following up on this thread to report on two separate errors in my
> attempts to properly parallellize a calculation. For the first, a
> calculation utilized 0.00% of available CPU resources. My .machines file
> looks like this:
>
>
> #
> dstart:g004:8 g010:8 g011:8 g040:8
> lapw0:g004:8 g010:8 g011:8 g040:8
> 1:g004:16
> 1:g010:16
> 1:g011:16
> 1:g040:16
>
> With my submit script calling the following commands:
>
>
> srun hostname -s > slurm.hosts
>
> run_lapw -p
>
> x qtl -p -telnes
>
>
> Of course, the job didn't reach x qtl. The resultant case.dayfile is
> short, so I am dumping all of it here:
>
> Calculating test-machines in /path/to/directory
> on node.host.name.dtu.dk with PID X
> using WIEN2k_19.1 (Release 25/6/2019) in
> /path/to/installation/directory/WIEN2k/19.1-intel-2019a
>
>
> start   (Mon Oct 12 19:04:06 CEST 2020) with lapw0 (40/99 to go)
>
> cycle 1 (Mon Oct 12 19:04:06 CEST 2020) (40/99 to go)
>
> >   lapw0   -p  (19:04:06) starting parallel lapw0 at Mon Oct 12 19:04:06
> CEST 2020
>  .machine0 : 32 processors
> [1] 16095
>
>
> The .machine0 file displays the lines
>
> g004 [repeated for 8 lines]
> g010 [repeated for 8 lines]
> g011 [repeated for 8 lines]
> g040 [repeated for 8 lines]
>
> which tells me that the .machines file works as intended, and that the
> cause of the problem is located somewhere else. Which brings me to the
> second error, which occured when I tried calling mpirun explicitly like so:
>
> srun hostname -s > slurm.hosts
> mpirun run_lapw -p
> mpirun qtl -p -telnes
>
> from within the job script. This crashed the job right away. The
> lapw0.error file prints out "Error in Parallel lapw0" and "check ERROR
> FILES!" a number of times. The case.clmsum file is present and looks
> correct, and the .machines file looks like the one from before (with
> different node numbers). However, the .machine0 file now looks like:
>
> g094
> g094
> g094
> g081
> g081
> g08g094
> g094
> g094
> g094
> g094
> [...]
>
> I.e. there's an error on line 6, where a node is not properly named and a
> line break is missing. The dayfile repeatedly prints out "> stop error" a
> total of sixteen times. I don't know if the above .machine0 file is the
> culprit, but it seems the obvious conclusion. Any help in this matter will
> be much appreciated.
>
> Best regards
> Christian
> ___
> Wien mailing list
> Wien@zeus.theochem.tuwien.ac.at
>
> https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!HfFpsmCVflsbQjgThWcuh7q8HHICdTWqPtl-ppytordiJbv_sciV0av016hMB6ImR3gClg$
> SEARCH the MAILING-LIST at:
> https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!HfFpsmCVflsbQjgThWcuh7q8HHICdTWqPtl-ppytordiJbv_sciV0av016hMB6J__JRiLw$
>
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


[Wien] Fwd: Postdoc ad mailing list

2020-10-12 Thread Peter Blaha

 Weitergeleitete Nachricht 
Betreff:Postdoc ad mailing list
Datum:  Mon, 12 Oct 2020 09:58:21 -0700
Von:Antia S. Botana 
An: Peter Blaha 



Postdoctoral position in Theoretical Condensed Matter Physics.

The Department of Physics at Arizona State University invites 
applications for a post-doctoral position in Condensed Matter Theory in 
the group of Prof. Antia Botana. Starting date is as early as January 
1st 2021. The position is for one year with the possibility of renewal 
up to one more year. Preference will be given to candidates with 
expertise and strong interests in density functional theory, dynamical 
mean-field theory, strongly correlated electron systems, and magnetic 
properties. Full consideration will be given to applications received 
before November 15, 2020. Applications will be considered thereafter 
until the position is filled. Applications should include a CV (with 
list of publications) and a statement of research interests. 
Additionally, candidates must arrange for two letters of recommendation 
to be submitted. To apply, email your documents to antia.bot...@asu.edu 
. Further information about the position 
can be obtained from Antia Botana, antia.bot...@asu.edu 



Thanks in advance!
Antia
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] .machines for several nodes

2020-10-12 Thread Christian Søndergaard Pedersen
Thanks for the suggestion regarding the .processes file; this will probably 
come in handy at a later stage. Regarding the qtl program, my end goal is to 
calculate an ELNES spectrum for the structures I am investigating. To this end, 
is there any difference between running 'x lapw2 -p -qtl' and 'x qtl -p 
-telnes' (assuming case.innes is present)? Specifically, will the workflow:


script 1:

run_lapw -p

x -qtl -telnes

x telnes3


perform the same task as this:


script 1:

run_lapw


followed by script 2:

lapw1 -p -d >&/dev/null

lapw2 -p -qtl

x telnes3

My question is both related to the parallellization schemes (assuming I use the 
same setup of nodes and generate the .machines file in the same way) and 
related to the behaviour of the programs (will lapw2 -p -qtl calculate suitable 
input files for telnes3, assuming case.innes is present)? I realize that 
telnes3 can probably be run locally, but I included the command for 
completeness.

Best regards
Christian

Fra: Wien  på vegne af Peter Blaha 

Sendt: 12. oktober 2020 11:58:22
Til: wien@zeus.theochem.tuwien.ac.at
Emne: Re: [Wien] .machines for several nodes

Yes, this is ok when your have nodes with 16 cores !!!

(Only the lapw0 line could use :16 instead of 8 if you have 96 atoms,
but most likely this is fairly negligible).

Yes, the QTL calculation in lapw2 is also affected by the
parallelization. but it reads from a .processes file, which is created
by lapw1.

If you run x lapw2 -p -qtl   in an extra job, you should add the
following line to create a "correct" .processes file:

x lapw1 -p -d >&/dev/null  # Create .processes (necessary for
standalone-lapw2)

On 10/12/20 11:45 AM, Christian Søndergaard Pedersen wrote:
> This went a long way towards clearing up my confusion, thanks again. I
> will try starting an MPI-parallel calculations for 4 nodes with 16 cores
> each using the following .machines-file:
>
> 1:g008:16
> 1:g021:16
> 1:g025:16
> 1:g028:16
> lapw0: g008:8 g021:8 g025:8 g028:8
>
> dstart: g008:8 g021:8 g025:8 g028:8
>
>
> ... and see how it performs. If the matrix sizes are small, I understand
> that I could also have each node work on 2 (or more) k-points at the
> same time, by specifying:
>
>
> 1:g008:8
> 1:g008:8
> 1:g021:8
> 1:g021:8
> 1:g025:8
> 1:g025:8
> 1:g028:8
> 1:g028:8
>
> so that for instance g008 will work on 2 kpoints using 8 cores for each
> k point, am I right? And a (hopefully) final question, since qtl
> according to the manual runs in k-point parallel, is it also affected by
> the parallellization scheme specified for lapw1 and lapw2 (unless I
> deliberately change it)?
>
>
>
> 
> *Fra:* Wien  på vegne af Ruh,
> Thomas 
> *Sendt:* 12. oktober 2020 10:59:09
> *Til:* A Mailing list for WIEN2k users
> *Emne:* Re: [Wien] .machines for several nodes
>
> I am afraid, there is still some confusion.
>
>
> First about /lapw1/:
>
> Sorry for my unclear statement - I meant that you need one line per
> k-parallel job in the sense that #lines k-points are run simultaneously,
> i. e. if you speficify this part of the machines file like this:
>
>
> 1:g008:16
>
> 1:g021:16
>
> 1:g025:16
>
> 1:g028:16
>
>
> your k-point list will be split into 4 parts of 56 k-points each [1] ,
> which will be processed step-by-step. Node g008 will work in its first
> k-point, while node g021 will do the same for its first k-point, and so on
>
> You need the ":16" after the name of the node. Otherwise, on every node
> only *one* core would be used. If it is useful to use 16 mpi-parallel
> jobs per k-point (meaning that the matrices will distributed on 16 cores
> with each core getting only 1/16 of the matrix elements) depends on your
> matrix sizes (which in turn depend on your rkmax). You should check that
> by grepping :rkm in your case.scf file. If the matrix size there is
> small, using OMP_NUM_THREADS 16 might be much faster (since MPI adds
> overhead to your calculation).
>
>
>
> Regarding /lapw0/dstart/:
>
> The way you set the calculation up could lead to (possible severe)
> overloading of your nodes: WIEN2k will start 24 jobs on each node (so
> 1.5 times the number of cores) at the same time doing the calculation
> for 1 atom each.
>
> As one possible alternative, you specify only 8 cores per node (i.e. for
> example "lapw0: g008:8" and so on) 8 jobs per node, which would lead to
> step-by-step calculations for 3 atoms per core.
>
> Which option is faster is hard to tell and depends a lot on your hardware.
>
>
> So what you could do - in principle - is to test multiple configurations
> (you can modify your .machines file on the fly during a SCF run) in the
> first cycles, compare the times (in case.dayfile), and use the faster
> one for the rest of the run.
>
>
>
> Regards,
> Thomas
>
>
> [1] Sidenote: This splitting is controlled by the first number - in this
> case 4 equal sublists will be set-up - you could also specifiy diff

Re: [Wien] .machines for several nodes

2020-10-12 Thread Christian Søndergaard Pedersen
This went a long way towards clearing up my confusion, thanks again. I will try 
starting an MPI-parallel calculations for 4 nodes with 16 cores each using the 
following .machines-file:


1:g008:16
1:g021:16
1:g025:16
1:g028:16
lapw0: g008:8 g021:8 g025:8 g028:8

dstart: g008:8 g021:8 g025:8 g028:8


... and see how it performs. If the matrix sizes are small, I understand that I 
could also have each node work on 2 (or more) k-points at the same time, by 
specifying:


1:g008:8
1:g008:8
1:g021:8
1:g021:8
1:g025:8
1:g025:8
1:g028:8
1:g028:8

so that for instance g008 will work on 2 kpoints using 8 cores for each k 
point, am I right? And a (hopefully) final question, since qtl according to the 
manual runs in k-point parallel, is it also affected by the parallellization 
scheme specified for lapw1 and lapw2 (unless I deliberately change it)?




Fra: Wien  på vegne af Ruh, Thomas 

Sendt: 12. oktober 2020 10:59:09
Til: A Mailing list for WIEN2k users
Emne: Re: [Wien] .machines for several nodes


I am afraid, there is still some confusion.


First about lapw1:

Sorry for my unclear statement - I meant that you need one line per k-parallel 
job in the sense that #lines k-points are run simultaneously, i. e. if you 
speficify this part of the machines file like this:


1:g008:16

1:g021:16

1:g025:16

1:g028:16


your k-point list will be split into 4 parts of 56 k-points each [1] , which 
will be processed step-by-step. Node g008 will work in its first k-point, while 
node g021 will do the same for its first k-point, and so on

You need the ":16" after the name of the node. Otherwise, on every node only 
one core would be used. If it is useful to use 16 mpi-parallel jobs per k-point 
(meaning that the matrices will distributed on 16 cores with each core getting 
only 1/16 of the matrix elements) depends on your matrix sizes (which in turn 
depend on your rkmax). You should check that by grepping :rkm in your case.scf 
file. If the matrix size there is small, using OMP_NUM_THREADS 16 might be much 
faster (since MPI adds overhead to your calculation).



Regarding lapw0/dstart:

The way you set the calculation up could lead to (possible severe) overloading 
of your nodes: WIEN2k will start 24 jobs on each node (so 1.5 times the number 
of cores) at the same time doing the calculation for 1 atom each.

As one possible alternative, you specify only 8 cores per node (i.e. for 
example "lapw0: g008:8" and so on) 8 jobs per node, which would lead to 
step-by-step calculations for 3 atoms per core.

Which option is faster is hard to tell and depends a lot on your hardware.


So what you could do - in principle - is to test multiple configurations (you 
can modify your .machines file on the fly during a SCF run) in the first 
cycles, compare the times (in case.dayfile), and use the faster one for the 
rest of the run.



Regards,
Thomas


[1] Sidenote: This splitting is controlled by the first number - in this case 4 
equal sublists will be set-up - you could also specifiy different "weights", 
for instance, if your nodes are of different speeds, the machinesfile could 
then read for example:


3:g008:16

2:g021:16

2:g025:16

1:g028:16


In this case, the first node would "get" 3/8 of the k-points (84), nodes g021 
and g025 would geht 2/8 each (56), and the last one (because it is very slow) 
would get only 28 k-points.



Von: Wien  im Auftrag von Christian 
Søndergaard Pedersen 
Gesendet: Montag, 12. Oktober 2020 10:24
An: A Mailing list for WIEN2k users
Betreff: Re: [Wien] .machines for several nodes


Thanks a lot for your answer. After re-reading the relevant pages in the User 
Guide, I am still left with some questions. Specifically, I am working with a 
system containing 96 atoms (as described in the case.struct-file) and 224 
inequivalent k points; i.e. 500 kpoints distributed as a 7x8x8 grid (448 total) 
reduced to 224 kpoints. Running on 4 nodes each with 16 cores, I want each of 
the 4 nodes to calculate 56 k points (224/4 = 56). Meanwhile, each node should 
handle 24 atoms (96/4 = 24).


Part of my confusion stems from your suggestion that I repeat the line 
"1:g008:4 [...]" a number of times equal to the number of k points I want to 
run in parallel, and that each repetition should refer to a different node. The 
reason is that the line in question already contains the names of all four 
nodes that were assigned to the job. However, combining your advice with the 
example on page 86, the lines should read:


1:g008

1:g021

1:g025

1:g028 # k points distributed over 4 jobs, running on 1 node each

extrafine:1


As for the parallellization over atoms for dstart and lapw0, I understand that 
the numbers assigned to each individual node should sum up to the number of 
atoms in the system, like this:


dstart:g008:24 g021:24 g025:24 g028:24

lapw0:g008:24 g021:24 g025:24 g028:24


so the final .machines-file would be a combination of the abov

Re: [Wien] optimization - 2D materials

2020-10-12 Thread Laurence Marks
Obviously you cannot optimize "c", this is not meaningful. You will need to
vary a=b carefully by hand (not automated), use something like excel or
sheets to plot the energy and also varying the internal parameters (MSR1a
or perhaps PORT). Be careful in your choice of symmetry as some 2D
materials have rumples along c.

_
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what nobody
else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu

On Mon, Oct 12, 2020, 08:34 Brik Hamida  wrote:

> Hi
>
> I m working on 2D material ( a=b and the vacuum is along z direction )
> Can you help me how I can optimize the structure?. I think  that is  not
> the same steps already used for  the bulk .thanks.
>
> best regards
> ___
> Wien mailing list
> Wien@zeus.theochem.tuwien.ac.at
>
> https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!GOghCgX3UefSKpUbaX6F7HBV_aL0F5AV1o4pJTL9QvQ3Z6cRl0hBoXTQUSKh4r42VNXuyA$
> SEARCH the MAILING-LIST at:
> https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!GOghCgX3UefSKpUbaX6F7HBV_aL0F5AV1o4pJTL9QvQ3Z6cRl0hBoXTQUSKh4r6OdpMhJQ$
>
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


[Wien] optimization - 2D materials

2020-10-12 Thread Brik Hamida
Hi

I m working on 2D material ( a=b and the vacuum is along z direction )  Can
you help me how I can optimize the structure?. I think  that is  not the
same steps already used for  the bulk .thanks.

best regards
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] .machines for several nodes

2020-10-12 Thread Christian Søndergaard Pedersen
Thanks a lot for your answer. After re-reading the relevant pages in the User 
Guide, I am still left with some questions. Specifically, I am working with a 
system containing 96 atoms (as described in the case.struct-file) and 224 
inequivalent k points; i.e. 500 kpoints distributed as a 7x8x8 grid (448 total) 
reduced to 224 kpoints. Running on 4 nodes each with 16 cores, I want each of 
the 4 nodes to calculate 56 k points (224/4 = 56). Meanwhile, each node should 
handle 24 atoms (96/4 = 24).


Part of my confusion stems from your suggestion that I repeat the line 
"1:g008:4 [...]" a number of times equal to the number of k points I want to 
run in parallel, and that each repetition should refer to a different node. The 
reason is that the line in question already contains the names of all four 
nodes that were assigned to the job. However, combining your advice with the 
example on page 86, the lines should read:


1:g008

1:g021

1:g025

1:g028 # k points distributed over 4 jobs, running on 1 node each

extrafine:1


As for the parallellization over atoms for dstart and lapw0, I understand that 
the numbers assigned to each individual node should sum up to the number of 
atoms in the system, like this:


dstart:g008:24 g021:24 g025:24 g028:24

lapw0:g008:24 g021:24 g025:24 g028:24


so the final .machines-file would be a combination of the above pieces. Have I 
understood this correctly, or am I missing the mark? Also, is there any 
difference between distributing the k points across four jobs (1 for each 
node), and across 224 jobs (by repeating each of the 1:gxxx lines 56 times)?


Best regards

Christian


Fra: Wien  på vegne af Ruh, Thomas 

Sendt: 12. oktober 2020 09:29:37
Til: A Mailing list for WIEN2k users
Emne: Re: [Wien] .machines for several nodes


Hi,


your .machines is wrong.


The nodes for lapw1 are prefaced not with "lapw1:" but only with "1:". lapw2 
needs no line, as it takes the same nodes as lapw1 before.


So an example for your usecase would be:


#

dstart:g008:4 g021:4 g025:4 g028:4

lapw0:g008:4 g021:4 g025:4 g028:4

1:g008:4 g021:4 g025:4 g028:4

granularity:1

extrafine:1


The line starting with "1:" has to be repeated (with different nodes, of 
course) x times, if you want to run x k-points in parallel (you can find more 
details about this in the usersguide, pages 84-91).


Regards,

Thomas


PS: As a sidenote: Both dstart and lapw0 parallelize over atoms, so 16 nodes 
might not be the best choice for your example.


Von: Wien  im Auftrag von Christian 
Søndergaard Pedersen 
Gesendet: Montag, 12. Oktober 2020 09:06
An: wien@zeus.theochem.tuwien.ac.at
Betreff: [Wien] .machines for several nodes


Hello everybody


I am new to WIEN2k, and am struggling with parallellizing calculations on our 
HPC cluster beyond what can be achieved using OMP. In particular, I want to 
execute run_lapw and/or runsp_lapw running on four identical nodes (16 cores 
each), parallellizing over k points (unless there's a more efficient scheme). 
To achieve this, I try to mimic the example from the User Guide (without the 
extra Alpha node), but my .machines-file does not work the way I intended. This 
is what I have:


#

dstart:g008:4 g021:4 g025:4 g028:4

lapw0:g008:4 g021:4 g025:4 g028:4

lapw1:g008:4 g021:4 g025:4 g028:4

lapw2:g008:4 g021:4 g025:4 g028:4

granularity:1

extrafine:1


The node names gxxx are read from SLURM_JOB_NODELIST in the submit script, and 
a couple of regular expressions generate the above lines. Afterwards, my job 
script does the following:


srun hostname -s > slurm.hosts
run_lapw -p

which results in a job that idles for the entire walltime and finishes with a 
CPU efficiency of 0.00%. I would appreciate any help in figuring out where I've 
gone wrong.


Best regards
Christian
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


[Wien] .machines for several nodes

2020-10-12 Thread Christian Søndergaard Pedersen
Hello everybody


I am new to WIEN2k, and am struggling with parallellizing calculations on our 
HPC cluster beyond what can be achieved using OMP. In particular, I want to 
execute run_lapw and/or runsp_lapw running on four identical nodes (16 cores 
each), parallellizing over k points (unless there's a more efficient scheme). 
To achieve this, I try to mimic the example from the User Guide (without the 
extra Alpha node), but my .machines-file does not work the way I intended. This 
is what I have:


#

dstart:g008:4 g021:4 g025:4 g028:4

lapw0:g008:4 g021:4 g025:4 g028:4

lapw1:g008:4 g021:4 g025:4 g028:4

lapw2:g008:4 g021:4 g025:4 g028:4

granularity:1

extrafine:1


The node names gxxx are read from SLURM_JOB_NODELIST in the submit script, and 
a couple of regular expressions generate the above lines. Afterwards, my job 
script does the following:


srun hostname -s > slurm.hosts
run_lapw -p

which results in a job that idles for the entire walltime and finishes with a 
CPU efficiency of 0.00%. I would appreciate any help in figuring out where I've 
gone wrong.


Best regards
Christian
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] .machines for several nodes

2020-10-12 Thread Peter Blaha

Yes, this is ok when your have nodes with 16 cores !!!

(Only the lapw0 line could use :16 instead of 8 if you have 96 atoms, 
but most likely this is fairly negligible).


Yes, the QTL calculation in lapw2 is also affected by the 
parallelization. but it reads from a .processes file, which is created 
by lapw1.


If you run x lapw2 -p -qtl   in an extra job, you should add the 
following line to create a "correct" .processes file:


x lapw1 -p -d >&/dev/null  # Create .processes (necessary for 
standalone-lapw2)


On 10/12/20 11:45 AM, Christian Søndergaard Pedersen wrote:
This went a long way towards clearing up my confusion, thanks again. I 
will try starting an MPI-parallel calculations for 4 nodes with 16 cores 
each using the following .machines-file:


1:g008:16
1:g021:16
1:g025:16
1:g028:16
lapw0: g008:8 g021:8 g025:8 g028:8

dstart: g008:8 g021:8 g025:8 g028:8


... and see how it performs. If the matrix sizes are small, I understand 
that I could also have each node work on 2 (or more) k-points at the 
same time, by specifying:



1:g008:8
1:g008:8
1:g021:8
1:g021:8
1:g025:8
1:g025:8
1:g028:8
1:g028:8

so that for instance g008 will work on 2 kpoints using 8 cores for each 
k point, am I right? And a (hopefully) final question, since qtl 
according to the manual runs in k-point parallel, is it also affected by 
the parallellization scheme specified for lapw1 and lapw2 (unless I 
deliberately change it)?





*Fra:* Wien  på vegne af Ruh, 
Thomas 

*Sendt:* 12. oktober 2020 10:59:09
*Til:* A Mailing list for WIEN2k users
*Emne:* Re: [Wien] .machines for several nodes

I am afraid, there is still some confusion.


First about /lapw1/:

Sorry for my unclear statement - I meant that you need one line per 
k-parallel job in the sense that #lines k-points are run simultaneously, 
i. e. if you speficify this part of the machines file like this:



1:g008:16

1:g021:16

1:g025:16

1:g028:16


your k-point list will be split into 4 parts of 56 k-points each [1] , 
which will be processed step-by-step. Node g008 will work in its first 
k-point, while node g021 will do the same for its first k-point, and so on


You need the ":16" after the name of the node. Otherwise, on every node 
only *one* core would be used. If it is useful to use 16 mpi-parallel 
jobs per k-point (meaning that the matrices will distributed on 16 cores 
with each core getting only 1/16 of the matrix elements) depends on your 
matrix sizes (which in turn depend on your rkmax). You should check that 
by grepping :rkm in your case.scf file. If the matrix size there is 
small, using OMP_NUM_THREADS 16 might be much faster (since MPI adds 
overhead to your calculation).




Regarding /lapw0/dstart/:

The way you set the calculation up could lead to (possible severe) 
overloading of your nodes: WIEN2k will start 24 jobs on each node (so 
1.5 times the number of cores) at the same time doing the calculation 
for 1 atom each.


As one possible alternative, you specify only 8 cores per node (i.e. for 
example "lapw0: g008:8" and so on) 8 jobs per node, which would lead to 
step-by-step calculations for 3 atoms per core.


Which option is faster is hard to tell and depends a lot on your hardware.


So what you could do - in principle - is to test multiple configurations 
(you can modify your .machines file on the fly during a SCF run) in the 
first cycles, compare the times (in case.dayfile), and use the faster 
one for the rest of the run.




Regards,
Thomas


[1] Sidenote: This splitting is controlled by the first number - in this 
case 4 equal sublists will be set-up - you could also specifiy different 
"weights", for instance, if your nodes are of different speeds, the 
machinesfile could then read for example:



3:g008:16

2:g021:16

2:g025:16

1:g028:16


In this case, the first node would "get" 3/8 of the k-points (84), nodes 
g021 and g025 would geht 2/8 each (56), and the last one (because it is 
very slow) would get only 28 k-points.




*Von:* Wien  im Auftrag von 
Christian Søndergaard Pedersen 

*Gesendet:* Montag, 12. Oktober 2020 10:24
*An:* A Mailing list for WIEN2k users
*Betreff:* Re: [Wien] .machines for several nodes

Thanks a lot for your answer. After re-reading the relevant pages in the 
User Guide, I am still left with some questions. Specifically, I am 
working with a system containing 96 atoms (as described in the 
case.struct-file) and 224 inequivalent k points; i.e. 500 kpoints 
distributed as a 7x8x8 grid (448 total) reduced to 224 kpoints. Running 
on 4 nodes each with 16 cores, I want each of the 4 nodes to calculate 
56 k points (224/4 = 56). Meanwhile, each node should handle 24 atoms 
(96/4 = 24).



Part of my confusion stems from your suggestion that I repeat the line 
"1:g008:4 [...]" a number of times equal to the number of k points I 
want to run in parallel

Re: [Wien] Segmentation fault in w2w

2020-10-12 Thread Peter Blaha

Thank you very much for your report.

Next version will include these changes.

Best regards
Peter Blaha

On 10/6/20 1:24 AM, Niraj Aryal wrote:

Dear all,

Thank you all for your suggestions and for guiding me to the right 
directions.

I was able to solve the problem of segmentation fault in w2w.
It seems like this problem is not reproducible in all the platforms as 
verified by Prof. Rubel.


If you encounter such a problem, please apply the following patch for 
modules_rc.F file located in the SRC_w2w directory:


434c434,435
<   complex(C16) :: projection(inwf%bmax-inwf%bmin+1, inwf%Nproj, num_kpts)
---
 >   ! making the projection array dynamic solved the seg fault
 >   complex(C16), allocatable :: projection(:,:,:)
448a450,452
 >
 >   allocate(projection(Nb,inwf%Nproj, num_kpts))
 >
661a666,668
 >   !deallocate projection array
 >   deallocate(projection)
 >

i.e. please make the projection array dynamic.

Thank you.

Sincerely,
Niraj Aryal
Research Associate
Brookhaven National Lab.
Upton, NY



On Sat, Sep 12, 2020 at 5:15 PM Laurence Marks > wrote:


...please add -g ... (not -f, a typo).

On Sat, Sep 12, 2020 at 11:26 AM Laurence Marks
mailto:laurence.ma...@gmail.com>> wrote:

Is this compiled with -g ? If not, please add -f, recompile and
then repeat. (The reason is that optimizations can lead to
segmentation faults appearing at an inappropriate location in
the code.) If with -g it is still the same, please add before
line 449 of  modules_rc.F a write command, i.e. so it has:

   Nb = inwf%bmax-inwf%bmin+1
   write(*,*) Nb, inwf%Nproj, num_kpts
   projection = 0

I suspect that num_kpts is wrong, so the dimensions of
projection are incorrect. However, segmentation errors can be
hard to locate.

On Sat, Sep 12, 2020 at 10:20 AM Niraj Aryal
mailto:debonairni...@gmail.com>> wrote:

Thank you all for your suggestions.
I tried your suggestions but so far, the problem remains.

As per Prof. Rubel's request, I will share with you my
struct file privately along with the steps. Please watch out
for my email (aryalnir...@gmail.com
).
To simplify, I was able to reproduce the problem for
paramagnetic case without spin orbit case.

Prof. Marks, I have -traceback option in my compilation.
These are the line numbers where the seg fault occurs:

w2w                00432427  l2amn_m_mp_l2amn_  
   449  modules_rc.F
w2w                0042D68E  MAIN__
    245  main.f


This made me believe that the problem is in modules_rc.F
file in the amn routine.

Thank you for the patch link Gavin.
I applied modules_rc.patch but the problem persists.

I will continue trying to solve this problem. I will keep
you updated if something new comes up.
I look forward to your suggestions and feedback.

Sincerely,
Niraj Aryal

On Fri, Sep 11, 2020 at 8:08 PM Gavin Abo
mailto:gs...@crimson.ua.edu>> wrote:

Not sure if it is related, but are you using the w2w fix
that Jindrich previously posted for WIEN2k 19.2 [

https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg19849.html


] .  Or modules_rc.patch and modules_rc_wplot.patch
related to w2web if you prefer to try them are at:

https://github.com/gsabo/WIEN2k-Patches/tree/master/19.2



On 9/11/2020 12:02 PM, Niraj Aryal wrote:

Dear w2w experts and developers,

I am using Wien2k version 19.2 in scientific linux 7.5
using intel compilers (2018).

I am trying to wannierize f-electron system with
antiferromagnetic magnetic ordering using w2w version
2.0 in the presence of SOC.

After self-consistent calculations, these are the
steps I am following for the wannierisation:

init_w2w (to write case.klist, case.inwf, case.win etc)

 x lapw1 -up -p
 x lapw1 -dn -p
 x lapwso -up -orb -p

x w2w -so -up -p  --> segmentation fault here
 

Re: [Wien] .machines for several nodes

2020-10-12 Thread Peter Blaha




On 10/12/20 10:24 AM, Christian Søndergaard Pedersen wrote:
Thanks a lot for your answer. After re-reading the relevant pages in the 
User Guide, I am still left with some questions. Specifically, I am 
working with a system containing 96 atoms (as described in the 
case.struct-file) and 224 inequivalent k points; i.e. 500 kpoints 
distributed as a 7x8x8 grid (448 total) reduced to 224 kpoints. Running 
on 4 nodes each with 16 cores, I want each of the 4 nodes to calculate 
56 k points (224/4 = 56). Meanwhile, each node should handle 24 atoms 
(96/4 = 24).


lapw1/2 does not really parallelize over "atoms" but over APWs. For any 
single k-point (and matrix element ij you need summ over all atoms.


You want to distribute 54k-points to each of your nodes (therefore 4 
lines) and can probably use all cores of each node for each job (from 
your first .machines file I assumed you have 4 cores/node ??)






As for the parallellization over atoms for dstart and lapw0, I 
understand that the numbers assigned to each individual node should sum 
up to the number of atoms in the system, like this:



dstart:g008:24 g021:24 g025:24 g028:24


Yes, this line would span 96 mpi processes. However, the main question 
is what kind of nodes you have. How many cores (real not virtual) does 
each node have)?  It does NOT make sense to overload a node heavily.


so the final .machines-file would be a combination of the above pieces. 
Have I understood this correctly, or am I missing the mark? Also, is 
there any difference between distributing the k points across four jobs 
(1 for each node), and across 224 jobs (by repeating each of the 1:gxxx 
lines 56 times)?


In "principle yes", but in practice: NO WAY !!

a) Do not overload your nodes. Spanning more porcesses on a single node 
that what it has cores is not really beneficial in most cases.


b) Each parallelization has a certain overhead, and if you make a stupid 
parallelization, it can easily happen that your calculation runs 10 (or 
more) times SLOWER than in a less highly parallel (or even sequential) mode.
Even if you have 224 cores available, parallelization over 224 k-points 
would mean that all 224 jobs need a certain startup time and then try to 
read/write from your filesystem at the same time and this would most 
likely produce a tremendous overhead.


c) For an inexperienced user I'd suggest to
 i)) learn the details of your hardware (cores/node; filesystem (is 
there a local scratch ?), network-speed, )
 ii) start out with medium parallelization and monitor the timeing 
(case.dayfile, but also case.output1_*). In the "ideal" world, using 2 
cores instead of 1 should give a speedup of 2. If it does, increase the 
cores until you see a significant decrease of the speedup (but stop for 
sure before an INCREASE of run-time ("wall-time") occurs).
 iii) These considerations depend on the size of your calculations 
(large cases (a few hundred atoms) can run on 512 or more cores, our 
simple TiC "getting started" example only on 2-4 cores).
 iv) Reconsider your input: RKMAX adapted to your elements/sphere 
sizes) and k-points. I would for instance NEVER start a 96 atom cell 
with 224 k-points, but probably with ONE !! (insulator) or maybe 10-64 
(metal). Once scf (and force minimization ) is reached, save_lapw 
and increase the k-mesh for checking convergence.






Best regards

Christian


*Fra:* Wien  på vegne af Ruh, 
Thomas 

*Sendt:* 12. oktober 2020 09:29:37
*Til:* A Mailing list for WIEN2k users
*Emne:* Re: [Wien] .machines for several nodes

Hi,


your .machines is wrong.


The nodes for /lapw1 /are prefaced not with "lapw1:" but only with "1:". 
/lapw2 /needs no line, as it takes the same nodes as lapw1 before.



So an example for your usecase would be:


#

dstart:g008:4 g021:4 g025:4 g028:4

lapw0:g008:4 g021:4 g025:4 g028:4

1:g008:4 g021:4 g025:4 g028:4

granularity:1

extrafine:1


The line starting with "1:" has to be repeated (with different nodes, of 
course) x times, if you want to run x k-points in parallel (you can find 
more details about this in the usersguide, pages 84-91).



Regards,

Thomas


PS: As a sidenote: Both /dstart /and /lapw0 /parallelize over atoms, so 
16 nodes might not be the best choice for your example.



*Von:* Wien  im Auftrag von 
Christian Søndergaard Pedersen 

*Gesendet:* Montag, 12. Oktober 2020 09:06
*An:* wien@zeus.theochem.tuwien.ac.at
*Betreff:* [Wien] .machines for several nodes

Hello everybody


I am new to WIEN2k, and am struggling with parallellizing calculations 
on our HPC cluster beyond what can be achieved using OMP. In particular, 
I want to execute run_lapw and/or runsp_lapw running on four identical 
nodes (16 cores each), parallellizing over k points (unless there's a 
more efficient scheme). To achieve this, I try to mimic the example from 
the

Re: [Wien] .machines for several nodes

2020-10-12 Thread Ruh, Thomas
I am afraid, there is still some confusion.


First about lapw1:

Sorry for my unclear statement - I meant that you need one line per k-parallel 
job in the sense that #lines k-points are run simultaneously, i. e. if you 
speficify this part of the machines file like this:


1:g008:16

1:g021:16

1:g025:16

1:g028:16


your k-point list will be split into 4 parts of 56 k-points each [1] , which 
will be processed step-by-step. Node g008 will work in its first k-point, while 
node g021 will do the same for its first k-point, and so on

You need the ":16" after the name of the node. Otherwise, on every node only 
one core would be used. If it is useful to use 16 mpi-parallel jobs per k-point 
(meaning that the matrices will distributed on 16 cores with each core getting 
only 1/16 of the matrix elements) depends on your matrix sizes (which in turn 
depend on your rkmax). You should check that by grepping :rkm in your case.scf 
file. If the matrix size there is small, using OMP_NUM_THREADS 16 might be much 
faster (since MPI adds overhead to your calculation).



Regarding lapw0/dstart:

The way you set the calculation up could lead to (possible severe) overloading 
of your nodes: WIEN2k will start 24 jobs on each node (so 1.5 times the number 
of cores) at the same time doing the calculation for 1 atom each.

As one possible alternative, you specify only 8 cores per node (i.e. for 
example "lapw0: g008:8" and so on) 8 jobs per node, which would lead to 
step-by-step calculations for 3 atoms per core.

Which option is faster is hard to tell and depends a lot on your hardware.


So what you could do - in principle - is to test multiple configurations (you 
can modify your .machines file on the fly during a SCF run) in the first 
cycles, compare the times (in case.dayfile), and use the faster one for the 
rest of the run.



Regards,
Thomas


[1] Sidenote: This splitting is controlled by the first number - in this case 4 
equal sublists will be set-up - you could also specifiy different "weights", 
for instance, if your nodes are of different speeds, the machinesfile could 
then read for example:


3:g008:16

2:g021:16

2:g025:16

1:g028:16


In this case, the first node would "get" 3/8 of the k-points (84), nodes g021 
and g025 would geht 2/8 each (56), and the last one (because it is very slow) 
would get only 28 k-points.



Von: Wien  im Auftrag von Christian 
Søndergaard Pedersen 
Gesendet: Montag, 12. Oktober 2020 10:24
An: A Mailing list for WIEN2k users
Betreff: Re: [Wien] .machines for several nodes


Thanks a lot for your answer. After re-reading the relevant pages in the User 
Guide, I am still left with some questions. Specifically, I am working with a 
system containing 96 atoms (as described in the case.struct-file) and 224 
inequivalent k points; i.e. 500 kpoints distributed as a 7x8x8 grid (448 total) 
reduced to 224 kpoints. Running on 4 nodes each with 16 cores, I want each of 
the 4 nodes to calculate 56 k points (224/4 = 56). Meanwhile, each node should 
handle 24 atoms (96/4 = 24).


Part of my confusion stems from your suggestion that I repeat the line 
"1:g008:4 [...]" a number of times equal to the number of k points I want to 
run in parallel, and that each repetition should refer to a different node. The 
reason is that the line in question already contains the names of all four 
nodes that were assigned to the job. However, combining your advice with the 
example on page 86, the lines should read:


1:g008

1:g021

1:g025

1:g028 # k points distributed over 4 jobs, running on 1 node each

extrafine:1


As for the parallellization over atoms for dstart and lapw0, I understand that 
the numbers assigned to each individual node should sum up to the number of 
atoms in the system, like this:


dstart:g008:24 g021:24 g025:24 g028:24

lapw0:g008:24 g021:24 g025:24 g028:24


so the final .machines-file would be a combination of the above pieces. Have I 
understood this correctly, or am I missing the mark? Also, is there any 
difference between distributing the k points across four jobs (1 for each 
node), and across 224 jobs (by repeating each of the 1:gxxx lines 56 times)?


Best regards

Christian


Fra: Wien  på vegne af Ruh, Thomas 

Sendt: 12. oktober 2020 09:29:37
Til: A Mailing list for WIEN2k users
Emne: Re: [Wien] .machines for several nodes


Hi,


your .machines is wrong.


The nodes for lapw1 are prefaced not with "lapw1:" but only with "1:". lapw2 
needs no line, as it takes the same nodes as lapw1 before.


So an example for your usecase would be:


#

dstart:g008:4 g021:4 g025:4 g028:4

lapw0:g008:4 g021:4 g025:4 g028:4

1:g008:4 g021:4 g025:4 g028:4

granularity:1

extrafine:1


The line starting with "1:" has to be repeated (with different nodes, of 
course) x times, if you want to run x k-points in parallel (you can find more 
details about this in the usersguide, pages 84-91).


Regards,

Thomas


PS: As a sid

Re: [Wien] NOMAD and Wien2k

2020-10-12 Thread Peter Blaha
Well, I agree at present NOMAD (the "repository") might be useful for a 
"personal storage tool" (although any external 2TB harddisk will do this 
for 60 Euros) or as a "sharing tool" between 2 groups, if you tell your 
partner details how you uploaded the data, so that he can find "your" 
data easily.



Retrieving "real" information either from from these data or also from 
the "encyclopedia" seems difficult.
As a test I found eg. for Au in the repository calculations by 9 
different codes, but in the encyclopedia only results for 3 codes are 
listed (VASP, exciting and FHI-aims). It seems they only parse a few 
codes, the rest is only for repository and if I search for one compound, 
I would get usually hundreds of results, but don't know which one I 
should actually take.


PS: The DOS/bands problem could be circumvented if you "save_lapw 
-dos/-band/-eels/-optic/-xspec" with the same name as the scf run.


On 10/9/20 11:56 AM, Pavel Ondračka wrote:

Dear Wien2k mailing list,

I'm experimenting with the NOMAD database (nomad-lab.eu) and since I
remembered some old post from prof. Blaha on this topic, I just thought
I would ask here for user experience, because so far its not really
working that well for me.

So first of all the most annoying thing is that NOMAD detects two
"mainfiles" per directory, specifically the scf and scf0 files, but as
far as I can see it can't parse anything useful from the scf0 file (not
even the potential). The main downside is that it thinks there are two
calculations, while there is just one in fact and therefore creates lot
of useless entries.

Another think which I'm not sure how to approach is what to do when the
scf file is missing. I often do the main scf loop, than save_lapw to
another directory and after that I generate a new denser k-grid (for
DOS or optics) and just run lapw1, 2 and the tetra (or optic) or so on.
In this case I don't have scf file in the directory with the DOS or
optic calculations, but I would still like to upload this. The upload
somehow works, because the scf0 file from the old run stays there, so
at least NOMAD recognizes there are some Wien2k data, but it really
can't parse anything from the scf0 file (and in general I think that
using the scf0 file is a bug). I can make it somehow detect some
metadata by artificially creating a new fake scf file by combining the
old and new scfxxx files with cat... so that at least the NOMAD can
detect the composition, potential and some other basic things, but this
is clearly not optimal.

In general the NOMAD Wien2k parser cannot detect pretty much anything
beyond maybe structure information, and I will try to report all this
to the NOMAD developers to improve the parser, but I would be curious
if someone here had some better experience or can share some tricks,
how to make it work better for Wien2k with the current NOMAD state.

Best regards
Pavel

___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html



--

  P.Blaha
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/TC_Blaha
--
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] finding density of states for individual bands

2020-10-12 Thread Peter Blaha
Actually, this should not be a problem for a dense k-mesh. Dense mehes 
have mostly "general k-points"(kx,ky,kz all different) and thus no 
symmetry. Such "bands" always split due to the non-crossing rule since 
all eigenvalues have the same irrep.


Crossing-/anti-crossing is important only along high-symmetry 
lines/points, but a "DOS" originates mostly from the volume of the BZ, 
not from high-symmetry points.


PS: I must however admit, that I'm rather skeptical of "bandwise 
analysis". I'd prefer "energy-ranges" for analysis (analyse the partial 
DOS for dominating reagions of a certain partial DOS and run lapw2 
(maybe EFG-split) for this energy region.


This could produce an analysis (eg. for a distorted octahedron) like: 
The distorted "t2g"-like bands contribute by +2 V/m**2 to the EFG, while 
the "eg"-like states contribute with -4 V/m**2, ..


On 10/12/20 9:17 AM, Pavel Ondračka wrote:

The problematic part is that while joint claims it can give you a DOS
for just one band (with the switch 2), this is actually not a DOS of a
single band but of a single band index. This will be the same thing
only if there is no band crossing (the difference will be obvious if
you do energy band structure plot with x spaghetti with and without
running x irrep before).

Best regards
Pavel

On Sun, 2020-10-11 at 05:58 +, Lee, Yongbin [A LAB] wrote:

I guess you can do it with "joint".
Check *.injoint which is at page 170 in UG.

Yongbin

From: Wien  on behalf of
Joseph Ross 
Sent: Saturday, October 10, 2020 4:40 PM
To: wien@zeus.theochem.tuwien.ac.at 
Subject: [Wien] finding density of states for individual bands
  
We have a semimetallic system which has an indirect overlap of some

rather convoluted bands at Ef. In order to better understand the
holes vs. electrons in this system we would like to find the density
of states (and partial densities if possible) associated with
individual bands, rather than the total. From my understanding &
reading through the users guide, I think this is not a feature
included in wien2k. However if we are overlooking something, or if
there is a separate package that we could use to extract this type of
information, we would be interested to know. Any suggestions on this
are welcome.
-Joe Ross
-
Joseph H. Ross Jr.
Professor
Department of Physics and Astronomy
Texas A&M University
4242 TAMU
College Station TX  77843-4242
979 845 3842 / 448 MPHY
jhr...@tamu.edu / http://faculty.physics.tamu.edu/ross
-

___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html



--

  P.Blaha
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/TC_Blaha
--
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] .machines for several nodes

2020-10-12 Thread Ruh, Thomas
Hi,


your .machines is wrong.


The nodes for lapw1 are prefaced not with "lapw1:" but only with "1:". lapw2 
needs no line, as it takes the same nodes as lapw1 before.


So an example for your usecase would be:


#

dstart:g008:4 g021:4 g025:4 g028:4

lapw0:g008:4 g021:4 g025:4 g028:4

1:g008:4 g021:4 g025:4 g028:4

granularity:1

extrafine:1


The line starting with "1:" has to be repeated (with different nodes, of 
course) x times, if you want to run x k-points in parallel (you can find more 
details about this in the usersguide, pages 84-91).


Regards,

Thomas


PS: As a sidenote: Both dstart and lapw0 parallelize over atoms, so 16 nodes 
might not be the best choice for your example.


Von: Wien  im Auftrag von Christian 
Søndergaard Pedersen 
Gesendet: Montag, 12. Oktober 2020 09:06
An: wien@zeus.theochem.tuwien.ac.at
Betreff: [Wien] .machines for several nodes


Hello everybody


I am new to WIEN2k, and am struggling with parallellizing calculations on our 
HPC cluster beyond what can be achieved using OMP. In particular, I want to 
execute run_lapw and/or runsp_lapw running on four identical nodes (16 cores 
each), parallellizing over k points (unless there's a more efficient scheme). 
To achieve this, I try to mimic the example from the User Guide (without the 
extra Alpha node), but my .machines-file does not work the way I intended. This 
is what I have:


#

dstart:g008:4 g021:4 g025:4 g028:4

lapw0:g008:4 g021:4 g025:4 g028:4

lapw1:g008:4 g021:4 g025:4 g028:4

lapw2:g008:4 g021:4 g025:4 g028:4

granularity:1

extrafine:1


The node names gxxx are read from SLURM_JOB_NODELIST in the submit script, and 
a couple of regular expressions generate the above lines. Afterwards, my job 
script does the following:


srun hostname -s > slurm.hosts
run_lapw -p

which results in a job that idles for the entire walltime and finishes with a 
CPU efficiency of 0.00%. I would appreciate any help in figuring out where I've 
gone wrong.


Best regards
Christian
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] .machines for several nodes

2020-10-12 Thread Peter Blaha

Please study the syntax for the .machines files.

Only dstart, lapw0 or nlvdw have specific   program:xxx entries.

The lapw1/2 parallelization is done via lines like:

speed:hostname:nodes

So you need a line like:

1:g008:4 g021:4 g025:4 g028:4

This would span ONE lapw1/2 jobs with mpi on 16 cores

alternatively, if you want k-parallelism, lines like

1:g008:4
1:g021:4
1:g025:4
1:g028:4

should do 4 k-parallel jobs, each running with 4 mpi-tasks.

However, since mpi diagonalization may have some overhead compared to 
sequential lapw1/2 (in particular when using SCALAPack and not ELPA), an 
alternative is to use k-point + omp parallelization:


1:g008
1:g021
1:g025
1:g028
omp_global:4
omp_lapw0:1

Regards


On 10/12/20 9:06 AM, Christian Søndergaard Pedersen wrote:

Hello everybody


I am new to WIEN2k, and am struggling with parallellizing calculations 
on our HPC cluster beyond what can be achieved using OMP. In particular, 
I want to execute run_lapw and/or runsp_lapw running on four identical 
nodes (16 cores each), parallellizing over k points (unless there's a 
more efficient scheme). To achieve this, I try to mimic the example from 
the User Guide (without the extra Alpha node), but my .machines-file 
does not work the way I intended. This is what I have:



#

dstart:g008:4 g021:4 g025:4 g028:4

lapw0:g008:4 g021:4 g025:4 g028:4

lapw1:g008:4 g021:4 g025:4 g028:4

lapw2:g008:4 g021:4 g025:4 g028:4

granularity:1

extrafine:1


The node names gxxx are read from SLURM_JOB_NODELIST in the submit 
script, and a couple of regular expressions generate the above lines. 
Afterwards, my job script does the following:



srun hostname -s > slurm.hosts
run_lapw -p

which results in a job that idles for the entire walltime and finishes 
with a CPU efficiency of 0.00%. I would appreciate any help in figuring 
out where I've gone wrong.



Best regards
Christian


___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html



--

  P.Blaha
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/TC_Blaha
--
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] finding density of states for individual bands

2020-10-12 Thread Pavel Ondračka
The problematic part is that while joint claims it can give you a DOS
for just one band (with the switch 2), this is actually not a DOS of a
single band but of a single band index. This will be the same thing
only if there is no band crossing (the difference will be obvious if
you do energy band structure plot with x spaghetti with and without
running x irrep before).

Best regards
Pavel

On Sun, 2020-10-11 at 05:58 +, Lee, Yongbin [A LAB] wrote:
> I guess you can do it with "joint".
> Check *.injoint which is at page 170 in UG.
> 
> Yongbin
> 
> From: Wien  on behalf of
> Joseph Ross 
> Sent: Saturday, October 10, 2020 4:40 PM
> To: wien@zeus.theochem.tuwien.ac.at 
> Subject: [Wien] finding density of states for individual bands
>  
> We have a semimetallic system which has an indirect overlap of some
> rather convoluted bands at Ef. In order to better understand the
> holes vs. electrons in this system we would like to find the density
> of states (and partial densities if possible) associated with
> individual bands, rather than the total. From my understanding &
> reading through the users guide, I think this is not a feature
> included in wien2k. However if we are overlooking something, or if
> there is a separate package that we could use to extract this type of
> information, we would be interested to know. Any suggestions on this
> are welcome.
> -Joe Ross
> -
> Joseph H. Ross Jr.
> Professor
> Department of Physics and Astronomy
> Texas A&M University
> 4242 TAMU
> College Station TX  77843-4242
> 979 845 3842 / 448 MPHY
> jhr...@tamu.edu / http://faculty.physics.tamu.edu/ross
> -
> 
> ___
> Wien mailing list
> Wien@zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:  
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> ___
> Wien mailing list
> Wien@zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:  
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html