[Wien] "so: Undefined variable" error

2023-02-12 Thread Tim Williams via Wien
Hi,

I have upgraded to 23.1 and encountered an error running qtl

running LAPW2 in parallel mode
STOP LAPW2 - FERMI; weights written
FERMI only
0.229u 0.063s 0:00.13 215.3%0+0k 0+1312io 0pf+0w
running QTL in parallel mode
calculating QTL's from parallel vectors
so: Undefined variable.
0.015u 0.005s 0:00.01 100.0%0+0k 0+24io 0pf+0w
error: command   /home/mcem-admin/wien2k/qtlpara qtl.def   failed

The SCF ran to convergence (parallel, 6 cores on one Intel machine).

The error somewhat resembles those previously present in x_lapw fixed with the 
Github patches. “so” is defined in x_lapw (set so).

Happy to provide more details but is anyone aware of any need for a patch in 
23.1?

Many thanks,

Tim.



---  
Dr. Tim Williams  
Transmission Electron Microscope Manager

Monash Centre for Electron Microscopy (MCEM)
Monash University 
Room 103, 10 Innovation Walk, Clayton Campus
Wellington Road
Clayton VIC 3800
Australia
T: +61 (0) 3 9902 0721  
M: +61 (0) 401 853 850
e: timothy.willi...@monash.edu

CRICOS Provider: Monash University
8C/01857J
 
We acknowledge and pay respects to the Elders
and Traditional Owners of the land on which our
four Australian campuses stand.

___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] Parallel execution on new Intel CPUs

2023-02-12 Thread Laurence Marks
Don't use Intel Hyper threading. Unless something drastic has changed it
gets in the way.

Beyong that there is no single answer. For small problems k-pt parallel is
better, perhaps 2 threads. For medium problems (10-25 unique atoms) mpi
with/without omp is better. For a large slab (50+ unique) mpi is needed,
but you may run out of memory.

Recommendation: install mpi & experiment.

---
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu
"Research is to see what everybody else has seen, and to think what nobody
else has thought" Albert Szent-Györgyi

On Sun, Feb 12, 2023, 13:47 pluto via Wien 
wrote:

> Dear All,
>
> I am now using a machine with i7-13700K. This CPU has 8 performance
> cores (P-cores) and 8 efficient cores (E-cores). In addition each P-core
> has 2 threads, so there is 24 threads alltogether. It is hard to find
> some reasonable info online, but probably a P-core is approx. 2x faster
> than an E-core:
>
> https://www.anandtech.com/show/17047/the-intel-12th-gen-core-i912900k-review-hybrid-performance-brings-hybrid-complexity/10
> This will of course depend on what is being calculated...
>
> Do you have suggestions on how to optimize the .machines file for the
> parallel execution of an scf cycle?
>
> On my machine using OMP_NUM_THREADS leads to oscillations of the CPU use
> (for a large slab maybe 40% of time is spent on a single thread),
> suggesting that large OMP is not the optimal strategy.
>
> Some examples of strategies:
>
> One strategy would be to repeat the line
> 1:localhost
> 24 times, to have all the threads busy, and set OMP_NUM_THREADS=1.
>
> Another would be set the line
> 1:localhost
> 8 times and set OMP_NUM_THREADS=2, this would mean using all 16 physical
> cores.
>
> Or perhaps one should better "overload" the CPU e.g. by doing
> 1:localhost 16 times and OMP=2 ?
>
> Over time I will try to benchmark some the different options, but
> perhaps there is some logic of how one should think about this.
>
> In addition I have a comment on .machines file. It seems that for the
> FM+SOC (runsp -so) calculations the
>
> omp_global
>
> setting in .machines is ignored. The
>
> omp_lapw1
> omp_lapw2
>
> settings seem to work fine. So, I tried to set OMP for lapwso
> separately, by including the line like:
>
> omp_lapwso:2
>
> but this gives an error when executing parallel scf.
>
> Best,
> Lukasz
> ___
> Wien mailing list
> Wien@zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


[Wien] Parallel execution on new Intel CPUs

2023-02-12 Thread pluto via Wien

Dear All,

I am now using a machine with i7-13700K. This CPU has 8 performance 
cores (P-cores) and 8 efficient cores (E-cores). In addition each P-core 
has 2 threads, so there is 24 threads alltogether. It is hard to find 
some reasonable info online, but probably a P-core is approx. 2x faster 
than an E-core:

https://www.anandtech.com/show/17047/the-intel-12th-gen-core-i912900k-review-hybrid-performance-brings-hybrid-complexity/10
This will of course depend on what is being calculated...

Do you have suggestions on how to optimize the .machines file for the 
parallel execution of an scf cycle?


On my machine using OMP_NUM_THREADS leads to oscillations of the CPU use 
(for a large slab maybe 40% of time is spent on a single thread), 
suggesting that large OMP is not the optimal strategy.


Some examples of strategies:

One strategy would be to repeat the line
1:localhost
24 times, to have all the threads busy, and set OMP_NUM_THREADS=1.

Another would be set the line
1:localhost
8 times and set OMP_NUM_THREADS=2, this would mean using all 16 physical 
cores.


Or perhaps one should better "overload" the CPU e.g. by doing 
1:localhost 16 times and OMP=2 ?


Over time I will try to benchmark some the different options, but 
perhaps there is some logic of how one should think about this.


In addition I have a comment on .machines file. It seems that for the 
FM+SOC (runsp -so) calculations the


omp_global

setting in .machines is ignored. The

omp_lapw1
omp_lapw2

settings seem to work fine. So, I tried to set OMP for lapwso 
separately, by including the line like:


omp_lapwso:2

but this gives an error when executing parallel scf.

Best,
Lukasz
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] QTL quantization axis for Y_lm orbitals

2023-02-12 Thread pluto via Wien

Dear Prof. Blaha,

Thank you for your comments.

Are the functions u and u-dot provided in some output file? Manual 
mentions different types of u and u-dot for the cases of Psi^LO and 
Psi^lo. Manual also mentions that u and u-dot are obtained by numerical 
integration of radial Schrodinger equation on the mesh. Are they all 
tabulated somewhere?


Having all the A_lm, B_lm, C_lm and all the u and u-dot would allow to 
have the full wave function Psi(r) inside the spheres as a function of 
wave-vector and energy. That would allow to numerically calculate the 
matrix elements which I need, with the assumption of my favorite final 
state, and without any further assumptions. The only remaining problem 
would be the interstitial region, but it would also be under control by 
knowing how much charge leaks out of the spheres.


Best,
Lukasz




On 2023-02-09 18:06, Peter Blaha wrote:
Well, I'm not sure I do understand all your problems, but a few 
comments:


a) XMCD is implemented in   optics !

b) I do not see the problem with A_lm, B_lm C_lm,..., because in any
case  A_lm (or for semicore a C_lm) will dominate and you can probably
neglect the B_lm and the corresponding u-dot radial function.

When you chose a good expansion energy for your radial wf., you more
or less have this "hydrogenic orbital" with one fixed radial function.
Of course, this argument holds only when your states are "localized",
otherwise you will have a large interstital (PW) contribution.

c) I'm not the real expert of Wannier functions, but I guess the WF
might be complicated linear combinations of different l,m 



Am 09.02.2023 um 15:46 schrieb pluto via Wien:

Dear Sylwia, dear Prof. Blaha, dear All,

Having these A_lm, B_lm etc is of course a problem if one wants to 
estimate interferences in dipole optical matrix element due to phases 
at which different Y_lm orbitals enter the wave function. It would be 
good to have a single complex number per Y_lm.


For this it would be good to have the LAPW wavefunction projected onto 
hydrogenic orbitals that just have a single radial component. Then 
there would be just one complex coefficient. For a particular l (i.e. 
s, p, or d) one would have a common radial part of the wave function, 
since the radial part does not depend on m. Then one would need to 
assume the final state expansion in Y_lm (can always be done even for 
free-electron final state) and do some estimation of the XMCD process 
within the simplified LCAO way of thinking.


Is there any tool already existing to project WIEN2k wave function 
onto hydrogenic orbitals?
I was thinking something like this might be a part of the 
WIEN2Wannier, but I wanted to ask here before investing further time 
into this.


Best,
Lukasz



___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html