[Wien] "so: Undefined variable" error
Hi, I have upgraded to 23.1 and encountered an error running qtl running LAPW2 in parallel mode STOP LAPW2 - FERMI; weights written FERMI only 0.229u 0.063s 0:00.13 215.3%0+0k 0+1312io 0pf+0w running QTL in parallel mode calculating QTL's from parallel vectors so: Undefined variable. 0.015u 0.005s 0:00.01 100.0%0+0k 0+24io 0pf+0w error: command /home/mcem-admin/wien2k/qtlpara qtl.def failed The SCF ran to convergence (parallel, 6 cores on one Intel machine). The error somewhat resembles those previously present in x_lapw fixed with the Github patches. “so” is defined in x_lapw (set so). Happy to provide more details but is anyone aware of any need for a patch in 23.1? Many thanks, Tim. --- Dr. Tim Williams Transmission Electron Microscope Manager Monash Centre for Electron Microscopy (MCEM) Monash University Room 103, 10 Innovation Walk, Clayton Campus Wellington Road Clayton VIC 3800 Australia T: +61 (0) 3 9902 0721 M: +61 (0) 401 853 850 e: timothy.willi...@monash.edu CRICOS Provider: Monash University 8C/01857J We acknowledge and pay respects to the Elders and Traditional Owners of the land on which our four Australian campuses stand. ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Parallel execution on new Intel CPUs
Don't use Intel Hyper threading. Unless something drastic has changed it gets in the way. Beyong that there is no single answer. For small problems k-pt parallel is better, perhaps 2 threads. For medium problems (10-25 unique atoms) mpi with/without omp is better. For a large slab (50+ unique) mpi is needed, but you may run out of memory. Recommendation: install mpi & experiment. --- Professor Laurence Marks Department of Materials Science and Engineering Northwestern University www.numis.northwestern.edu "Research is to see what everybody else has seen, and to think what nobody else has thought" Albert Szent-Györgyi On Sun, Feb 12, 2023, 13:47 pluto via Wien wrote: > Dear All, > > I am now using a machine with i7-13700K. This CPU has 8 performance > cores (P-cores) and 8 efficient cores (E-cores). In addition each P-core > has 2 threads, so there is 24 threads alltogether. It is hard to find > some reasonable info online, but probably a P-core is approx. 2x faster > than an E-core: > > https://www.anandtech.com/show/17047/the-intel-12th-gen-core-i912900k-review-hybrid-performance-brings-hybrid-complexity/10 > This will of course depend on what is being calculated... > > Do you have suggestions on how to optimize the .machines file for the > parallel execution of an scf cycle? > > On my machine using OMP_NUM_THREADS leads to oscillations of the CPU use > (for a large slab maybe 40% of time is spent on a single thread), > suggesting that large OMP is not the optimal strategy. > > Some examples of strategies: > > One strategy would be to repeat the line > 1:localhost > 24 times, to have all the threads busy, and set OMP_NUM_THREADS=1. > > Another would be set the line > 1:localhost > 8 times and set OMP_NUM_THREADS=2, this would mean using all 16 physical > cores. > > Or perhaps one should better "overload" the CPU e.g. by doing > 1:localhost 16 times and OMP=2 ? > > Over time I will try to benchmark some the different options, but > perhaps there is some logic of how one should think about this. > > In addition I have a comment on .machines file. It seems that for the > FM+SOC (runsp -so) calculations the > > omp_global > > setting in .machines is ignored. The > > omp_lapw1 > omp_lapw2 > > settings seem to work fine. So, I tried to set OMP for lapwso > separately, by including the line like: > > omp_lapwso:2 > > but this gives an error when executing parallel scf. > > Best, > Lukasz > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] Parallel execution on new Intel CPUs
Dear All, I am now using a machine with i7-13700K. This CPU has 8 performance cores (P-cores) and 8 efficient cores (E-cores). In addition each P-core has 2 threads, so there is 24 threads alltogether. It is hard to find some reasonable info online, but probably a P-core is approx. 2x faster than an E-core: https://www.anandtech.com/show/17047/the-intel-12th-gen-core-i912900k-review-hybrid-performance-brings-hybrid-complexity/10 This will of course depend on what is being calculated... Do you have suggestions on how to optimize the .machines file for the parallel execution of an scf cycle? On my machine using OMP_NUM_THREADS leads to oscillations of the CPU use (for a large slab maybe 40% of time is spent on a single thread), suggesting that large OMP is not the optimal strategy. Some examples of strategies: One strategy would be to repeat the line 1:localhost 24 times, to have all the threads busy, and set OMP_NUM_THREADS=1. Another would be set the line 1:localhost 8 times and set OMP_NUM_THREADS=2, this would mean using all 16 physical cores. Or perhaps one should better "overload" the CPU e.g. by doing 1:localhost 16 times and OMP=2 ? Over time I will try to benchmark some the different options, but perhaps there is some logic of how one should think about this. In addition I have a comment on .machines file. It seems that for the FM+SOC (runsp -so) calculations the omp_global setting in .machines is ignored. The omp_lapw1 omp_lapw2 settings seem to work fine. So, I tried to set OMP for lapwso separately, by including the line like: omp_lapwso:2 but this gives an error when executing parallel scf. Best, Lukasz ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] QTL quantization axis for Y_lm orbitals
Dear Prof. Blaha, Thank you for your comments. Are the functions u and u-dot provided in some output file? Manual mentions different types of u and u-dot for the cases of Psi^LO and Psi^lo. Manual also mentions that u and u-dot are obtained by numerical integration of radial Schrodinger equation on the mesh. Are they all tabulated somewhere? Having all the A_lm, B_lm, C_lm and all the u and u-dot would allow to have the full wave function Psi(r) inside the spheres as a function of wave-vector and energy. That would allow to numerically calculate the matrix elements which I need, with the assumption of my favorite final state, and without any further assumptions. The only remaining problem would be the interstitial region, but it would also be under control by knowing how much charge leaks out of the spheres. Best, Lukasz On 2023-02-09 18:06, Peter Blaha wrote: Well, I'm not sure I do understand all your problems, but a few comments: a) XMCD is implemented in optics ! b) I do not see the problem with A_lm, B_lm C_lm,..., because in any case A_lm (or for semicore a C_lm) will dominate and you can probably neglect the B_lm and the corresponding u-dot radial function. When you chose a good expansion energy for your radial wf., you more or less have this "hydrogenic orbital" with one fixed radial function. Of course, this argument holds only when your states are "localized", otherwise you will have a large interstital (PW) contribution. c) I'm not the real expert of Wannier functions, but I guess the WF might be complicated linear combinations of different l,m Am 09.02.2023 um 15:46 schrieb pluto via Wien: Dear Sylwia, dear Prof. Blaha, dear All, Having these A_lm, B_lm etc is of course a problem if one wants to estimate interferences in dipole optical matrix element due to phases at which different Y_lm orbitals enter the wave function. It would be good to have a single complex number per Y_lm. For this it would be good to have the LAPW wavefunction projected onto hydrogenic orbitals that just have a single radial component. Then there would be just one complex coefficient. For a particular l (i.e. s, p, or d) one would have a common radial part of the wave function, since the radial part does not depend on m. Then one would need to assume the final state expansion in Y_lm (can always be done even for free-electron final state) and do some estimation of the XMCD process within the simplified LCAO way of thinking. Is there any tool already existing to project WIEN2k wave function onto hydrogenic orbitals? I was thinking something like this might be a part of the WIEN2Wannier, but I wanted to ask here before investing further time into this. Best, Lukasz ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html