Dear Prof. Blaha, Prof. Marks, dear All,

Below some benchmark results. It seems that for a serial calculation using 8 OMP threads is optimal. This probably has something to do with having 8 fast and 8 slow cores.

Hardware:
13th Gen Intel(R) Core(TM) i7-13700K
64 GB of RAM DDR4-3600
2 TB drive Samsung NVMe
ASUS Z690-P D4 mainboard

I also looked at mpi-benchmark, but I don't have mpi, so I think these tests make no sense.

Let me know if I shoud add something to this.

Best,
Lukasz



bash-5.1$ pwd
(...)/WIEN2k_benchmark/Serial/test_case

bash-5.1$ export OMP_NUM_THREADS=1
bash-5.1$ echo $OMP_NUM_THREADS
1
bash-5.1$ x lapw1
 LAPW1 END
12.567u 0.216s 0:12.82 99.6%    0+0k 464+37840io 2pf+0w

bash-5.1$ export OMP_NUM_THREADS=2
bash-5.1$ echo $OMP_NUM_THREADS
2
bash-5.1$ x lapw1
 LAPW1 END
14.844u 0.248s 0:07.65 197.1%   0+0k 0+37840io 2pf+0w


bash-5.1$ export OMP_NUM_THREADS=4
bash-5.1$ echo $OMP_NUM_THREADS
4
bash-5.1$ x lapw1
 LAPW1 END
21.091u 0.372s 0:05.51 389.4%   0+0k 0+37840io 10pf+0w

bash-5.1$ export OMP_NUM_THREADS=6
bash-5.1$ echo $OMP_NUM_THREADS
6
bash-5.1$ x lapw1
 LAPW1 END
27.765u 0.490s 0:04.87 580.0%   0+0k 0+37824io 19pf+0w

bash-5.1$ export OMP_NUM_THREADS=8
bash-5.1$ echo $OMP_NUM_THREADS
8
bash-5.1$ x lapw1
 LAPW1 END
34.099u 0.605s 0:04.51 769.1%   0+0k 0+37824io 27pf+0w
bash-5.1$ x lapw1
 LAPW1 END
34.087u 0.616s 0:04.51 769.1%   0+0k 0+37824io 33pf+0w
bash-5.1$ x lapw1
 LAPW1 END
34.119u 0.629s 0:04.52 768.3%   0+0k 0+37824io 26pf+0w
bash-5.1$ x lapw1
 LAPW1 END
34.234u 0.579s 0:04.53 768.2%   0+0k 0+37824io 26pf+0w

bash-5.1$ export OMP_NUM_THREADS=12
bash-5.1$ echo $OMP_NUM_THREADS
12
bash-5.1$ x lapw1
 LAPW1 END
61.638u 2.193s 0:05.54 1151.9%  0+0k 0+37840io 44pf+0w

bash-5.1$ export OMP_NUM_THREADS=16
bash-5.1$ echo $OMP_NUM_THREADS
16
bash-5.1$ x lapw1
 LAPW1 END
82.629u 2.636s 0:05.55 1536.0%  0+0k 0+37840io 63pf+0w

bash-5.1$ export OMP_NUM_THREADS=24
bash-5.1$ echo $OMP_NUM_THREADS
24
bash-5.1$ x lapw1
 LAPW1 END
86.794u 3.724s 0:05.48 1651.6%  0+0k 0+37840io 57pf+0w




bash-5.1$ pwd
(...)/WIEN2k_benchmark/mpi-benchmark
bash-5.1$ export OMP_NUM_THREADS=1
bash-5.1$ echo $OMP_NUM_THREADS
1
bash-5.1$ x lapw1
 LAPW1 END
117.827u 0.921s 1:58.88 99.8%   0+0k 432+162616io 2pf+0w




On 2023-02-15 01:11, Laurence Marks wrote:
Two things:

1) The CPU you have looks interesting. Can you please run and post the
benchmark from the Wien2k page for different omp (and mpi would be
good). It would be good to know what the "Hybrid Core" architecture
does with Wien2k. For mpi elpa is much better -- it can also be better
for non-mpi.

2) It is established lore in the DFT community that increasing the
"smearing" assists convergence. However, not all lore is true. I am
aware of zero evidence for this with the current Wien2k mixer, so I
suggest sticking with room temperature rather than 1500K. More
important is a well-posed problem. For more see
http://www.numis.northwestern.edu/Presentations/DFT_Mixing_For_Dummies.pdf

On Tue, Feb 14, 2023 at 5:18 PM pluto via Wien
<wien@zeus.theochem.tuwien.ac.at> wrote:

Dear Prof. Blaha,

Thank you for comments.

At the moment I have 56 k-points in a big slab of one of the ternary

magnetic 2D materials. Perhaps I can reduce k-points, something to
test.
Also now I see that my 56 k-points are compatible with 1:localhost
lines
:-)

Also, for now it does not want to converge after 40 iterations with
TEMP
0.002, for a while I was trying TEMP 0.004, and now I am trying TEMP

0.01. Maybe I should start with a smaller slab...

Some info you asked for:

The i7-13700K CPU has 8 P-cores (fast) and 8 E-cores (slow), so 16
total
physical cores. Each P-core has 2 threads, so there are total of 24
threads. Many other new Intel CPUs are the same. I don't think there
is
an easy way to enforce certain task on a certain core, and probably
it
makes no sense, because the CPU for sure has thermal control over
different cores etc.

 --

Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
https://scholar.google.com/citations?user=zmHhI9gAAAAJ&hl=en [1]
"Research is to see what everybody else has seen, and to think what
nobody else has thought", Albert Szent-Györgyi

Links:
------
[1] http://www.numis.northwestern.edu
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

Reply via email to