Dear Michal
3.7x with respect to what ?
your cut and paste refers to the wall time and the total cpu time per
mpi task, they differ because you are using thread parallelism.
if you don't have memory issues I would try to increase the number of
mpi processes decreasing the number of thread and usually when the
number on MPI tasks is smaller than the dimesions of the fft grid it is
better to avoid using nt.
Hope it helps
regards
Pietro
On 30/05/19 16:42, Michal Krompiec wrote:
Hello,
I am trying to run a calculation on a 2D slab with a bit of adsorbate
(119 atoms in total), and I would like to parallelize it as much as
possible. I am using a 3 3 1 Monkhorst-Pack grid (so I have 5 k-points).
I tried using -npool 5 -nt 4 using 20 MPI processes and 5 threads per
process but, as it seems, the speedup was just 3.7x:
PWSCF : 1d 4h27m CPU 7h43m WALL
What could have gone wrong, is there anything "obvious" I can do to
diagnose the problem? I am using QE 6.4rc, compiled with gcc and
OpenMPI, without ELPA.
Best regards,
Michal Krompiec
Merck KGaA and University of Southampton
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users