Dear Michal

3.7x with respect to what ?

your cut and paste refers to the wall time and the total cpu time per mpi task, they differ because you are using thread parallelism.

if you don't have memory issues I would try to increase the number of mpi processes decreasing the number of thread and usually when the number on MPI tasks is smaller than the dimesions of the fft grid it is better to avoid using nt.

Hope it helps

regards

Pietro

On 30/05/19 16:42, Michal Krompiec wrote:
Hello,
I am trying to run a calculation on a 2D slab with a bit of adsorbate (119 atoms in total), and I would like to parallelize it as much as possible. I am using a 3 3 1 Monkhorst-Pack grid (so I have 5 k-points). I tried using -npool 5 -nt 4 using 20 MPI processes and 5 threads per process but, as it seems, the speedup was just 3.7x:
   PWSCF        :   1d 4h27m CPU      7h43m WALL
What could have gone wrong, is there anything "obvious" I can do to diagnose the problem? I am using QE 6.4rc, compiled with gcc and OpenMPI, without ELPA.

Best regards,

Michal Krompiec

Merck KGaA and University of Southampton

_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to