Thank you, Paolo
I've now compiled with Scalapack, and the problem has though changed to
something else. When launching the code with the command mpirun -np 32
pw.x -nd 1 < NiOHsupercell1.in > NiOHsupercell1.out , I get the
following error:
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 9 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[ec4302-MZ01-CE1-00:111052] PMIX ERROR: UNREACHABLE in file
server/pmix_server.c at line 2198
[ec4302-MZ01-CE1-00:111052] PMIX ERROR: UNREACHABLE in file
server/pmix_server.c at line 2198
[ec4302-MZ01-CE1-00:111052] PMIX ERROR: UNREACHABLE in file
server/pmix_server.c at line 2198
[ec4302-MZ01-CE1-00:111052] PMIX ERROR: UNREACHABLE in file
server/pmix_server.c at line 2198
[ec4302-MZ01-CE1-00:111052] PMIX ERROR: UNREACHABLE in file
server/pmix_server.c at line 2198
[ec4302-MZ01-CE1-00:111052] 5 more processes have sent help message
help-mpi-api.txt / mpi-abort
[ec4302-MZ01-CE1-00:111052] Set MCA parameter "orte_base_help_aggregate"
to 0 to see all help / error messages
For some reason, 22 is the maximum number of threads usable, but this
does not happen when running parallel jobs in other softwares. Changing
the nd number does not change anything.
During configuration of QE, this was the command I used:
./configure MPIF90=mpif90 CC=gcc --enable-parallel --with-scalapack
SCALAPACK_LIBS="-L/home/ec4302/scalapack-2.2.0 -lscalapack" BLAS_LIBS
="-L/usr/lib/x86_64-linux-gnu/blas -lblas" LAPACK_LIBS ="-L/usr/lib
I don't know much about PMIX but seems to be used only with OpenMPI,
which isn't the case here
Álvaro
El 2023-03-22 16:38, Paolo Giannozzi escribió:
After two independent reports of a similar problem, several tests and
a considerable amount of head-scratching, I came to the conclusion
that the problem is no longer present in the development version (to
be released as v.7.2 no later than this week). For more explanations,
see issue https://gitlab.com/QEF/q-e/-/issues/572, the related
comments and the fix provided by Miroslav Iliaš
Paolo
On 15/03/2023 09:44, a.pramos wrote:
Dear everyone,
I am running QE 7.1 in an AMD EPYC 7763 build and Ubuntu 22.04 LTS.
When launching an input with the following command:
mpirun -np 24 pw.x <NiOH3.in> NiOH3.out
The program fails after writing these lines in the output:
Estimated max dynamical RAM per process > 4.68 GB
Estimated total dynamical RAM > 112.25 GB
Check: negative core charge= -0.000002
Generating pointlists ...
new r_m : 0.0031 (alat units) 0.0555 (a.u.) for type 1
new r_m : 0.0031 (alat units) 0.0555 (a.u.) for type 2
new r_m : 0.0031 (alat units) 0.0555 (a.u.) for type 3
Initial potential from superposition of free atoms
starting charge 809.9994, renormalised to 864.0000
Starting wfcs are 621 randomized atomic wfcs
No CRASH file is generated and the lines in the console mention a
problem with MPI communication:
[ec4302-MZ01-CE1-00:56280] *** An error occurred in MPI_Comm_free
[ec4302-MZ01-CE1-00:56280] *** reported by process [507510785,1]
[ec4302-MZ01-CE1-00:56280] *** on communicator MPI_COMM_WORLD
[ec4302-MZ01-CE1-00:56280] *** MPI_ERR_COMM: invalid communicator
[ec4302-MZ01-CE1-00:56280] *** MPI_ERRORS_ARE_FATAL (processes in this
communicator will now abort,
[ec4302-MZ01-CE1-00:56280] *** and potentially your MPI job)
[ec4302-MZ01-CE1-00:56275] 4 more processes have sent help message
help-mpi-errors.txt / mpi_errors_are_fatal
[ec4302-MZ01-CE1-00:56275] Set MCA parameter
"orte_base_help_aggregate" to 0 to see all help / error messages
This only happens, however, when using 24 cores. Using 12 makes the
code run as usual, and using the same parallelization with other
softwares has not given any trouble, which suggests me this is not a
problem of MPI.
Best regards,
Álvaro
_______________________________________________
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
_______________________________________________
Quantum ESPRESSO is supported by MaX
(https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=05%7C01%7Cpaolo.giannozzi%40uniud.it%7Cee3a57f8aece40daa5b708db25319a1c%7C6e6ade15296c4224ac581c8ec2fd53a8%7C0%7C0%7C638144667190947424%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=TaWearoihki4Ny7txtiKHUA7bkntV%2FGLjB9vbVf9P%2Bk%3D&reserved=0)
users mailing list [email protected]
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=05%7C01%7Cpaolo.giannozzi%40uniud.it%7Cee3a57f8aece40daa5b708db25319a1c%7C6e6ade15296c4224ac581c8ec2fd53a8%7C0%7C0%7C638144667190947424%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ve0ZyQjci%2Boq6BlLcTYz5l71VGOC05rPeAWL8%2BHtybs%3D&reserved=0
_______________________________________________
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users