Dear Quantum Espresso users,

I am trying to wannierize the electronic structure of ZnMgHf. The structure is 
large, contains 174 atoms and I calculate 1400 spins unpolarized bands. I carry 
out calculations using computational cluster. The problem is when I try to use 
pw2wannier90.x. I receive the message that my calculations are terminated and 
in joberr file it is stated that the system is out of memory. I even tried 64 
nodes with 180 Gb of RAM each and still receive this message. Is there a proper 
way to make calculations on a cluster with pw2wannier90.x? I run the file with 
the instruction: mpirun -np 3072 pw2wannier90.x -in pw2wan_ZnMgHf_m1.in > 
pw2wan_ZnMgHf_m1.out, where pw2wan_ZnMgHf_m1.in is the input file Details of 
slurm comends are in run_qe_wannier.sh file. I had no problems with scf, nscf 
calculations and never needed that high number of nodes. I usually use just 4 
nodes for this system and the memeory amount is enough.

I include the .win and pw2wannier input files and run file for cluster.Also, 
output files are included.
I used Quantum espresso 6.7 and wannier90 3.1.0


Would appreciate any help.

Best regards,
Ireneusz Buganski
AGH University of Science and Technology, Krakow, Poland


 gcccore/10.3.0 loaded.
 zlib/1.2.11-gcccore-10.3.0 loaded.
 binutils/2.36.1-gcccore-10.3.0 loaded.
 intel-compilers/2021.2.0 loaded.
 numactl/2.0.14-gcccore-10.3.0 loaded.
 ucx/1.10.0-gcccore-10.3.0 loaded.
 impi/2021.2.0-intel-compilers-2021.2.0 loaded.
 iimpi/2021a loaded.
 imkl/2021.2.0-iimpi-2021a loaded.
 intel/2021a loaded.
 szip/2.1.1-gcccore-10.3.0 loaded.
 hdf5/1.10.7-iimpi-2021a loaded.
 elpa/2021.05.001-intel-2021a loaded.
 libxc/5.1.5-intel-compilers-2021.2.0 loaded.
 quantumespresso/6.7-intel-2021a loaded.
 intel/2021a unloaded.
 gcccore/10.3.0 unloaded.
 gcccore/11.2.0 loaded.
 zlib/1.2.11-gcccore-10.3.0 unloaded.
 binutils/2.36.1-gcccore-10.3.0 unloaded.
 zlib/1.2.11-gcccore-11.2.0 loaded.
 binutils/2.37-gcccore-11.2.0 loaded.
 intel-compilers/2021.2.0 unloaded.
 intel-compilers/2021.4.0 loaded.
 impi/2021.2.0-intel-compilers-2021.2.0 unloaded.
 ucx/1.10.0-gcccore-10.3.0 unloaded.
 numactl/2.0.14-gcccore-10.3.0 unloaded.
 numactl/2.0.14-gcccore-11.2.0 loaded.
 ucx/1.11.2-gcccore-11.2.0 loaded.
 impi/2021.4.0-intel-compilers-2021.4.0 loaded.
 imkl/2021.2.0-iimpi-2021a unloaded.
 imkl/2021.4.0 loaded.
 iimpi/2021a unloaded.
 iimpi/2021b loaded.
 imkl-fftw/2021.4.0-iimpi-2021b loaded.
 intel/2021b loaded.
 wannier90/3.1.0-intel-2021b loaded.

The following have been reloaded with a version change:
  1) binutils/2.36.1-gcccore-10.3.0 => binutils/2.37-gcccore-11.2.0
  2) gcccore/10.3.0 => gcccore/11.2.0
  3) iimpi/2021a => iimpi/2021b
  4) imkl/2021.2.0-iimpi-2021a => imkl/2021.4.0
  5) impi/2021.2.0-intel-compilers-2021.2.0 => 
impi/2021.4.0-intel-compilers-2021.4.0
  6) intel-compilers/2021.2.0 => intel-compilers/2021.4.0
  7) intel/2021a => intel/2021b
  8) numactl/2.0.14-gcccore-10.3.0 => numactl/2.0.14-gcccore-11.2.0
  9) ucx/1.10.0-gcccore-10.3.0 => ucx/1.11.2-gcccore-11.2.0
 10) zlib/1.2.11-gcccore-10.3.0 => zlib/1.2.11-gcccore-11.2.0

slurmstepd: error: Detected 1 oom_kill event in StepId=9760000.9. Some of the 
step tasks have been OOM Killed.
srun: error: ac0519: task 0: Out Of Memory
slurmstepd: error: Detected 1 oom_kill event in StepId=9760000.6. Some of the 
step tasks have been OOM Killed.
srun: error: ac0074: task 1: Out Of Memory
[proxy:0:40@ac0517] main (../../../../../src/pm/i_hydra/proxy/proxy.c:1189): 
assert (proxy_params.immediate.proxy.pid_hash == NULL) failed
srun: error: ac0517: task 10: Exited with exit code 5
[proxy:0:4@ac0043] main (../../../../../src/pm/i_hydra/proxy/proxy.c:1189): 
assert (proxy_params.immediate.proxy.pid_hash == NULL) failed
srun: error: ac0043: task 1: Exited with exit code 5
srun: error: ac0662: task 14: Broken pipe
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
srun: error: ac0705: task 15: Broken pipe
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
srun: error: ac0620: task 0: Out Of Memory
srun: error: ac0599: task 12: Exited with exit code 5
slurmstepd: error: Detected 1 oom_kill event in StepId=9760000.5. Some of the 
step tasks have been OOM Killed.
[proxy:0:48@ac0599] main (../../../../../src/pm/i_hydra/proxy/proxy.c:1189): 
assert (proxy_params.immediate.proxy.pid_hash == NULL) failed
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
srun: error: ac0416: task 5: Out Of Memory
slurmstepd: error: Detected 1 oom_kill event in StepId=9760000.0. Some of the 
step tasks have been OOM Killed.
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
MPI startup(): PMI server not found. Please set I_MPI_PMI_LIBRARY variable if 
it is not a singleton case.

Attachment: pw2wan_ZnMgHf_m1.in
Description: Binary data

Attachment: pw2wan_ZnMgHf_m1.out
Description: Binary data

Attachment: run_qe_wannier.sh
Description: Unix shell archive

Attachment: ZnMgHf_m1.win
Description: Binary data

Attachment: ZnMgHf_m1.wout
Description: Binary data

_______________________________________________
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to