Dear Wannier90 users,

I am trying  to wannierize the electronic structure of ZnMgHf. The structure is 
 large, contains 174 atoms and I calculate 1400 spins unpolarized bands. I  
carry out calculations using computational cluster. The problem is when  I try 
to use pw2wannier90.x. I receive the message that my calculations  are 
terminated and in joberr file it is stated that the system is out  of memory. I 
even tried 64 nodes with 180 Gb of RAM each and still  receive this message. Is 
there a proper way to make calculations on a  cluster with pw2wannier90.x? I 
run the file with the instruction: mpirun  -np 3072 pw2wannier90.x -in 
pw2wan_ZnMgHf_m1.in >  pw2wan_ZnMgHf_m1.out, where pw2wan_ZnMgHf_m1.in is the 
input file  Details of slurm comends are in run_qe_wannier.sh file. I had no  
problems with scf, nscf calculations and never needed that high number  of 
nodes. I usually use just 4 nodes for this system and the memeory  amount is 
enough.

I include the .win and pw2wannier input files and run file for cluster. Also, 
output files are included.
I used Quantum espresso 6.7 and wannier90 3.1.0


Would appreciate any help.

Best regards,
Ireneusz Buganski
AGH University of Science and Technology, Krakow, Poland

Attachment: run_qe_wannier.sh
Description: Unix shell archive

Attachment: pw2wan_ZnMgHf_m1.out
Description: Binary data

Attachment: pw2wan_ZnMgHf_m1.in
Description: Binary data

 gcccore/10.3.0 loaded.
 zlib/1.2.11-gcccore-10.3.0 loaded.
 binutils/2.36.1-gcccore-10.3.0 loaded.
 intel-compilers/2021.2.0 loaded.
 numactl/2.0.14-gcccore-10.3.0 loaded.
 ucx/1.10.0-gcccore-10.3.0 loaded.
 impi/2021.2.0-intel-compilers-2021.2.0 loaded.
 iimpi/2021a loaded.
 imkl/2021.2.0-iimpi-2021a loaded.
 intel/2021a loaded.
 szip/2.1.1-gcccore-10.3.0 loaded.
 hdf5/1.10.7-iimpi-2021a loaded.
 elpa/2021.05.001-intel-2021a loaded.
 libxc/5.1.5-intel-compilers-2021.2.0 loaded.
 quantumespresso/6.7-intel-2021a loaded.
 intel/2021a unloaded.
 gcccore/10.3.0 unloaded.
 gcccore/11.2.0 loaded.
 zlib/1.2.11-gcccore-10.3.0 unloaded.
 binutils/2.36.1-gcccore-10.3.0 unloaded.
 zlib/1.2.11-gcccore-11.2.0 loaded.
 binutils/2.37-gcccore-11.2.0 loaded.
 intel-compilers/2021.2.0 unloaded.
 intel-compilers/2021.4.0 loaded.
 impi/2021.2.0-intel-compilers-2021.2.0 unloaded.
 ucx/1.10.0-gcccore-10.3.0 unloaded.
 numactl/2.0.14-gcccore-10.3.0 unloaded.
 numactl/2.0.14-gcccore-11.2.0 loaded.
 ucx/1.11.2-gcccore-11.2.0 loaded.
 impi/2021.4.0-intel-compilers-2021.4.0 loaded.
 imkl/2021.2.0-iimpi-2021a unloaded.
 imkl/2021.4.0 loaded.
 iimpi/2021a unloaded.
 iimpi/2021b loaded.
 imkl-fftw/2021.4.0-iimpi-2021b loaded.
 intel/2021b loaded.
 wannier90/3.1.0-intel-2021b loaded.

The following have been reloaded with a version change:
  1) binutils/2.36.1-gcccore-10.3.0 => binutils/2.37-gcccore-11.2.0
  2) gcccore/10.3.0 => gcccore/11.2.0
  3) iimpi/2021a => iimpi/2021b
  4) imkl/2021.2.0-iimpi-2021a => imkl/2021.4.0
  5) impi/2021.2.0-intel-compilers-2021.2.0 => 
impi/2021.4.0-intel-compilers-2021.4.0
  6) intel-compilers/2021.2.0 => intel-compilers/2021.4.0
  7) intel/2021a => intel/2021b
  8) numactl/2.0.14-gcccore-10.3.0 => numactl/2.0.14-gcccore-11.2.0
  9) ucx/1.10.0-gcccore-10.3.0 => ucx/1.11.2-gcccore-11.2.0
 10) zlib/1.2.11-gcccore-10.3.0 => zlib/1.2.11-gcccore-11.2.0

slurmstepd: error: Detected 1 oom_kill event in StepId=9760000.9. Some of the 
step tasks have been OOM Killed.
srun: error: ac0519: task 0: Out Of Memory
slurmstepd: error: Detected 1 oom_kill event in StepId=9760000.6. Some of the 
step tasks have been OOM Killed.
srun: error: ac0074: task 1: Out Of Memory
[proxy:0:40@ac0517] main (../../../../../src/pm/i_hydra/proxy/proxy.c:1189): 
assert (proxy_params.immediate.proxy.pid_hash == NULL) failed
srun: error: ac0517: task 10: Exited with exit code 5
[proxy:0:4@ac0043] main (../../../../../src/pm/i_hydra/proxy/proxy.c:1189): 
assert (proxy_params.immediate.proxy.pid_hash == NULL) failed
srun: error: ac0043: task 1: Exited with exit code 5
srun: error: ac0662: task 14: Broken pipe
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
srun: error: ac0705: task 15: Broken pipe
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
srun: error: ac0620: task 0: Out Of Memory
srun: error: ac0599: task 12: Exited with exit code 5
slurmstepd: error: Detected 1 oom_kill event in StepId=9760000.5. Some of the 
step tasks have been OOM Killed.
[proxy:0:48@ac0599] main (../../../../../src/pm/i_hydra/proxy/proxy.c:1189): 
assert (proxy_params.immediate.proxy.pid_hash == NULL) failed
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
srun: error: ac0416: task 5: Out Of Memory
slurmstepd: error: Detected 1 oom_kill event in StepId=9760000.0. Some of the 
step tasks have been OOM Killed.
[mpiexec@ac0001] HYD_sock_write 
(../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write 
error (Bad file descriptor)
MPI startup(): PMI server not found. Please set I_MPI_PMI_LIBRARY variable if 
it is not a singleton case.
_______________________________________________
Wannier mailing list
[email protected]
https://lists.quantum-espresso.org/mailman/listinfo/wannier

Reply via email to