Dear Antonio
The actual time spent per scf cycle is about 33 minutes.
This is not so bad. :-)
The relevant parameters in the input file are the following:
Some relevant parameters are not shown.
input_dft= 'pz'
ecutwfc= 25
Which kind of pseudopotential? You didn't set ecutrho...
What about ibrav and celldm?
I suppose that you really want to perform LDA calculations for some reason.
occupations= 'smearing'
smearing= 'cold'
degauss= 0.05 ! I know it's quite large, but necessary to
stabilize the SCF at this preliminary stage (no geometry step done
yet)
mixing_beta= 0.4
If you want to stabilize the scf it is better to use a Gaussian
smearing and to reduce degauss (to 0.01) and mixing beta (to 0.1 or
even 0.05~0.01). In the case of a relax calculation with a difficult
first step, try to use scf_must_converge=.false. and a reasonable
electron_maxstep (30~50). It often helps when the scf is not
completely going astray.
nbnd= 2010
diagonalization= 'ppcg'
davidson should be faster.
And, if possible, also to reduce the number of nodes?
Estimated total dynamical RAM > 1441.34 GB
you may try with 7-8 nodes according to this estimate.
HTH
Giuseppe
Quoting Antonio Cammarata via users <users@lists.quantum-espresso.org>:
I did some tests. For 1000 Si atoms, I use 2010 bands because I need
to get the band gap value; moreover, being a cluster, the surface
states of the truncated bonds might close the gap, especially at the
first steps of the geometry optimization, so it's better I use few
empty bands. I managed to run the calculation by using 10 nodes and
a max of 40 cores per node. My question now is: can you suggest me
optimal command line options and/or input settings to speed up the
calculation? And, if possible, also to reduce the number of nodes?
The relevant parameters in the input file are the following:
input_dft= 'pz'
ecutwfc= 25
occupations= 'smearing'
smearing= 'cold'
degauss= 0.05 ! I know it's quite large, but necessary to
stabilize the SCF at this preliminary stage (no geometry step done
yet)
nbnd= 2010
diagonalization= 'ppcg'
mixing_mode= 'plain'
mixing_beta= 0.4
The actual time spent per scf cycle is about 33 minutes. I use QE v.
7.3 compiled with openmpi and scalapack. I have access to the intel
compilers too but I did some tests and the difference is just tens
of seconds. I have only the Gamma point; please, here you have some
info about the grid and the estimated RAM usage:
Dense grid: 24616397 G-vectors FFT dimensions: ( 375, 375, 375)
Dynamical RAM for wfc: 235.91 MB
Dynamical RAM for wfc (w. buffer): 235.91 MB
Dynamical RAM for str. fact: 0.94 MB
Dynamical RAM for local pot: 0.00 MB
Dynamical RAM for nlocal pot: 2112.67 MB
Dynamical RAM for qrad: 0.80 MB
Dynamical RAM for rho,v,vnew: 6.04 MB
Dynamical RAM for rhoin: 2.01 MB
Dynamical RAM for rho*nmix: 15.03 MB
Dynamical RAM for G-vectors: 3.99 MB
Dynamical RAM for h,s,v(r/c): 0.46 MB
Dynamical RAM for <psi|beta>: 552.06 MB
Dynamical RAM for wfcinit/wfcrot: 1305.21 MB
Estimated static dynamical RAM per process > 2.31 GB
Estimated max dynamical RAM per process > 3.60 GB
Estimated total dynamical RAM > 1441.34 GB
Thanks a lot in advance for your kind help.
All the best
Antonio
On 10. 05. 24 12:01, Paolo Giannozzi wrote:
On 5/10/24 08:58, Antonio Cammarata via users wrote:
pw.x -nk 1 -nt 1 -nb 1 -nd 768 -inp qe.in > qe.out
too many processors for linear-algebra parallelization. 1000 Si
atoms = 2000 bands (assuming an insulator with no spin
polarization). Use a few tens of processors at most
"some processors have no G-vectors for symmetrization".
which sounds strange to me: with the Gamma point symmetrization is
not even needed
Dense grid: 30754065 G-vectors FFT dimensions: ( 400, 400, 400)
This is what a 256-atom Si supercell with 30 Ry cutoff yields:
Dense grid: 825897 G-vectors FFT dimensions: ( 162, 162, 162)
I guess you may reduce the size of your supercell
Paolo
Dynamical RAM for wfc: 153.50 MB
Dynamical RAM for wfc (w. buffer): 153.50 MB
Dynamical RAM for str. fact: 0.61 MB
Dynamical RAM for local pot: 0.00 MB
Dynamical RAM for nlocal pot: 1374.66 MB
Dynamical RAM for qrad: 0.87 MB
Dynamical RAM for rho,v,vnew: 5.50 MB
Dynamical RAM for rhoin: 1.83 MB
Dynamical RAM for rho*nmix: 9.78 MB
Dynamical RAM for G-vectors: 2.60 MB
Dynamical RAM for h,s,v(r/c): 0.25 MB
Dynamical RAM for <psi|beta>: 552.06 MB
Dynamical RAM for wfcinit/wfcrot: 977.20 MB
Estimated static dynamical RAM per process > 1.51 GB
Estimated max dynamical RAM per process > 2.47 GB
Estimated total dynamical RAM > 1900.41 GB
I managed to run the simulation with 512 atoms, cg diagonalization
and 3 nodes on the same machine with command line
pw.x -nk 1 -nt 1 -nd 484 -inp qe.in > qe.out
Please, do you have any suggestion on how to set optimal
parallelization parameters to avoid the memory issue and run the
calculation? I am also planning to run simulations on nanoclusters
with more than 1000 atoms.
Thanks a lot in advance for your kind help.
Antonio
--
_______________________________________________
Antonio Cammarata, PhD in Physics
Associate Professor in Applied Physics
Advanced Materials Group
Department of Control Engineering - KN:G-204
Faculty of Electrical Engineering
Czech Technical University in Prague
Karlovo Náměstí, 13
121 35, Prague 2, Czech Republic
Phone: +420 224 35 5711
Fax: +420 224 91 8646
ORCID: orcid.org/0000-0002-5691-0682
WoS ResearcherID: A-4883-2014
_______________________________________________
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users
GIUSEPPE MATTIOLI
CNR - ISTITUTO DI STRUTTURA DELLA MATERIA
Via Salaria Km 29,300 - C.P. 10
I-00015 - Monterotondo Scalo (RM)
Mob (*preferred*) +39 373 7305625
Tel + 39 06 90672342 - Fax +39 06 90672316
E-mail: <giuseppe.matti...@ism.cnr.it>
_______________________________________________
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users