Re: [QE-users] DFT+U+V neighbour allocation extremely slow and not parallelized

Timrov Iurii Mon, 05 May 2025 07:02:25 -0700

Dear Julien,

To solve both problems, I would define a new atomic type for the couple of 
interest and apply the inter-site V between these two "Hubbard atoms". All the 
other atoms will be treated as non-Hubbard, and hence the bottleneck in 
alloc_neigh should disappear.


Greetings,
Iurii

----------------------------------------------------------
Dr. Iurii TIMROV
Tenure-track scientist
Laboratory for Materials Simulations (LMS)
Paul Scherrer Institut (PSI)
CH-5232 Villigen, Switzerland
+41 56 310 62 14
https://www.psi.ch/en/lms/people/iurii-timrov
________________________________
From: users <[email protected]> on behalf of 
[email protected] <[email protected]>
Sent: Monday, May 5, 2025 01:10
To: [email protected] <[email protected]>
Subject: [QE-users] DFT+U+V neighbour allocation extremely slow and not 
parallelized


Dear QE users,



I am currently trying to run calculations on a 324-atom crystal system using 
DFT+U+V (I’m running a recompiled version of QE 7.2 with an increased natx 
parameter). More specifically, I am trying to use only the V parameter on the 
combination of two specific atomic orbitals which are supposed to form a 
polaronic dimer (the Hubbard potential is helping to localize the polaronic 
charge on the dimer after tuning). This worked well on smaller systems.

The issue that I’m facing with the large system is that the calculation is 
taking very long to initiate (~20 000 seconds), while the iterations themselves 
are relatively fast (~3000 seconds).This seems strange that the preliminary 
calculations take so much longer than the actual iterations. On smaller systems 
(96 atoms), I was not observing that trend, the bottleneck was the iterations 
as expected and there was barely any initialization time.

The most concerning aspect is that this bottleneck does not seem to be 
effectively distributed in parallel, because no matter how many nodes I use, 
the pre-iteration calculations always seem to take about 20 000 seconds 
(meanwhile, the iteration time does decrease when using more nodes)

The routine taking up all that time is the Hubbard routine “alloc_neigh”. I 
thought it might have been due to some memory issues, but looking at the memory 
usage of the nodes during the job, it seems to have only used about 30% of 
memory according to slurm “seff” utility. I have also experimented with 
different IO settings to try to reduce memory usage without success.



Is there a way to speed up that part of the calculation, or at least to 
distribute it over several nodes? Am I doing something wrong in my input?

Additionally, it seems that the program is calculating Hubbard projectors for 
every single atom of the species, even though I am only applying a V parameter 
on two hand-picked atoms, which seems very wasteful indeed. Is there a way to 
force the program to drop the Hubbard calculations on the atoms of the same 
species which do not receive a V value?



I have attached the input and output file of an example job (charge has been 
set to 0 in that particular one, but same problem happens when adding the 
polaronic charge)



Thanks in advance!

Julien

_______________________________________________________________________________
The Quantum ESPRESSO Foundation stands in solidarity with all civilians 
worldwide who are victims of terrorism, military aggression, and indiscriminate 
warfare.
--------------------------------------------------------------------------------
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users

Re: [QE-users] DFT+U+V neighbour allocation extremely slow and not parallelized

Reply via email to