The qq2rr code is serial right? After running in /scratch (for memory
issue) it shows Bus error!
/var/spool/slurm/slurmd.spool/job3829394/slurm_script: line 38: 179584
Done ls anh*
179585 Bus error |
/gpfs/home/kghosh/kanka/qe-6.5/bin/d3_qq2rr.x 1 1 1
Job finished
Any specific reason for this error?
It is parallelized with openmp (not MPI), although I have not tested it
in a while. I do not know what causes a bus error, it is not something I
had seen since the nineties. Maybe out of memory ? If you are running it
on a cluster, it may be better to submit it as a job even if it is serial
Regards,
Kanka
Kanka Ghosh
Postdoctoral Researcher
I2M-Bordeaux
University of Bordeaux, CNRS UMR 5295
Site: Ecole Nationale Supérieure des Arts et Métiers
Bordeaux-Talence 33400
------------------------------------------------------------------------
*From: *"Lorenzo Paulatto" <[email protected]>
*To: *"users" <[email protected]>
*Sent: *Friday, December 4, 2020 1:12:09 PM
*Subject: *Re: [QE-users] D3Q code stopped due to davcio error
Yes it took little more than 5 days to compute only the first
q-point. anyway it seems that I should use 1x1x1 grid instead of
2x2x2. But are you suggesting to do the single mode calculation
with 1x1x1 grid or the "mode=full" using the 1x1x1 grid?
Yes, but no need to do it: you have done it already. You can just call
d3_qq2rr and specify "1 1 1" as the grid size:
ls anh*| d3_qq2rr.x 1 1 1
and it will automatically compute the force constants from the
calculation at (0,0,0). This way you can immediately test how it works.
If you want to try the 2x2x2 grid, I would use 10 pools and maybe try
with *fewer* CPUs per pool: at the moment you are using 128 which
requires a lot of communications. If the calculation fits in RAM, I
would recommend keeping each pool on a single computing node.
You may try to use some local scratch in order to avoid running out of
disk space (ask the cluster managers what to use).
Finally, if you manage to get everything running, you can run al the
q-points triplet simultaneously as different batch jobs by setting
"first" and "last". You can have the same outdir and prefix, as long
as they work on different triplets, they will not interfere (this is
true for d3q, but not in general for other linear response codes)
hth
Regards,
Kanka
Kanka Ghosh
Postdoctoral Researcher
I2M-Bordeaux
University of Bordeaux, CNRS UMR 5295
Site: Ecole Nationale Supérieure des Arts et Métiers
Bordeaux-Talence 33400
------------------------------------------------------------------------
*From: *"Lorenzo Paulatto" <[email protected]>
*To: *"users" <[email protected]>
*Sent: *Friday, December 4, 2020 9:09:50 AM
*Subject: *Re: [QE-users] D3Q code stopped due to davcio error
Thanks for pointing out the storage issue. Yes, I am running
it at the French computing centre (University of Bordeaux's
cluster system (curta, mcia)). Here I am attaching the d3q
output file. Indeed, it was in the process of computing the
second q-point triplet.
I do not have access to Bordeaux cluster, but I could ask it if
you need that I look at the code. That said, I see that to compute
the first q-point it took about 5 days, it will take at least a
month to do the second point ! Because it has less symmetry the
code needs to compute 2x more k-points and 3x more perturbations.
"Maybe for such a large system you can get some decent
force-constants already from (0,0,0) alone"
In that case, you mean to implement the "mode=gamma-only" tag?
Not really, the triple (0,0,0) is in itself the 1x1x1 grid, and
you can threat it as such. Thanks to some Fourier interpolation
trickery, you can use it to get the D3 matrices at any point.
Also, the d3_qq2rr code is not particularly optimized, and is not
parallelized I'm not sure you would manage to compute the Fourier
transform of the 2x2x2 grid anyway.
You have to keep in mind that the 3-body force constant become
huge very quickly with the number of atoms and the size of the
grid: each D3 matrix has (3*nat)^3 complex elements, and a grid n
x n x n contains n^6 power triplets
In your case, the 2x2x2 grid would use about 2.2GB of RAM, which
is probably still feasible, but i would try the 1x1x1 first.
cheers
Regards,
Kanka
Kanka Ghosh
Postdoctoral Researcher
I2M-Bordeaux
University of Bordeaux, CNRS UMR 5295
Site: Ecole Nationale Supérieure des Arts et Métiers
Bordeaux-Talence 33400
------------------------------------------------------------------------
*From: *"Lorenzo Paulatto" <[email protected]>_
*To: *"users" <[email protected]>
*Sent: *Thursday, December 3, 2020 11:23:58 PM
*Subject: *Re: [QE-users] D3Q code stopped due to davcio error
task # 71
from davcio : error # 5011
error while writing from file
".//D3_Q1.0_0_0_Q2.0_0_-1o2_Q3.0_0_1o2/scf.d1.dq1pq1.72"
I guess it may have run out of space, d3q uses a ton of disk
space and there is not easy way to avoid this. If you are
running on any of the French computing centers I can try to
have a look directly.
I do not think the change in number of CPUs could cause this
problem, but if you provide the full output I can check. Also,
44 atoms is a lot for the d3q code, it seems like you're
running the second q-point triplet, which is of kind (0,q,-q),
it takes much more time and disk space than the triplet
(0,0,0). Maybe for such a large system you can get some decent
force-constants already from (0,0,0) alone
cheers
Kanka Ghosh
Postdoctoral Researcher
I2M-Bordeaux
University of Bordeaux, CNRS UMR 5295
Site: Ecole Nationale Supérieure des Arts et Métiers
Bordeaux-Talence 33400
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list [email protected]
https://lists.quantum-espresso.org/mailman/listinfo/users