Re: [QE-users] MD runs out of memory with increasing number of cores

2021-06-24 Thread Paolo Giannozzi
It's not easy. There is a trick (see dev-tools/mem_counter) to track the
allocated memory but it requires some recompilation (possibly after some
tweaking) and says something only on memory allocated with Fortran
"allocate". And of course, I have tried it and it does not show anything
suspicious. Otherwise: monitor memory usage with "memstat"

Paolo

On Wed, Jun 23, 2021 at 9:23 PM Lenz Fiedler  wrote:

> Dear Prof. Giannozzi,
>
> ah, I understand, that makes sense. Do you have any advice on how to best
> track such a memory leak down in this case? The behavior is reproducible
> with my setup.
>
> Kind regards
> Lenz
>
>
>
> Am Mi., 23. Juni 2021 um 14:02 Uhr schrieb Paolo Giannozzi <
> p.gianno...@gmail.com>:
>
>> On Wed, Jun 23, 2021 at 1:31 PM Lenz Fiedler 
>> wrote:
>>
>>  (and the increase in number of processes is most likely the reason for
>>> the error, if I understand you correctly?)
>>>
>>
>> not exactly. Too many processes may result in too much global memory
>> usage, because some arrays are replicated on each process.  If you exceed
>> the global available memory, the code will crash. BUT: it will do so during
>> the first MD step, not after 2000 MD steps. The memory usage should not
>> increase with the number of MD time steps. If it does, there is a memory
>> leak, either in the code or somewhere else (libraries etc).
>>
>> Paolo
>>
>> this is not a problem. For my Beryllium calculation it is more
>>> problematic since the 144 processors case really gives the best performance
>>> (I have uploaded a file called performance_Be128.png to show my timing
>>> results), but I still run out of memory after 2700 time steps. Although
>>> this is also manageable, since I can always restart the calculation and
>>> perform another 2700 time steps. With this I was able to perform 10.000
>>> time steps in just over a day. I am running more calculations on larger Be
>>> and Fe cells and I will investigate this behavior there.
>>>
>>> I have also used the "gamma" option for the K-points to use the
>>> performance benefits you outlined. For the Fe128 cell, I achieved optimal
>>> performance with 144 processors and using the "gamma" option (resulting in
>>> about 90s per SCF cycle). I am still not within my personal target of ~30s
>>> per SCF cycle but I will start looking into the choice of my PSP and cutoff
>>> (along with considering OpenMP and task group parallelization) rather than
>>> blindly throwing more and more processors at the problem.
>>>
>>> Kind regards
>>> Lenz
>>>
>>> PhD Student (HZDR / CASUS)
>>>
>>>
>>> Am Sa., 19. Juni 2021 um 09:25 Uhr schrieb Paolo Giannozzi <
>>> p.gianno...@gmail.com>:
>>>
 I tried your Fe job on a 36-core machine (with Gamma point to save time
 and memory) and found no evidence of memory leaks after more than 100 
 steps.

 The best performance I was able to achieve so far was with 144 cores
> defaulting to -nb 144, so am I correct to assume that I should try e.g. 
> -nb
> 144 -ntg 2 for 288 cores?
>

 You should not use option -nb except in some rather special cases.

 Paolo


 PhD Student (HZDR / CASUS)
>
> Am Mi., 16. Juni 2021 um 07:33 Uhr schrieb Paolo Giannozzi <
> p.gianno...@gmail.com>:
>
>> Hard to say without knowing exactly what goes out of which memory
>> limits. Note that not all arrays are distributed across processors, so a
>> considerable number of arrays are replicated on all processes. As a
>> consequence the total amount of required memory will increase with the
>> number of mpi processes. Also note that a 128-atom cell is not "large" 
>> and
>> 144 cores are not "a small number of processors". You will not get any
>> advantage by just increasing the number of processors any more, quite the
>> opposite. If you have too many idle cores, you should consider
>> - "task group" parallelization (option -ntg)
>> - MPI+OpenMP parallelization (configure --enable-openmp)
>> Please also note that ecutwfc=80 Ry is a rather large cutoff for a
>> USPP (while ecutrho=320 is fine) and that running with K_POINTS Gamma
>> instead of 1 1 1 0 0 0 will be faster and take less memory.
>>
>> Paolo
>>
>> On Mon, Jun 14, 2021 at 4:22 PM Lenz Fiedler 
>> wrote:
>>
>>> Dear users,
>>>
>>> I am trying to perform a MD simulation for a large cell (128 Fe
>>> atoms, gamma point) using pw.x and I get a strange scaling behavior. To
>>> test the performance I ran the same MD simulation with an increasing 
>>> number
>>> of nodes (2, 4, 6, 8, etc.) using 24 cores per node. The simulation is
>>> successful when using 2, 4, and 6 nodes, so 48, 96 and 144 cores resp
>>> (albeit slow, which is within my expectations for such a small number of
>>> processors).
>>> Going to 8 and more nodes, I run into an out-of-memory error after
>>> about two time steps.
>>> I am a little bit con

Re: [QE-users] CUDA-compiled version of Quantum Espresso

2021-06-24 Thread Filippo Spiga
Hello Chiara,

Maxwell and Turing architectures are not a good fit due to their reduced
double precision support. They target different segment. Kepler, Pascal,
Volta and Ampere are a good fit. Kepler and Pascal are old, I strongly
suggest Volta and Ampere (latest being better, obviously). If you have
Pascal, it can work.

Indipendently by the GPU architecture, there are GPU SKU that are more
suitable for double precisions than others. Quadro cards are not a good fit
with two exceptions: Quadro GP100 (very old, you can;t buy this new
anymore) and GV100 (you may be able to find some but is reached / has
reached EOL). For serious HPC doiuble-precision calculation you need the
Tesla product line, A100 PCIe or A100 SXM (HGX or DGX products).

In short: don't buy Quandro cards and dont' buy GeForce, performance will
not be optimal. If you are procuring a proper HPC system, you need to
target A100 products.

HTH

--
Filippo SPIGA ~ http://fspiga.github.io ~ skype: filippo.spiga


On Sun, 20 Jun 2021 at 16:29, Chiara Biz  wrote:

> Dear QE team,
>
> we would like to compile the GPU-QE since we learnt that is faster for
> spin-polarized wavefunctions.
> We would like to ask you which NVIDIA architecture is supported by QE:
> Maxwell, Pascal, Turing, Ampere?
> Can QE support SLY?
>
> These are crucial aspects for us because they will determine our choices
> on the market.
>
> Thank you very much for your attention and have a nice day.
>
> Yours Sincerely,
> Chiara Biz (MagnetoCat SL/SpinCat consortium)
>
> ___
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
> users mailing list users@lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users

Re: [QE-users] Top surface layer seems to be isolated from the rest of the system

2021-06-24 Thread Dr. K. C. Bhamu
Dear QE Users,
Could you please suggest any possible solution?

Regards
Bhamu

On Tue, Jun 22, 2021 at 1:44 PM Dr. K. C. Bhamu  wrote:

> Dear QE Users,
>
> I am trying to relax the (100) surface of WO3 (SG pm-3m: 221).
> The relaxation for bulk and the surface has conversed smoothly.
> The problem  I am facing for the surface is that the top layer (5 layer
> system keeping the bottom 3 layers fixed)  is dissociating from the
> surface. The bond length between vertical W-O is getting elongated to 2.21
> Ang from 1.91 Ang and the top surface layer seems to be isolated from the
> rest of the system.
>
> When I use 6 layer system keeping bottom 3 layers fixed, then the top two
> layers seem to be isolated from the rest of the system and the top two
> isolated layers also separated from each other.
> The input and output files can be accessed from here:
> https://we.tl/t-AlHUBg1DwE.
>
> I am wondering whether the high symmetric atomic positions of WO3 are
> culprits?
>
> Any suggestions will be appreciated.
>
> Regards
>
> K C Bhamu
>
> University of Ulsan,
> Republic of Korea
>
>
>
>
>
>
>
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users

Re: [QE-users] Projected Band structure (Mayuri Bora)

2021-06-24 Thread Marcelo Albuquerque
Dear Mayuri Bora,

There are some examples inside the PP directory of the QE package, for
example, example04, example05, fermisurf_example, ForceTheorem_example,
ForceTheorem_example, and MolDos_example.
I believe the first two are sufficient.

Hope it helps.

Best wishes,

Marcelo Albuquerque

Ph.D. Candidate

Institute of Physics

Fluminense Federal University
NiterĂ³i/RJ - Brazil



On Thu, Jun 24, 2021 at 7:01 AM Mayuri Bora wrote:

>
> Message: 1
> Date: Wed, 23 Jun 2021 15:59:27 +0530
> From: "Mayuri Bora" 
> To: users@lists.quantum-espresso.org
> Subject: Re: [QE-users] Projected Band structure
> Message-ID:
> <46e2400b9459a4a3da9fc5e720bdc243.squir...@webmail.tezu.ernet.in>
> Content-Type: text/plain;charset=utf-8
>
> I am waiting for the reply.
>
> Thanks in advance.
>
>
> > Dear QE Users
> >
> > Currently i am working on a heterostructure consisting of semimetal and
> > ferromagnetic insulator. I am stuck in projecting the band structure
> > exactly on which portion i have to project the band structure i am bit
> > stuck in it. However, i have given nband=150 for k-points 6 6 1.
> >
> > I will be thankful for the help.
> >
> > regards
> > Mayuri
> >
> > Mayuri Bora
> > INSPIRE Fellow
> > Advanced Functional Material Laboratory
> > Tezpur University
> > Napaam
> > http://www.tezu.ernet.in/afml/
> >
> >
> > * * * D I S C L A I M E R * * *
> > This e-mail may contain privileged information and is intended solely for
> > the individual named. If you are not the named addressee you should not
> > disseminate, distribute or copy this e-mail. Please notify the sender
> > immediately by e-mail if you have received this e-mail in error and
> > destroy it from your system. Though considerable effort has been made to
> > deliver error free e-mail messages but it can not be guaranteed to be
> > secure or error-free as information could be intercepted, corrupted,
> lost,
> > destroyed, delayed, or may contain viruses. The recipient must verify the
> > integrity of this e-mail message.
> >
>
>
> Mayuri Bora
> INSPIRE Fellow
> Advanced Functional Material Laboratory
> Tezpur University
> Napaam
> http://www.tezu.ernet.in/afml/
>
>
> * * * D I S C L A I M E R * * *
> This e-mail may contain privileged information and is intended solely for
> the individual named. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately by e-mail if you have received this e-mail in error and destroy
> it from your system. Though considerable effort has been made to deliver
> error free e-mail messages but it can not be guaranteed to be secure or
> error-free as information could be intercepted, corrupted, lost, destroyed,
> delayed, or may contain viruses. The recipient must verify the integrity of
> this e-mail message.
>
>
> --
>
>
>
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users