[QE-users] [QE-GPU] Significant Slowdown in GPU Phonon Calculation Preparation

2024-04-29 Thread Yin-Ying Ting

Dear Quantum Espresso Experts,

I've been running a phonon calculation using QE-7.2 on both CPU and GPU and 
have noticed significantly different runtimes in the preparation step. I'd 
appreciate insights on how to improve the GPU performance.


CPU Setup:
   Nodes: 8
   Cores per Node: 128
   Parallelization: 8 images, 4 npools (calculation has 4 k-points)

GPU Setup:
   Nodes: 8
   GPUs per Node: 4
   OMP_NUM_THREADS: 1
   Parallelization: 8 images, 4 npools (calculation has 4 k-points)



The preparation step, including steps like "Compute atoms," "Ewald sum," and 
charge density calculations, took around 9 hours on the CPU setup but over 24 hours on the GPU 
setup. Below is a section from the CPU calculation output before computing scf of each 
representation mode:



Compute atoms: 1,2,3,4,5,6,7,8,
   9,   10,   11,   12,   13,   14,   15,   16,
  17,   18,   19,   20,   21,   22,   23,   24,



Alpha used in Ewald sum =   2.8000

negative rho (up, down):  9.781E-01 0.000E+00
PHONON   :  8h51m CPU  8h56m WALL



Representation #   1 mode #   1

Self-consistent Calculation
---


Why might the preparation step for the GPU-based calculation be taking 
significantly longer? Are there specific optimizations or configurations I can 
apply to improve GPU performance in QE phonon calculations?

Thank you in advance for your help!


Best regards,

PhD student in Forschungszentrum Jülich
Yin-Ying Ting

--

Forschungszentrum Jülich GmbH
Institute of Energy and Climate Research
Theory and Computation of Energy Materials (IEK-13)
E-mail: y.t...@fz-juelich.de<mailto:y.t...@fz-juelich.de>




Forschungszentrum Jülich GmbH
52425 Jülich
Sitz der Gesellschaft: Jülich
Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
Karsten Beneke (stellv. Vorsitzender), Dr. Ir. Pieter Jansens


___
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users

[QE-users] Hubbard U Parameters in pmw.x

2024-01-24 Thread Yin-Ying Ting

Dear Quantum Espresso Community,

I am currently working with the poormanwannier.x (pmw.x) as demonstrated
in example05 of PP. In this example, the procedure begins with a
self-consistent field (SCF) calculation without the incorporation of
Wannier functions (wf) and with very small Hubbard parameters (around
0.001). This is followed by the execution of pmw.x and a subsequent
second SCF calculation, this time including Wannier functions and the
Hubbard U parameters.

My question pertains to the rationale behind this two-step SCF
calculation approach, particularly concerning the Hubbard U parameters.
Why is it not recommended or feasible to employ the Hubbard U parameters
right from the first SCF calculation? Understanding the underlying
reasoning for this process would greatly enhance my comprehension of the
method and its applications.

I appreciate any insights or explanations you can provide. Thank you for
your time and assistance.

Best regards,

PhD student in Forschungszentrum Jülich

Yin-Ying Ting

--

Forschungszentrum Jülich GmbH
Institute of Energy and Climate Research
Theory and Computation of Energy Materials (IEK-13)
E-mail: y.t...@fz-juelich.de





Forschungszentrum Jülich GmbH
52425 Jülich
Sitz der Gesellschaft: Jülich
Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
Karsten Beneke (stellv. Vorsitzender), Dr. Ir. Pieter Jansens


___
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users

Re: [QE-users] [QE-GPU] High GPU oversubscription detected

2023-11-30 Thread Yin-Ying Ting

Dear Paolo,

Thank you for your prompt response. Your suggestion was very helpful.

I have reviewed the numbers and discovered that regardless of the value set in 
--gres=gpu:X, the number of ndev remains consistently at 1. Our HPC 
documentation indicates that --gres=gpu:X is the correct method to set GPUs, 
with each node having 4 GPUs. Here is the output when I set --gres=gpu:4:

GPU acceleration is ACTIVE.
GPU-aware MPI enabled

nproc (MPI process): 4
ndev (GPU per Node): 1
nnode (Nodes):   1
Message from routine print_cuda_info:
High GPU oversubscription detected. Are you sure this is what you want?

I monitored GPU core usage every 10 seconds, it appears that all 4 GPU cores 
are activated when setting --gres=gpu:4:

utilization.gpu [%], utilization.memory [%]
96 %, 37 %
95 %, 76 %
95 %, 50 %
95 %, 76 %
time = 70 s

For reference, here is my sbatch submission script:

---

#!/bin/bash -x
#SBATCH --gres=gpu:4 --partition=dc-gpu
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --time=00:00:20

export OMP_NUM_THREADS=1

module load NVHPC/23.7-CUDA-12
module load CUDA/12
module load OpenMPI/4.1.5
module load mpi-settings/CUDA
module load imkl/2023.2.0

monitor_gpu_usage() {
   while true; do
   nvidia-smi --query-gpu=utilization.gpu,utilization.memory --format=csv 
>> gpu_usage_$SLURM_JOB_ID.csv
   sleep 10
   done
}
monitor_gpu_usage &
srun -n 4 pw.x -nk 4 -nd 1 -nb 1 -nt 1 < inp_pwscf > out_pwscf

-


Could you please provide guidance on resolving the oversubscription issue? 
Thank you very much in advance.

Kind regards,

Yin-Ying Ting


On 29.11.23 15:53, Paolo Giannozzi wrote:
On 11/27/23 11:32, Yin-Ying Ting wrote:

Based on the *environment.f90* file, this message is triggered when /nproc > ndev 
* nnode * 2/. If I understand correctly, I have nproc (Number of parallel 
processe)=4, ndev(Number of GPU Devices per Node) =4 and nnode (Number of Nodes)=1. 
This condition seems to be false (4 > 8). Despite this, the message still appears. 
All 4 GPUs were active during the run.

funny. Even funnier, the number of GPUs actually used does not seem to be 
written anywhere on output.

Add a line printing nproc, ndev, nnode just before the warning is issued, 
recompile and re-run. One (at least) of those numbers is not what you expect. 
Computers are not among the most reliable machines, but they should be able to 
find out who is larger between 4 and 8

Paolo
--

Forschungszentrum Jülich GmbH
Institute of Energy and Climate Research
Theory and Computation of Energy Materials (IEK-13)
E-mail: y.t...@fz-juelich.de<mailto:y.t...@fz-juelich.de>




Forschungszentrum Jülich GmbH
52425 Jülich
Sitz der Gesellschaft: Jülich
Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
Karsten Beneke (stellv. Vorsitzender), Dr. Ir. Pieter Jansens


___
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users

[QE-users] [QE-GPU] High GPU oversubscription detected

2023-11-27 Thread Yin-Ying Ting

Dear Quantum Espresso Experts,

I am encountering an unexpected message regarding GPU oversubscription when running pw.x 
with GPU on an HPC system. The system has 4 GPU cores per node. I did not receive any 
warnings when running calculations with 1 node and 1 GPU (ntasks=1, npools=1) or with 1 
node and 2 GPUs (ntasks=2, npools=2). However, upon running with 1 node and 4 GPUs 
(ntasks=4, npools=4), I receive the following message: "Message from routine 
print_cuda_info: High GPU oversubscription detected. Are you sure this is what you 
want?"

Here's the SLURM batch script I used:

#!/bin/bash -x
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=1
#SBATCH --output=TbHf.%j
#SBATCH --error=mpi-err.%j
#SBATCH --gres=gpu:4

export OMP_NUM_THREADS=1

srun ~/qe-7.2-gpu/bin/pw.x -nk 4 -nb 1 -nt 1 -nd 1 < inp_pwscf > out_pwscf


Based on the environment.f90 file, this message is triggered when nproc > ndev * 
nnode * 2. If I understand correctly, I have nproc (Number of parallel processe)=4, 
ndev(Number of GPU Devices per Node) =4 and nnode (Number of Nodes)=1. This condition 
seems to be false (4 > 8). Despite this, the message still appears. All 4 GPUs 
were active during the run.

Could you please help me understand why this message is appearing under these 
conditions? Any insights or suggestions to address this would be greatly 
appreciated.

Thank you in advance for your help!


PhD student in Forschungszentrum Jülich

Yin-Ying Ting


--

Forschungszentrum Jülich GmbH
Institute of Energy and Climate Research
Theory and Computation of Energy Materials (IEK-13)
E-mail: y.t...@fz-juelich.de<mailto:y.t...@fz-juelich.de>




Forschungszentrum Jülich GmbH
52425 Jülich
Sitz der Gesellschaft: Jülich
Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
Karsten Beneke (stellv. Vorsitzender), Dr. Ir. Pieter Jansens


___
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users