Dear developers and users,

I tried to run GPU version of QE for electron phonon coupling calculation on an 
a100 card. The structure relaxation and self-consistent calculation are 
successful. However, when I did phonon calculation, my job crashed with a 
strange error:


##################
[m005:65520:0:65520] Caught signal 11 (Segmentation fault: address not mapped 
to object at address 0xfffffffffffffffc)


/fs08/home/js_luqing/src/qe-7.2/PHonon/PH/phq_setup.f90: [ phq_setup_() ]
      ...
      322   !     nat_todo, atomo, 
comp_irr
      323 
      324   DO irr=0,nirr
==>   325      
comp_irr(irr)=comp_irr_iq(irr,current_iq)
      326      IF (elph .AND. irr>0) 
comp_elph(irr)=comp_irr(irr)
      327   ENDDO
      328   !


==== backtrace (tid:  65520) ====
 0 0x00000000004a2780 phq_setup_()  
/fs08/home/js_luqing/src/qe-7.2/PHonon/PH/phq_setup.f90:325
 1 0x00000000004700e1 initialize_ph_()  
/fs08/home/js_luqing/src/qe-7.2/PHonon/PH/initialize_ph.f90:79
 2 0x000000000041a811 do_phonon_()  
/fs08/home/js_luqing/src/qe-7.2/PHonon/PH/do_phonon.f90:100
 3 0x0000000000413d25 MAIN_()  
/fs08/home/js_luqing/src/qe-7.2/PHonon/PH/phonon.f90:78
 4 0x0000000000413c71 main()  ???:0
 5 0x0000000000022555 __libc_start_main()  ???:0
 6 0x000000000040cd8d _start()  ???:0
=================================
[m005:65520] *** Process received signal ***
[m005:65520] Signal: Segmentation fault (11)
[m005:65520] Signal code:  (-6)
[m005:65520] Failing at address: 0x6a80000fff0
[m005:65520] [ 0] /lib64/libpthread.so.0(+0xf630)[0x2ac4edf09630]
[m005:65520] [ 1] /fs08/home/js_luqing/src/qe-7.2/bin/ph.x[0x4a2780]
[m005:65520] [ 2] /fs08/home/js_luqing/src/qe-7.2/bin/ph.x[0x4700e1]
[m005:65520] [ 3] /fs08/home/js_luqing/src/qe-7.2/bin/ph.x[0x41a811]
[m005:65520] [ 4] /fs08/home/js_luqing/src/qe-7.2/bin/ph.x[0x413d25]
[m005:65520] [ 5] /fs08/home/js_luqing/src/qe-7.2/bin/ph.x[0x413c71]
[m005:65520] [ 6] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2ac4ee9e7555]
[m005:65520] [ 7] /fs08/home/js_luqing/src/qe-7.2/bin/ph.x[0x40cd8d]
[m005:65520] *** End of error message ***
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node m005 exited on signal 11 
(Segmentation fault).

#########################


I am not an expert of coding, but it seems like the line 325 wasn't recognized, 
which is fairly strange. I don't know how to solve this problem, and I am glad 
if anyone can help me.


Yours,
Qing Lu


lq1998
1148330...@qq.com



 
_______________________________________________
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users

Reply via email to