Hmm... That's quite annoying... :( Thanks for reporting back!
Den ons. 30. jun. 2021 kl. 15.39 skrev Karen Fidanyan < karen.fidan...@mpsd.mpg.de>: > Dear Nick, > > I've tried Divide-and-Conquer, Expert and QR, they all fail with the same > backtrace. > I couldn't compile MRRR, I think my ScaLAPACK misses some routines. > > But, following your idea about a bug in ScaLAPACK, I recompiled Siesta > with the MKL libraries from the Debian-9 repo. They are from 2019, not that > old. > It also failed for Divide-and-Conquer, but MRRR, Expert and QR work fine. > I think it's enough for my purposes, I'll use MRRR. But I don't know what > is wrong with D&C. I attach the output of the D&C with MKL, maybe you find > it useful. > > Thank you for help! > > Best, > Karen > On 6/30/21 11:25 AM, Nick Papior wrote: > > I have now tried to rerun it with 4.1. > And I get no error, even in debug mode. > > My bet is that the scalapack library is an old and buggy one. But I could > be wrong. > > Could you rerun with the different possibilities for Diag.Algorithm? > I.e. try them all and see which ones works, and which doesn't, then report > back. > > Den ons. 30. jun. 2021 kl. 11.16 skrev Karen Fidanyan < > karen.fidan...@mpsd.mpg.de>: > >> Dear Nick, >> >> thanks for helping! >> >> I redid it with -Og flag. The input, *.psf and the output are attached. I >> also attach debug.* files obtained with -DDEBUG. >> I run as `mpirun -np 2 ~/soft/siesta-4.1/Obj-dbg-Og/siesta control.fdf >> 2>&1 | tee siesta.out`. >> >> Sincerely, >> Karen Fidanyan >> On 6/28/21 10:22 PM, Nick Papior wrote: >> >> I can't rerun without psf files. >> >> Could you try and compile with -Og -g -fbacktrace (without fcheck=all). >> >> Then try again. :) >> >> Den man. 28. jun. 2021 kl. 22.01 skrev Karen Fidanyan < >> karen.fidan...@mpsd.mpg.de>: >> >>> Dear Siesta users, >>> >>> I'm having a hard time trying to run SIESTA on my Debian-9 laptop. >>> I have: >>> >>> GNU Fortran (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 >>> OpenMPI-2.0.2-2 >>> libblas 3.7.0-2, liblapack 3.7.0-2 >>> libscalapack-openmpi1 1.8.0-13 >>> >>> My arch.make is the following: >>> ********************************************************************** >>> .SUFFIXES: >>> .SUFFIXES: .f .F .o .a .f90 .F90 .c >>> >>> SIESTA_ARCH = gfortran_openMPI >>> >>> FPP = $(FC) -E -P -x c >>> FC = mpifort >>> FC_SERIAL = gfortran >>> FFLAGS = -O0 -g -fbacktrace -fcheck=all #-Wall >>> FFLAGS_DEBUG = -g -O0 >>> >>> PP = gcc -E -P -C >>> CC = gcc >>> CFLAGS = -O0 -g -Wall >>> >>> AR = ar >>> RANLIB = ranlib >>> SYS = nag >>> >>> LDFLAGS = -static-libgcc -ldl >>> >>> BLASLAPACK_LIBS = -llapack -lblas \ >>> -lscalapack-openmpi -lblacs-openmpi >>> -lblacsF77init-openmpi \ >>> -lblacsCinit-openmpi \ >>> -lpthread -lm >>> >>> MPI_INTERFACE = libmpi_f90.a >>> MPI_INCLUDE = . >>> >>> FPPFLAGS_MPI = -DMPI -DMPI_TIMING -D_DIAG_WORK >>> FPPFLAGS = $(DEFS_PREFIX) -DFC_HAVE_FLUSH -DFC_HAVE_ABORT $(FPPFLAGS_MPI) >>> >>> INCFLAGS = $(MPI_INCLUDE) >>> >>> LIBS = $(BLASLAPACK_LIBS) $(MPI_LIBS) >>> >>> atom.o: atom.F >>> $(FC) -c $(FFLAGS_DEBUG) $(INCFLAGS) $(FPPFLAGS) >>> $(FPPFLAGS_fixed_F) $< >>> >>> >>> .c.o: >>> $(CC) -c $(CFLAGS) $(INCFLAGS) $(CPPFLAGS) $< >>> .F.o: >>> $(FC) -c $(FFLAGS) $(INCFLAGS) $(FPPFLAGS) $(FPPFLAGS_fixed_F) $< >>> .F90.o: >>> $(FC) -c $(FFLAGS) $(INCFLAGS) $(FPPFLAGS) $(FPPFLAGS_free_F90) $< >>> .f.o: >>> $(FC) -c $(FFLAGS) $(INCFLAGS) $(FCFLAGS_fixed_f) $< >>> .f90.o: >>> $(FC) -c $(FFLAGS) $(INCFLAGS) $(FCFLAGS_free_f90) $< >>> ********************************************************************** >>> >>> The code compiles without errors. >>> If I run with Diag.ParallelOverK True, I can run on multiple cores, no >>> errors. >>> With Diag.ParallelOverK False, I can run `mpirun -np 1` without errors, >>> but if I try to use >=2 cores, it fails with: >>> ********************************************************************** >>> Program received signal SIGSEGV: Segmentation fault - invalid memory >>> reference. >>> >>> Backtrace for this error: >>> #0 0x2ba6eb754d1d in ??? >>> #1 0x2ba6eb753f7d in ??? >>> #2 0x2ba6ec95405f in ??? >>> #3 0x2ba70ec1cd8c in ??? >>> #4 0x2ba6eab438a4 in ??? >>> #5 0x2ba6eab44336 in ??? >>> #6 0x563b3f1cfead in __m_diag_MOD_diag_c >>> at /home/fidanyan/soft/siesta-4.1/Src/diag.F90:709 >>> #7 0x563b3f1d2ef9 in cdiag_ >>> at /home/fidanyan/soft/siesta-4.1/Src/diag.F90:2253 >>> #8 0x563b3ebc7c8d in diagk_ >>> at /home/fidanyan/soft/siesta-4.1/Src/diagk.F:195 >>> #9 0x563b3eb9d714 in __m_diagon_MOD_diagon >>> at /home/fidanyan/soft/siesta-4.1/Src/diagon.F:265 >>> #10 0x563b3ed897cb in __m_compute_dm_MOD_compute_dm >>> at /home/fidanyan/soft/siesta-4.1/Src/compute_dm.F:172 >>> #11 0x563b3edbfaa5 in __m_siesta_forces_MOD_siesta_forces >>> at /home/fidanyan/soft/siesta-4.1/Src/siesta_forces.F:315 >>> #12 0x563b3f9a4005 in siesta >>> at /home/fidanyan/soft/siesta-4.1/Src/siesta.F:73 >>> #13 0x563b3f9a408a in main >>> at /home/fidanyan/soft/siesta-4.1/Src/siesta.F:10 >>> >>> -------------------------------------------------------------------------- >>> mpirun noticed that process rank 0 with PID 0 on node fenugreek exited >>> on signal 11 (Segmentation fault). >>> ********************************************************************** >>> >>> I ran it by >>> `mpirun -np 2 ~/soft/siesta-4.1/Obj-debug-O0/siesta control.fdf | tee >>> siesta.out` >>> >>> The header of the broken calculation: >>> -------------------------------------------------------------------------------------------- >>> >>> >>> Siesta Version : v4.1.5-1-g384057250 >>> Architecture : gfortran_openMPI >>> Compiler version: GNU Fortran (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 >>> Compiler flags : mpifort -O0 -g -fbacktrace -fcheck=all >>> PP flags : -DFC_HAVE_FLUSH -DFC_HAVE_ABORT -DMPI -DMPI_TIMING >>> -D_DIAG_WORK >>> Libraries : -llapack -lblas -lscalapack-openmpi -lblacs-openmpi >>> -lblacsF77init-openmpi -lblacsCinit-openmpi -lpthread -lm >>> PARALLEL version >>> >>> * Running on 2 nodes in parallel >>> -------------------------------------------------------------------------------------------- >>> >>> >>> >>> I also attach the fdf file and the full output with an error. >>> Do you have an idea what is wrong? >>> >>> Sincerely, >>> Karen Fidanyan >>> PhD student >>> Max Planck Institute for the Structure and Dynamics of Matter >>> Hamburg, Germany >>> >>> >>> -- >>> SIESTA is supported by the Spanish Research Agency (AEI) and by the >>> European H2020 MaX Centre of Excellence (http://www.max-centre.eu/) >>> >> >> >> -- >> Kind regards Nick >> >> > > -- > Kind regards Nick > > -- Kind regards Nick
-- SIESTA is supported by the Spanish Research Agency (AEI) and by the European H2020 MaX Centre of Excellence (http://www.max-centre.eu/)