[petsc-users] random SLEPc segfault using openmpi-3.0.1
Hi, I'm using SLEPc to diagonalize a huge sparse matrix and I've encountered random segmentation faults. I'm actually using a the slepc example 4 without modifications to rule out errors due to coding. Concretely, I use the command line ompirun -n 28 ex4 \ -file amatrix.bin -eps_tol 1e-6 -eps_target 0 -eps_nev 18 \ -eps_harmonic -eps_ncv 40 -eps_max_it 10 \ -eps_monitor -eps_view -eps_view_values -eps_view_vectors 2>&1 |tee -a $LOGFILE The program runs for some time (about half a day) and then stops with the error message [13]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range There is definitely enough memory, because I'm using less than 4% of the available 128GB. Since everything worked fine on a slower computer with a different setup and from previous mailing list comments, I have the feeling that this might be due to some issues with MPI. Unfortunately, I have to share the computer with other people and can not uninstall the current MPI implementation and I've also heard that there are issues if you install more than one MPI implementation. For your information: I've configured PETSc with ./configure --with-mpi-dir=/home/applications/builds/intel_2018/openmpi-3.0.1/ --with-scalar-type=complex --download-mumps --download-scalapack --with-blas-lapack-dir=/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl I wanted to ask a few things: - Is there a known issue with openmpi causing random segmentation faults? - I've also tried to install everything needed by configuring PETSc with ./configure \ --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-scalar-type=complex \ --download-mumps --download-scalapack --download-mpich --download-fblaslapack Here, the problem is that performing the checks after "make" stops after the check with 1 MPI process, i.e., the check using 2 MPI just never finishes. Is that a known issue of conflict between the downloaded mpich and the installed openmpi? Do you know a way to install mpich without conflicts with openmpi without actually removing openmpi? - Some time ago a posted a question in the mailing list about how to compile SLEPc/PETSc with OpenMP only instead of MPI. After some time, I was able to get MPI to work on a different computer, but I was never really able to use OpenMP with slepc, but it would be very useful in the present situation. The programs compile but they never take more than 100% CPU load as displayed by top. The answers to my question contained the recommendations that I should configure with --download-openblas and have the OMP_NUM_THREADS variable set when executing the program. I did it, but it didn't help either. So, my question: has someone ever managed to find a configure line that disables MPI but enables the usage of OpenMP so that the slepc ex4 program uses significantly more than 100% CPU usage when executing the standard Krylov-Schur method? Regards, Moritz
Re: [petsc-users] STFILTER in slepc
Thank you for the response. This example is exactly the kind of thing I thought about. But, as you say, convergence is indeed not better. What's the best format to send the matrix to slepc-maint? My small test case is a sparse matrix of dimension 30720. I currently read it from three files (containing IA, JA, and the values in CSR format). Because everything is contained in a larger framework, it's not very easy to extract the reader. Is there a generic way to read in sparse matrices from files for PETSc? Maybe you can already give me a hint about the best method if I tell you the properties of the spectrum: The problem is the calculation of electronic states in a quantum dot. There is a band gap from around 0 to 2 and I am interested in the first 20 or 40 eigenvectors above and below the gap. The first few states are separated by about 0.1 but for higher states the energies come closer and closer together. Additionally, there are a number of states with energies around 1000, which are artificial and originate from the way we treat the boundary conditions. Also, I know that all states come in pairs with the same energy (Kramers degeneracy). For the small test case, the conduction band states, i.e. eigenvectors with energies close to 2, converge very fast (about 5 min on a laptop computer). However, the states with energies around 0 converge much more slowly and that's one of my major problems. For those states harmonic extraction seems to be better suited, but I have the impression that it is not extremely stable. For example, applied to the states close to 2 I see that some states are skipped, which can be seen by the fact that the degeneracies are sometimes wrong. Also, with harmonic extraction, the program sometimes stops claiming the number of requested eigenpairs are reached, but the calculated relative error of most states is way larger than the tolerance. Maybe you know from experience which method is better suited to tackle these kinds of problems? Eventually, I intend to do calculates with dimensions of ~ 10 million distributed of a few 100 CPUs. Regards, Moritz From: Jose E. Roman Sent: Thursday, October 11, 2018 5:55 AM To: Matthew Knepley; Moritz Cygorek Cc: PETSc; Carmen Campos Subject: Re: [petsc-users] STFILTER in slepc The filter technique must be used with an interval in EPS, e.g. -eps_interval 2.,2.7 (This interval will be passed to STFILTER, so no need to specify -st_filter_interval). Therefore, it does not require -eps_target_magnitude or similar. Regarding the simpler (A-tau*I)^2, we call it "spectrum folding" (see section 3.4.6 of SLEPc's manual) and it is implemented in ex24.c http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex24.c.html My experience is that spectrum folding will not give good convergence except for easy problems. For more difficult problems with clustered eigenvalues you need a polynomial filter. The technique of polynomial filter trades iterations of the Krylov subspace for matrix-vector products required by a high-degree polynomial. If you want, send us the matrix to slepc-maint and we will have a try. Jose > El 11 oct 2018, a las 2:43, Matthew Knepley escribió: > > On Wed, Oct 10, 2018 at 8:41 PM Moritz Cygorek wrote: > Thank you very much. Apparently, I've misunderstood what the filter actually > does. I thought about the much simpler process, where you diagonalize > > > > -(A- tau*I)^2 +offset*I > > > > where tau is my target an offset is large enough so that the global maximum > is reached for eigenvalues around tau. > > > Is this different from -eps_target_magnitude? > > Thanks, > > Matt > > Then you look for the largest eigenvalue of the modified problem and either > calculate the Ritz value of the original matrix or calculate back from the > eigenvalues of the modified problem. > > > > Now, it looks to me like -st_type filter activates something like the package > FILTLAN. > > > > I guess I can define a MatShell to do the thing I intended in the first place. > > But, I guess, this is a common thing, so I am wondering whether it is already > implemented somewhere and I just didn't find it in the documentation. Can > you say something about this? > > > > Regards, > > Moritz > > > > > > From: Jose E. Roman > Sent: Wednesday, October 10, 2018 3:48 PM > To: Moritz Cygorek > Cc: petsc-users@mcs.anl.gov > Subject: Re: [petsc-users] STFILTER in slepc > > This type of method requires a very high degree polynomial; suggest using > degree=100 at least (this is the default value), but larger values may be > necessary. Also, for this particular filter the "range" must be approximately > equal to the numerical range; if you have no
Re: [petsc-users] STFILTER in slepc
Thank you very much. Apparently, I've misunderstood what the filter actually does. I thought about the much simpler process, where you diagonalize -(A- tau*I)^2 +offset*I where tau is my target an offset is large enough so that the global maximum is reached for eigenvalues around tau. Then you look for the largest eigenvalue of the modified problem and either calculate the Ritz value of the original matrix or calculate back from the eigenvalues of the modified problem. Now, it looks to me like -st_type filter activates something like the package FILTLAN. I guess I can define a MatShell to do the thing I intended in the first place. But, I guess, this is a common thing, so I am wondering whether it is already implemented somewhere and I just didn't find it in the documentation. Can you say something about this? Regards, Moritz From: Jose E. Roman Sent: Wednesday, October 10, 2018 3:48 PM To: Moritz Cygorek Cc: petsc-users@mcs.anl.gov Subject: Re: [petsc-users] STFILTER in slepc This type of method requires a very high degree polynomial; suggest using degree=100 at least (this is the default value), but larger values may be necessary. Also, for this particular filter the "range" must be approximately equal to the numerical range; if you have no clue where your first and last eigenvalues are, you may use EPSSolve() calls with EPS_LARGEST_REAL and EPS_SMALLEST_REAL. Jose > El 10 oct 2018, a las 21:10, Moritz Cygorek escribió: > > Thank you for the fast reply. > > I've tried running my program (using the defaul Krylov-Schur method for > sparse MPI matrices) with the additional options: > > -st_type filter -st_filter_degree 2 -st_filter_interval 2.,2.7 > -st_filter_range -2000,2000 > > and I get the following error message: > > [0]PETSC ERROR: STFILTER cannot get the filter specified; please adjust your > filter parameters (e.g. increasing the polynomial degree) > > [0]PETSC ERROR: #1 FILTLAN_GetIntervals() line 451 in > /home/applications/sources/libraries/slepc-3.9.2/src/sys/classes/st/impls/filter/filtlan.c > [0]PETSC ERROR: #2 STFilter_FILTLAN_setFilter() line 1016 in > /home/applications/sources/libraries/slepc-3.9.2/src/sys/classes/st/impls/filter/filtlan.c > [0]PETSC ERROR: #3 STSetUp_Filter() line 42 in > /home/applications/sources/libraries/slepc-3.9.2/src/sys/classes/st/impls/filter/filter.c > [0]PETSC ERROR: #4 STSetUp() line 271 in > /home/applications/sources/libraries/slepc-3.9.2/src/sys/classes/st/interface/stsolve.c > [0]PETSC ERROR: #5 EPSSetUp() line 263 in > /home/applications/sources/libraries/slepc-3.9.2/src/eps/interface/epssetup.c > [0]PETSC ERROR: #6 EPSSolve() line 135 in > /home/applications/sources/libraries/slepc-3.9.2/src/eps/interface/epssolve.c > > > > Do you have a clue what I've missed? > > > Moritz > > > From: Jose E. Roman > Sent: Wednesday, October 10, 2018 2:30 PM > To: Moritz Cygorek > Cc: petsc-users@mcs.anl.gov > Subject: Re: [petsc-users] STFILTER in slepc > > > > El 10 oct 2018, a las 19:54, Moritz Cygorek escribió: > > > > Hi, > > > > in the list of changes to SLEPc version 3.8, it is stated that there is a > > preliminary implementation of polynomial filtering using STFILTER. > > > > Because I am struggling to obtain interior eigenvalues and harmonic > > extraction seems not to be stable enough in my case, I wanted to give it a > > try, but I could not find any documentation yet. > > > > Does anybody have an example of how to use STFILTER or any documentation > > about it? > > > > Thanks in advance, > > Moritz > > There are no examples. You just set the type to STFILTER and set some > parameters such as the interval of interest or the polynomial degree. See > functions starting with STFilter > here:http://slepc.upv.es/documentation/current/docs/manualpages/ST/index.html > > In some problems it works well, but don't expect too much. It is still in our > to-do list to make it more usable. It will be good to have your feedback. If > you want, send results to slepc-maint, maybe we can help tuning the > parameters. > > Jose
Re: [petsc-users] STFILTER in slepc
Thank you for the fast reply. I've tried running my program (using the defaul Krylov-Schur method for sparse MPI matrices) with the additional options: -st_type filter -st_filter_degree 2 -st_filter_interval 2.,2.7 -st_filter_range -2000,2000 and I get the following error message: [0]PETSC ERROR: STFILTER cannot get the filter specified; please adjust your filter parameters (e.g. increasing the polynomial degree) [0]PETSC ERROR: #1 FILTLAN_GetIntervals() line 451 in /home/applications/sources/libraries/slepc-3.9.2/src/sys/classes/st/impls/filter/filtlan.c [0]PETSC ERROR: #2 STFilter_FILTLAN_setFilter() line 1016 in /home/applications/sources/libraries/slepc-3.9.2/src/sys/classes/st/impls/filter/filtlan.c [0]PETSC ERROR: #3 STSetUp_Filter() line 42 in /home/applications/sources/libraries/slepc-3.9.2/src/sys/classes/st/impls/filter/filter.c [0]PETSC ERROR: #4 STSetUp() line 271 in /home/applications/sources/libraries/slepc-3.9.2/src/sys/classes/st/interface/stsolve.c [0]PETSC ERROR: #5 EPSSetUp() line 263 in /home/applications/sources/libraries/slepc-3.9.2/src/eps/interface/epssetup.c [0]PETSC ERROR: #6 EPSSolve() line 135 in /home/applications/sources/libraries/slepc-3.9.2/src/eps/interface/epssolve.c Do you have a clue what I've missed? Moritz From: Jose E. Roman Sent: Wednesday, October 10, 2018 2:30 PM To: Moritz Cygorek Cc: petsc-users@mcs.anl.gov Subject: Re: [petsc-users] STFILTER in slepc > El 10 oct 2018, a las 19:54, Moritz Cygorek escribió: > > Hi, > > in the list of changes to SLEPc version 3.8, it is stated that there is a > preliminary implementation of polynomial filtering using STFILTER. > > Because I am struggling to obtain interior eigenvalues and harmonic > extraction seems not to be stable enough in my case, I wanted to give it a > try, but I could not find any documentation yet. > > Does anybody have an example of how to use STFILTER or any documentation > about it? > > Thanks in advance, > Moritz There are no examples. You just set the type to STFILTER and set some parameters such as the interval of interest or the polynomial degree. See functions starting with STFilter here: http://slepc.upv.es/documentation/current/docs/manualpages/ST/index.html In some problems it works well, but don't expect too much. It is still in our to-do list to make it more usable. It will be good to have your feedback. If you want, send results to slepc-maint, maybe we can help tuning the parameters. Jose
[petsc-users] STFILTER in slepc
Hi, in the list of changes to SLEPc version 3.8, it is stated that there is a preliminary implementation of polynomial filtering using STFILTER. Because I am struggling to obtain interior eigenvalues and harmonic extraction seems not to be stable enough in my case, I wanted to give it a try, but I could not find any documentation yet. Does anybody have an example of how to use STFILTER or any documentation about it? Thanks in advance, Moritz
[petsc-users] memory corruption when using harmonic extraction with SLEPc
Hi, I want to diagonalize a huge sparse matrix and I'm using the Kryov-Schur method with harmonic extraction (command line option -eps_harmonic ) implemented in SLEPc. I manually distribute a sparse matrix across several CPUs and everything works fine when: - I do _not_ use harmonic extraction - I use harmonic extraction on only a single CPU If I try do use harmonic extraction on multiple CPUs, I get a memory corruption. I'm not quite sure where to look at, but somewhere in the output, I find: [1]PETSC ERROR: PetscMallocValidate: error detected at PetscSignalHandlerDefault() line 145 in /home/applications/sources/libraries/petsc-3.9.3/src/sys/error/signal.c [1]PETSC ERROR: Memory [id=0(9072)] at address 0x145bcd0 is corrupted (probably write past end of array) [1]PETSC ERROR: Memory originally allocated in DSAllocateWork_Private() line 74 in /home/applications/sources/libraries/slepc-3.9.2/src/sys/classes/ds/interface/dspriv.c Now, I have the feeling that this might be a bug in SLEPc because, if I had messed up the matrix initialization and distribution, I should also get a memory corruption when I don't use harmonic extraction, right? Any suggestions what might be going on? Regards, Moritz
Re: [petsc-users] Documentation for different parallelization options
Thank you very much for your response. I have tested the --download-openblas option and it did not do what I expected. The total cpu-usage only moved to something like 105%, so it did not significantly make use of parallelization. I did not test MKL yet, because I'll first have to install it. However, compiling PETSc with MUMPS works very well and significantly speeds up the caluclation for my full-MPI code. I will have to do some more testing, but MPI with MUMPS support seems to be the way to go for me. Thanks again, Moritz From: Jose E. Roman Sent: Tuesday, June 5, 2018 5:43:37 PM To: Moritz Cygorek Cc: petsc-users@mcs.anl.gov Subject: Re: [petsc-users] Documentation for different parallelization options For multi-threaded parallelism you have to use a multi-threaded BLAS such as MKL or OpenBLAS: $ ./configure --with-blaslapack-dir=$MKLROOT or $ ./configure --download-openblas For MPI parallelism, if you are solving linear systems within EPS you most probably need PETSc be configured with a parallel linear solver such as MUMPS, see section 3.4.1 of SLEPc's user manual. Jose > El 5 jun 2018, a las 19:00, Moritz Cygorek escribió: > > Hi everyone, > > I'm looking for a document/tutorial/howto that describes the different > options to compile PETSc with parallelization. > > My problem is the following: > I'm trying to solve a large sparse eigenvalue problem using the Krylov-Schur > method implemented in SLEPc > When I install SLEPc/PETSc on my Ubuntu laptop via apt-get, everything works > smoothly and parallelization works automatically. > I see this by the fact that the CPU-load of the process (only one process, > not using mpiexec) is close to 400% according to "top" > Therefore, it seems that OpenMP is used. > > I have access to better computers and I would like to install SLEPc/PETSc > there, but I have to configure it manually. > I have tried different options, none of the satisfactory: > > When I compile PETSc with the --with-openmp flag, I see that the program > never runs with cpu load above 100%. > I use the same command to call the program as on my laptop where everything > works. So it seems that openmp is somehow not activated. > An old mailing list entry says that I am supposed to configure PETSc using > --with-threadcomm --with-openmp, which I did, but it also didn't help. > However that entry was from 2014 and I found in the list of changes for PETSc > in version 3.6: > "Removed all threadcomm support including --with-pthreadclasses and > --with-openmpclasses configure arguments" > > Does that mean that openmp is no longer supported in newer versions? > > > Given my resources, I would prefer OpenMP over MPI. Nevertheless, I then > spent some time to go full MPI without openmp and to split up the sparse > matrix across several processes. When I start the program using mpiexec, > I see indeed that multiple processes are started, but even when I use 12 > processes, the computation time is about the same as with only 1 process. > Is there anything I have to tell the EPS solver to activate parallelization? > > > So, all in all, I can't get to run anything faster on a large multi-core > computer than on my old crappy laptop. > > > I have no idea how to start debugging and assessing the performance and the > documentation on this issue on the website is not very verbose. > Can you give me a few hints? > > Regards, > Moritz > > >
[petsc-users] Documentation for different parallelization options
Hi everyone, I'm looking for a document/tutorial/howto that describes the different options to compile PETSc with parallelization. My problem is the following: I'm trying to solve a large sparse eigenvalue problem using the Krylov-Schur method implemented in SLEPc When I install SLEPc/PETSc on my Ubuntu laptop via apt-get, everything works smoothly and parallelization works automatically. I see this by the fact that the CPU-load of the process (only one process, not using mpiexec) is close to 400% according to "top" Therefore, it seems that OpenMP is used. I have access to better computers and I would like to install SLEPc/PETSc there, but I have to configure it manually. I have tried different options, none of the satisfactory: When I compile PETSc with the --with-openmp flag, I see that the program never runs with cpu load above 100%. I use the same command to call the program as on my laptop where everything works. So it seems that openmp is somehow not activated. An old mailing list entry says that I am supposed to configure PETSc using --with-threadcomm --with-openmp, which I did, but it also didn't help. However that entry was from 2014 and I found in the list of changes for PETSc in version 3.6: "Removed all threadcomm support including --with-pthreadclasses and --with-openmpclasses configure arguments" Does that mean that openmp is no longer supported in newer versions? Given my resources, I would prefer OpenMP over MPI. Nevertheless, I then spent some time to go full MPI without openmp and to split up the sparse matrix across several processes. When I start the program using mpiexec, I see indeed that multiple processes are started, but even when I use 12 processes, the computation time is about the same as with only 1 process. Is there anything I have to tell the EPS solver to activate parallelization? So, all in all, I can't get to run anything faster on a large multi-core computer than on my old crappy laptop. I have no idea how to start debugging and assessing the performance and the documentation on this issue on the website is not very verbose. Can you give me a few hints? Regards, Moritz