The functions called to set the solver are (in this order): EPSCreate(); EPSSetOperators(); EPSSetProblemType(EPS_HEP); EPSSetType(EPSLANCZOS); EPSSetWhichEigenpairs(EPS_SMALLEST_REAL); EPSSetFromOptions();
The output of -eps_view for each run is: ================================================================= EPS Object: 960 MPI processes type: lanczos LOCAL reorthogonalization problem type: symmetric eigenvalue problem selected portion of the spectrum: smallest real parts number of eigenvalues (nev): 1 number of column vectors (ncv): 16 maximum dimension of projected problem (mpd): 16 maximum number of iterations: 291700777 tolerance: 1e-08 convergence test: relative to the eigenvalue BV Object: 960 MPI processes type: svec 17 columns of global length 2333606220 vector orthogonalization method: modified Gram-Schmidt orthogonalization refinement: if needed (eta: 0.7071) block orthogonalization method: GS doing matmult as a single matrix-matrix product generating random vectors independent of the number of processes DS Object: 960 MPI processes type: hep parallel operation mode: REDUNDANT solving the problem with: Implicit QR method (_steqr) ST Object: 960 MPI processes type: shift shift: 0. number of matrices: 1 ================================================================= EPS Object: 1024 MPI processes type: lanczos LOCAL reorthogonalization problem type: symmetric eigenvalue problem selected portion of the spectrum: smallest real parts number of eigenvalues (nev): 1 number of column vectors (ncv): 16 maximum dimension of projected problem (mpd): 16 maximum number of iterations: 291700777 tolerance: 1e-08 convergence test: relative to the eigenvalue BV Object: 1024 MPI processes type: svec 17 columns of global length 2333606220 vector orthogonalization method: modified Gram-Schmidt orthogonalization refinement: if needed (eta: 0.7071) block orthogonalization method: GS doing matmult as a single matrix-matrix product generating random vectors independent of the number of processes DS Object: 1024 MPI processes type: hep parallel operation mode: REDUNDANT solving the problem with: Implicit QR method (_steqr) ST Object: 1024 MPI processes type: shift shift: 0. number of matrices: 1 ================================================================= I run again the same configurations and I got the same result in term of the number of iterations. I also tried the full reorthogonalization (always with the -bv_reproducible_random option) but I still get different number of iterations: for 960 procs I get 172 iters, and for 1024 I get 362 iters. The -esp_view output for this case (only for 960 procs, the other one has the same information -except the number of processes-) is: ================================================================= EPS Object: 960 MPI processes type: lanczos FULL reorthogonalization problem type: symmetric eigenvalue problem selected portion of the spectrum: smallest real parts number of eigenvalues (nev): 1 number of column vectors (ncv): 16 maximum dimension of projected problem (mpd): 16 maximum number of iterations: 291700777 tolerance: 1e-08 convergence test: relative to the eigenvalue BV Object: 960 MPI processes type: svec 17 columns of global length 2333606220 vector orthogonalization method: classical Gram-Schmidt orthogonalization refinement: if needed (eta: 0.7071) block orthogonalization method: GS doing matmult as a single matrix-matrix product generating random vectors independent of the number of processes DS Object: 960 MPI processes type: hep parallel operation mode: REDUNDANT solving the problem with: Implicit QR method (_steqr) ST Object: 960 MPI processes type: shift shift: 0. number of matrices: 1 ================================================================= El mié., 24 oct. 2018 a las 10:52, Jose E. Roman (<jro...@dsic.upv.es>) escribió: > This is very strange. Make sure you call EPSSetFromOptions in the code. Do > iteration counts change also for two different runs with the same number of > processes? > Maybe Lanczos with default options is too sensitive (by default it does > not reorthogonalize). Suggest using Krylov-Schur or Lanczos with full > reorthogonalization (EPSLanczosSetReorthog). > Also, send the output of -eps_view to see if there is anything abnormal. > > Jose > > > > El 24 oct 2018, a las 9:09, Ale Foggia <amfog...@gmail.com> escribió: > > > > I've tried the option that you gave me but I still get different number > of iterations when changing the number of MPI processes: I did 960 procs > and 1024 procs and I got 435 and 176 iterations, respectively. > > > > El mar., 23 oct. 2018 a las 16:48, Jose E. Roman (<jro...@dsic.upv.es>) > escribió: > > > > > > > El 23 oct 2018, a las 15:46, Ale Foggia <amfog...@gmail.com> escribió: > > > > > > > > > > > > El mar., 23 oct. 2018 a las 15:33, Jose E. Roman (<jro...@dsic.upv.es>) > escribió: > > > > > > > > > > El 23 oct 2018, a las 15:17, Ale Foggia <amfog...@gmail.com> > escribió: > > > > > > > > Hello Jose, thanks for your answer. > > > > > > > > El mar., 23 oct. 2018 a las 12:59, Jose E. Roman (< > jro...@dsic.upv.es>) escribió: > > > > There is an undocumented option: > > > > > > > > -bv_reproducible_random > > > > > > > > It will force the initial vector of the Krylov subspace to be the > same irrespective of the number of MPI processes. This should be used for > scaling analyses as the one you are trying to do. > > > > > > > > What about when I'm not doing the scaling? Now I would like to ask > for computing time for bigger size problems, should I also use this option > in that case? Because, what happens if I have a "bad" configuration? > Meaning, I ask for some time, enough if I take into account the "correct" > scaling, but when I run it takes double the time/iterations, like it > happened before when changing from 960 to 1024 processes? > > > > > > When you increase the matrix size the spectrum of the matrix changes > and probably also the convergence, so the computation time is not easy to > predict in advance. > > > > > > Okey, I'll keep that in mine. I thought that, even if the spectrum > changes, if I had a behaviour/tendency for 6 or 7 smaller cases I could > predict more or less the time. It was working this way until I found this > "iterations problem" which doubled the time of execution for the same size > problem. To be completely sure, do you suggest me or not to use this > run-time option when going in production? Can you elaborate a bit in the > effect this option? Is the (huge) difference I got in the number of > iterations something expected? > > > > Ideally if you have a rough approximation of the eigenvector, you set it > as the initial vector with EPSSetInitialSpace(). Otherwise, SLEPc generates > a random initial vector, that is start the search blindly. The difference > between using one random vector or another may be large, depending on the > problem. Krylov-Schur is usually less sensitive to the initial vector. > > > > Jose > > > > > > > > > > > > > > > > An additional comment is that we strongly recommend to use the > default solver (Krylov-Schur), which will do Lanczos with implicit restart. > It is generally faster and more stable. > > > > > > > > I will be doing Dynamical Lanczos, that means that I'll need the > "matrix whose rows are the eigenvectors of the tridiagonal matrix" (so, > according to the Lanczos Technical Report notation, I need the "matrix > whose rows are the eigenvectors of T_m", which should be the same as the > vectors y_i). I checked the Technical Report for Krylov-Schur also and I > think I can get the same information also from that solver, but I'm not > sure. Can you confirm this please? > > > > Also, as the vectors I want are given by V_m^(-1)*x_i=y_i (following > the notation on the Report), my idea to get them was to retrieve the > invariant subspace V_m (with EPSGetInvariantSubspace), invert it, and then > multiply it with the eigenvectors that I get with EPSGetEigenvector. Is > there another easier (or with less computations) way to get this? > > > > > > In Krylov-Schur the tridiagonal matrix T_m becomes > arrowhead-plus-tridiagonal. Apart from this, it should be equivalent. The > relevant information can be obtained with EPSGetBV() and EPSGetDS(). But > this is a "developer level" interface. We could help you get this running. > Send a small problem matrix to slepc-maint together with a more detailed > description of what you need to compute. > > > > > > Thanks! When I get to that part I'll write to slepc-maint for help. > > > > > > > > > Jose > > > > > > > > > > > > > > > Jose > > > > > > > > > > > > > > > > > El 23 oct 2018, a las 12:13, Ale Foggia <amfog...@gmail.com> > escribió: > > > > > > > > > > Hello, > > > > > > > > > > I'm currently using Lanczos solver (EPSLANCZOS) to get the > smallest real eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem > (EPS_HEP). Those are the only options I set for the solver. My aim is to be > able to predict/estimate the time-to-solution. To do so, I was doing a > scaling of the code for different sizes of matrices and for different > number of MPI processes. As I was not observing a good scaling I checked > the number of iterations of the solver (given by EPSGetIterationNumber). > I've encounter that for the **same size** of matrix (that meaning, the same > problem), when I change the number of MPI processes, the amount of > iterations changes, and the behaviour is not monotonic. This are the > numbers I've got: > > > > > > > > > > # procs # iters > > > > > 960 157 > > > > > 992 189 > > > > > 1024 338 > > > > > 1056 190 > > > > > 1120 174 > > > > > 2048 136 > > > > > > > > > > I've checked the mailing list for a similar situation and I've > found another person with the same problem but in another solver ("[SLEPc] > GD is not deterministic when using different number of cores", Nov 19 > 2015), but I think the solution this person finds does not apply to my > problem (removing "-eps_harmonic" option). > > > > > > > > > > Can you give me any hint on what is the reason for this behaviour? > Is there a way to prevent this? It's not possible to estimate/predict any > time consumption for bigger problems if the number of iterations varies > this much. > > > > > > > > > > Ale > > > > > > > >