Everything seems correct. I don't know, maybe your problem is very sensitive? Is the eigenvalue tiny? I would still try with Krylov-Schur. Jose
> El 24 oct 2018, a las 14:59, Ale Foggia <amfog...@gmail.com> escribió: > > The functions called to set the solver are (in this order): EPSCreate(); > EPSSetOperators(); EPSSetProblemType(EPS_HEP); EPSSetType(EPSLANCZOS); > EPSSetWhichEigenpairs(EPS_SMALLEST_REAL); EPSSetFromOptions(); > > The output of -eps_view for each run is: > ================================================================= > EPS Object: 960 MPI processes > type: lanczos > LOCAL reorthogonalization > problem type: symmetric eigenvalue problem > selected portion of the spectrum: smallest real parts > number of eigenvalues (nev): 1 > number of column vectors (ncv): 16 > maximum dimension of projected problem (mpd): 16 > maximum number of iterations: 291700777 > tolerance: 1e-08 > convergence test: relative to the eigenvalue > BV Object: 960 MPI processes > type: svec > 17 columns of global length 2333606220 > vector orthogonalization method: modified Gram-Schmidt > orthogonalization refinement: if needed (eta: 0.7071) > block orthogonalization method: GS > doing matmult as a single matrix-matrix product > generating random vectors independent of the number of processes > DS Object: 960 MPI processes > type: hep > parallel operation mode: REDUNDANT > solving the problem with: Implicit QR method (_steqr) > ST Object: 960 MPI processes > type: shift > shift: 0. > number of matrices: 1 > ================================================================= > EPS Object: 1024 MPI processes > type: lanczos > LOCAL reorthogonalization > problem type: symmetric eigenvalue problem > selected portion of the spectrum: smallest real parts > number of eigenvalues (nev): 1 > number of column vectors (ncv): 16 > maximum dimension of projected problem (mpd): 16 > maximum number of iterations: 291700777 > tolerance: 1e-08 > convergence test: relative to the eigenvalue > BV Object: 1024 MPI processes > type: svec > 17 columns of global length 2333606220 > vector orthogonalization method: modified Gram-Schmidt > orthogonalization refinement: if needed (eta: 0.7071) > block orthogonalization method: GS > doing matmult as a single matrix-matrix product > generating random vectors independent of the number of processes > DS Object: 1024 MPI processes > type: hep > parallel operation mode: REDUNDANT > solving the problem with: Implicit QR method (_steqr) > ST Object: 1024 MPI processes > type: shift > shift: 0. > number of matrices: 1 > ================================================================= > > I run again the same configurations and I got the same result in term of the > number of iterations. > > I also tried the full reorthogonalization (always with the > -bv_reproducible_random option) but I still get different number of > iterations: for 960 procs I get 172 iters, and for 1024 I get 362 iters. The > -esp_view output for this case (only for 960 procs, the other one has the > same information -except the number of processes-) is: > ================================================================= > EPS Object: 960 MPI processes > type: lanczos > FULL reorthogonalization > problem type: symmetric eigenvalue problem > selected portion of the spectrum: smallest real parts > number of eigenvalues (nev): 1 > number of column vectors (ncv): 16 > maximum dimension of projected problem (mpd): 16 > maximum number of iterations: 291700777 > tolerance: 1e-08 > convergence test: relative to the eigenvalue > BV Object: 960 MPI processes > type: svec > 17 columns of global length 2333606220 > vector orthogonalization method: classical Gram-Schmidt > orthogonalization refinement: if needed (eta: 0.7071) > block orthogonalization method: GS > doing matmult as a single matrix-matrix product > generating random vectors independent of the number of processes > DS Object: 960 MPI processes > type: hep > parallel operation mode: REDUNDANT > solving the problem with: Implicit QR method (_steqr) > ST Object: 960 MPI processes > type: shift > shift: 0. > number of matrices: 1 > ================================================================= > > El mié., 24 oct. 2018 a las 10:52, Jose E. Roman (<jro...@dsic.upv.es>) > escribió: > This is very strange. Make sure you call EPSSetFromOptions in the code. Do > iteration counts change also for two different runs with the same number of > processes? > Maybe Lanczos with default options is too sensitive (by default it does not > reorthogonalize). Suggest using Krylov-Schur or Lanczos with full > reorthogonalization (EPSLanczosSetReorthog). > Also, send the output of -eps_view to see if there is anything abnormal. > > Jose > > > > El 24 oct 2018, a las 9:09, Ale Foggia <amfog...@gmail.com> escribió: > > > > I've tried the option that you gave me but I still get different number of > > iterations when changing the number of MPI processes: I did 960 procs and > > 1024 procs and I got 435 and 176 iterations, respectively. > > > > El mar., 23 oct. 2018 a las 16:48, Jose E. Roman (<jro...@dsic.upv.es>) > > escribió: > > > > > > > El 23 oct 2018, a las 15:46, Ale Foggia <amfog...@gmail.com> escribió: > > > > > > > > > > > > El mar., 23 oct. 2018 a las 15:33, Jose E. Roman (<jro...@dsic.upv.es>) > > > escribió: > > > > > > > > > > El 23 oct 2018, a las 15:17, Ale Foggia <amfog...@gmail.com> escribió: > > > > > > > > Hello Jose, thanks for your answer. > > > > > > > > El mar., 23 oct. 2018 a las 12:59, Jose E. Roman (<jro...@dsic.upv.es>) > > > > escribió: > > > > There is an undocumented option: > > > > > > > > -bv_reproducible_random > > > > > > > > It will force the initial vector of the Krylov subspace to be the same > > > > irrespective of the number of MPI processes. This should be used for > > > > scaling analyses as the one you are trying to do. > > > > > > > > What about when I'm not doing the scaling? Now I would like to ask for > > > > computing time for bigger size problems, should I also use this option > > > > in that case? Because, what happens if I have a "bad" configuration? > > > > Meaning, I ask for some time, enough if I take into account the > > > > "correct" scaling, but when I run it takes double the time/iterations, > > > > like it happened before when changing from 960 to 1024 processes? > > > > > > When you increase the matrix size the spectrum of the matrix changes and > > > probably also the convergence, so the computation time is not easy to > > > predict in advance. > > > > > > Okey, I'll keep that in mine. I thought that, even if the spectrum > > > changes, if I had a behaviour/tendency for 6 or 7 smaller cases I could > > > predict more or less the time. It was working this way until I found this > > > "iterations problem" which doubled the time of execution for the same > > > size problem. To be completely sure, do you suggest me or not to use this > > > run-time option when going in production? Can you elaborate a bit in the > > > effect this option? Is the (huge) difference I got in the number of > > > iterations something expected? > > > > Ideally if you have a rough approximation of the eigenvector, you set it as > > the initial vector with EPSSetInitialSpace(). Otherwise, SLEPc generates a > > random initial vector, that is start the search blindly. The difference > > between using one random vector or another may be large, depending on the > > problem. Krylov-Schur is usually less sensitive to the initial vector. > > > > Jose > > > > > > > > > > > > > > > > An additional comment is that we strongly recommend to use the default > > > > solver (Krylov-Schur), which will do Lanczos with implicit restart. It > > > > is generally faster and more stable. > > > > > > > > I will be doing Dynamical Lanczos, that means that I'll need the > > > > "matrix whose rows are the eigenvectors of the tridiagonal matrix" (so, > > > > according to the Lanczos Technical Report notation, I need the "matrix > > > > whose rows are the eigenvectors of T_m", which should be the same as > > > > the vectors y_i). I checked the Technical Report for Krylov-Schur also > > > > and I think I can get the same information also from that solver, but > > > > I'm not sure. Can you confirm this please? > > > > Also, as the vectors I want are given by V_m^(-1)*x_i=y_i (following > > > > the notation on the Report), my idea to get them was to retrieve the > > > > invariant subspace V_m (with EPSGetInvariantSubspace), invert it, and > > > > then multiply it with the eigenvectors that I get with > > > > EPSGetEigenvector. Is there another easier (or with less computations) > > > > way to get this? > > > > > > In Krylov-Schur the tridiagonal matrix T_m becomes > > > arrowhead-plus-tridiagonal. Apart from this, it should be equivalent. The > > > relevant information can be obtained with EPSGetBV() and EPSGetDS(). But > > > this is a "developer level" interface. We could help you get this > > > running. Send a small problem matrix to slepc-maint together with a more > > > detailed description of what you need to compute. > > > > > > Thanks! When I get to that part I'll write to slepc-maint for help. > > > > > > > > > Jose > > > > > > > > > > > > > > > Jose > > > > > > > > > > > > > > > > > El 23 oct 2018, a las 12:13, Ale Foggia <amfog...@gmail.com> escribió: > > > > > > > > > > Hello, > > > > > > > > > > I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest > > > > > real eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). > > > > > Those are the only options I set for the solver. My aim is to be able > > > > > to predict/estimate the time-to-solution. To do so, I was doing a > > > > > scaling of the code for different sizes of matrices and for different > > > > > number of MPI processes. As I was not observing a good scaling I > > > > > checked the number of iterations of the solver (given by > > > > > EPSGetIterationNumber). I've encounter that for the **same size** of > > > > > matrix (that meaning, the same problem), when I change the number of > > > > > MPI processes, the amount of iterations changes, and the behaviour is > > > > > not monotonic. This are the numbers I've got: > > > > > > > > > > # procs # iters > > > > > 960 157 > > > > > 992 189 > > > > > 1024 338 > > > > > 1056 190 > > > > > 1120 174 > > > > > 2048 136 > > > > > > > > > > I've checked the mailing list for a similar situation and I've found > > > > > another person with the same problem but in another solver ("[SLEPc] > > > > > GD is not deterministic when using different number of cores", Nov 19 > > > > > 2015), but I think the solution this person finds does not apply to > > > > > my problem (removing "-eps_harmonic" option). > > > > > > > > > > Can you give me any hint on what is the reason for this behaviour? Is > > > > > there a way to prevent this? It's not possible to estimate/predict > > > > > any time consumption for bigger problems if the number of iterations > > > > > varies this much. > > > > > > > > > > Ale > > > > > > >