Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

Jose E. Roman Wed, 24 Oct 2018 06:49:11 -0700

Everything seems correct. I don't know, maybe your problem is very sensitive? 
Is the eigenvalue tiny?
I would still try with Krylov-Schur.
Jose



> El 24 oct 2018, a las 14:59, Ale Foggia <amfog...@gmail.com> escribió:
> 
> The functions called to set the solver are (in this order): EPSCreate(); 
> EPSSetOperators(); EPSSetProblemType(EPS_HEP); EPSSetType(EPSLANCZOS); 
> EPSSetWhichEigenpairs(EPS_SMALLEST_REAL); EPSSetFromOptions();
> 
> The output of -eps_view for each run is:
> =================================================================
> EPS Object: 960 MPI processes
>   type: lanczos
>     LOCAL reorthogonalization
>   problem type: symmetric eigenvalue problem
>   selected portion of the spectrum: smallest real parts
>   number of eigenvalues (nev): 1
>   number of column vectors (ncv): 16
>   maximum dimension of projected problem (mpd): 16
>   maximum number of iterations: 291700777
>   tolerance: 1e-08
>   convergence test: relative to the eigenvalue
> BV Object: 960 MPI processes
>   type: svec
>   17 columns of global length 2333606220
>   vector orthogonalization method: modified Gram-Schmidt
>   orthogonalization refinement: if needed (eta: 0.7071)
>   block orthogonalization method: GS
>   doing matmult as a single matrix-matrix product
>   generating random vectors independent of the number of processes
> DS Object: 960 MPI processes
>   type: hep
>   parallel operation mode: REDUNDANT
>   solving the problem with: Implicit QR method (_steqr)
> ST Object: 960 MPI processes
>   type: shift
>   shift: 0.
>   number of matrices: 1
> =================================================================
> EPS Object: 1024 MPI processes
>   type: lanczos
>     LOCAL reorthogonalization
>   problem type: symmetric eigenvalue problem
>   selected portion of the spectrum: smallest real parts
>   number of eigenvalues (nev): 1
>   number of column vectors (ncv): 16
>   maximum dimension of projected problem (mpd): 16
>   maximum number of iterations: 291700777
>   tolerance: 1e-08
>   convergence test: relative to the eigenvalue
> BV Object: 1024 MPI processes
>   type: svec
>   17 columns of global length 2333606220
>   vector orthogonalization method: modified Gram-Schmidt
>   orthogonalization refinement: if needed (eta: 0.7071)
>   block orthogonalization method: GS
>   doing matmult as a single matrix-matrix product
>   generating random vectors independent of the number of processes
> DS Object: 1024 MPI processes
>   type: hep
>   parallel operation mode: REDUNDANT
>   solving the problem with: Implicit QR method (_steqr)
> ST Object: 1024 MPI processes
>   type: shift
>   shift: 0.
>   number of matrices: 1
> =================================================================
> 
> I run again the same configurations and I got the same result in term of the 
> number of iterations.
> 
> I also tried the full reorthogonalization (always with the 
> -bv_reproducible_random option) but I still get different number of 
> iterations: for 960 procs I get 172 iters, and for 1024 I get 362 iters. The 
> -esp_view output for this case (only for 960 procs, the other one has the 
> same information -except the number of processes-) is:
> =================================================================
> EPS Object: 960 MPI processes
>   type: lanczos
>     FULL reorthogonalization
>   problem type: symmetric eigenvalue problem
>   selected portion of the spectrum: smallest real parts
>   number of eigenvalues (nev): 1
>   number of column vectors (ncv): 16
>   maximum dimension of projected problem (mpd): 16
>   maximum number of iterations: 291700777
>   tolerance: 1e-08
>   convergence test: relative to the eigenvalue
> BV Object: 960 MPI processes
>   type: svec
>   17 columns of global length 2333606220
>   vector orthogonalization method: classical Gram-Schmidt
>   orthogonalization refinement: if needed (eta: 0.7071)
>   block orthogonalization method: GS
>   doing matmult as a single matrix-matrix product
>   generating random vectors independent of the number of processes
> DS Object: 960 MPI processes
>   type: hep
>   parallel operation mode: REDUNDANT
>   solving the problem with: Implicit QR method (_steqr)
> ST Object: 960 MPI processes
>   type: shift
>   shift: 0.
>   number of matrices: 1
> =================================================================
> 
> El mié., 24 oct. 2018 a las 10:52, Jose E. Roman (<jro...@dsic.upv.es>) 
> escribió:
> This is very strange. Make sure you call EPSSetFromOptions in the code. Do 
> iteration counts change also for two different runs with the same number of 
> processes?
> Maybe Lanczos with default options is too sensitive (by default it does not 
> reorthogonalize). Suggest using Krylov-Schur or Lanczos with full 
> reorthogonalization (EPSLanczosSetReorthog).
> Also, send the output of -eps_view to see if there is anything abnormal.
> 
> Jose
> 
> 
> > El 24 oct 2018, a las 9:09, Ale Foggia <amfog...@gmail.com> escribió:
> > 
> > I've tried the option that you gave me but I still get different number of 
> > iterations when changing the number of MPI processes: I did 960 procs and 
> > 1024 procs and I got 435 and 176 iterations, respectively.
> > 
> > El mar., 23 oct. 2018 a las 16:48, Jose E. Roman (<jro...@dsic.upv.es>) 
> > escribió:
> > 
> > 
> > > El 23 oct 2018, a las 15:46, Ale Foggia <amfog...@gmail.com> escribió:
> > > 
> > > 
> > > 
> > > El mar., 23 oct. 2018 a las 15:33, Jose E. Roman (<jro...@dsic.upv.es>) 
> > > escribió:
> > > 
> > > 
> > > > El 23 oct 2018, a las 15:17, Ale Foggia <amfog...@gmail.com> escribió:
> > > > 
> > > > Hello Jose, thanks for your answer.
> > > > 
> > > > El mar., 23 oct. 2018 a las 12:59, Jose E. Roman (<jro...@dsic.upv.es>) 
> > > > escribió:
> > > > There is an undocumented option:
> > > > 
> > > >   -bv_reproducible_random
> > > > 
> > > > It will force the initial vector of the Krylov subspace to be the same 
> > > > irrespective of the number of MPI processes. This should be used for 
> > > > scaling analyses as the one you are trying to do.
> > > > 
> > > > What about when I'm not doing the scaling? Now I would like to ask for 
> > > > computing time for bigger size problems, should I also use this option 
> > > > in that case? Because, what happens if I have a "bad" configuration? 
> > > > Meaning, I ask for some time, enough if I take into account the 
> > > > "correct" scaling, but when I run it takes double the time/iterations, 
> > > > like it happened before when changing from 960 to 1024 processes?
> > > 
> > > When you increase the matrix size the spectrum of the matrix changes and 
> > > probably also the convergence, so the computation time is not easy to 
> > > predict in advance.
> > > 
> > > Okey, I'll keep that in mine. I thought that, even if the spectrum 
> > > changes, if I had a behaviour/tendency for 6 or 7 smaller cases I could 
> > > predict more or less the time. It was working this way until I found this 
> > > "iterations problem" which doubled the time of execution for the same 
> > > size problem. To be completely sure, do you suggest me or not to use this 
> > > run-time option when going in production? Can you elaborate a bit in the 
> > > effect this option? Is the (huge) difference I got in the number of 
> > > iterations something expected?
> > 
> > Ideally if you have a rough approximation of the eigenvector, you set it as 
> > the initial vector with EPSSetInitialSpace(). Otherwise, SLEPc generates a 
> > random initial vector, that is start the search blindly. The difference 
> > between using one random vector or another may be large, depending on the 
> > problem. Krylov-Schur is usually less sensitive to the initial vector. 
> > 
> > Jose
> > 
> > > 
> > > 
> > > > 
> > > > An additional comment is that we strongly recommend to use the default 
> > > > solver (Krylov-Schur), which will do Lanczos with implicit restart. It 
> > > > is generally faster and more stable.
> > > > 
> > > > I will be doing Dynamical Lanczos, that means that I'll need the 
> > > > "matrix whose rows are the eigenvectors of the tridiagonal matrix" (so, 
> > > > according to the Lanczos Technical Report notation, I need the "matrix 
> > > > whose rows are the eigenvectors of T_m", which should be the same as 
> > > > the vectors y_i). I checked the Technical Report for Krylov-Schur also 
> > > > and I think I can get the same information also from that solver, but 
> > > > I'm not sure. Can you confirm this please? 
> > > > Also, as the vectors I want are given by V_m^(-1)*x_i=y_i (following 
> > > > the notation on the Report), my idea to get them was to retrieve the 
> > > > invariant subspace V_m (with EPSGetInvariantSubspace), invert it, and 
> > > > then multiply it with the eigenvectors that I get with 
> > > > EPSGetEigenvector. Is there another easier (or with less computations) 
> > > > way to get this?
> > > 
> > > In Krylov-Schur the tridiagonal matrix T_m becomes 
> > > arrowhead-plus-tridiagonal. Apart from this, it should be equivalent. The 
> > > relevant information can be obtained with EPSGetBV() and EPSGetDS(). But 
> > > this is a "developer level" interface. We could help you get this 
> > > running. Send a small problem matrix to slepc-maint together with a more 
> > > detailed description of what you need to compute.
> > > 
> > > Thanks! When I get to that part I'll write to slepc-maint for help.
> > > 
> > > 
> > > Jose
> > > 
> > > > 
> > > > 
> > > > Jose
> > > > 
> > > > 
> > > > 
> > > > > El 23 oct 2018, a las 12:13, Ale Foggia <amfog...@gmail.com> escribió:
> > > > > 
> > > > > Hello, 
> > > > > 
> > > > > I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest 
> > > > > real eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). 
> > > > > Those are the only options I set for the solver. My aim is to be able 
> > > > > to predict/estimate the time-to-solution. To do so, I was doing a 
> > > > > scaling of the code for different sizes of matrices and for different 
> > > > > number of MPI processes. As I was not observing a good scaling I 
> > > > > checked the number of iterations of the solver (given by 
> > > > > EPSGetIterationNumber). I've encounter that for the **same size** of 
> > > > > matrix (that meaning, the same problem), when I change the number of 
> > > > > MPI processes, the amount of iterations changes, and the behaviour is 
> > > > > not monotonic. This are the numbers I've got:
> > > > > 
> > > > > # procs   # iters
> > > > > 960          157
> > > > > 992          189
> > > > > 1024        338
> > > > > 1056        190
> > > > > 1120        174
> > > > > 2048        136
> > > > > 
> > > > > I've checked the mailing list for a similar situation and I've found 
> > > > > another person with the same problem but in another solver ("[SLEPc] 
> > > > > GD is not deterministic when using different number of cores", Nov 19 
> > > > > 2015), but I think the solution this person finds does not apply to 
> > > > > my problem (removing "-eps_harmonic" option).
> > > > > 
> > > > > Can you give me any hint on what is the reason for this behaviour? Is 
> > > > > there a way to prevent this? It's not possible to estimate/predict 
> > > > > any time consumption for bigger problems if the number of iterations 
> > > > > varies this much.
> > > > > 
> > > > > Ale
> > > > 
> > 
>

Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

Reply via email to