Re: [petsc-users] CPR-AMG: SNES with two cores worse than with one
Thank you Hong! I've used GMRES via mpirun \ -n ${NP} pflotran \ -pflotranin ${INPUTFILE}.pflinput \ -flow_ksp_type gmres \ -flow_pc_type bjacobi \ -flow_sub_pc_type lu \ -flow_sub_pc_factor_nonzeros_along_diagonal \ -snes_monitor and get: NP 1 FLOW TS BE steps = 43 newton = 43 linear = 43 cuts = 0 FLOW TS BE Wasted Linear Iterations = 0 FLOW TS BE SNES time = 197.0 seconds NP 2 FLOW TS BE steps = 43 newton = 43 linear =770 cuts = 0 FLOW TS BE Wasted Linear Iterations = 0 FLOW TS BE SNES time = 68.7 seconds Which looks ok to me. Robert On 07/07/17 15:49, h...@aspiritech.org wrote: > What do you get with '-ksp_type gmres' or '-ksp_type bcgs' in parallel > runs? > Hong > > On Fri, Jul 7, 2017 at 6:05 AM, Robert Annewandter > <robert.annewand...@opengosim.com > <mailto:robert.annewand...@opengosim.com>> wrote: > > Yes indeed, PFLOTRAN cuts timestep after 8 failed iterations of SNES. > > I've rerun with -snes_monitor (attached with canonical suffix), > their -pc_type is always PCBJACOBI + PCLU (though we'd like to try > SUPERLU in the future, however it works only with -mat_type aij..) > > > The sequential and parallel runs I did with > > -ksp_type preonly -pc_type lu > -pc_factor_nonzeros_along_diagonal -snes_monitor > > and > > -ksp_type preonly -pc_type bjacobi -sub_pc_type lu > -sub_pc_factor_nonzeros_along_diagonal -snes_monitor > > As expected, the sequential are bot identical and the parallel > takes half the time compared to sequential. > > > > > On 07/07/17 01:20, Barry Smith wrote: >>Looks like PFLOTRAN has a maximum number of SNES iterations as 8 and >> cuts the timestep if that fails. >> >>Please run with -snes_monitor I don't understand the strange densely >> packed information that PFLOTRAN is printing. >> >>It looks like the linear solver is converging fine in parallel, >> normally then there is absolutely no reason that the Newton should behave >> different on 2 processors than 1 unless there is something wrong with the >> Jacobian. What is the -pc_type for the two cases LU or your fancy thing? >> >>Please run sequential and parallel with -pc_type lu and also with >> -snes_monitor. We need to fix all the knobs but one in order to understand >> what is going on. >> >> >>Barry >> >> >> >>> On Jul 6, 2017, at 5:11 PM, Robert Annewandter >>> <robert.annewand...@opengosim.com> >>> <mailto:robert.annewand...@opengosim.com> wrote: >>> >>> Thanks Barry! >>> >>> I've attached log files for np = 1 (SNES time: 218 s) and np = 2 (SNES >>> time: 600 s). PFLOTRAN final output: >>> >>> NP 1 >>> >>> FLOW TS BE steps = 43 newton = 43 linear = 43 cuts = >>> 0 >>> FLOW TS BE Wasted Linear Iterations = 0 >>> FLOW TS BE SNES time = 218.9 seconds >>> >>> NP 2 >>> >>> FLOW TS BE steps = 67 newton = 176 linear =314 cuts = >>>13 >>> FLOW TS BE Wasted Linear Iterations = 208 >>> FLOW TS BE SNES time = 600.0 seconds >>> >>> >>> Robert >>> >>> On 06/07/17 21:24, Barry Smith wrote: >>>>So on one process the outer linear solver takes a single iteration >>>> this is because the block Jacobi with LU and one block is a direct solver. >>>> >>>> >>>>> 11 KSP preconditioned resid norm 1.131868956745e+00 true resid >>>>> norm 1.526261825526e-05 ||r(i)||/||b|| 1.485509868409e-05 >>>>> [0] KSPConvergedDefault(): Linear solver has converged. Residual norm >>>>> 2.148515820410e-14 is less than relative tolerance 1.e-07 >>>>> times initial right hand side norm 1.581814306485e-02 at iteration 1 >>>>> 1 KSP unpreconditioned resid norm 2.148515820410e-14 true resid >>>>> norm 2.148698024622e-14 ||r(i)||/||b|| 1.358375642332e-12 >>>>> >>>>On two processes the outer linear solver takes a few iterations to >>>> solver, this is to be expected. >>>> >>>>But what you sent doesn't give any indication about SNES not >>>> converging. Please turn off all inner linear solver monitoring and just >>>> run with -ksp_monitor_true_residual -snes_monitor -snes_lineseach_monitor
[petsc-users] CPR-AMG: SNES with two cores worse than with one
Hi all, I like to understand why the SNES of my CPR-AMG Two-Stage Preconditioner (with KSPFGMRES + multipl. PCComposite (PCGalerkin with KSPGMRES + BoomerAMG, PCBJacobi + PCLU init) on a 24,000 x 24,000 matrix) struggles to converge when using two cores instead of one. Because of the adaptive time stepping of the Newton, this leads to severe cuts in time step. This is how I run it with two cores mpirun \ -n 2 pflotran \ -pflotranin het.pflinput \ -ksp_monitor_true_residual \ -flow_snes_view \ -flow_snes_converged_reason \ -flow_sub_1_pc_type bjacobi \ -flow_sub_1_sub_pc_type lu \ -flow_sub_1_sub_pc_factor_pivot_in_blocks true\ -flow_sub_1_sub_pc_factor_nonzeros_along_diagonal \ -options_left \ -log_summary \ -info With one core I get (after grepping the crap away from -info): Step 32 Time= 1.8E+01 [...] 0 2r: 1.58E-02 2x: 0.00E+00 2u: 0.00E+00 ir: 7.18E-03 iu: 0.00E+00 rsn: 0 [0] SNESComputeJacobian(): Rebuilding preconditioner Residual norms for flow_ solve. 0 KSP unpreconditioned resid norm 1.581814306485e-02 true resid norm 1.581814306485e-02 ||r(i)||/||b|| 1.e+00 Residual norms for flow_sub_0_galerkin_ solve. 0 KSP preconditioned resid norm 5.697603110484e+07 true resid norm 5.175721849125e+03 ||r(i)||/||b|| 5.037527476892e+03 1 KSP preconditioned resid norm 5.041509073319e+06 true resid norm 3.251596928176e+02 ||r(i)||/||b|| 3.164777657484e+02 2 KSP preconditioned resid norm 1.043761838360e+06 true resid norm 8.957519558348e+01 ||r(i)||/||b|| 8.718349288342e+01 3 KSP preconditioned resid norm 1.129189815646e+05 true resid norm 2.722436912053e+00 ||r(i)||/||b|| 2.649746479496e+00 4 KSP preconditioned resid norm 8.829637298082e+04 true resid norm 8.026373593492e+00 ||r(i)||/||b|| 7.812065388300e+00 5 KSP preconditioned resid norm 6.506021637694e+04 true resid norm 3.479889319880e+00 ||r(i)||/||b|| 3.386974527698e+00 6 KSP preconditioned resid norm 6.392263200180e+04 true resid norm 3.819202631980e+00 ||r(i)||/||b|| 3.717228003987e+00 7 KSP preconditioned resid norm 2.464946645480e+04 true resid norm 7.329964753388e-01 ||r(i)||/||b|| 7.134251013911e-01 8 KSP preconditioned resid norm 2.603879153772e+03 true resid norm 2.035525412004e-02 ||r(i)||/||b|| 1.981175861414e-02 9 KSP preconditioned resid norm 1.774410462754e+02 true resid norm 3.001214973121e-03 ||r(i)||/||b|| 2.921081026352e-03 10 KSP preconditioned resid norm 1.664227038378e+01 true resid norm 3.413136309181e-04 ||r(i)||/||b|| 3.322003855903e-04 [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 1.131868956745e+00 is less than relative tolerance 1.e-07 times initial right hand side norm 2.067297386780e+07 at iteration 11 11 KSP preconditioned resid norm 1.131868956745e+00 true resid norm 1.526261825526e-05 ||r(i)||/||b|| 1.485509868409e-05 [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 2.148515820410e-14 is less than relative tolerance 1.e-07 times initial right hand side norm 1.581814306485e-02 at iteration 1 1 KSP unpreconditioned resid norm 2.148515820410e-14 true resid norm 2.148698024622e-14 ||r(i)||/||b|| 1.358375642332e-12 [0] SNESSolve_NEWTONLS(): iter=0, linear solve iterations=1 [0] SNESNEWTONLSCheckResidual_Private(): ||J^T(F-Ax)||/||F-AX|| 3.590873180642e-01 near zero implies inconsistent rhs [0] SNESSolve_NEWTONLS(): fnorm=1.5818143064846742e-02, gnorm=1.0695649833687331e-02, ynorm=4.6826522561266171e+02, lssucceed=0 [0] SNESConvergedDefault(): Converged due to small update length: 4.682652256127e+02 < 1.e-05 * 3.702480426117e+09 1 2r: 1.07E-02 2x: 3.70E+09 2u: 4.68E+02 ir: 5.05E-03 iu: 4.77E+01 rsn: stol Nonlinear flow_ solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 But with two cores I get: Step 32 Time= 1.8E+01 [...] 0 2r: 6.16E-03 2x: 0.00E+00 2u: 0.00E+00 ir: 3.63E-03 iu: 0.00E+00 rsn: 0 [0] SNESComputeJacobian(): Rebuilding preconditioner Residual norms for flow_ solve. 0 KSP unpreconditioned resid norm 6.162760088924e-03 true resid norm 6.162760088924e-03 ||r(i)||/||b|| 1.e+00 Residual norms for flow_sub_0_galerkin_ solve. 0 KSP preconditioned resid norm 8.994949630499e+08 true resid norm 7.982144380936e-01 ||r(i)||/||b|| 1.e+00 1 KSP preconditioned resid norm 8.950556502615e+08 true resid norm 1.550138696155e+00 ||r(i)||/||b|| 1.942007839218e+00 2 KSP preconditioned resid norm 1.044849684205e+08 true resid norm 2.166193480531e+00 ||r(i)||/||b|| 2.713798920631e+00 3 KSP preconditioned resid norm 8.209708619718e+06 true resid norm 3.076045005154e-01 ||r(i)||/||b|| 3.853657436340e-01 4 KSP preconditioned resid norm 3.027461352422e+05 true resid norm 1.207731865714e-02 ||r(i)||/||b|| 1.513041869549e-02 5 KSP preconditioned resid norm 1.595302164817e+04 true resid norm 4.123713694368e-04
Re: [petsc-users] Hypre BoomerAMG has slow convergence
Thank you Barry! I believe the sub problem can be singular, because the first preconditioner M1 in the CPR-AMG preconditioner Mcpr^(-1) = M2^(-1) [ I - J M1^(-1) ] + M1^(-1), where M1^(-1) = C [ W^T J C ]^(-1) W^T, has prolongation C and restriction W^T with size 3N x N resp. N x 3N (and 3 # of primary variables). Would that make sense? On 04/07/17 18:24, Barry Smith wrote: >The large preconditioned norm relative to the true residual norm is often > a sign that the preconditioner is not happy. > > 0 KSP preconditioned resid norm 2.495457360562e+08 true resid norm > 9.213492769259e-01 ||r(i)||/||b|| 1.e+00 > > Is there any chance this subproblem is singular? > >Run with -flow_sub_0_galerkin_ksp_view_mat binary > -flow_sub_0_galerkin_ksp_view_rhs binary for 18 stages. This will save the > matrices and the right hand sides for the 18 systems passed to hypre in a > file called binaryoutput. Email the file to petsc-ma...@mcs.anl.gov > >Barry > > >> On Jul 4, 2017, at 11:59 AM, Robert Annewandter >> <robert.annewand...@opengosim.com> wrote: >> >> Hi all, >> >> >> I'm working on a CPR-AMG Two-Stage preconditioner implemented as >> multiplicative PCComposite with outer FGMRES, where the first PC is Hypre >> AMG (PCGalerkin + KSPRichardson + PCHYPRE) and the second stage is Block >> Jacobi with LU. The pde's describe two-phase subsurface flow, and I kept the >> problem small at 8000 x 8000 dofs. >> >> The first stage is hard-wired because of the PCGalerkin part and the second >> stage Block Jacobi is configured via command line (with pflotran prefix >> flow_): >> >> -flow_sub_1_pc_type bjacobi \ >> -flow_sub_1_sub_pc_type lu \ >> >> With this configuration I see occasionally that Hypre struggles to converge >> fast: >> >> >> Step 16 >> >> 0 2r: 3.95E-03 2x: 0.00E+00 2u: 0.00E+00 ir: 2.53E-03 iu: 0.00E+00 rsn: >> 0 >> Residual norms for flow_ solve. >> 0 KSP unpreconditioned resid norm 3.945216988332e-03 true resid norm >> 3.945216988332e-03 ||r(i)||/||b|| 1.e+00 >> Residual norms for flow_sub_0_galerkin_ solve. >> 0 KSP preconditioned resid norm 2.495457360562e+08 true resid norm >> 9.213492769259e-01 ||r(i)||/||b|| 1.e+00 >> 1 KSP preconditioned resid norm 3.900401635809e+07 true resid norm >> 1.211813734614e-01 ||r(i)||/||b|| 1.315259874797e-01 >> 2 KSP preconditioned resid norm 7.264015944695e+06 true resid norm >> 2.127154159346e-02 ||r(i)||/||b|| 2.308738078618e-02 >> 3 KSP preconditioned resid norm 1.523934370189e+06 true resid norm >> 4.50720434e-03 ||r(i)||/||b|| 4.891961172285e-03 >> 4 KSP preconditioned resid norm 3.456355485206e+05 true resid norm >> 1.017486337883e-03 ||r(i)||/||b|| 1.104343774250e-03 >> 5 KSP preconditioned resid norm 8.215494701640e+04 true resid norm >> 2.386758602821e-04 ||r(i)||/||b|| 2.590503582729e-04 >> 6 KSP preconditioned resid norm 2.006221595869e+04 true resid norm >> 5.806707975375e-05 ||r(i)||/||b|| 6.302395975986e-05 >> 7 KSP preconditioned resid norm 4.975749682114e+03 true resid norm >> 1.457831681999e-05 ||r(i)||/||b|| 1.582279075383e-05 >> 8 KSP preconditioned resid norm 1.245359749620e+03 true resid norm >> 3.746721600730e-06 ||r(i)||/||b|| 4.066559441204e-06 >> 9 KSP preconditioned resid norm 3.134373137075e+02 true resid norm >> 9.784665277082e-07 ||r(i)||/||b|| 1.061993048904e-06 >>10 KSP preconditioned resid norm 7.917076489741e+01 true resid norm >> 2.582765351245e-07 ||r(i)||/||b|| 2.803242392356e-07 >>11 KSP preconditioned resid norm 2.004702594193e+01 true resid norm >> 6.867609287185e-08 ||r(i)||/||b|| 7.453860831257e-08 >> 1 KSP unpreconditioned resid norm 3.022346103074e-11 true resid norm >> 3.022346103592e-11 ||r(i)||/||b|| 7.660785484121e-09 >> 1 2r: 2.87E-04 2x: 3.70E+09 2u: 3.36E+02 ir: 1.67E-04 iu: 2.19E+01 rsn: >> stol >> Nonlinear flow_ solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 >> >> >> Step 17 >> >> 0 2r: 3.85E-03 2x: 0.00E+00 2u: 0.00E+00 ir: 2.69E-03 iu: 0.00E+00 rsn: 0 >> Residual norms for flow_ solve. >> 0 KSP unpreconditioned resid norm 3.846677237838e-03 true resid norm >> 3.846677237838e-03 ||r(i)||/||b|| 1.e+00 >> Residual norms for flow_sub_0_galerkin_ solve. >> 0 KSP preconditioned resid norm 8.359592959751e+07 true resid norm >> 8.919381920269e-01 ||r(i)||/||b|| 1.e+00 >> 1 KSP preconditioned resid norm 2.04647
[petsc-users] Hypre BoomerAMG has slow convergence
Hi all, I'm working on a CPR-AMG Two-Stage preconditioner implemented as multiplicative PCComposite with outer FGMRES, where the first PC is Hypre AMG (PCGalerkin + KSPRichardson + PCHYPRE) and the second stage is Block Jacobi with LU. The pde's describe two-phase subsurface flow, and I kept the problem small at 8000 x 8000 dofs. The first stage is hard-wired because of the PCGalerkin part and the second stage Block Jacobi is configured via command line (with pflotran prefix flow_): -flow_sub_1_pc_type bjacobi \ -flow_sub_1_sub_pc_type lu \ With this configuration I see occasionally that Hypre struggles to converge fast: Step 16 0 2r: 3.95E-03 2x: 0.00E+00 2u: 0.00E+00 ir: 2.53E-03 iu: 0.00E+00 rsn: 0 Residual norms for flow_ solve. 0 KSP unpreconditioned resid norm 3.945216988332e-03 true resid norm 3.945216988332e-03 ||r(i)||/||b|| 1.e+00 Residual norms for flow_sub_0_galerkin_ solve. 0 KSP preconditioned resid norm 2.495457360562e+08 true resid norm 9.213492769259e-01 ||r(i)||/||b|| 1.e+00 1 KSP preconditioned resid norm 3.900401635809e+07 true resid norm 1.211813734614e-01 ||r(i)||/||b|| 1.315259874797e-01 2 KSP preconditioned resid norm 7.264015944695e+06 true resid norm 2.127154159346e-02 ||r(i)||/||b|| 2.308738078618e-02 3 KSP preconditioned resid norm 1.523934370189e+06 true resid norm 4.50720434e-03 ||r(i)||/||b|| 4.891961172285e-03 4 KSP preconditioned resid norm 3.456355485206e+05 true resid norm 1.017486337883e-03 ||r(i)||/||b|| 1.104343774250e-03 5 KSP preconditioned resid norm 8.215494701640e+04 true resid norm 2.386758602821e-04 ||r(i)||/||b|| 2.590503582729e-04 6 KSP preconditioned resid norm 2.006221595869e+04 true resid norm 5.806707975375e-05 ||r(i)||/||b|| 6.302395975986e-05 7 KSP preconditioned resid norm 4.975749682114e+03 true resid norm 1.457831681999e-05 ||r(i)||/||b|| 1.582279075383e-05 8 KSP preconditioned resid norm 1.245359749620e+03 true resid norm 3.746721600730e-06 ||r(i)||/||b|| 4.066559441204e-06 9 KSP preconditioned resid norm 3.134373137075e+02 true resid norm 9.784665277082e-07 ||r(i)||/||b|| 1.061993048904e-06 10 KSP preconditioned resid norm 7.917076489741e+01 true resid norm 2.582765351245e-07 ||r(i)||/||b|| 2.803242392356e-07 11 KSP preconditioned resid norm 2.004702594193e+01 true resid norm 6.867609287185e-08 ||r(i)||/||b|| 7.453860831257e-08 1 KSP unpreconditioned resid norm 3.022346103074e-11 true resid norm 3.022346103592e-11 ||r(i)||/||b|| 7.660785484121e-09 1 2r: 2.87E-04 2x: 3.70E+09 2u: 3.36E+02 ir: 1.67E-04 iu: 2.19E+01 rsn: stol Nonlinear flow_ solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 Step 17 0 2r: 3.85E-03 2x: 0.00E+00 2u: 0.00E+00 ir: 2.69E-03 iu: 0.00E+00 rsn: 0 Residual norms for flow_ solve. 0 KSP unpreconditioned resid norm 3.846677237838e-03 true resid norm 3.846677237838e-03 ||r(i)||/||b|| 1.e+00 Residual norms for flow_sub_0_galerkin_ solve. 0 KSP preconditioned resid norm 8.359592959751e+07 true resid norm 8.919381920269e-01 ||r(i)||/||b|| 1.e+00 1 KSP preconditioned resid norm 2.046474217608e+07 true resid norm 1.356172589724e+00 ||r(i)||/||b|| 1.520478214574e+00 2 KSP preconditioned resid norm 5.534610937223e+06 true resid norm 1.361527715124e+00 ||r(i)||/||b|| 1.526482134406e+00 3 KSP preconditioned resid norm 1.642592089665e+06 true resid norm 1.359990274368e+00 ||r(i)||/||b|| 1.524758426677e+00 4 KSP preconditioned resid norm 6.869446528993e+05 true resid norm 1.357740694885e+00 ||r(i)||/||b|| 1.522236301823e+00 5 KSP preconditioned resid norm 5.245968674991e+05 true resid norm 1.355364470917e+00 ||r(i)||/||b|| 1.519572189007e+00 6 KSP preconditioned resid norm 5.042030663187e+05 true resid norm 1.352962944308e+00 ||r(i)||/||b|| 1.516879708036e+00 7 KSP preconditioned resid norm 5.007302249221e+05 true resid norm 1.350558656878e+00 ||r(i)||/||b|| 1.514184131760e+00 8 KSP preconditioned resid norm 4.994105316949e+05 true resid norm 1.348156961110e+00 ||r(i)||/||b|| 1.511491461137e+00 9 KSP preconditioned resid norm 4.984373051647e+05 true resid norm 1.345759135434e+00 ||r(i)||/||b|| 1.508803129481e+00 10 KSP preconditioned resid norm 4.975323739321e+05 true resid norm 1.343365479502e+00 ||r(i)||/||b|| 1.506119472750e+00 11 KSP preconditioned resid norm 4.966432959339e+05 true resid norm 1.340976058673e+00 ||r(i)||/||b|| 1.503440564224e+00 [...] 193 KSP preconditioned resid norm 3.591931201817e+05 true resid norm 9.698521332569e-01 ||r(i)||/||b|| 1.087353520599e+00 194 KSP preconditioned resid norm 3.585542278288e+05 true resid norm 9.681270691497e-01 ||r(i)||/||b|| 1.085419458213e+00 195 KSP preconditioned resid norm 3.579164717745e+05 true resid norm 9.664050733935e-01 ||r(i)||/||b|| 1.083488835922e+00 196 KSP preconditioned resid norm 3.572798501551e+05 true resid norm 9.646861405301e-01 ||r(i)||/||b||
Re: [petsc-users] PCCOMPOSITE with PCBJACOBI
Interesting! And would fit into configuring PFLOTRAN via its input decks (ie we could also provide ASM instead of Block Jacobi) Thanks a lot! On 28/06/17 17:31, Barry Smith wrote: >> On Jun 28, 2017, at 2:07 AM, Robert Annewandter >> <robert.annewand...@opengosim.com> wrote: >> >> Thank you Barry! >> >> We like to hard wire it into PFLOTRAN with CPR-AMG Block Jacobi Two-Stage >> Preconditioning potentially becoming the standard solver strategy. >Understood. Note that you can embed the options into the program with > PetscOptionsSetValue() so they don't need to be on the command line. > > Barry > >> Using the options database is a great start to reverse engineer the issue! >> >> Thanks! >> Robert >> >> >> >> >> On 27/06/17 23:45, Barry Smith wrote: >>>It is difficult, if not impossible at times to get all the options where >>> you want them to be using the function call interface. On the other hand it >>> is generally easy (if there are no inner PCSHELLS) to do this via the >>> options database >>> >>>-pc_type composite >>>-pc_composite_type multiplicative >>>-pc_composite_pcs galerkin,bjacobi >>> >>>-sub_0_galerkin_ksp_type preonly >>>-sub_0_galerkin_pc_type none >>> >>>-sub_1_sub_pc_factor_shift_type inblocks >>>-sub_1_sub_pc_factor_zero_pivot zpiv >>> >>> >>> >>> >>>> On Jun 27, 2017, at 11:24 AM, Robert Annewandter >>>> <robert.annewand...@opengosim.com> >>>> wrote: >>>> >>>> Dear PETSc folks, >>>> >>>> >>>> I want a Block Jacobi PC to be the second PC in a two-stage >>>> preconditioning scheme implemented via multiplicative PCCOMPOSITE, with >>>> the outermost KSP an FGMRES. >>>> >>>> >>>> However, PCBJacobiGetSubKSP ( >>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCBJacobiGetSubKSP.html#PCBJacobiGetSubKSP >>>> ) requires to call KSPSetUp (or PCSetUp) first on its parent KSP, which I >>>> struggle in succeeding. I wonder which KSP (or if so PC) that is. >>>> >>>> >>>> This is how I attempt to do it (using PCKSP to provide a parent KSP for >>>> PCBJacobiGetSubKSP): >>>> >>>> >>>> call KSPGetPC(solver%ksp, solver%pc, ierr); CHKERRQ(ierr) >>>> call PCSetType(solver%pc, PCCOMPOSITE, ierr); CHKERRQ(ierr) >>>> call PCCompositeSetType(solver%pc, PC_COMPOSITE_MULTIPLICATIVE, ierr); >>>> CHKERRQ(ierr) >>>> >>>> >>>> ! 1st Stage >>>> call PCCompositeAddPC(solver%pc, PCGALERKIN, ierr); CHKERRQ(ierr) >>>> call PCCompositeGetPC(solver%pc, 0, T1, ierr); CHKERRQ(ierr) >>>> >>>> >>>> ! KSPPREONLY-PCNONE for testing >>>> call PCGalerkinGetKSP(T1, Ap_ksp, ierr); CHKERRQ(ierr) >>>> call KSPSetType(Ap_ksp, KSPPREONLY, ierr); CHKERRQ(ierr) >>>> call KSPGetPC(Ap_ksp, Ap_pc, ierr); CHKERRQ(ierr) >>>> call PCSetType(Ap_pc, PCNONE, ierr); CHKERRQ(ierr) >>>> >>>> >>>> ! 2nd Stage >>>> call PCCompositeAddPC(solver%pc, PCKSP, ierr); CHKERRQ(ierr) >>>> call PCCompositeGetPC(solver%pc, 1, T2, ierr); CHKERRQ(ierr) >>>> call PCKSPGetKSP(T2, BJac_ksp, ierr); CHKERRQ(ierr) >>>> call KSPSetType(BJac_ksp, KSPPREONLY, ierr); CHKERRQ(ierr) >>>> call KSPGetPC(BJac_ksp, BJac_pc, ierr); CHKERRQ(ierr) >>>> call PCSetType(BJac_pc, PCBJACOBI, ierr); CHKERRQ(ierr) >>>> >>>> >>>> call KSPSetUp(solver%ksp, ierr); CHKERRQ(ierr) >>>> ! call KSPSetUp(BJac_ksp, ierr); CHKERRQ(ierr) >>>> ! call PCSetUp(T2, ierr); CHKERRQ(ierr) >>>> ! call PCSetUp(BJac_pc, ierr); CHKERRQ(ierr) >>>> >>>> >>>> call PCBJacobiGetSubKSP(BJac_pc, nsub_ksp, first_sub_ksp, PETSC_NULL_KSP, >>>> ierr); CHKERRQ(ierr) >>>> allocate(sub_ksps(nsub_ksp)) >>>> call PCBJacobiGetSubKSP(BJac_pc, nsub_ksp, first_sub_ksp, sub_ksps,ierr); >>>> CHKERRQ(ierr) >>>> do i = 1, nsub_ksp >>>> call KSPGetPC(sub_ksps(i), BJac_pc_sub, ierr); CHKERRQ(ierr) >>>> call PCFactorSetShiftType(BJac_pc_sub, MAT_SHIFT_INBLOCKS, ierr); >>>> CHKERRQ(ierr) >>>> call PCFactorSetZeroPivot(BJac_pc_sub, solver%linear_zero_pivot_tol, >>>> ierr); CHKERRQ(ierr) >>>
Re: [petsc-users] PCCOMPOSITE with PCBJACOBI
Thank you Barry! We like to hard wire it into PFLOTRAN with CPR-AMG Block Jacobi Two-Stage Preconditioning potentially becoming the standard solver strategy. Using the options database is a great start to reverse engineer the issue! Thanks! Robert On 27/06/17 23:45, Barry Smith wrote: >It is difficult, if not impossible at times to get all the options where > you want them to be using the function call interface. On the other hand it > is generally easy (if there are no inner PCSHELLS) to do this via the options > database > >-pc_type composite >-pc_composite_type multiplicative >-pc_composite_pcs galerkin,bjacobi > >-sub_0_galerkin_ksp_type preonly >-sub_0_galerkin_pc_type none > >-sub_1_sub_pc_factor_shift_type inblocks >-sub_1_sub_pc_factor_zero_pivot zpiv > > > >> On Jun 27, 2017, at 11:24 AM, Robert Annewandter >> <robert.annewand...@opengosim.com> wrote: >> >> Dear PETSc folks, >> >> >> I want a Block Jacobi PC to be the second PC in a two-stage preconditioning >> scheme implemented via multiplicative PCCOMPOSITE, with the outermost KSP an >> FGMRES. >> >> >> However, PCBJacobiGetSubKSP >> (https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCBJacobiGetSubKSP.html#PCBJacobiGetSubKSP) >> requires to call KSPSetUp (or PCSetUp) first on its parent KSP, which I >> struggle in succeeding. I wonder which KSP (or if so PC) that is. >> >> >> This is how I attempt to do it (using PCKSP to provide a parent KSP for >> PCBJacobiGetSubKSP): >> >> >> call KSPGetPC(solver%ksp, solver%pc, ierr); CHKERRQ(ierr) >> call PCSetType(solver%pc, PCCOMPOSITE, ierr); CHKERRQ(ierr) >> call PCCompositeSetType(solver%pc, PC_COMPOSITE_MULTIPLICATIVE, ierr); >> CHKERRQ(ierr) >> >> >> ! 1st Stage >> call PCCompositeAddPC(solver%pc, PCGALERKIN, ierr); CHKERRQ(ierr) >> call PCCompositeGetPC(solver%pc, 0, T1, ierr); CHKERRQ(ierr) >> >> >> ! KSPPREONLY-PCNONE for testing >> call PCGalerkinGetKSP(T1, Ap_ksp, ierr); CHKERRQ(ierr) >> call KSPSetType(Ap_ksp, KSPPREONLY, ierr); CHKERRQ(ierr) >> call KSPGetPC(Ap_ksp, Ap_pc, ierr); CHKERRQ(ierr) >> call PCSetType(Ap_pc, PCNONE, ierr); CHKERRQ(ierr) >> >> >> ! 2nd Stage >> call PCCompositeAddPC(solver%pc, PCKSP, ierr); CHKERRQ(ierr) >> call PCCompositeGetPC(solver%pc, 1, T2, ierr); CHKERRQ(ierr) >> call PCKSPGetKSP(T2, BJac_ksp, ierr); CHKERRQ(ierr) >> call KSPSetType(BJac_ksp, KSPPREONLY, ierr); CHKERRQ(ierr) >> call KSPGetPC(BJac_ksp, BJac_pc, ierr); CHKERRQ(ierr) >> call PCSetType(BJac_pc, PCBJACOBI, ierr); CHKERRQ(ierr) >> >> >> call KSPSetUp(solver%ksp, ierr); CHKERRQ(ierr) >> ! call KSPSetUp(BJac_ksp, ierr); CHKERRQ(ierr) >> ! call PCSetUp(T2, ierr); CHKERRQ(ierr) >> ! call PCSetUp(BJac_pc, ierr); CHKERRQ(ierr) >> >> >> call PCBJacobiGetSubKSP(BJac_pc, nsub_ksp, first_sub_ksp, PETSC_NULL_KSP, >> ierr); CHKERRQ(ierr) >> allocate(sub_ksps(nsub_ksp)) >> call PCBJacobiGetSubKSP(BJac_pc, nsub_ksp, first_sub_ksp, sub_ksps,ierr); >> CHKERRQ(ierr) >> do i = 1, nsub_ksp >> call KSPGetPC(sub_ksps(i), BJac_pc_sub, ierr); CHKERRQ(ierr) >> call PCFactorSetShiftType(BJac_pc_sub, MAT_SHIFT_INBLOCKS, ierr); >> CHKERRQ(ierr) >> call PCFactorSetZeroPivot(BJac_pc_sub, solver%linear_zero_pivot_tol, >> ierr); CHKERRQ(ierr) >> end do >> deallocate(sub_ksps) >> nullify(sub_ksps) >> >> >> Is using PCKSP a good idea at all? >> >> >> With KSPSetUp(solver%ksp) -> FGMRES >> >> [0]PETSC ERROR: - Error Message >> -- >> [0]PETSC ERROR: Object is in wrong state >> [0]PETSC ERROR: You requested a vector from a KSP that cannot provide one >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for >> trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.5-3167-g03c0fad GIT >> Date: 2017-03-30 14:27:53 -0500 >> [0]PETSC ERROR: pflotran on a debug_g-6.2 named mother by pujjad Tue Jun 27 >> 16:55:14 2017 >> [0]PETSC ERROR: Configure options --download-mpich=yes --download-hdf5=yes >> --download-fblaslapack=yes --download-metis=yes --download-parmetis=yes >> --download-eigen=yes --download-hypre=yes --download-superlu_dist=yes >> --download-superlu=yes --with-cc=gcc-6 --with-cxx=g++-6 --with-fc=gfortran-6 >> PETSC_ARCH=debug_g-6.2 PETSC_DIR=/home/pujjad/Repositories/petsc >> [0]PETSC ERROR: #1 KSPCreateVecs() line 939 in
[petsc-users] PCCOMPOSITE with PCBJACOBI
Dear PETSc folks, I want a Block Jacobi PC to be the second PC in a two-stage preconditioning scheme implemented via multiplicative PCCOMPOSITE, with the outermost KSP an FGMRES. However, PCBJacobiGetSubKSP (https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCBJacobiGetSubKSP.html#PCBJacobiGetSubKSP) requires to call KSPSetUp (or PCSetUp) first on its parent KSP, which I struggle in succeeding. I wonder which KSP (or if so PC) that is. This is how I attempt to do it (using PCKSP to provide a parent KSP for PCBJacobiGetSubKSP): call KSPGetPC(solver%ksp, solver%pc, ierr); CHKERRQ(ierr) call PCSetType(solver%pc, PCCOMPOSITE, ierr); CHKERRQ(ierr) call PCCompositeSetType(solver%pc, PC_COMPOSITE_MULTIPLICATIVE, ierr); CHKERRQ(ierr) ! 1st Stage call PCCompositeAddPC(solver%pc, PCGALERKIN, ierr); CHKERRQ(ierr) call PCCompositeGetPC(solver%pc, 0, T1, ierr); CHKERRQ(ierr) ! KSPPREONLY-PCNONE for testing call PCGalerkinGetKSP(T1, Ap_ksp, ierr); CHKERRQ(ierr) call KSPSetType(Ap_ksp, KSPPREONLY, ierr); CHKERRQ(ierr) call KSPGetPC(Ap_ksp, Ap_pc, ierr); CHKERRQ(ierr) call PCSetType(Ap_pc, PCNONE, ierr); CHKERRQ(ierr) ! 2nd Stage call PCCompositeAddPC(solver%pc, PCKSP, ierr); CHKERRQ(ierr) call PCCompositeGetPC(solver%pc, 1, T2, ierr); CHKERRQ(ierr) call PCKSPGetKSP(T2, BJac_ksp, ierr); CHKERRQ(ierr) call KSPSetType(BJac_ksp, KSPPREONLY, ierr); CHKERRQ(ierr) call KSPGetPC(BJac_ksp, BJac_pc, ierr); CHKERRQ(ierr) call PCSetType(BJac_pc, PCBJACOBI, ierr); CHKERRQ(ierr) call KSPSetUp(solver%ksp, ierr); CHKERRQ(ierr) ! call KSPSetUp(BJac_ksp, ierr); CHKERRQ(ierr) ! call PCSetUp(T2, ierr); CHKERRQ(ierr) ! call PCSetUp(BJac_pc, ierr); CHKERRQ(ierr) call PCBJacobiGetSubKSP(BJac_pc, nsub_ksp, first_sub_ksp, PETSC_NULL_KSP, ierr); CHKERRQ(ierr) allocate(sub_ksps(nsub_ksp)) call PCBJacobiGetSubKSP(BJac_pc, nsub_ksp, first_sub_ksp, sub_ksps,ierr); CHKERRQ(ierr) do i = 1, nsub_ksp call KSPGetPC(sub_ksps(i), BJac_pc_sub, ierr); CHKERRQ(ierr) call PCFactorSetShiftType(BJac_pc_sub, MAT_SHIFT_INBLOCKS, ierr); CHKERRQ(ierr) call PCFactorSetZeroPivot(BJac_pc_sub, solver%linear_zero_pivot_tol, ierr); CHKERRQ(ierr) end do deallocate(sub_ksps) nullify(sub_ksps) Is using PCKSP a good idea at all? With KSPSetUp(solver%ksp) -> FGMRES [0]PETSC ERROR: - Error Message -- [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: You requested a vector from a KSP that cannot provide one [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.7.5-3167-g03c0fad GIT Date: 2017-03-30 14:27:53 -0500 [0]PETSC ERROR: pflotran on a debug_g-6.2 named mother by pujjad Tue Jun 27 16:55:14 2017 [0]PETSC ERROR: Configure options --download-mpich=yes --download-hdf5=yes --download-fblaslapack=yes --download-metis=yes --download-parmetis=yes --download-eigen=yes --download-hypre=yes --download-superlu_dist=yes --download-superlu=yes --with-cc=gcc-6 --with-cxx=g++-6 --with-fc=gfortran-6 PETSC_ARCH=debug_g-6.2 PETSC_DIR=/home/pujjad/Repositories/petsc [0]PETSC ERROR: #1 KSPCreateVecs() line 939 in /home/pujjad/Repositories/petsc/src/ksp/ksp/interface/iterativ.c [0]PETSC ERROR: #2 KSPSetUp_GMRES() line 85 in /home/pujjad/Repositories/petsc/src/ksp/ksp/impls/gmres/gmres.c [0]PETSC ERROR: #3 KSPSetUp_FGMRES() line 41 in /home/pujjad/Repositories/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c [0]PETSC ERROR: #4 KSPSetUp() line 338 in /home/pujjad/Repositories/petsc/src/ksp/ksp/interface/itfunc.c application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 [mpiexec@mother] handle_pmi_cmd (./pm/pmiserv/pmiserv_cb.c:52): Unrecognized PMI command: abort | cleaning up processes [mpiexec@mother] control_cb (./pm/pmiserv/pmiserv_cb.c:289): unable to process PMI command [mpiexec@mother] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status [mpiexec@mother] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event [mpiexec@mother] main (./ui/mpich/mpiexec.c:405): process manager error waiting for completion With KSPSetUp(BJac_ksp) -> KSPPREONLY [0]PETSC ERROR: - Error Message -- [0]PETSC ERROR: Arguments are incompatible [0]PETSC ERROR: Both n and N cannot be PETSC_DECIDE likely a call to VecSetSizes() or MatSetSizes() is wrong. See http://www.mcs.anl.gov/petsc/documentation/faq.html#split [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.7.5-3167-g03c0fad GIT Date: 2017-03-30 14:27:53 -0500 [0]PETSC ERROR: pflotran on a debug_g-6.2 named mother by pujjad Tue Jun 27 16:52:57 2017 [0]PETSC ERROR: Configure options --download-mpich=yes --download-hdf5=yes --download-fblaslapack=yes --download-metis=yes --download-parmetis=yes
[petsc-users] Access to 'step' in 'toil_ims.f90' and 'pm_toil_ims.f90'
Hi, Is there a way to access 'step' as used in the output log file : Step 35 Time= 1.0E-03 Dt= 2.73903E-05 [d] snes_conv_reason: 10, from within the subroutines ' TOilImsResidual(snes,xx,r,realization,ierr)' and 'TOilImsJacobian(snes,xx,A,B,realization,ierr)' in 'toil_ims.f90', as well as from within 'PMTOilImsCheckUpdatePre(this,line_search,X,dX,changed,ierr)' in 'pm_toil_ims.f90'? I like to make sure that J, b and dx are taken from the same step. I can do that if the filenames of the exports reflect step and NR iteration. The problem is that, e.g. 'TOilImsResidual()' is called more often than is 'TOilImsJacobian()', so using a simple counter which gets increased every time when the subroutines are called doesn't help unfortunately. Thank you for any pointers! Robert
Re: [petsc-users] Control Output of '-ksp_view_mat binary'
Thank you Matt and Barry! On 17/11/16 16:05, Barry Smith wrote: You can also call MatView(jac,PETSC_VIEWER_BINARY_WORLD); at the location where you compute the Jacobian based on the iteration number etc On Nov 17, 2016, at 9:35 AM, Matthew Knepley <knep...@gmail.com> wrote: On Thu, Nov 17, 2016 at 8:32 AM, Robert Annewandter <robert.annewand...@opengosim.com> wrote: Hi, Is there a way to fine tune '-ksp_view_mat binary' so it only gives me an output at a defined time? I'm implementing a Two-Stage Preconditioner for PFLOTRAN (a subsurface flow and transport simulator which uses PETsc). Currently I like to compare convergence of the preconditioning in PFLOTRAN against a prototyped version in MATLAB. In PFLOTRAN I export the Jacobian, and residual before they get handed-over to PETsc, which get imported in MATLAB to do the Two-Stage Preconditioning. However, I know I can do exports closer to PETsc with the options '-ksp_view_mat binary' and '-ksp_view_rhs binary'. Unfortunately I get outputs for the whole simulation in one file, instead of only at a well-defined Newton-Raphson iteration (as done in PFLOTRAN). Is there a way to tell PETsc with above options when to export the Jacobian and residual? Any pointers much appreciated! It sounds like you could give the particular solve you want to look at a prefix: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetOptionsPrefix.html and then give -myprefix_ksp_view_mat binary Thanks, Matt Thank you! Robert -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
[petsc-users] Control Output of '-ksp_view_mat binary'
Hi, Is there a way to fine tune '-ksp_view_mat binary' so it only gives me an output at a defined time? I'm implementing a Two-Stage Preconditioner for PFLOTRAN (a subsurface flow and transport simulator which uses PETsc). Currently I like to compare convergence of the preconditioning in PFLOTRAN against a prototyped version in MATLAB. In PFLOTRAN I export the Jacobian, and residual before they get handed-over to PETsc, which get imported in MATLAB to do the Two-Stage Preconditioning. However, I know I can do exports closer to PETsc with the options '-ksp_view_mat binary' and '-ksp_view_rhs binary'. Unfortunately I get outputs for the whole simulation in one file, instead of only at a well-defined Newton-Raphson iteration (as done in PFLOTRAN). Is there a way to tell PETsc with above options when to export the Jacobian and residual? Any pointers much appreciated! Thank you! Robert