Dear all, I am solving a Stokes problem in 3D aorta geometries, using a P2/P1 finite elements discretization on tetrahedral meshes resulting in ~1-1.5M DOFs. Viscosity is uniform (can be adjusted arbitrarily), and the right hand side is a function of noisy measurement data.
In other settings of "standard" Stokes flow problems I have obtained good convergence with an "upper" Schur complement preconditioner, using AMG (ML or Hypre) on the velocity block and approximating the Schur complement matrix by the diagonal of the pressure mass matrix: -ksp_converged_reason -ksp_monitor_true_residual -ksp_initial_guess_nonzero -ksp_diagonal_scale -ksp_diagonal_scale_fix -ksp_type fgmres -ksp_rtol 1.0e-8 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_schur_fact_type upper -pc_fieldsplit_schur_precondition user # <-- pressure mass matrix -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type ml -fieldsplit_1_ksp_type preonly -fieldsplit_1_pc_type jacobi In my present case this setup gives rather slow convergence (varies for different geometries between 200-500 or several thousands!). I obtain better convergence with "-pc_fieldsplit_schur_precondition selfp"and using multigrid on S, with "-fieldsplit_1_pc_type ml" (I don't think this is optimal, though). I don't understand why the pressure mass matrix approach performs so poorly and wonder what I could try to improve the convergence. Until now I have been using ML and Hypre BoomerAMG mostly with default parameters. Surely they can be improved by tuning some parameters. Which could be a good starting point? Are there other options I should consider? With the above setup (jacobi) for a case that works better than others, the KSP terminates with 467 KSP unpreconditioned resid norm 2.072014323515e-09 true resid norm 2.072014322600e-09 ||r(i)||/||b|| 9.939098100674e-09 You can find the output of -ksp_view below. Let me know if you need more details. Thanks in advance for your advice! Best wishes David KSP Object: 1 MPI processes type: fgmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000 tolerances: relative=1e-08, absolute=1e-50, divergence=10000. right preconditioning diagonally scaled system using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization UPPER Preconditioner for the Schur complement formed from user provided matrix Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: ml MG: type is MULTIPLICATIVE, levels=5 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=3, cols=3 package used to perform factorization: petsc total: nonzeros=3, allocated nonzeros=3 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=3, cols=3 total: nonzeros=3, allocated nonzeros=3 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=15, cols=15 total: nonzeros=69, allocated nonzeros=69 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_2_) 1 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_2_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=304, cols=304 total: nonzeros=7354, allocated nonzeros=7354 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_3_) 1 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_3_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=30236, cols=30236 total: nonzeros=2730644, allocated nonzeros=2730644 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_4_) 1 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_4_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=894132, cols=894132 total: nonzeros=70684164, allocated nonzeros=70684164 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=894132, cols=894132 total: nonzeros=70684164, allocated nonzeros=70684164 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_1_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: schurcomplement rows=42025, cols=42025 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_1_) 1 MPI processes type: seqaij rows=42025, cols=42025 total: nonzeros=554063, allocated nonzeros=554063 total number of mallocs used during MatSetValues calls =0 not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=42025, cols=894132 total: nonzeros=6850107, allocated nonzeros=6850107 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: ml MG: type is MULTIPLICATIVE, levels=5 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_coarse_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=3, cols=3 package used to perform factorization: petsc total: nonzeros=3, allocated nonzeros=3 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=3, cols=3 total: nonzeros=3, allocated nonzeros=3 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=15, cols=15 total: nonzeros=69, allocated nonzeros=69 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_2_) 1 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_2_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=304, cols=304 total: nonzeros=7354, allocated nonzeros=7354 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_3_) 1 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_3_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=30236, cols=30236 total: nonzeros=2730644, allocated nonzeros=2730644 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (fieldsplit_0_mg_levels_4_) 1 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (fieldsplit_0_mg_levels_4_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=894132, cols=894132 total: nonzeros=70684164, allocated nonzeros=70684164 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=894132, cols=894132 total: nonzeros=70684164, allocated nonzeros=70684164 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: 1 MPI processes type: seqaij rows=894132, cols=42025 total: nonzeros=6850107, allocated nonzeros=6850107 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: 1 MPI processes type: seqaij rows=42025, cols=42025 total: nonzeros=554063, allocated nonzeros=554063 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=936157, cols=936157 total: nonzeros=84938441, allocated nonzeros=84938441 total number of mallocs used during MatSetValues calls =0 not using I-node routines