[petsc-users] Slow convergence while parallel computations.

Наздрачёв Виктор Wed, 01 Sep 2021 01:43:14 -0700

Dear all,

I have a 3D elasticity problem with heterogeneous properties. There is
unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet
BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are
imposed on side faces. Gravity load is also accounted for. The grid I use
consists of 500k cells (which is approximately 1.6M of DOFs).


The best performance and memory usage for single MPI process was obtained
with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as
preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with
4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of
number of iterations required to achieve the same tolerance is
significantly increased.

I`ve also tried PCGAMG (agg) preconditioner with ICС (1) sub-precondtioner.
For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To
improve the convergence rate, the nullspace was attached using
MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has
reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there
is peak memory usage with 14.1 GB, which appears just before the start of
the iterations. Parallel computation with 4 MPI processes took 2 m 53 s
when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.



Are there ways to avoid decreasing of the convergence rate for bjacobi
precondtioner in parallel mode? Does it make sense to use hierarchical or
nested krylov methods with a local gmres solver (sub_pc_type gmres) and
some sub-precondtioner (for example, sub_pc_type bjacobi)?



Is this peak memory usage expected for gamg preconditioner? is there any
way to reduce it?



What advice would you give to improve the convergence rate with multiple
MPI processes, but keep memory consumption reasonable?



Kind regards,

Viktor Nazdrachev

R&D senior researcher

Geosteering Technologies LLC

[petsc-users] Slow convergence while parallel computations.

Reply via email to