Here are some quick numbers for anyone interested in the impact of dof
re-ordering, and why DOLFIN renumbers the UFC dofmap by default. For a
vector-valued reaction-diffusion problem using P2 Lagrange elements in
3D (6.44M dofs) and PETSc as the backend:

Ordering method             | None  | Random* |  A*   | A     | B*    | B
----------------------------|-------|---------|-------|-------|-------|-------
Call graph ordering         | 0.0s  | -       | 1.2s  | 9.2s  | 2.4s  | 17.87
Assembly                    | 27.9s | 34.4s   | 29.3s | 29.4s | 30.2s | 32s
100 CG + Jacobi iterations  | 221s  | 1177s   | 142s  | 176s  | 140s  | 160s


A: New SCOTCH reordering (Gibbs-Poole-Stockmeyer)
B: Boost reordering (reverse Cuthill-McKee)
*: With dof blocking

The Krylov solver speed-up with re-ordering and dof blocking is
dramatic. Speed-ups are also observed when using direct solvers, but
these are less dramatic. The small drop in assembly when the dofs are
re-ordered is due to reduced memory locality in the linear algebra
objects with respect to the order in which the mesh cells are iterated
over. This can be (and soon will be) fixed by re-ordering the mesh
indices. It has been observed that some solvers (e.g. BommerAMG fail
to converge for this problem without re-ordering, and converge in <10
iterations with re-ordering and dof blocking).

The DOLFIN UnitFooMesh ordering is quite cache-friendly. Meshes
created by 3rd-party libraries are unlikely to be as cache-friendly
and could tend towards the random ordering timing,

Garth
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to