Here are some quick numbers for anyone interested in the impact of dof re-ordering, and why DOLFIN renumbers the UFC dofmap by default. For a vector-valued reaction-diffusion problem using P2 Lagrange elements in 3D (6.44M dofs) and PETSc as the backend:
Ordering method | None | Random* | A* | A | B* | B ----------------------------|-------|---------|-------|-------|-------|------- Call graph ordering | 0.0s | - | 1.2s | 9.2s | 2.4s | 17.87 Assembly | 27.9s | 34.4s | 29.3s | 29.4s | 30.2s | 32s 100 CG + Jacobi iterations | 221s | 1177s | 142s | 176s | 140s | 160s A: New SCOTCH reordering (Gibbs-Poole-Stockmeyer) B: Boost reordering (reverse Cuthill-McKee) *: With dof blocking The Krylov solver speed-up with re-ordering and dof blocking is dramatic. Speed-ups are also observed when using direct solvers, but these are less dramatic. The small drop in assembly when the dofs are re-ordered is due to reduced memory locality in the linear algebra objects with respect to the order in which the mesh cells are iterated over. This can be (and soon will be) fixed by re-ordering the mesh indices. It has been observed that some solvers (e.g. BommerAMG fail to converge for this problem without re-ordering, and converge in <10 iterations with re-ordering and dof blocking). The DOLFIN UnitFooMesh ordering is quite cache-friendly. Meshes created by 3rd-party libraries are unlikely to be as cache-friendly and could tend towards the random ordering timing, Garth _______________________________________________ fenics mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics
