On Mon, Sep 16, 2013 at 4:00 PM, Garth N. Wells <gn...@cam.ac.uk> wrote:
> On 16 September 2013 13:59, Johan Hake <hake....@gmail.com> wrote: > > Interesting! > > > > What do you mean with dof blocking? > > > > Making sure that dofs at a node are grouped together, e.g. (ux, uy and > uz) in elasticity. The block size (which must be constant) is passed > through to the PETSc matrix. This allows PETSc to make some > optimisations for sparse matrix-vector products, and it dramatically > improves some preconditioners (e.g. AMG) because they work with a much > smaller sparsity graph and can incorporate the extra information in > the preconditoner (e.g., in constructing aggregates). > > Ok! I was confused by the naming ;) It appeared that the dofs where blocked in some way, as in hindered :P Johan > So it looks like the SCOTCH reordering is faster to build but performs > > similar to the old Boost reordering? > > > > Yes. Importantly, both re-ordering schemes appear to scale linearly > with the number of dofs. > > Garth > > > Johan > > > > > > On Mon, Sep 16, 2013 at 2:32 PM, Garth N. Wells <gn...@cam.ac.uk> wrote: > >> > >> Here are some quick numbers for anyone interested in the impact of dof > >> re-ordering, and why DOLFIN renumbers the UFC dofmap by default. For a > >> vector-valued reaction-diffusion problem using P2 Lagrange elements in > >> 3D (6.44M dofs) and PETSc as the backend: > >> > >> Ordering method | None | Random* | A* | A | B* | > B > >> > >> > ----------------------------|-------|---------|-------|-------|-------|------- > >> Call graph ordering | 0.0s | - | 1.2s | 9.2s | 2.4s | > >> 17.87 > >> Assembly | 27.9s | 34.4s | 29.3s | 29.4s | 30.2s | > >> 32s > >> 100 CG + Jacobi iterations | 221s | 1177s | 142s | 176s | 140s | > >> 160s > >> > >> > >> A: New SCOTCH reordering (Gibbs-Poole-Stockmeyer) > >> B: Boost reordering (reverse Cuthill-McKee) > >> *: With dof blocking > >> > >> The Krylov solver speed-up with re-ordering and dof blocking is > >> dramatic. Speed-ups are also observed when using direct solvers, but > >> these are less dramatic. The small drop in assembly when the dofs are > >> re-ordered is due to reduced memory locality in the linear algebra > >> objects with respect to the order in which the mesh cells are iterated > >> over. This can be (and soon will be) fixed by re-ordering the mesh > >> indices. It has been observed that some solvers (e.g. BommerAMG fail > >> to converge for this problem without re-ordering, and converge in <10 > >> iterations with re-ordering and dof blocking). > >> > >> The DOLFIN UnitFooMesh ordering is quite cache-friendly. Meshes > >> created by 3rd-party libraries are unlikely to be as cache-friendly > >> and could tend towards the random ordering timing, > >> > >> Garth > >> _______________________________________________ > >> fenics mailing list > >> fenics@fenicsproject.org > >> http://fenicsproject.org/mailman/listinfo/fenics > > > > >
_______________________________________________ fenics mailing list fenics@fenicsproject.org http://fenicsproject.org/mailman/listinfo/fenics