On Mon, Sep 16, 2013 at 4:00 PM, Garth N. Wells <gn...@cam.ac.uk> wrote:

> On 16 September 2013 13:59, Johan Hake <hake....@gmail.com> wrote:
> > Interesting!
> >
> > What do you mean with dof blocking?
> >
>
> Making sure that dofs at a node are grouped together, e.g. (ux, uy and
> uz) in elasticity. The block size (which must be constant) is passed
> through to the PETSc matrix. This allows PETSc to make some
> optimisations for sparse matrix-vector products, and it dramatically
> improves some preconditioners (e.g. AMG) because they work with a much
> smaller sparsity graph and can incorporate the extra information in
> the preconditoner (e.g., in constructing aggregates).
>
>
Ok! I was confused by the naming ;) It appeared that the dofs where
blocked in some way, as in hindered :P

Johan


> So it looks like the SCOTCH reordering is faster to build but performs
> > similar to the old Boost reordering?
> >
>
> Yes. Importantly, both re-ordering schemes appear to scale linearly
> with the number of dofs.
>
> Garth
>
> > Johan
> >
> >
> > On Mon, Sep 16, 2013 at 2:32 PM, Garth N. Wells <gn...@cam.ac.uk> wrote:
> >>
> >> Here are some quick numbers for anyone interested in the impact of dof
> >> re-ordering, and why DOLFIN renumbers the UFC dofmap by default. For a
> >> vector-valued reaction-diffusion problem using P2 Lagrange elements in
> >> 3D (6.44M dofs) and PETSc as the backend:
> >>
> >> Ordering method             | None  | Random* |  A*   | A     | B*    |
> B
> >>
> >>
> ----------------------------|-------|---------|-------|-------|-------|-------
> >> Call graph ordering         | 0.0s  | -       | 1.2s  | 9.2s  | 2.4s  |
> >> 17.87
> >> Assembly                    | 27.9s | 34.4s   | 29.3s | 29.4s | 30.2s |
> >> 32s
> >> 100 CG + Jacobi iterations  | 221s  | 1177s   | 142s  | 176s  | 140s  |
> >> 160s
> >>
> >>
> >> A: New SCOTCH reordering (Gibbs-Poole-Stockmeyer)
> >> B: Boost reordering (reverse Cuthill-McKee)
> >> *: With dof blocking
> >>
> >> The Krylov solver speed-up with re-ordering and dof blocking is
> >> dramatic. Speed-ups are also observed when using direct solvers, but
> >> these are less dramatic. The small drop in assembly when the dofs are
> >> re-ordered is due to reduced memory locality in the linear algebra
> >> objects with respect to the order in which the mesh cells are iterated
> >> over. This can be (and soon will be) fixed by re-ordering the mesh
> >> indices. It has been observed that some solvers (e.g. BommerAMG fail
> >> to converge for this problem without re-ordering, and converge in <10
> >> iterations with re-ordering and dof blocking).
> >>
> >> The DOLFIN UnitFooMesh ordering is quite cache-friendly. Meshes
> >> created by 3rd-party libraries are unlikely to be as cache-friendly
> >> and could tend towards the random ordering timing,
> >>
> >> Garth
> >> _______________________________________________
> >> fenics mailing list
> >> fenics@fenicsproject.org
> >> http://fenicsproject.org/mailman/listinfo/fenics
> >
> >
>
_______________________________________________
fenics mailing list
fenics@fenicsproject.org
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to