> If I create my grid using GridGenerator::hyper_rectangle(...), and then
> proceed to uniformly refine the mesh 4 times, I get a more-or-less optimal
> partitioning.  That is, the grid is partitioned into nice rectilinear
> chunks that minimize the surface-area-to-volume ratio of each processor
> block.  If, instead, I use GridGenerator::subdivided_hyper_rectangle(...)
> to generate the mesh, assigning 16 subdivisions in each direction, I end
> up with an identical mesh but a very different partitioning (see attached
> images).  The partitioning seems to use a strip method that is far from
> optimal.  I have noticed similar behavior if I read in an external mesh
> file.  The distinction seems to be whether the coarsest mesh starts with 1
> element (the hyper_rectangle strategy) or many elements (the
> subdivided_hyper_rectange or external mesh strategy).
> 
> Is this behavior typical, and is there anyway to improve the coarsest level
> partitioning of the mesh?

Good question.

Here's what's happening: the partitioning is done using a space filling curve 
-- all cells are arranged along the one-dimensional curve and then you 
partition the mesh by cutting the space filling curve into segments with 
equally many elements.

If there is only one coarse mesh cell, this leads to fairly good -- though by 
no means optimal -- partitions. They are very cheap to compute, however, which 
makes this approach attractive when you have a large number of processors. The 
p4est paper goes into more detail as to the reasons.

If there are more than one coarse mesh cell, the question becomes a bit more 
interesting. Within each coarse cell, the path the space filling curve takes 
is clear, but we have to somehow order the coarse mesh cells along this line 
as well. p4est itself just uses the order it is given, which may mean that 
some processors have cells that are children of coarse mesh cells that are far 
apart. To make this less likely, deal.II sorts the coarse mesh cells by a 
Cuthill-McKee algorithm, which is why you see the concentric shells: Cuthill-
Mckee orders the cells one shell at a time.

I think this algorithm makes things a bit better, but it's certainly still not 
good. I think it's also not bad once you have many more processors than coarse 
mesh cells -- or, alternatively, if you have many more coarse mesh cells than 
processors. The only case that's problematic is when the number of processors 
is roughly equal or slightly less than the number of coarse mesh cells, which 
I think is the case you run into.

If you have ideas how to improve this all, I'd be interested in working on 
this with you!

Best
 W.

-------------------------------------------------------------------------
Wolfgang Bangerth                email:            [email protected]
                                 www: http://www.math.tamu.edu/~bangerth/
_______________________________________________
dealii mailing list http://poisson.dealii.org/mailman/listinfo/dealii

Reply via email to