I just got the distributed mesh down to 4 times larger than the serial! There is one more std::vector in MeshConnectivity to clear (_num_global_connections).
I know there are lots of additional requirements for the parallel mesh, but yesterday the numbers were crazy. I'll dig some more. M Den Jan 13, 2014 kl. 3:31 PM skrev Garth N. Wells: > On 2014-01-13 14:25, Mikael Mortensen wrote: >> Great! >> But I think there are still some related issues with a parallel mesh. >> Using 2 cpus and a smaller UnitCubeMesh(40, 40, 40) I find that the >> distributed mesh now is 13 times larger than the serial mesh >> (yesterday it was 20). And mesh.clean() only reduces the size to a >> factor 9. > > Dig around to see where the memory is going. Compare both total and relative > memory usage. Relative usage is misleading because if my change improved some > local aspects, then the relative usage will look worse even if total memory > usage is reduced. > > In parallel, facets will be built automatically (used in figuring out dof > ownership). There is also extras storage for global indices. > > Garth > >> What should be a reasonable size of the distributed mesh >> anyway, compared with the serial? >> Mikael >> Den Jan 13, 2014 kl. 1:45 PM skrev Garth N. Wells: >>> I've just pushed some changes to master, which for >>> from dolfin import * >>> mesh = UnitCubeMesh(128, 128, 128) >>> mesh.init(2) >>> mesh.init(2, 3) >>> give a factor 2 reduction in memory usage and a factor 2 speedup. Change is >>> at https://bitbucket.org/fenics-project/dolfin/commits/8265008. >>> The improvement is primarily due to d--d connectivity no being computed. >>> I'll submit a pull request to throw an error when d--d connectivity is >>> requested. The only remaining place where d--d is (implicitly) computed is >>> in the code for finding constrained mesh entities (e.g., for periodic bcs). >>> The code in question is >>> for (MeshEntityIterator(facet, dim) e; . . . .) >>> when dim is the same as the topological dimension if the facet. As in other >>> places, it would be consistent (and the original intention of the >>> programmer) if this just iterated over the facet itself rather than all >>> facets attached to it via a vertex. >>> Garth >>> On 2014-01-08 17:00, Garth N. Wells wrote: >>>> On 2014-01-08 16:48, Anders Logg wrote: >>>>> On Wed, Jan 08, 2014 at 04:41:17PM +0000, Garth N. Wells wrote: >>>>>> On 2014-01-08 16:17, Anders Logg wrote: >>>>>> >On Wed, Jan 08, 2014 at 04:07:35PM +0000, Garth N. Wells wrote: >>>>>> >>On 2014-01-08 15:21, Anders Logg wrote: >>>>>> >>>I would claim that the connectivity is stored very efficiently. The >>>>>> >>>simplest example is D--0 connectivity for a mesh which is for each >>>>>> >>>cell which vertices belong to it. That data is stored as one big array >>>>>> >>>of integers, something like >>>>>> >>> >>>>>> >>> 0 1 2 3 0 1 2 4 ... >>>>>> >>> >>>>>> >>>which means that tetrahedron 0 has vertices 0 1 2 3, tetrahedron 1 has >>>>>> >>>vertices 0 1 2 4 etc. >>>>>> >>> >>>>>> >>>There are D + 1 different kinds of entities in a mesh. For a >>>>>> >>>tetrahedron we have vertex, edge, face, cell and so the total number >>>>>> >>>of connectivities that may be computed is (D + 1)*(D + 1). Not all >>>>>> >>>these are needed, but they can all be computed. >>>>>> >>> >>>>>> >>>For solving Poisson's equation, we first need D--0. To see which >>>>>> >>>connectivities are computed, one can do info(mesh, True) to print the >>>>>> >>>following little table: >>>>>> >>> >>>>>> >>> 0 1 2 3 >>>>>> >>> 0 - - - - >>>>>> >>> 1 - - - - >>>>>> >>> 2 - - - - >>>>>> >>> 3 x - - - >>>>>> >>> >>>>>> >>>To set boundary conditions, one needs to create the faces of the mesh >>>>>> >>>(which are not stored a priori). One will then get the following >>>>>> >>>connectivities: >>>>>> >>> >>>>>> >>> 0 1 2 3 >>>>>> >>> 0 - - - x >>>>>> >>> 1 - - - - >>>>>> >>> 2 x - - - >>>>>> >>> 3 x - x x >>>>>> >>> >>>>>> >>>Here we see the following connectivities: >>>>>> >>> >>>>>> >>> 3--0 = the vertices of all cells = the cells >>>>>> >>> 0--3 = the cells connected to all vertices >>>>>> >>> 3--3 = the cells connected to all cells (via vertices) >>>>>> >>> 2--0 = the vertices of all faces = the faces >>>>>> >>> 3--2 = the faces of all cells >>>>>> >>> >>>>>> >>>These get computed in order: >>>>>> >>> >>>>>> >>> 0--3 is computed from 3--0 (by "inversion") >>>>>> >>> 3--3 is computed from 3--0 and 0--3 >>>>>> >>> 2--0 and 3--2 are computed from 3--3 by a local search >>>>>> >>> >>>>>> >>>3--3 is needed in order for a cell to communicate with its neighboring >>>>>> >>>cells to decide on how to number the common face (if any). >>>>>> >>> >>>>>> >> >>>>>> >>3--3 connectivity is *undefined*. >>>>>> > >>>>>> >Yes, in general, but we have *chosen* to define it as 3--0--3. >>>>>> > >>>>>> Which I think we should attempt to remove >>>>>> >The reason for this choice is that we start with 3--0, from which we >>>>>> >can easily compute 3--0--3. >>>>>> > >>>>>> >In other words, we start with cell-vertex data and hence the only way >>>>>> >for two cells to figure out if they share common entities (for example >>>>>> >faces) is to go via the vertices. Unless you have some other brilliant >>>>>> >idea. >>>>>> > >>>>>> The function GraphBuilder::compute_local_dual_graph does this. It >>>>>> loops over cells and builds a hashed map (boost::unordered_map) from >>>>>> a (sorted) vector of the facet vertices to the cell index. If the >>>>>> key (the facet vertices) is already in the map, then we know a cell >>>>>> that the current cell shares a facet with. If the key is not already >>>>>> in the map, then we add it. With appropriate hashing, the >>>>>> unordered_map look-ups are O(1). >>>>> Now that's a brilliant idea. Why is this not already the default? >>>> Reluctance to touch the beautifully crafted mesh library? :) I'll add >>>> it alongside the current code so that we can test it for speed. >>>> It's used in GraphBuilder to construct a dual graph (cell-facet-cell >>>> connections) to feed to a graph partitioner because in GraphBuilder we >>>> don't yet have a Mesh object - just cell connectivity data. >>>>> Application of boundary conditions can bypass TopologyComputation.cpp >>>>> and build the facets using the dual graph and it would not be in >>>>> conflict with TopologyComputation.cpp (which may still be useful for >>>>> computing other connectitivies). >>>> The approach in GraphBuilder::compute_local_dual_graph could be used >>>> for other connection types. >>>> Garth >>>>> -- >>>>> Anders >>>> _______________________________________________ >>>> fenics mailing list >>>> [email protected] >>>> http://fenicsproject.org/mailman/listinfo/fenics >>> _______________________________________________ >>> fenics mailing list >>> [email protected] >>> http://fenicsproject.org/mailman/listinfo/fenics > _______________________________________________ > fenics mailing list > [email protected] > http://fenicsproject.org/mailman/listinfo/fenics _______________________________________________ fenics mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics
