I have had a few testruns and compared my memory usage to a Navier-Stokes 
solver from OpenFOAM. Some results thus far for a box of size 40**3 are (using 
/proc/pid/statm to measure memory usage, see 
https://github.com/mikaem/fenicstools/blob/master/fem/getMemoryUsage.cpp)

1 CPU
OpenFOAM:     324 MB RSS  (443 MB VM)
Fenics:               442 MB RSS (mesh 16 MB) + 60 MB for importing dolfin (1.0 
GB VM)

2 CPUs
OpenFOAM:      447 MB (1.6 GB VM)
Fenics:                520 MB (mesh 200 MB) + 120 MB for importing dolfin (1.7 
GB VM)

4 CPUs
OpenFOAM:      540 MB RSS (2.0 GB VM)  
Fenics:                624 MB (mesh 215 MB) + 250 MB for importing dolfin  (3.0 
GB VM)

This is after Garth's removal of the D--0--D connectivity and I'm not 
attempting to clean out topology or anything after the mesh has been created. 
I'm actually very happy with these results:-) More CPUs require more memory, 
but it increases more for OpenFOAM than for Fenics and the difference is not 
all that much. Importing dolfin is a considerable part of the total cost, but 
the cost does not increase with mesh size so it's not of much concern (also, 
this depends on how many dependencies you carry around). I added virtual memory 
in parenthesis, but I'm not sure what to make of it.

Some other things I have learned (very) recently:

1) Clearing out memory during runtime is apparently not straight forward 
because, in an effort to optimize speed (allocation is slow), the system may 
choose to keep you previously allocated memory close (in RAM). Even with the 
swap trick the operating system may actually choose to keep the allocated 
memory available and not free it (see, e.g., 
http://stackoverflow.com/questions/6526983/free-memory-from-a-long-vector). 
Apparently there is a compiler option GLIBCXX_FORCE_NEW that is intended to 
disable this caching of memory, but I have not been able to make it work. With 
the swap trick memory is definitely freed quite efficiently, but it is 
frustrating trying to figure out where memory goes when we are not in total 
control of allocation and deallocation. One example - I use project to 
initialize a Function. During project rss memory increases substantially due 
to, e.g., a matrix being created behind the scenes. One would expect that this 
memory is freed after finishing, but that does not always happen. The 
additional memory may be kept in RAM and used later. If, for example, I create 
a matrix just after project, then memory use may actually not increase at all, 
because the OS can just reuse the memory it did not free. I guess this is 
something one has to live with using dynamic memory allocation? I really know 
very little of how these things works, so it would be great if someone who does 
could comment?


2) Both MeshGeometry.clear() and MeshTopology.clear() should probably make use 
of the swap trick instead of std::vector<>.clear(). At least then the memory 
´may´ be freed when clearing the mesh.

3) Memory use for creating a parallel UnitCubeMesh(40, 40, 40) goes like this:

  16  MB in creating the serial mesh
  56  MB in creating LocalMeshData class 
  41  MB in creating cell partition in MeshPartitioning::build_distributed_mesh
  101 MB in creating mesh from LocalMeshData and cell partitioning in 
MeshPartitioning::build

No memory is freed after the mesh has been created, even though it looks to me 
like LocalMeshData runs out of scope?


Mikael

PS Not really relevant to this test, but my NS-solver is 20% faster than 
OpenFOAM for this channel-flow test-case:-)


Den Jan 15, 2014 kl. 10:48 AM skrev Garth N. Wells:

> On 2014-01-15 07:13, Anders Logg wrote:
>> On Tue, Jan 14, 2014 at 08:08:50PM +0000, Garth N. Wells wrote:
>>> On 2014-01-14 19:28, Anders Logg wrote:
>>> >On Tue, Jan 14, 2014 at 05:19:55PM +0000, Garth N. Wells wrote:
>>> >>On 2014-01-14 16:24, Anders Logg wrote:
>>> >>>On Mon, Jan 13, 2014 at 07:16:01PM +0000, Garth N. Wells wrote:
>>> >>>>On 2014-01-13 18:42, Anders Logg wrote:
>>> >>>>>On Mon, Jan 13, 2014 at 12:45:11PM +0000, Garth N. Wells wrote:
>>> >>>>>>I've just pushed some changes to master, which for
>>> >>>>>>
>>> >>>>>>    from dolfin import *
>>> >>>>>>    mesh = UnitCubeMesh(128, 128, 128)
>>> >>>>>>    mesh.init(2)
>>> >>>>>>    mesh.init(2, 3)
>>> >>>>>g>
>>> >>>>>>give a factor 2 reduction in memory usage and a factor 2 speedup.
>>> >>>>>>Change is at
>>> >>>>>>https://bitbucket.org/fenics-project/dolfin/commits/8265008.
>>> >>>>>>
>>> >>>>>>The improvement is primarily due to d--d connectivity no being
>>> >>>>>>computed. I'll submit a pull request to throw an error when d--d
>>> >>>>>>connectivity is requested. The only remaining place where d--d is
>>> >>>>>>(implicitly) computed is in the code for finding constrained mesh
>>> >>>>>>entities (e.g., for periodic bcs). The code in question is
>>> >>>>>>
>>> >>>>>>    for (MeshEntityIterator(facet, dim) e; . . . .)
>>> >>>>>>
>>> >>>>>>when dim is the same as the topological dimension if the facet. As
>>> >>>>>>in other places, it would be consistent (and the original intention
>>> >>>>>>of the programmer) if this just iterated over the facet itself
>>> >>>>>>rather than all facets attached to it via a vertex.
>>> >>>>>
>>> >>>>>I don't see why an error message is needed. Could we not just add the
>>> >>>>>possibility to specify what d -- d means? It might be useful for other
>>> >>>>>algorithms to be able to compute that data.
>>> >>>>>
>>> >>>>
>>> >>>>I think it should be removed because it's (i) ad-hoc, (ii) is not
>>> >>>>used/required in the library and (iii) is a memory monster.
>>> >>>>Moreover, we have dimension-independent algorithms that work when
>>> >>>>d--d connectivity is a connection from an entity to itself (for
>>> >>>>which we don't need computation and memory eating).  We shouldn't
>>> >>>>have an unnecessary, memory monster d-0-d data structure being
>>> >>>>created opaquely for no purpose, which is what what
>>> >>>>
>>> >>>>    for (MeshEntityIterator e(entity, dim); . . . .){....}
>>> >>>>
>>> >>>>does at present when (dimension of entity) = dim. The excessive
>>> >>>>memory usage is an issue for big problems.
>>> >>>>
>>> >>>>If a user wants d-0-d it can be built explicitly, which makes both
>>> >>>>the purpose and the intention clear.
>>> >>>
>>> >>>How can it be built explicitly without calling mesh.init(d, d)?
>>> >>>
>>> >>
>>> >>Build d--0, then 0--d:
>>> >>
>>> >>  for (MeshEntityIterator e0(mesh, d); .....)
>>> >>    for (VertexIterator v(*e0); ....)
>>> >>      for (MeshEntityIterator e2(*v, d); .....)
>>> >
>>> >Yes, that's even more explicity but since we already have a function
>>> >that computes exactly that, I don't see the urge to remove it.
>>> >
>>> >>Note that d--0--d is no longer used in DOLFIN (except by accident in
>>> >>finding periodic bcs because of inconsistent behind-the-scenes
>>> >>behaviour, and it should be changed because it's expensive and not
>>> >>required for periodic bcs).
>>> >
>>> >Not true - see line 67 of Extrapolation.cpp.
>>> That code is not covered by any test or demo.
>> Yes it is, by a unit test under test/unit/adaptivity/ and all the
>> auto-adaptive demos, including the documented demo auto-adaptive-poisson.
> 
> OK. I don't know why it didn't trigger a failure before when I added an error 
> when d0==d1 and ran the tests.
> 
>>> >Having easy access to
>>> >neighboring cells (however one chooses to define what it means to be a
>>> >cell neighbor of a cell) can be useful to many algorithms that perform
>>> >local search algorithms or solve local problems.
>>> >
>>> Sure, but again the user should decide how a connection is defined.
>> Yes, I agree.
>>> >Again, I don't see the urge to remove this function just because it
>>> >consumes a lot of memory. The important point is to avoid using it for
>>> >standard algorithms in DOLFIN, like boundary conditions.
>>> That's not a good argument. If poor/slow/memory hogging code is left
>>> in the library, it eventually gets used in the library despite any
>>> warnings. It's happened time and time again.
>> I have never understood this urge to remove all functionality that can
>> potentially be misused. We are adults so it should be enough to write
>> a warning in the documentation.
> 
> This has proven not to be enough in the past. Moreover, it is good to make it 
> easy for users to write good code by not providing functions to are easily 
> misused, and it's good to make it easy to write code that doesn't only work 
> well for small serial cases.
> 
> Garth
> 
>>> >The proper
>>> >way to fix functions that should not use this particular function (in
>>> >for example periodic bcs) would be to create an issue for its misuse.
>>> >
>>> The periodic bc code does not mis-use the function. The flaw is that
>>> the mesh library magically and unnecessarily creates a memory
>>> monster behind the scenes.
>> I agree with that.
>> --
>> Anders
> 
> _______________________________________________
> fenics mailing list
> [email protected]
> http://fenicsproject.org/mailman/listinfo/fenics

_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to