Re: [FEniCS] Issue 162 - interpolation on nonmatching meshes

Mikael Mortensen Thu, 05 Dec 2013 00:46:31 -0800

Den Dec 4, 2013 kl. 6:15 PM skrev Anders Logg:

> On Tue, Dec 03, 2013 at 10:12:22AM +0100, Mikael Mortensen wrote:
>> 
>> Den Dec 2, 2013 kl. 9:17 PM skrev Anders Logg:
>> 
>>> I don't see a simple way around this (other than inventing a new
>>> algorithm - the only way I can think of is to create a third data
>>> structure with a regular grid predictively distributed onto the
>>> different processors and then communicate via that grid) so I don't
>>> mind this being added.
>> 
>> Great
>> 
>> I could probably create a very inefficient parallel interpolation
>> routine by looping over all partitions of the distributed mesh1, one
>> by one, collectively (i.e., broadcast mesh1's partition on process 0
>> to all, then run collectively over it, allowing us to call u0.eval
>> or evaluate_dofs collectively. When finished move on to partition on
>> process 1). That could work for any space I guess?
> 
> I meant communicating via a third independent grid constructed in a
> way such that each cell only needs to communicate with a subset of
> that third grid.
>


Ok, sound interesting but I don't quite see how it would work?

Sorry about the confusion, I meant to describe a third alternative 
(inefficient) approach, no reference to your suggestion.

Mikael

> --
> Anders
> 
> 
>>> Since not all elements are supported, I think this function should
>>> have a separate name for now, perhaps
>>> 
>>>  interpolate_lagrange()
>>> 
>>> My suggestion would be that the documentation explains that this
>>> function works in parallel for non-matching meshes (in contrast to the
>>> regular interpolate) but does only support Lagrange spaces.
>> 
>> Sounds good to me.
>> 
>> Mikael
>> 
>>> 
>>>> Hi,
>>>> 
>>>> This is in reference to Issue 162 
>>>> https://bitbucket.org/fenics-project/dolfin/
>>>> issue/162/interpolation-nonmatching-mesh-in-parallel
>>>> 
>>>> Currently it is not possible in parallel to interpolate a
>>>> dolfin Function u0 defined on one mesh to a Function with the
>>>> same FunctionSpace defined on a different mesh. I would like to be able to 
>>>> do
>>>> this
>>>> 
>>>> mesh0 = UnitSquareMesh(100, 100)
>>>> mesh1 = UnitSquareMesh(20, 20)
>>>> V0 = FunctionSpace(mesh0, "CG", 1)
>>>> V1 = FunctionSpace(mesh1, "CG", 1)
>>>> u0 = interpolate(Expression("x[0]*x[1]"), V0)
>>>> u0.update()
>>>> u = interpolate_nonmatching_mesh(u0, V1)
>>>> 
>>>> or probably overloaded like this
>>>> 
>>>> u = interpolate(u0, V1)
>>>> 
>>>> Now there are probably many different ways this can be implemented. The 
>>>> most
>>>> obvious approach that comes to mind is simply to overload `Function::eval` 
>>>> with
>>>> a parallel version. There are many reasons why this doesn’t quite seem to 
>>>> work.
>>>> Consider the current `interpolate` implementation
>>>> 
>>>> FunctionSpace::interpolate(GenericVector& expansion_coefficients,
>>>>                               const GenericFunction& v)
>>>> …
>>>> // Iterate over mesh and interpolate on each cell
>>>> UFCCell ufc_cell(*_mesh);
>>>> for (CellIterator cell(*_mesh); !cell.end(); ++cell)
>>>> {
>>>>   // Update to current cell
>>>>   ufc_cell.update(*cell);
>>>> 
>>>>   // Restrict function to cell
>>>>   v.restrict(&cell_coefficients[0], *_element, *cell, ufc_cell);
>>>> 
>>>>   // Tabulate dofs
>>>>   const std::vector<dolfin::la_index>& cell_dofs = 
>>>> _dofmap->cell_dofs(cell->
>>>> index());
>>>> 
>>>>   // Copy dofs to vector
>>>>   expansion_coefficients.set(&cell_coefficients[0],
>>>>                              _dofmap->cell_dimension(cell->index()),
>>>>                              &cell_dofs[0]);
>>>> }
>>>> 
>>>> The cell loop runs on all processors for the part of the mesh of V1 that 
>>>> lives
>>>> on that process. `v.restrict` calls `restrict_as_ufc_function` since V0 
>>>> and V1
>>>> have different meshes, `restrict_as_ufc_function` calls 
>>>> `element.evaluate_dofs`
>>>> that again calls `u0.eval` for each dof on the cell. Now the major problem 
>>>> here
>>>> that makes it difficult to extend this to parallel is that the cell 
>>>> iteration
>>>> is not collective and `eval` is not called collectively. Since the 
>>>> coordinates
>>>> of dofs on a cell in the V1-mesh could live on any other cpu for the 
>>>> V0-mesh,
>>>> the `eval` needs to be called collectively with some try/catch clause and 
>>>> with
>>>> a lot of broadcasting (of coordinates and result) to obtain what is
>>>> required.  Alternatively, the non-collective `eval` that fails must be 
>>>> captured
>>>> and handled using MPI++ later.
>>>> 
>>>> Faced with these problems my solution has instead been to create a 
>>>> stand-alone
>>>> version for parallel interpolation on nonmatching meshes. Not trying to
>>>> overload `eval` or fit it into the current `interpolate` framework. The 
>>>> code
>>>> that performs the interpolation is attached to issue 162. The code is 
>>>> fairly
>>>> well documented and should, hopefully, be possible to follow. The 
>>>> algorithm is
>>>> basically like this
>>>> 
>>>> 1) On each process tabulate all coordinates for dofs in the functionspace
>>>> we’re interpolating to, i.e., V1 in the above code.
>>>> 
>>>> x = V1.dofmap()->tabulate_all_coordinates(mesh1)
>>>> 
>>>> 2) Create a map from dof number in V1 to component number in Mixed Space.
>>>> 
>>>> This is necessary to be able to handle mixed spaces. I do not think I have
>>>> found the best solution to this problem, so any help here would be much
>>>> appreciated. I just realized that the current solution does not work
>>>> for special spaces like BDM or RT.
>>>> 
>>>> 3) Compute `u0.eval` for all points x computed in 1).
>>>> 
>>>> Problem here is that u0 and u have different meshes and as such a point in 
>>>> u
>>>> will not necessarily, or not even likely, be found on the same processor 
>>>> for
>>>> u0. Hence the dofs coordinates must be passed around and searched on all 
>>>> ranks
>>>> until found. When found the result is returned to the process that owns the
>>>> dof. (Alternatively, but slower, `u0.eval` could simply be called
>>>> collectively.) A major issue here for performance is that all points not 
>>>> found
>>>> on one process are collected in a vector and passed on to the
>>>> next process through just one single MPI::send_recv. All points not found 
>>>> are
>>>> sent and points not found by process with one lower rank are received.
>>>> 
>>>> 4) Finally, set all values in u using the dof to component map.
>>>> 
>>>> 
>>>> The problem is inherently unscalable by nature since the efficiency 
>>>> depends on
>>>> the distribution of two different meshes. If they happen to overlap much, 
>>>> then
>>>> the routine may be faster with 2 than one cpu, but in the end that depends 
>>>> on
>>>> the mesh partitioning that you cannot really control, at least not just 
>>>> using
>>>> the default mesh-partitioner. Consider running a problem with 2 CPUs. Here 
>>>> it
>>>> should be quite obvious that each dof must be searched on average 1.5 times
>>>> (half are found on this CPU and the rest on the other). Since each process'
>>>> mesh is half the size of the original, the total cost is then approximately
>>>> 1.5*0.5 = 0.75. So, not counting any cost of MPI communication, you should
>>>> expect to reduce the computational time to 75% of the original. However, 
>>>> the
>>>> MPI communications are extensive (half the mesh is sent back and forth) 
>>>> and you
>>>> should honestly not expect any speedup at all. The estimate for N cpus will
>>>> similarly be (N+1) / N / 2 + MPI communications.
>>>> 
>>>> If you have any comments to the implementation or, even better, 
>>>> suggestions on
>>>> how to improve it, then please let me know by following up on this mail. 
>>>> If you
>>>> like the current solution and you’d like to see this implemented soon
>>>> in dolfin, then please vote for issue 162 on bitbucket:-)
>>>> 
>>>> Meanwhile, the `interpolate_nonmatching_mesh` function is available on 
>>>> https://
>>>> github.com/mikaem/fenicstools
>>>> 
>>>> Best regards
>>>> 
>>>> Mikael
>>>> 
>>>> 
>>> 
>>>> _______________________________________________
>>>> fenics mailing list
>>>> [email protected]
>>>> http://fenicsproject.org/mailman/listinfo/fenics
>>> 
>>> _______________________________________________
>>> fenics mailing list
>>> [email protected]
>>> http://fenicsproject.org/mailman/listinfo/fenics
>> 
> _______________________________________________
> fenics mailing list
> [email protected]
> http://fenicsproject.org/mailman/listinfo/fenics

_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Re: [FEniCS] Issue 162 - interpolation on nonmatching meshes

Reply via email to