I don't see a simple way around this (other than inventing a new
algorithm - the only way I can think of is to create a third data
structure with a regular grid predictively distributed onto the
different processors and then communicate via that grid) so I don't
mind this being added.

Since not all elements are supported, I think this function should
have a separate name for now, perhaps

   interpolate_lagrange()

My suggestion would be that the documentation explains that this
function works in parallel for non-matching meshes (in contrast to the
regular interpolate) but does only support Lagrange spaces.

--
Anders


On Mon, Dec 02, 2013 at 11:38:23AM +0100, Mikael Mortensen wrote:
> Hi,
>
> This is in reference to Issue 162 https://bitbucket.org/fenics-project/dolfin/
> issue/162/interpolation-nonmatching-mesh-in-parallel
>
> Currently it is not possible in parallel to interpolate a
> dolfin Function u0 defined on one mesh to a Function with the
> same FunctionSpace defined on a different mesh. I would like to be able to do
> this
>
> mesh0 = UnitSquareMesh(100, 100)
> mesh1 = UnitSquareMesh(20, 20)
> V0 = FunctionSpace(mesh0, "CG", 1)
> V1 = FunctionSpace(mesh1, "CG", 1)
> u0 = interpolate(Expression("x[0]*x[1]"), V0)
> u0.update()
> u = interpolate_nonmatching_mesh(u0, V1)
>
> or probably overloaded like this
>
> u = interpolate(u0, V1)
>
> Now there are probably many different ways this can be implemented. The most
> obvious approach that comes to mind is simply to overload `Function::eval` 
> with
> a parallel version. There are many reasons why this doesn’t quite seem to 
> work.
> Consider the current `interpolate` implementation
>
> FunctionSpace::interpolate(GenericVector& expansion_coefficients,
>                                 const GenericFunction& v)
> …
> // Iterate over mesh and interpolate on each cell
>   UFCCell ufc_cell(*_mesh);
>   for (CellIterator cell(*_mesh); !cell.end(); ++cell)
>   {
>     // Update to current cell
>     ufc_cell.update(*cell);
>
>     // Restrict function to cell
>     v.restrict(&cell_coefficients[0], *_element, *cell, ufc_cell);
>
>     // Tabulate dofs
>     const std::vector<dolfin::la_index>& cell_dofs = _dofmap->cell_dofs(cell->
> index());
>
>     // Copy dofs to vector
>     expansion_coefficients.set(&cell_coefficients[0],
>                                _dofmap->cell_dimension(cell->index()),
>                                &cell_dofs[0]);
>   }
>
> The cell loop runs on all processors for the part of the mesh of V1 that lives
> on that process. `v.restrict` calls `restrict_as_ufc_function` since V0 and V1
> have different meshes, `restrict_as_ufc_function` calls 
> `element.evaluate_dofs`
> that again calls `u0.eval` for each dof on the cell. Now the major problem 
> here
> that makes it difficult to extend this to parallel is that the cell iteration
> is not collective and `eval` is not called collectively. Since the coordinates
> of dofs on a cell in the V1-mesh could live on any other cpu for the V0-mesh,
> the `eval` needs to be called collectively with some try/catch clause and with
> a lot of broadcasting (of coordinates and result) to obtain what is
> required.  Alternatively, the non-collective `eval` that fails must be 
> captured
> and handled using MPI++ later.
>
> Faced with these problems my solution has instead been to create a stand-alone
> version for parallel interpolation on nonmatching meshes. Not trying to
> overload `eval` or fit it into the current `interpolate` framework. The code
> that performs the interpolation is attached to issue 162. The code is fairly
> well documented and should, hopefully, be possible to follow. The algorithm is
> basically like this
>
> 1) On each process tabulate all coordinates for dofs in the functionspace
> we’re interpolating to, i.e., V1 in the above code.
>
> x = V1.dofmap()->tabulate_all_coordinates(mesh1)
>
> 2) Create a map from dof number in V1 to component number in Mixed Space.
>
> This is necessary to be able to handle mixed spaces. I do not think I have
> found the best solution to this problem, so any help here would be much
> appreciated. I just realized that the current solution does not work
> for special spaces like BDM or RT.
>
> 3) Compute `u0.eval` for all points x computed in 1).
>
> Problem here is that u0 and u have different meshes and as such a point in u
> will not necessarily, or not even likely, be found on the same processor for
> u0. Hence the dofs coordinates must be passed around and searched on all ranks
> until found. When found the result is returned to the process that owns the
> dof. (Alternatively, but slower, `u0.eval` could simply be called
> collectively.) A major issue here for performance is that all points not found
> on one process are collected in a vector and passed on to the
> next process through just one single MPI::send_recv. All points not found are
> sent and points not found by process with one lower rank are received.
>
> 4) Finally, set all values in u using the dof to component map.
>
>
> The problem is inherently unscalable by nature since the efficiency depends on
> the distribution of two different meshes. If they happen to overlap much, then
> the routine may be faster with 2 than one cpu, but in the end that depends on
> the mesh partitioning that you cannot really control, at least not just using
> the default mesh-partitioner. Consider running a problem with 2 CPUs. Here it
> should be quite obvious that each dof must be searched on average 1.5 times
> (half are found on this CPU and the rest on the other). Since each process'
> mesh is half the size of the original, the total cost is then approximately
> 1.5*0.5 = 0.75. So, not counting any cost of MPI communication, you should
> expect to reduce the computational time to 75% of the original. However, the
> MPI communications are extensive (half the mesh is sent back and forth) and 
> you
> should honestly not expect any speedup at all. The estimate for N cpus will
> similarly be (N+1) / N / 2 + MPI communications.
>
> If you have any comments to the implementation or, even better, suggestions on
> how to improve it, then please let me know by following up on this mail. If 
> you
> like the current solution and you’d like to see this implemented soon
> in dolfin, then please vote for issue 162 on bitbucket:-)
>
> Meanwhile, the `interpolate_nonmatching_mesh` function is available on 
> https://
> github.com/mikaem/fenicstools
>
> Best regards
>
> Mikael
>
>

> _______________________________________________
> fenics mailing list
> [email protected]
> http://fenicsproject.org/mailman/listinfo/fenics

_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to