On 2013-12-04 17:15, Anders Logg wrote:
On Tue, Dec 03, 2013 at 10:12:22AM +0100, Mikael Mortensen wrote:

Den Dec 2, 2013 kl. 9:17 PM skrev Anders Logg:

> I don't see a simple way around this (other than inventing a new
> algorithm - the only way I can think of is to create a third data
> structure with a regular grid predictively distributed onto the
> different processors and then communicate via that grid) so I don't
> mind this being added.

Great

I could probably create a very inefficient parallel interpolation
routine by looping over all partitions of the distributed mesh1, one
by one, collectively (i.e., broadcast mesh1's partition on process 0
to all, then run collectively over it, allowing us to call u0.eval
or evaluate_dofs collectively. When finished move on to partition on
process 1). That could work for any space I guess?

I meant communicating via a third independent grid constructed in a
way such that each cell only needs to communicate with a subset of
that third grid.


I would create a bounding box(es) of the mesh on each process, and then build a search tree for the boxes. Then do

  for (CellIterator cell(*_mesh); !cell.end(); ++cell)
  {
    // If interpolation point is on this process, do as we do now

    // If interpolation point is *not* on this process, insert
    // into cache to be fetched
  }

  // Call function to retrieve non-local data. This function should use
  // the search tree.

The constrained/periodic bc code uses bounding boxes to locate constrained vertices/edges/facets that are off-process. It doesn't use a tree for searching, but it should. The some abstract function for finding the off-process owner of a point or mesh entity would be very helpful.

Garth



--
Anders


> Since not all elements are supported, I think this function should
> have a separate name for now, perhaps
>
>   interpolate_lagrange()
>
> My suggestion would be that the documentation explains that this
> function works in parallel for non-matching meshes (in contrast to the
> regular interpolate) but does only support Lagrange spaces.

Sounds good to me.

Mikael

>
>> Hi,
>>
>> This is in reference to Issue 162 
https://bitbucket.org/fenics-project/dolfin/
>> issue/162/interpolation-nonmatching-mesh-in-parallel
>>
>> Currently it is not possible in parallel to interpolate a
>> dolfin Function u0 defined on one mesh to a Function with the
>> same FunctionSpace defined on a different mesh. I would like to be able to do
>> this
>>
>> mesh0 = UnitSquareMesh(100, 100)
>> mesh1 = UnitSquareMesh(20, 20)
>> V0 = FunctionSpace(mesh0, "CG", 1)
>> V1 = FunctionSpace(mesh1, "CG", 1)
>> u0 = interpolate(Expression("x[0]*x[1]"), V0)
>> u0.update()
>> u = interpolate_nonmatching_mesh(u0, V1)
>>
>> or probably overloaded like this
>>
>> u = interpolate(u0, V1)
>>
>> Now there are probably many different ways this can be implemented. The most
>> obvious approach that comes to mind is simply to overload `Function::eval` 
with
>> a parallel version. There are many reasons why this doesn’t quite seem to 
work.
>> Consider the current `interpolate` implementation
>>
>> FunctionSpace::interpolate(GenericVector& expansion_coefficients,
>>                                const GenericFunction& v)
>> …
>> // Iterate over mesh and interpolate on each cell
>>  UFCCell ufc_cell(*_mesh);
>>  for (CellIterator cell(*_mesh); !cell.end(); ++cell)
>>  {
>>    // Update to current cell
>>    ufc_cell.update(*cell);
>>
>>    // Restrict function to cell
>>    v.restrict(&cell_coefficients[0], *_element, *cell, ufc_cell);
>>
>>    // Tabulate dofs
>>    const std::vector<dolfin::la_index>& cell_dofs = _dofmap->cell_dofs(cell->
>> index());
>>
>>    // Copy dofs to vector
>>    expansion_coefficients.set(&cell_coefficients[0],
>>                               _dofmap->cell_dimension(cell->index()),
>>                               &cell_dofs[0]);
>>  }
>>
>> The cell loop runs on all processors for the part of the mesh of V1 that 
lives
>> on that process. `v.restrict` calls `restrict_as_ufc_function` since V0 and 
V1
>> have different meshes, `restrict_as_ufc_function` calls 
`element.evaluate_dofs`
>> that again calls `u0.eval` for each dof on the cell. Now the major problem 
here
>> that makes it difficult to extend this to parallel is that the cell iteration
>> is not collective and `eval` is not called collectively. Since the 
coordinates
>> of dofs on a cell in the V1-mesh could live on any other cpu for the V0-mesh,
>> the `eval` needs to be called collectively with some try/catch clause and 
with
>> a lot of broadcasting (of coordinates and result) to obtain what is
>> required.  Alternatively, the non-collective `eval` that fails must be 
captured
>> and handled using MPI++ later.
>>
>> Faced with these problems my solution has instead been to create a 
stand-alone
>> version for parallel interpolation on nonmatching meshes. Not trying to
>> overload `eval` or fit it into the current `interpolate` framework. The code
>> that performs the interpolation is attached to issue 162. The code is fairly
>> well documented and should, hopefully, be possible to follow. The algorithm 
is
>> basically like this
>>
>> 1) On each process tabulate all coordinates for dofs in the functionspace
>> we’re interpolating to, i.e., V1 in the above code.
>>
>> x = V1.dofmap()->tabulate_all_coordinates(mesh1)
>>
>> 2) Create a map from dof number in V1 to component number in Mixed Space.
>>
>> This is necessary to be able to handle mixed spaces. I do not think I have
>> found the best solution to this problem, so any help here would be much
>> appreciated. I just realized that the current solution does not work
>> for special spaces like BDM or RT.
>>
>> 3) Compute `u0.eval` for all points x computed in 1).
>>
>> Problem here is that u0 and u have different meshes and as such a point in u
>> will not necessarily, or not even likely, be found on the same processor for
>> u0. Hence the dofs coordinates must be passed around and searched on all 
ranks
>> until found. When found the result is returned to the process that owns the
>> dof. (Alternatively, but slower, `u0.eval` could simply be called
>> collectively.) A major issue here for performance is that all points not 
found
>> on one process are collected in a vector and passed on to the
>> next process through just one single MPI::send_recv. All points not found are
>> sent and points not found by process with one lower rank are received.
>>
>> 4) Finally, set all values in u using the dof to component map.
>>
>>
>> The problem is inherently unscalable by nature since the efficiency depends 
on
>> the distribution of two different meshes. If they happen to overlap much, 
then
>> the routine may be faster with 2 than one cpu, but in the end that depends on
>> the mesh partitioning that you cannot really control, at least not just using
>> the default mesh-partitioner. Consider running a problem with 2 CPUs. Here it
>> should be quite obvious that each dof must be searched on average 1.5 times
>> (half are found on this CPU and the rest on the other). Since each process'
>> mesh is half the size of the original, the total cost is then approximately
>> 1.5*0.5 = 0.75. So, not counting any cost of MPI communication, you should
>> expect to reduce the computational time to 75% of the original. However, the
>> MPI communications are extensive (half the mesh is sent back and forth) and 
you
>> should honestly not expect any speedup at all. The estimate for N cpus will
>> similarly be (N+1) / N / 2 + MPI communications.
>>
>> If you have any comments to the implementation or, even better, suggestions 
on
>> how to improve it, then please let me know by following up on this mail. If 
you
>> like the current solution and you’d like to see this implemented soon
>> in dolfin, then please vote for issue 162 on bitbucket:-)
>>
>> Meanwhile, the `interpolate_nonmatching_mesh` function is available on 
https://
>> github.com/mikaem/fenicstools
>>
>> Best regards
>>
>> Mikael
>>
>>
>
>> _______________________________________________
>> fenics mailing list
>> [email protected]
>> http://fenicsproject.org/mailman/listinfo/fenics
>
> _______________________________________________
> fenics mailing list
> [email protected]
> http://fenicsproject.org/mailman/listinfo/fenics

_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to