On Tue, Dec 03, 2013 at 10:12:22AM +0100, Mikael Mortensen wrote:
>
> Den Dec 2, 2013 kl. 9:17 PM skrev Anders Logg:
>
> > I don't see a simple way around this (other than inventing a new
> > algorithm - the only way I can think of is to create a third data
> > structure with a regular grid predictively distributed onto the
> > different processors and then communicate via that grid) so I don't
> > mind this being added.
>
> Great
>
> I could probably create a very inefficient parallel interpolation
> routine by looping over all partitions of the distributed mesh1, one
> by one, collectively (i.e., broadcast mesh1's partition on process 0
> to all, then run collectively over it, allowing us to call u0.eval
> or evaluate_dofs collectively. When finished move on to partition on
> process 1). That could work for any space I guess?

I meant communicating via a third independent grid constructed in a
way such that each cell only needs to communicate with a subset of
that third grid.

--
Anders


> > Since not all elements are supported, I think this function should
> > have a separate name for now, perhaps
> >
> >   interpolate_lagrange()
> >
> > My suggestion would be that the documentation explains that this
> > function works in parallel for non-matching meshes (in contrast to the
> > regular interpolate) but does only support Lagrange spaces.
>
> Sounds good to me.
>
> Mikael
>
> >
> >> Hi,
> >>
> >> This is in reference to Issue 162 
> >> https://bitbucket.org/fenics-project/dolfin/
> >> issue/162/interpolation-nonmatching-mesh-in-parallel
> >>
> >> Currently it is not possible in parallel to interpolate a
> >> dolfin Function u0 defined on one mesh to a Function with the
> >> same FunctionSpace defined on a different mesh. I would like to be able to 
> >> do
> >> this
> >>
> >> mesh0 = UnitSquareMesh(100, 100)
> >> mesh1 = UnitSquareMesh(20, 20)
> >> V0 = FunctionSpace(mesh0, "CG", 1)
> >> V1 = FunctionSpace(mesh1, "CG", 1)
> >> u0 = interpolate(Expression("x[0]*x[1]"), V0)
> >> u0.update()
> >> u = interpolate_nonmatching_mesh(u0, V1)
> >>
> >> or probably overloaded like this
> >>
> >> u = interpolate(u0, V1)
> >>
> >> Now there are probably many different ways this can be implemented. The 
> >> most
> >> obvious approach that comes to mind is simply to overload `Function::eval` 
> >> with
> >> a parallel version. There are many reasons why this doesn’t quite seem to 
> >> work.
> >> Consider the current `interpolate` implementation
> >>
> >> FunctionSpace::interpolate(GenericVector& expansion_coefficients,
> >>                                const GenericFunction& v)
> >> …
> >> // Iterate over mesh and interpolate on each cell
> >>  UFCCell ufc_cell(*_mesh);
> >>  for (CellIterator cell(*_mesh); !cell.end(); ++cell)
> >>  {
> >>    // Update to current cell
> >>    ufc_cell.update(*cell);
> >>
> >>    // Restrict function to cell
> >>    v.restrict(&cell_coefficients[0], *_element, *cell, ufc_cell);
> >>
> >>    // Tabulate dofs
> >>    const std::vector<dolfin::la_index>& cell_dofs = 
> >> _dofmap->cell_dofs(cell->
> >> index());
> >>
> >>    // Copy dofs to vector
> >>    expansion_coefficients.set(&cell_coefficients[0],
> >>                               _dofmap->cell_dimension(cell->index()),
> >>                               &cell_dofs[0]);
> >>  }
> >>
> >> The cell loop runs on all processors for the part of the mesh of V1 that 
> >> lives
> >> on that process. `v.restrict` calls `restrict_as_ufc_function` since V0 
> >> and V1
> >> have different meshes, `restrict_as_ufc_function` calls 
> >> `element.evaluate_dofs`
> >> that again calls `u0.eval` for each dof on the cell. Now the major problem 
> >> here
> >> that makes it difficult to extend this to parallel is that the cell 
> >> iteration
> >> is not collective and `eval` is not called collectively. Since the 
> >> coordinates
> >> of dofs on a cell in the V1-mesh could live on any other cpu for the 
> >> V0-mesh,
> >> the `eval` needs to be called collectively with some try/catch clause and 
> >> with
> >> a lot of broadcasting (of coordinates and result) to obtain what is
> >> required.  Alternatively, the non-collective `eval` that fails must be 
> >> captured
> >> and handled using MPI++ later.
> >>
> >> Faced with these problems my solution has instead been to create a 
> >> stand-alone
> >> version for parallel interpolation on nonmatching meshes. Not trying to
> >> overload `eval` or fit it into the current `interpolate` framework. The 
> >> code
> >> that performs the interpolation is attached to issue 162. The code is 
> >> fairly
> >> well documented and should, hopefully, be possible to follow. The 
> >> algorithm is
> >> basically like this
> >>
> >> 1) On each process tabulate all coordinates for dofs in the functionspace
> >> we’re interpolating to, i.e., V1 in the above code.
> >>
> >> x = V1.dofmap()->tabulate_all_coordinates(mesh1)
> >>
> >> 2) Create a map from dof number in V1 to component number in Mixed Space.
> >>
> >> This is necessary to be able to handle mixed spaces. I do not think I have
> >> found the best solution to this problem, so any help here would be much
> >> appreciated. I just realized that the current solution does not work
> >> for special spaces like BDM or RT.
> >>
> >> 3) Compute `u0.eval` for all points x computed in 1).
> >>
> >> Problem here is that u0 and u have different meshes and as such a point in 
> >> u
> >> will not necessarily, or not even likely, be found on the same processor 
> >> for
> >> u0. Hence the dofs coordinates must be passed around and searched on all 
> >> ranks
> >> until found. When found the result is returned to the process that owns the
> >> dof. (Alternatively, but slower, `u0.eval` could simply be called
> >> collectively.) A major issue here for performance is that all points not 
> >> found
> >> on one process are collected in a vector and passed on to the
> >> next process through just one single MPI::send_recv. All points not found 
> >> are
> >> sent and points not found by process with one lower rank are received.
> >>
> >> 4) Finally, set all values in u using the dof to component map.
> >>
> >>
> >> The problem is inherently unscalable by nature since the efficiency 
> >> depends on
> >> the distribution of two different meshes. If they happen to overlap much, 
> >> then
> >> the routine may be faster with 2 than one cpu, but in the end that depends 
> >> on
> >> the mesh partitioning that you cannot really control, at least not just 
> >> using
> >> the default mesh-partitioner. Consider running a problem with 2 CPUs. Here 
> >> it
> >> should be quite obvious that each dof must be searched on average 1.5 times
> >> (half are found on this CPU and the rest on the other). Since each process'
> >> mesh is half the size of the original, the total cost is then approximately
> >> 1.5*0.5 = 0.75. So, not counting any cost of MPI communication, you should
> >> expect to reduce the computational time to 75% of the original. However, 
> >> the
> >> MPI communications are extensive (half the mesh is sent back and forth) 
> >> and you
> >> should honestly not expect any speedup at all. The estimate for N cpus will
> >> similarly be (N+1) / N / 2 + MPI communications.
> >>
> >> If you have any comments to the implementation or, even better, 
> >> suggestions on
> >> how to improve it, then please let me know by following up on this mail. 
> >> If you
> >> like the current solution and you’d like to see this implemented soon
> >> in dolfin, then please vote for issue 162 on bitbucket:-)
> >>
> >> Meanwhile, the `interpolate_nonmatching_mesh` function is available on 
> >> https://
> >> github.com/mikaem/fenicstools
> >>
> >> Best regards
> >>
> >> Mikael
> >>
> >>
> >
> >> _______________________________________________
> >> fenics mailing list
> >> [email protected]
> >> http://fenicsproject.org/mailman/listinfo/fenics
> >
> > _______________________________________________
> > fenics mailing list
> > [email protected]
> > http://fenicsproject.org/mailman/listinfo/fenics
>
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to