What if the ufc finite element can return a quadrature rule for evaluation of dofs instead of evaluate_dofs taking a callback? Then dolfin can handle evaluation and taking the weighted sum without involving ufc::function at all. 7. sep. 2015 09.53 skrev "Mikael Mortensen" <mikael.morten...@gmail.com>:
> Hi > > As you all rememeber the LagrangeInterpolator class can already do > interpolation in parallel on non-matching meshes for Lagrange elements. I > had another look at what it would take to get interpolation working in > parallel for any element and it turns out I do not need to change all that > much. I have implemented a first solution in the branch > https://bitbucket.org/fenics-project/dolfin/branch/mikael/interpolation-parallel > just > to test in a function called “LagrangeInterpolator::interpolateall". This > is what the algorithm looks like to get interpolation in parallel working > for any element > > Parallel interpolation from Function u0 to Function u living on different > meshes with different partitioning: > > 1) Create a set of all different points of u that will require an eval > (like tabulate_all_coordinates but with a set of coordinates) > > 2) Create global bounding boxes for the partitioned mesh of u0 and > distribute to all processors. > > 3) Using bounding boxes, compute the processes that *may* own the points > in need of eval. > > 4) Distribute interpolation points to potential owners who subsequently > tries to evaluate u0. If successful, return values (result of eval) of u0 > to owner. > > 5) Now all the results of u0.eval will be on the processes that needs it. > It remains to run over all cells locally (for u) and call evaluate_dofs > using these values. This is a bit tricky since evaluate_dofs needs to take > a ufc::function as argument. To this end I have a solution where I wrap all > results of u0.eval in an Expression, calling a map from point x to result > > > // Create map from coordinate to global result of eval > static std::map<std::vector<double>, std::vector<double>, lt_coordinate> > coords_to_values(lt_coordinate(1.0e-12)); > > …. Fill map coords_to_values using MPI ++ > > // Wrap coords_to_values in an Expression > class WrapperFunction : public Expression > { > public: > > WrapperFunction(int value_shape) : Expression(value_shape) {}; > > mutable std::vector<double> _xx; > > void eval(Array<double>& values, const Array<double>& x) const > { > for (uint j = 0; j < x.size(); j++) > _xx[j] = x[j]; > > const std::vector<double>& v = coords_to_values[_xx]; // <— Map from > x to u0.eval result > for (std::size_t j = 0; j < v.size(); j++) > values[j] = v[j]; // Put > u0.eval in values of Expression > } > }; > > 6) Run over local mesh calling evaluate_dofs with wrapped function as the > ufc function. > > 7) Update coefficients of local vector with results from evaluate_dofs > > Finished:-) > > I have tested it for Nedelec elements of order 1 and bubble elements. > Higher order Nedelec elements do not have a tabulate_coordinates function > implemented, which makes it a bit more difficult to create the list of all > interpolation points. I think this can be solved quite easily, though, with > running over the local mesh once and collecting all x’s used in the evals. > > I do not have any strong opinion on ufc::cell vs dolfin::cell or > collective vs non-collective eval, but I certainly think both these should > be available to the user. The collective eval is not used in the above > algorithm, because it is more efficient to collect all interpolation points > in one large structure and distribute, than it is to do each point for > itself. > > The parallel interpolation routine should probably be put in the Function > class and not the specific LagrangeInterpolator class, but for now I’ve > just been testing. > > M > > > On 04 Sep 2015, at 11:02, Chris Richardson <ch...@bpi.cam.ac.uk> wrote: > > On 03/09/2015 15:54, Martin Sandve Alnæs wrote: > > On 3 September 2015 at 16:50, Garth N. Wells <gn...@cam.ac.uk> wrote: > > On 3 September 2015 at 14:42, Chris Richardson <ch...@bpi.cam.ac.uk> > wrote: > On 03/09/2015 14:10, Martin Sandve Alnæs wrote: > Part of the complexity in this chain steps from the numerous dolfin > signatures. As discussed before, these also need cleanup to clarify > collective/non-collective operations. While at it, we could also > vectorize on x by allowing a number of coordinates wherever there's > an > x argument in the code below. > For a moment I'd like to ignore the ufc/ffc parts and figure out > how > many eval signatures we really need in dolfin? In particular to > resolve the collective/non-collective part. Then we can design the > callback mechanism to match the ideal dolfin design. > Here's one take (dodging the type of 'cell'): > Is it as simple as just having a "collective" and "non_collective" > eval? > Typically, processes will iterate over some set of "local" points, > asking for evaluations ( - OK, they may be "local" on one mesh, but > not another). > When the point is not "on process", the process needs to get it, > somehow. > But all the other processes are unaware - they are thinking about > different points. > It is quite unusual for all processes to want to evaluate the same > point at the same time, for a simple collective operation. > My thought was that we would have to store up a list of points, and > then do a collective operation to resolve all the cross-process > evaluations. > Or maybe I have missed something... > > My impression is that Martin is just suggesting the we split evals > into non-collective and collective 'groups', and have as few versions > of eval in each group. With fewer evals, we have few code paths to > think about. > The interface for collective evals is orthogonal to this, and > something we need to look closely at. > Garth > Exactly. The collective variants of eval will always end up calling > non-collective variants of eval but never the other way around. > Also the collective eval variants won't be virtual functions. > The evaluation of a GenericFunction on cells that are known to be > on the local process differs between Function and Expression, but > can be handled with a single signature "virtual void eval(v,x,cell) > const = 0;" > in GenericFunction, implemented by Function and by the users subclass > of Expression. The only thing we need to clean up there in the > ufc::function > is the ufc::cell vs dolfin::cell. > For a collective eval over multiple points, I currently see only two > variants: > a) each process pass a different set of point(s) and wants the > corresponding results > b) each process actually pass the same point(s) and wants all results > In both cases there is first communication of 'who owns which points', > then each process makes calls to the non-collective evals for its own > points, > then the results are communicated back to the right place. > I believe a) is what Chris was talking about as the most common > operation. > > > OK, makes sense. I suppose I am thinking ahead to an implementation of > 'interpolate', > where fitting the collective calls in might be tricky. > > Chris > > -- > Chris Richardson > BP Institute > Madingley Road > Cambridge CB3 0EZ > _______________________________________________ > fenics mailing list > fenics@fenicsproject.org > http://fenicsproject.org/mailman/listinfo/fenics > > >
_______________________________________________ fenics mailing list fenics@fenicsproject.org http://fenicsproject.org/mailman/listinfo/fenics