On Tue, 8 Sep 2015 08:58:54 +0200 Mikael Mortensen <mikael.morten...@gmail.com> wrote:
> > > On 07 Sep 2015, at 15:35, Martin Sandve Alnæs <marti...@simula.no> > > wrote: > > > > What if the ufc finite element can return a quadrature rule for > > evaluation of dofs instead of evaluate_dofs taking a callback? Then > > dolfin can handle evaluation and taking the weighted sum without > > involving ufc::function at all. > > > > Not quite sure I follow. evaluate_dofs is performing point > evaluations and then some additional work for special elements. I > don’t think quadrature rules could help out with this? I think Martin suggests that instead of execution of DOF code, a recipe/formula for evaluating DOF should be passed to DOLFIN. From some reason Martin calls the DOF formula by "quadrature rule". Jan > > M > > > 7. sep. 2015 09.53 skrev "Mikael Mortensen" > > <mikael.morten...@gmail.com <mailto:mikael.morten...@gmail.com>>: Hi > > > > As you all rememeber the LagrangeInterpolator class can already do > > interpolation in parallel on non-matching meshes for Lagrange > > elements. I had another look at what it would take to get > > interpolation working in parallel for any element and it turns out > > I do not need to change all that much. I have implemented a first > > solution in the branch > > https://bitbucket.org/fenics-project/dolfin/branch/mikael/interpolation-parallel > > <https://bitbucket.org/fenics-project/dolfin/branch/mikael/interpolation-parallel> > > just to test in a function called > > “LagrangeInterpolator::interpolateall". This is what the algorithm > > looks like to get interpolation in parallel working for any element > > > > Parallel interpolation from Function u0 to Function u living on > > different meshes with different partitioning: > > > > 1) Create a set of all different points of u that will require an > > eval (like tabulate_all_coordinates but with a set of coordinates) > > 2) Create global bounding boxes for the partitioned mesh of u0 and > > distribute to all processors. > > > > 3) Using bounding boxes, compute the processes that *may* own the > > points in need of eval. > > > > 4) Distribute interpolation points to potential owners who > > subsequently tries to evaluate u0. If successful, return values > > (result of eval) of u0 to owner. 5) Now all the results of u0.eval > > will be on the processes that needs it. It remains to run over all > > cells locally (for u) and call evaluate_dofs using these values. > > This is a bit tricky since evaluate_dofs needs to take a > > ufc::function as argument. To this end I have a solution where I > > wrap all results of u0.eval in an Expression, calling a map from > > point x to result > > > > > > // Create map from coordinate to global result of eval > > static std::map<std::vector<double>, std::vector<double>, > > lt_coordinate> coords_to_values(lt_coordinate(1.0e-12)); > > > > …. Fill map coords_to_values using MPI ++ > > > > // Wrap coords_to_values in an Expression > > class WrapperFunction : public Expression > > { > > public: > > > > WrapperFunction(int value_shape) : Expression(value_shape) {}; > > > > mutable std::vector<double> _xx; > > > > void eval(Array<double>& values, const Array<double>& x) const > > { > > for (uint j = 0; j < x.size(); j++) > > _xx[j] = x[j]; > > > > const std::vector<double>& v = coords_to_values[_xx]; // <— > > Map from x to u0.eval result for (std::size_t j = 0; j < v.size(); > > j++) values[j] = v[j]; // Put > > u0.eval in values of Expression } > > }; > > > > 6) Run over local mesh calling evaluate_dofs with wrapped function > > as the ufc function. > > > > 7) Update coefficients of local vector with results from > > evaluate_dofs > > > > Finished:-) > > > > I have tested it for Nedelec elements of order 1 and bubble > > elements. Higher order Nedelec elements do not have a > > tabulate_coordinates function implemented, which makes it a bit > > more difficult to create the list of all interpolation points. I > > think this can be solved quite easily, though, with running over > > the local mesh once and collecting all x’s used in the evals. > > > > I do not have any strong opinion on ufc::cell vs dolfin::cell or > > collective vs non-collective eval, but I certainly think both these > > should be available to the user. The collective eval is not used in > > the above algorithm, because it is more efficient to collect all > > interpolation points in one large structure and distribute, than it > > is to do each point for itself. > > > > The parallel interpolation routine should probably be put in the > > Function class and not the specific LagrangeInterpolator class, but > > for now I’ve just been testing. > > > > M > > > > > >> On 04 Sep 2015, at 11:02, Chris Richardson <ch...@bpi.cam.ac.uk > >> <mailto:ch...@bpi.cam.ac.uk>> wrote: > >> > >> On 03/09/2015 15:54, Martin Sandve Alnæs wrote: > >>> On 3 September 2015 at 16:50, Garth N. Wells <gn...@cam.ac.uk > >>> <mailto:gn...@cam.ac.uk>> wrote: > >>>> On 3 September 2015 at 14:42, Chris Richardson > >>>> <ch...@bpi.cam.ac.uk <mailto:ch...@bpi.cam.ac.uk>> wrote: > >>>> On 03/09/2015 14:10, Martin Sandve Alnæs wrote: > >>>> Part of the complexity in this chain steps from the numerous > >>>> dolfin signatures. As discussed before, these also need cleanup > >>>> to clarify collective/non-collective operations. While at it, we > >>>> could also vectorize on x by allowing a number of coordinates > >>>> wherever there's an > >>>> x argument in the code below. > >>>> For a moment I'd like to ignore the ufc/ffc parts and figure out > >>>> how > >>>> many eval signatures we really need in dolfin? In particular to > >>>> resolve the collective/non-collective part. Then we can design > >>>> the callback mechanism to match the ideal dolfin design. > >>>> Here's one take (dodging the type of 'cell'): > >>>> Is it as simple as just having a "collective" and > >>>> "non_collective" eval? > >>>> Typically, processes will iterate over some set of "local" > >>>> points, asking for evaluations ( - OK, they may be "local" on > >>>> one mesh, but not another). > >>>> When the point is not "on process", the process needs to get it, > >>>> somehow. > >>>> But all the other processes are unaware - they are thinking about > >>>> different points. > >>>> It is quite unusual for all processes to want to evaluate the > >>>> same point at the same time, for a simple collective operation. > >>>> My thought was that we would have to store up a list of points, > >>>> and then do a collective operation to resolve all the > >>>> cross-process evaluations. > >>>> Or maybe I have missed something... > >>> My impression is that Martin is just suggesting the we split evals > >>> into non-collective and collective 'groups', and have as few > >>> versions of eval in each group. With fewer evals, we have few > >>> code paths to think about. > >>> The interface for collective evals is orthogonal to this, and > >>> something we need to look closely at. > >>> Garth > >>> Exactly. The collective variants of eval will always end up > >>> calling non-collective variants of eval but never the other way > >>> around. Also the collective eval variants won't be virtual > >>> functions. The evaluation of a GenericFunction on cells that are > >>> known to be on the local process differs between Function and > >>> Expression, but can be handled with a single signature "virtual > >>> void eval(v,x,cell) const = 0;" > >>> in GenericFunction, implemented by Function and by the users > >>> subclass of Expression. The only thing we need to clean up there > >>> in the ufc::function > >>> is the ufc::cell vs dolfin::cell. > >>> For a collective eval over multiple points, I currently see only > >>> two variants: > >>> a) each process pass a different set of point(s) and wants the > >>> corresponding results > >>> b) each process actually pass the same point(s) and wants all > >>> results In both cases there is first communication of 'who owns > >>> which points', then each process makes calls to the > >>> non-collective evals for its own points, > >>> then the results are communicated back to the right place. > >>> I believe a) is what Chris was talking about as the most common > >>> operation. > >> > >> OK, makes sense. I suppose I am thinking ahead to an > >> implementation of 'interpolate', where fitting the collective > >> calls in might be tricky. > >> > >> Chris > >> > >> -- > >> Chris Richardson > >> BP Institute > >> Madingley Road > >> Cambridge CB3 0EZ > >> _______________________________________________ > >> fenics mailing list > >> fenics@fenicsproject.org <mailto:fenics@fenicsproject.org> > >> http://fenicsproject.org/mailman/listinfo/fenics > >> <http://fenicsproject.org/mailman/listinfo/fenics> > _______________________________________________ fenics mailing list fenics@fenicsproject.org http://fenicsproject.org/mailman/listinfo/fenics