On Tue, 8 Sep 2015 08:58:54 +0200
Mikael Mortensen <mikael.morten...@gmail.com> wrote:

> 
> > On 07 Sep 2015, at 15:35, Martin Sandve Alnæs <marti...@simula.no>
> > wrote:
> > 
> > What if the ufc finite element can return a quadrature rule for
> > evaluation of dofs instead of evaluate_dofs taking a callback? Then
> > dolfin can handle evaluation and taking the weighted sum without
> > involving ufc::function at all.
> > 
> 
> Not quite sure I follow. evaluate_dofs is performing point
> evaluations and then some additional work for special elements. I
> don’t think quadrature rules could help out with this? 

I think Martin suggests that instead of execution of DOF code, a
recipe/formula for evaluating DOF should be passed to DOLFIN. From some
reason Martin calls the DOF formula by "quadrature rule".

Jan

> 
> M
> 
> > 7. sep. 2015 09.53 skrev "Mikael Mortensen"
> > <mikael.morten...@gmail.com <mailto:mikael.morten...@gmail.com>>: Hi
> > 
> > As you all rememeber the LagrangeInterpolator class can already do
> > interpolation in parallel on non-matching meshes for Lagrange
> > elements. I had another look at what it would take to get
> > interpolation working in parallel for any element and it turns out
> > I do not need to change all that much. I have implemented a first
> > solution in the branch
> > https://bitbucket.org/fenics-project/dolfin/branch/mikael/interpolation-parallel
> > <https://bitbucket.org/fenics-project/dolfin/branch/mikael/interpolation-parallel>
> > just to test in a function called
> > “LagrangeInterpolator::interpolateall". This is what the algorithm
> > looks like to get interpolation in parallel working for any element
> > 
> > Parallel interpolation from Function u0 to Function u living on
> > different meshes with different partitioning:
> > 
> > 1) Create a set of all different points of u that will require an
> > eval (like tabulate_all_coordinates but with a set of coordinates) 
> > 2) Create global bounding boxes for the partitioned mesh of u0 and
> > distribute to all processors.
> > 
> > 3) Using bounding boxes, compute the processes that *may* own the
> > points in need of eval.
> > 
> > 4) Distribute interpolation points to potential owners who
> > subsequently tries to evaluate u0. If successful, return values
> > (result of eval) of u0 to owner. 5) Now all the results of u0.eval
> > will be on the processes that needs it. It remains to run over all
> > cells locally (for u) and call evaluate_dofs using these values.
> > This is a bit tricky since evaluate_dofs needs to take a
> > ufc::function as argument. To this end I have a solution where I
> > wrap all results of u0.eval in an Expression, calling a map from
> > point x to result
> > 
> > 
> > // Create map from coordinate to global result of eval
> > static std::map<std::vector<double>, std::vector<double>,
> > lt_coordinate> coords_to_values(lt_coordinate(1.0e-12));
> > 
> > …. Fill map coords_to_values using MPI ++
> > 
> > // Wrap coords_to_values in an Expression
> > class WrapperFunction : public Expression
> > {
> > public:
> >     
> >     WrapperFunction(int value_shape) : Expression(value_shape) {};
> >     
> >     mutable std::vector<double> _xx;
> > 
> >     void eval(Array<double>& values, const Array<double>& x) const
> >     {
> >       for (uint j = 0; j < x.size(); j++)
> >         _xx[j] = x[j];
> >     
> >       const std::vector<double>& v = coords_to_values[_xx]; // <—
> > Map from x to u0.eval result for (std::size_t j = 0; j < v.size();
> > j++) values[j] = v[j];                                    // Put
> > u0.eval in values of Expression }
> > };
> > 
> > 6) Run over local mesh calling evaluate_dofs with wrapped function
> > as the ufc function.
> > 
> > 7) Update coefficients of local vector with results from
> > evaluate_dofs
> > 
> > Finished:-)
> > 
> > I have tested it for Nedelec elements of order 1 and bubble
> > elements. Higher order Nedelec elements do not have a
> > tabulate_coordinates function implemented, which makes it a bit
> > more difficult to create the list of all interpolation points. I
> > think this can be solved quite easily, though, with running over
> > the local mesh once and collecting all x’s used in the evals.
> > 
> > I do not have any strong opinion on ufc::cell vs dolfin::cell or
> > collective vs non-collective eval, but I certainly think both these
> > should be available to the user. The collective eval is not used in
> > the above algorithm, because it is more efficient to collect all
> > interpolation points in one large structure and distribute, than it
> > is to do each point for itself. 
> > 
> > The parallel interpolation routine should probably be put in the
> > Function class and not the specific LagrangeInterpolator class, but
> > for now I’ve just been testing.
> > 
> > M
> > 
> > 
> >> On 04 Sep 2015, at 11:02, Chris Richardson <ch...@bpi.cam.ac.uk
> >> <mailto:ch...@bpi.cam.ac.uk>> wrote:
> >> 
> >> On 03/09/2015 15:54, Martin Sandve Alnæs wrote:
> >>> On 3 September 2015 at 16:50, Garth N. Wells <gn...@cam.ac.uk
> >>> <mailto:gn...@cam.ac.uk>> wrote:
> >>>> On 3 September 2015 at 14:42, Chris Richardson
> >>>> <ch...@bpi.cam.ac.uk <mailto:ch...@bpi.cam.ac.uk>> wrote:
> >>>> On 03/09/2015 14:10, Martin Sandve Alnæs wrote:
> >>>> Part of the complexity in this chain steps from the numerous
> >>>> dolfin signatures. As discussed before, these also need cleanup
> >>>> to clarify collective/non-collective operations. While at it, we
> >>>> could also vectorize on x by allowing a number of coordinates
> >>>> wherever there's an
> >>>> x argument in the code below.
> >>>> For a moment I'd like to ignore the ufc/ffc parts and figure out
> >>>> how
> >>>> many eval signatures we really need in dolfin? In particular to
> >>>> resolve the collective/non-collective part. Then we can design
> >>>> the callback mechanism to match the ideal dolfin design.
> >>>> Here's one take (dodging the type of 'cell'):
> >>>> Is it as simple as just having a "collective" and
> >>>> "non_collective" eval?
> >>>> Typically, processes will iterate over some set of "local"
> >>>> points, asking for evaluations ( - OK, they may be "local" on
> >>>> one mesh, but not another).
> >>>> When the point is not "on process", the process needs to get it,
> >>>> somehow.
> >>>> But all the other processes are unaware - they are thinking about
> >>>> different points.
> >>>> It is quite unusual for all processes to want to evaluate the
> >>>> same point at the same time, for a simple collective operation.
> >>>> My thought was that we would have to store up a list of points,
> >>>> and then do a collective operation to resolve all the
> >>>> cross-process evaluations.
> >>>> Or maybe I have missed something...
> >>> My impression is that Martin is just suggesting the we split evals
> >>> into non-collective and collective 'groups', and have as few
> >>> versions of eval in each group. With fewer evals, we have few
> >>> code paths to think about.
> >>> The interface for collective evals is orthogonal to this, and
> >>> something we need to look closely at.
> >>> Garth
> >>> Exactly. The collective variants of eval will always end up
> >>> calling non-collective variants of eval but never the other way
> >>> around. Also the collective eval variants won't be virtual
> >>> functions. The evaluation of a GenericFunction on cells that are
> >>> known to be on the local process differs between Function and
> >>> Expression, but can be handled with a single signature "virtual
> >>> void eval(v,x,cell) const = 0;"
> >>> in GenericFunction, implemented by Function and by the users
> >>> subclass of Expression. The only thing we need to clean up there
> >>> in the ufc::function
> >>> is the ufc::cell vs dolfin::cell.
> >>> For a collective eval over multiple points, I currently see only
> >>> two variants:
> >>> a) each process pass a different set of point(s) and wants the
> >>> corresponding results
> >>> b) each process actually pass the same point(s) and wants all
> >>> results In both cases there is first communication of 'who owns
> >>> which points', then each process makes calls to the
> >>> non-collective evals for its own points,
> >>> then the results are communicated back to the right place.
> >>> I believe a) is what Chris was talking about as the most common
> >>> operation.
> >> 
> >> OK, makes sense. I suppose I am thinking ahead to an
> >> implementation of 'interpolate', where fitting the collective
> >> calls in might be tricky.
> >> 
> >> Chris
> >> 
> >> -- 
> >> Chris Richardson
> >> BP Institute
> >> Madingley Road
> >> Cambridge CB3 0EZ
> >> _______________________________________________
> >> fenics mailing list
> >> fenics@fenicsproject.org <mailto:fenics@fenicsproject.org>
> >> http://fenicsproject.org/mailman/listinfo/fenics
> >> <http://fenicsproject.org/mailman/listinfo/fenics>
> 

_______________________________________________
fenics mailing list
fenics@fenicsproject.org
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to