On Fri, May 09, 2014 at 03:27:20PM +0200, Martin Sandve Alnæs wrote: > Hi all, > I've implemented selective local evaluation of coefficient functions in the > assembler depending on which functions each integral depends on. It's > currently > in branches called > martinal/topic-add-enabled-coefficients-per-integral > in ufl, ffc and dolfin (must be used together). > Note that this changes ufc interface so everything must be recompiled. > > To show the performance improvement, here's a simple benchmark script, > assembling two forms (called a and b) that depend on one and two coefficients > (f and (f and g) respectively) but yield the exact same integral and assembly > result when assembled without any subdomains (the dx(1) term in form b is > never > executed). Each form is assembled twice for semi-robust timing and I first ran > the script to keep the jit out of the picture. (Performance numbers below the > code). > > > from dolfin import * > import time > > n = 60 > mesh = UnitCubeMesh(n, n, n) > V = FunctionSpace(mesh, "Lagrange", 1) > f = Function(V) > g = Function(V) > > a = f*dx() > b = f*dx() + g*dx(1) > > t1 = time.time() > A1 = assemble(a) > t2 = time.time() > A2 = assemble(a) > t3 = time.time() > > print "A1:", (t2-t1) > print "A2:", (t3-t2) > > t1 = time.time() > B1 = assemble(b) > t2 = time.time() > B2 = assemble(b) > t3 = time.time() > > print "B1:", (t2-t1) > print "B2:", (t3-t2) > > > Resulting time to assemble with current master: > > A1: 0.467525005341 > A2: 0.465034008026 > B1: 0.882906198502 > B2: 0.830652952194 > > Note how the additional coefficient in form b gives very significant overhead > for this simple functional even though it's never used in the computations. > > The time to assemble with the new branches: > > A1: 0.531542062759 > A2: 0.530611991882 > B1: 0.540424108505 > B2: 0.535769939423 > > Note two things: > The performance is a bit lower for the simple case. It might be possible to > optimize this. > The performance is the same for both cases, significantly faster for form b > because the function g is never restricted. > > > The cases that will benefit from this feature performance wise are forms with > two or more integrals involving different coefficients. > > The cases that will have a small regression performance wise are forms with > only one integral, with no coefficients, or where all integrals use the same > coefficients. The relative performance regression is most noticable for simple > forms such as mass and stiffness matrices. > > There are multiple future features that depend on this functionality: > - it allows for functions that cannot be evaluated everywhere to be called > only > in their valid domain (examples are functions only living on subdomains, a > partially overlapping mesh, or the boundary). > - possible refactoring of preprocessing in ufl to reduce the amount of > symbolic > processing done for forms that are already in the jit cache. > > The functionality is obviously highly beneficial, so is it ok if I push it now > even with the performance regression for simple forms?
Could you first check what the performance regression is (if any) for assembling a standard right-hand side vector f*dx and Poisson stiffness matrix? Perhaps this is only noticeable for functionals. -- Anders _______________________________________________ fenics mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics
