On Sat 2008-05-17 22:11, Murtazo Nazarov wrote: > > On Sat 2008-05-17 21:19, Murtazo Nazarov wrote: > >> This is from 3D Navier-Stokes equation with linear elements. The first > >> time assembly takes longer because it does initialization of the matrix, > >> it calculates the sparsity pattern. But, then the sparsity pattern and > >> A.init() is not done. Maybe I cannot gain so much, but as I posted in > >> previous posts, I did exactly the same test with another package, with > >> the > >> same mesh, the same elements and the same equations, and that was at > >> least > >> 3 times faster than assembly we have in dolfin. So, instead of 7 seconds > >> it is spent 7/3 seconds which I can gain 10 days from my simulation > >> which > >> takes now 14 days. > > > > I don't understand. Don't you solve a system with these matrices? Your > > numbers > > indicate that the solve takes negative time. > > What do you mean? In the test I sent you I do not solve the system, it was > just testing the assembly process.
Presumably your 14 day run isn't just assembling matrices and throwing them away. What fraction of the real runtime is spent in assembly? > > If you want to speed up the insertion of element matrices, I think the > > correct > > way is to do clever caching so that, for instance, an entire row > > contribution > > can be inserted at once. Anders mentioned this earlier in this thread. > > My > > understanding is that computing the element matrices is supposed to be > > very fast > > in Dolfin since it uses FFC-optimized code. Can you profile and see > > exactly > > where the time is being spent? Is it in FFC-generated code or in > > insertion? > > > > I think that the FFC is pretty fast and there is no problem with that so > far. The element matrices is calculated fast. I did profiling and posted > previously, here it is again: > > Dolfin::Assembler::assembleCells: > > 1. tabulate_tensor for bilinearform of Momentum in NSE: 6.04% > tabulate_tensor for linearform of Momentum in NSE: 11.98% > > 2. Dolfin::GeneriMatrix::add: 68.98% > > 3. Dolfin::Function::Interpolate: 9.05% > > The most time is spent to add, which calls MatSetValues. Right, it is in insertion. I guess we are back to better caching. One enhancement would be to order the element matrices so the columns indices are increasing. MatSetValues_SeqAIJ just inserts entries directly into the system matrix. If the columns are out of order, it resets the lower bound in a binary search. To get the fastest insertion, Dolfin could cache several nearby elements so as to insert entire rows at once with the columns sorted. A first step would be to just sort the columns in the element matrices. Jed
pgpFYLWkJkcD3.pgp
Description: PGP signature
_______________________________________________ DOLFIN-dev mailing list [email protected] http://www.fenics.org/mailman/listinfo/dolfin-dev
