Hej, I've written here before and hope I don't misuse the mailing list - however I've been looking a bit in the documentation and here and haven't really found a conclusive answer.
I am aiming to solve a nonlinear hyperbolic transport equation. For the sake of the argument, let's say it reads mu \cdot \nabla f(x) = - f(x)^2 - 2*b(x)*f(x) - a(x) this is, of course, a Riccati equation (up to signs, possibly). In my case, f is a complex function but this is of little relevance here. Since it's a nonlinear problem I need to construct both the Jacobian and the residual. For starters, I do that in each step. I've managed to implement this and even get a PETSc-parallelised version to work, and am very happy. (I love deal.ii, by the way - very impressive). It scales not "optimally" on my small laptop but it's still a fine speedup when using MPI. So far so good. However, I want to solve my problem for many different directions vF, and then extract all the solutions and do something with them. As such, my problem is less that I need very large number of DOFs / huge meshes - my typical mesh will be on the order of 10000 unknowns, maybe 100k but not millions. Rather, I want the individual solves to be as fast as possible since I need to do on the order of 100-10000 of them, depending on the problem at hand. I've done some layman's benchmarking of the individual "steps" (setup, assembly, solve, ...) in my current version of the code. It looks as if the assembly takes several orders of magnitude (~100 at least) longer than the solving part. My question is now: What is the best strategy to speed up assembly, is there any experience with this? I've read different approaches and am confused what's promising for small-scale problems. So far I'm considering: 1) Using a matrix-free approach rather than PETSc - this seems to be a win in most cases, would however consider rewriting large parts of the code and I am not sure if I will gain a lot given my small system size. 2) Only assemble the Jacobian every few steps, but the residual in every step. This is probably easier to implement. I know from experience with my problem that I pretty quickly land in a situation where I need only one or two Newton steps to find the solution to my nonlinear equation, so there saving will be small at best. Is there anything else one can do? So far I've been using MeshWorker, which is fine and understandable to be, but e.g. the boundary term as used in Example 12 queries the scalar product of \mu and the edge normal in each boundary element, which seems like a possible slowdown - in addition to generating jumps and averages on inner cell edges. Any help is much appreciated. Sorry for the long text! /Kev -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/75da33ce-4268-4716-a7e0-dae04192324an%40googlegroups.com.