Hej, 

I've written here before and hope I don't misuse the mailing list - however 
I've been looking a bit in the documentation and here and haven't really 
found a conclusive answer.

I am aiming to solve a nonlinear hyperbolic transport equation. For the 
sake of the argument, let's say it reads

mu \cdot \nabla f(x) = - f(x)^2 - 2*b(x)*f(x) - a(x)

this is, of course, a Riccati equation (up to signs, possibly). In my case, 
f is a complex function but this is of little relevance here. Since it's a 
nonlinear problem I need to construct both the Jacobian and the residual. 
For starters, I do that in each step.

I've managed to implement this and even get a PETSc-parallelised version to 
work, and am very happy. (I love deal.ii, by the way - very impressive). It 
scales not "optimally" on my small laptop but it's still a fine speedup 
when using MPI. So far so good.

However, I want to solve my problem for many different directions vF, and 
then extract all the solutions and do something with them. As such, my 
problem is less that I need very large number of DOFs / huge meshes - my 
typical mesh will be on the order of 10000 unknowns, maybe 100k but not 
millions. Rather, I want the individual solves to be as fast as possible 
since I need to do on the order of 100-10000 of them, depending on the 
problem at hand. 

I've done some layman's benchmarking of the individual "steps" (setup, 
assembly, solve, ...) in my current version of the code. It looks as if the 
assembly takes several orders of magnitude (~100 at least) longer than the 
solving part.

My question is now: What is the best strategy to speed up assembly, is 
there any experience with this? I've read different approaches and am 
confused what's promising for small-scale problems. So far I'm considering:

1) Using a matrix-free approach rather than PETSc - this seems to be a win 
in most cases, would however consider  rewriting large parts of the code 
and I am not sure if I will gain a lot given my small system size.

2) Only assemble the Jacobian every few steps, but the residual in every 
step. This is probably easier to implement. I know from experience with my 
problem that I pretty quickly land in a situation where I need only one or 
two Newton steps to find the solution to my nonlinear equation, so there 
saving will be small at best. 

Is there anything else one can do? 
So far I've been using MeshWorker, which is fine and understandable to be, 
but e.g. the boundary term as used in Example 12 queries the scalar product 
of \mu and the edge normal in each boundary element, which seems like a 
possible slowdown - in addition to generating jumps and averages on inner 
cell edges.

 Any help is much appreciated. Sorry for the long text!
/Kev

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/75da33ce-4268-4716-a7e0-dae04192324an%40googlegroups.com.

Reply via email to