Dear Corbin,
I mostly used callgrind to identify bottlenecks.
Currently, our matrix assembly and RHS assembly codes use virtualization at
the cell level. I think this is a very adequate equilibrium. We keep all
the flexibility, but we have not seen a drop in performance. We use the
WorkStream pa
Hello everyone,
I"m encountering a similar question for large 3D incompressible fluid
solver with many assembly options (which I implement via inheritance). I'd
like to investigate how much class/templated inheritance plays a role in
performance---Bruno, how do you go about profiling something
That's interesting. Seems like it was more about inlining than branch
prediction. Surprising how much difference it made.
On Tuesday, January 5, 2021 at 6:18:38 PM UTC-5 Timo Heister wrote:
> What I forgot to say:
> We used to have something like
>
> if (use_anisotropic_viscosity==true)
> cell(i
Hello Prof. Blais,
Optimizing code is always fun, so I've had this discussion multiple times
with a colleague. Dr. Turcksin comment was also our conclusion. Templating
seems to be the way to go, but only on options/variations where
mispredicted branches actually slow down your performance since
What I forgot to say:
We used to have something like
if (use_anisotropic_viscosity==true)
cell(i,j) += viscosity_tensor *
else
cell(i,j) += viscosity_constant *
and improved the performance by making two separate assemblers (note
that there is no function call/vtable lookup here, ju
Hi Timo,
I understand. It makes a lot of sense.
Thanks!
Bruno
On Tuesday, January 5, 2021 at 4:34:19 p.m. UTC-5 Timo Heister wrote:
> Hi Bruno,
>
> We mitigate the performance problem by making the decision per cell in
> ASPECT:
> We have a set of "assemblers" that are executed one after each o
Hi Bruno,
We mitigate the performance problem by making the decision per cell in ASPECT:
We have a set of "assemblers" that are executed one after each other
per cell. This means the vtable access cost is small compared to the
actual work.
See
https://github.com/geodynamics/aspect/blob/b9add5f5317
Bruno,
Thanks, you are right. As always, measure first and then optimize after. No
point in optimising stuff that costs nothing...
On Tuesday, January 5, 2021 at 3:15:06 p.m. UTC-5 bruno.t...@gmail.com
wrote:
> Bruno,
>
> If you are worry about the cost of looking up though the vtable, I thin
Bruno,
If you are worry about the cost of looking up though the vtable, I think
that you are stuck using template. So either use 2 or 3 and CRTP. But first
of all, I think that you should profile your code and verify that this is a
problem. There is no point in spending time refactoring your co