Re: [deal.II] Re: Tips on writing "versatile" assembly function

2023-06-11 Thread blais...@gmail.com
Dear Corbin, I mostly used callgrind to identify bottlenecks. Currently, our matrix assembly and RHS assembly codes use virtualization at the cell level. I think this is a very adequate equilibrium. We keep all the flexibility, but we have not seen a drop in performance. We use the WorkStream pa

Re: [deal.II] Re: Tips on writing "versatile" assembly function

2023-06-11 Thread Corbin Foucart
Hello everyone, I"m encountering a similar question for large 3D incompressible fluid solver with many assembly options (which I implement via inheritance). I'd like to investigate how much class/templated inheritance plays a role in performance---Bruno, how do you go about profiling something

Re: [deal.II] Re: Tips on writing "versatile" assembly function

2021-01-05 Thread Doug Shi-Dong
That's interesting. Seems like it was more about inlining than branch prediction. Surprising how much difference it made. On Tuesday, January 5, 2021 at 6:18:38 PM UTC-5 Timo Heister wrote: > What I forgot to say: > We used to have something like > > if (use_anisotropic_viscosity==true) > cell(i

Re: [deal.II] Re: Tips on writing "versatile" assembly function

2021-01-05 Thread Doug Shi-Dong
Hello Prof. Blais, Optimizing code is always fun, so I've had this discussion multiple times with a colleague. Dr. Turcksin comment was also our conclusion. Templating seems to be the way to go, but only on options/variations where mispredicted branches actually slow down your performance since

Re: [deal.II] Re: Tips on writing "versatile" assembly function

2021-01-05 Thread Timo Heister
What I forgot to say: We used to have something like if (use_anisotropic_viscosity==true) cell(i,j) += viscosity_tensor * else cell(i,j) += viscosity_constant * and improved the performance by making two separate assemblers (note that there is no function call/vtable lookup here, ju

Re: [deal.II] Re: Tips on writing "versatile" assembly function

2021-01-05 Thread blais...@gmail.com
Hi Timo, I understand. It makes a lot of sense. Thanks! Bruno On Tuesday, January 5, 2021 at 4:34:19 p.m. UTC-5 Timo Heister wrote: > Hi Bruno, > > We mitigate the performance problem by making the decision per cell in > ASPECT: > We have a set of "assemblers" that are executed one after each o

Re: [deal.II] Re: Tips on writing "versatile" assembly function

2021-01-05 Thread Timo Heister
Hi Bruno, We mitigate the performance problem by making the decision per cell in ASPECT: We have a set of "assemblers" that are executed one after each other per cell. This means the vtable access cost is small compared to the actual work. See https://github.com/geodynamics/aspect/blob/b9add5f5317

[deal.II] Re: Tips on writing "versatile" assembly function

2021-01-05 Thread blais...@gmail.com
Bruno, Thanks, you are right. As always, measure first and then optimize after. No point in optimising stuff that costs nothing... On Tuesday, January 5, 2021 at 3:15:06 p.m. UTC-5 bruno.t...@gmail.com wrote: > Bruno, > > If you are worry about the cost of looking up though the vtable, I thin

[deal.II] Re: Tips on writing "versatile" assembly function

2021-01-05 Thread Bruno Turcksin
Bruno, If you are worry about the cost of looking up though the vtable, I think that you are stuck using template. So either use 2 or 3 and CRTP. But first of all, I think that you should profile your code and verify that this is a problem. There is no point in spending time refactoring your co