You can test a benchmark problem with both. It probably doesn't make a lot of difference with the solver configuration you've selected (most of those operations are memory bandwidth limited).
If your residual and Jacobian assembly code is written to vectorize, you may get significant benefit from architecture-specific optimizations like -march=skylake. Alfredo Jaramillo <ajaramillopa...@gmail.com> writes: > Dear community, > > We are in the middle of testing a simulator where the main computational > bottleneck is solving a linear problem. We do this by calling > GMRES+BoomerAMG through PETSc. > > This is a commercial code, pretended to serve clients with workstations or > with access to clusters. > > Would you recommend O3 versus O2 optimizations? Maybe just to compile the > linear algebra libraries? > > Some years ago, I worked on another project where going back to O2 solved a > weird runtime error that I was never able to solve. This triggers my > untrust. > > Thank you for your time! > Alfredo