On Tue, May 7, 2024 at 11:31 AM Toon Moene <t...@moene.org> wrote: > > On 5/7/24 00:02, Toon Moene wrote: > > > OK, perhaps on the aarch64 I need the following option to make the > > comparison fair: > > > > ‘rdma’ > > > > Enable Round Double Multiply Accumulate instructions. This is on by > > default for -march=armv8.1-a. > > > > I.e., -mno-rdma > > > > (I hope that's correct - I'll will try that when the Sun rises again and > > I have some power to run the AArch64 machine ...). > > Well, I did two independent runs with gfortran-13.2 and the following > options: > > -O3 -march=armv8.1-a+rdma > > and > > -O3 -march=armv8.1-a+nordma > > No difference in the number of error runs exceeding the prescribed > thresholds. > > So, unless I made a mistake in the option specification (or the compiler > silently ignored them because they were not applicable to my machine - > ugh), the cause of the problem lies elsewhere.
AARCH64 armv8-a has FMA as part of its base ISA. So you want to try with `-ffp-contract=off` instead. RDMA turns on/off instructions which are not used by the auto-vectorizer (yet) and used by intrinsics for them (If I read the code correctly). Thanks, Andrew Pinski > > Kind regards, > > -- > Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands >