[Bug middle-end/110832] [14 Regression] 14% capacita -O2 regression between g:9fdbd7d6fa5e0a76 (2023-07-26 01:45) and g:ca912a39cccdd990 (2023-07-27 03:44) on zen3 and core

ubizjak at gmail dot com via Gcc-bugs Fri, 28 Jul 2023 08:46:41 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832


--- Comment #8 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #6)
> Do we know whether we could in theory improve the sanitizing by optimization
> without -funsafe-math-optimizations (I think -fno-trapping-math,
> -ffinite-math-only -fno-signalling-nans should be a better guard?)?

Regarding the sanitizing, we can remove all sanitizing MOVQ instructions
between trapping instructions (IOW, the result of ADDPS is guaranteed to have
zeros in the high part outside V2SF, so MOVQ is unnecessary in front of a
follow-up MULPS).

I think that some instruction back-walking pass on the RTL insn stream would be
able to identify these unnecessary instructions and remove them.

Also, as mentioned elsewhere, it is really hard to get non-zero value to the
highpart of XMM register. The compiler takes great care to always load values
via MOVQ, so one has to craft a special code that works around all these
fences. OTOH, in two years since gcc-11 was released with the V2SF support, not
a single PR involving spurious exceptions was reported. Even capacita benchmark
enables:

Note: The following floating-point exceptions are signalling:
IEEE_UNDERFLOW_FLAG IEEE_DENORMAL

without problems.

As an example here, it looks that polyhedron capacita greatly benefits from
V2SF vectors, and I was surprised that sanitizing MOVQ has such an effect here.

[Bug middle-end/110832] [14 Regression] 14% capacita -O2 regression between g:9fdbd7d6fa5e0a76 (2023-07-26 01:45) and g:ca912a39cccdd990 (2023-07-27 03:44) on zen3 and core

Reply via email to