https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108279

--- Comment #22 from Michael_S <already5chosen at yahoo dot com> ---
(In reply to Michael_S from comment #8)
> (In reply to Thomas Koenig from comment #6)
> > And there will have to be a decision about 32-bit targets.
> >
> 
> IMHO, 32-bit targets should be left in their current state.
> People that use them probably do not care deeply about performance.
> Technically, I can implement 32-bit targets in the same sources, by means of
> few ifdefs and macros, but resulting source code will look much uglier than
> how it looks today. Still, not to the same level of horror that you have in
> matmul_r16.c, but certainly uglier than how I like it to look.
> And I am not sure at all that my implementation of 32-bit targets would be
> significantly faster than current soft float.

I explored this path (implementing 32-bit and 64-bit targets from the same
source with few ifdefs) a little more:
Now I am even more sure that it is not a way to go. gcc compiler does not
generate good 32-bit code for this style of sources. This especially applies to
i386, other supported 32-bit targets (RV32, SPARC32) are affected less.

In the process I encountered a funny illogical pessimization by i386 code
generator:
https://godbolt.org/z/En6Tredox
Routines foo32() and foo64() are semantically identical, but foo32() is written
 with 32-bit targets in mind while foo64() is the style of could that will
likely be written if one wants to support 32 and 64 bits from the same source
with #ifdef.

The code, generated by gcc for foo32() is reasonable. Like in the source, we
can see 8 multiplications.
The code, generated by gcc for foo64() is anything but reasonable. Somehow,
compiler invented 5 more multiplications for a total of 13 multiplications.

May be, it deserves a separate bug report.

Reply via email to