https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |thopre01 at gcc dot gnu.org

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Arnd Bergmann from comment #4)
> I've tried out a few more things as well, to see if the alignment of the
> struct lpfc_name type or the builtin memcpy makes a different. Replacing the
> array of eight bytes with a single uint64_t and scalar operations instead of
> string functions makes very little difference, so it seems to be neither of
> them.
> 
> However, I think the wwn_to_uint64_t() function is what causes the problem.
> This is supposed to be turned into a direct load or a byte reversing load
> depending on endianess, but this apparently does not happen.

I'm not aware of such transform - we have the 'bswap' pass on GIMPLE
but that only looks for the direct load case (-mbig-endian):

64 bit load in target endianness found at: _95 = MEM[(uint8_t
*)vport_wwpn_14(D)];
64 bit load in target endianness found at: _125 = MEM[(uint8_t
*)target_wwpn_16(D)];
64 bit load in target endianness found at: _167 = MEM[(uint8_t
*)vport_wwpn_14(D)];
64 bit load in target endianness found at: _197 = MEM[(uint8_t
*)target_wwpn_16(D)];
64 bit load in target endianness found at: _236 = MEM[(uint8_t
*)vport_wwpn_14(D)];
64 bit load in target endianness found at: _266 = MEM[(uint8_t
*)target_wwpn_16(D)];

as there is no "standard" target independent way to express a byte
reversed load (optab or so).  The closest we'd have is to "vectorize"
this as

  tem = load v8qi;
  tem = vec_perm <tem, tem, { 7, 6, 5, 4, 3, 2, 1, 0 }>;
  ... = VIEW_CONVERT <uint64_t, tem>;

if both the vector mode exists and the constant permute is handled
by the target.  The target would then need to combine the load
and the permute into a reversing load (if the target indeed has such
instruction).

> Adding -mbig-endian to the compiler flags brings the stack usage down, so
> presumably the optimization step that identifies byteswap patters is what
> causes the stack growth.

Reply via email to