https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308

--- Comment #48 from Bernd Edlinger <bernd.edlinger at hotmail dot de> ---
(In reply to wilco from comment #22)
> 
> Anyway, there is another bug: on AArch64 we correctly recognize there are 8
> 1-byte loads, shifts and orrs which can be replaced by a single 8-byte load
> and a byte reverse. Although it is recognized on ARM and works correctly if
> it is a little endian load, it doesn't perform the optimization if a byte
> reverse is needed. As a result there are lots of 64-bit shifts and orrs
> which create huge register pressure if not expanded early.

Hmm...

I think the test case does something invalid here:

const SHA_LONG64 *W = in;

T1 = X[0] = PULL64(W[0]);


in is not aligned, but it is cast to a 8-byte aligned type.

If the bswap pass assumes with your proposed patch
it is OK to merge 4 byte accesses into an aligned word access,
it may likely break openssl on -mno-unaligned targets.
Even on our cortex-a9 the O/S will trap on unaligned accesses.
I have checked that openssl still works on arm-none-eabi 
with my patch, but I am not sure about your patch.

Reply via email to