https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90883

--- Comment #17 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Jeffrey A. Law from comment #16)
> The issue here (of course) is that aarch64 has a different set of defaults
> for when to open-code vs loop vs function call.   My attempts to pick a
> better size for the objects results in failures on other targets.
> 
> Do we have a method on aarch64 to tune this stuff via flags?  Otherwise I'm
> likely to just xfail aarch64 and move on since DSE is doing what we want at
> this point if given sane input.

I don't know, this issue doesn't seem related to any backend setting - this is
a typical inline memset expansion.

Handling structures that are not a multiple of 4 or 8 are generally inefficient
on GCC given the mid-end can't deal with overlapping accesses of different
sizes. It's efficient if I change the size of the array to 8 rather than 7.

So there is a real issue here, but maybe you'd prefer a new bugreport for that?

Reply via email to