https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91975

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
So in addition to the unrolling heuristics we can see that at -O2 PRE defeats
memcpy detection for g0 and f0 (but PRE does nothing to g1 and f1).  At -O3
PRE avoids this transform because of heuristics enabling vectorization so
loop distribution is happy.  Arguably what PRE does is on the border of being
never useful (and also artificial here because of the visible constant
initializer of T).  Loop distribution has no chance to re-discover the
memcpy here - loop splitting would need to peel off the first iteration
which now reads d[0] = 0 instead of d[0] = s[0]...

Reply via email to