https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113618
--- Comment #3 from Wilco <wilco at gcc dot gnu.org> --- (In reply to Richard Biener from comment #2) > It might be good to recognize this pattern in strlenopt or a related pass. > > A purely local transform would turn it into > > memcpy (temp, a, 64); > memmove (b, a, 64); > > relying on DSE to eliminate the copy to temp if possible. Not sure if > that possibly would be a bad transform if copying to temp is required. This would only be beneficial if you know memmove is inlined if memcpy is - on almost all targets memmove becomes a library call, so the transformation would be worse if memcpy can be inlined. > stp q30, q31, [sp] > ldp q30, q31, [sp] > > why is CSE not able to catch this? The new RTL now has UNSPECs in them, so CSE doesn't know it is a plain load/store: STP: (insn 12 11 13 2 (set (mem/c:V2x16QI (reg:DI 102) [0 +0 S32 A128]) (unspec:V2x16QI [ (reg:V4SI 104) (reg:V4SI 105) ] UNSPEC_STP)) "/app/example.c":5:5 -1 (nil)) LDP: (insn 16 15 17 2 (parallel [ (set (reg:V4SI 108) (unspec:V4SI [ (mem/c:V2x16QI (reg:DI 107) [0 +0 S32 A128]) ] UNSPEC_LDP_FST)) (set (reg:V4SI 109) (unspec:V4SI [ (mem/c:V2x16QI (reg:DI 107) [0 +0 S32 A128]) ] UNSPEC_LDP_SND)) ]) "/app/example.c":6:5 -1 (nil))