https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113618

--- Comment #3 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> It might be good to recognize this pattern in strlenopt or a related pass.
> 
> A purely local transform would turn it into
> 
>     memcpy (temp, a, 64);
>     memmove (b, a, 64);
> 
> relying on DSE to eliminate the copy to temp if possible.  Not sure if
> that possibly would be a bad transform if copying to temp is required.

This would only be beneficial if you know memmove is inlined if memcpy is - on
almost all targets memmove becomes a library call, so the transformation would
be worse if memcpy can be inlined.

>         stp     q30, q31, [sp]
>         ldp     q30, q31, [sp]
> 
> why is CSE not able to catch this?

The new RTL now has UNSPECs in them, so CSE doesn't know it is a plain
load/store:

STP: 

(insn 12 11 13 2 (set (mem/c:V2x16QI (reg:DI 102) [0 +0 S32 A128])
        (unspec:V2x16QI [
                (reg:V4SI 104)
                (reg:V4SI 105)
            ] UNSPEC_STP)) "/app/example.c":5:5 -1
     (nil))

LDP:

(insn 16 15 17 2 (parallel [
            (set (reg:V4SI 108)
                (unspec:V4SI [
                        (mem/c:V2x16QI (reg:DI 107) [0 +0 S32 A128])
                    ] UNSPEC_LDP_FST))
            (set (reg:V4SI 109)
                (unspec:V4SI [
                        (mem/c:V2x16QI (reg:DI 107) [0 +0 S32 A128])
                    ] UNSPEC_LDP_SND))
        ]) "/app/example.c":6:5 -1
     (nil))

Reply via email to