pr98914.c for 32 bits

luoxhu at gcc dot gnu.org via Gcc-bugs Thu, 25 Mar 2021 22:37:26 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99718


--- Comment #11 from luoxhu at gcc dot gnu.org ---
Created attachment 50474
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50474&action=edit
32bit variable vec_insert

LLVM also generates store-hit-load instruction:

        addi 3, 1, -16
        rlwinm 4, 5, 2, 28, 29
        stvx 2, 0, 3
        stwx 6, 3, 4
        lvx 2, 0, 3
        blr
        .long   0
        .quad   0

I didn't use "can't" in my reply, sorry that caused the confusion, we though it
was  inefficient to move SF to SI on 32bit mode , but it turns out also huge
performance gain (46.704s -> 4.369s).

Attached the patch that also support variable vec_insert for 32bit, testing on
P8BE/PBLE/P9LE, could you please verify it on AIX? Will refine it and send to
the mail-list to fix this P1 issue fundamentally.

[Bug target/99718] [11 regression] ICE in new test case gcc.target/powerpc/pr98914.c for 32 bits

Reply via email to