https://gcc.gnu.org/g:9d8867fa9f44d03fb188aa470560983794d137b3
commit 9d8867fa9f44d03fb188aa470560983794d137b3 Author: Michael Meissner <meiss...@linux.ibm.com> Date: Tue Jul 22 00:52:20 2025 -0400 Update ChangeLog.* Diff: --- gcc/ChangeLog.bugs | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs index 468610cdefc6..0af246567fd5 100644 --- a/gcc/ChangeLog.bugs +++ b/gcc/ChangeLog.bugs @@ -1,3 +1,79 @@ +==================== Branch work215-bugs, patch #101 ==================== + +PR target/120528 -- Simplify zero extend from memory to VSX register on power10 + +Previously GCC would zero extend a DImode value in memory to a TImode +target in a vector register by firt zero extending the DImode value +into a GPR TImode register pair, and then do a MTVSRDD to move this +value to a VSX register. + +For example, consider the following code: + + #ifndef TYPE + #define TYPE unsigned long long + #endif + + void + mem_to_vsx (TYPE *p, __uint128_t *q) + { + /* lxvrdx 0,0,3 + stxv 0,0(4) */ + + __uint128_t x = *p; + __asm__ (" # %x0" : "+wa" (x)); + *q = x; +} + +It currently generates the following code on power10: + + mem_to_vsx: + ld 10,0(3) + li 11,0 + mtvsrdd 0,11,10 + #APP + # 0 + #NO_APP + stxv 0,0(4) + blr + +Instead it could generate: + + mem_to_vsx: + lxvrdx 0,0,3 + #APP + # 0 + #NO_APP + stxv 0,0(4) + blr + +The lxvr{b,h,w,d}x instructions were added in power10, and they load up +a vector register with a byte, half-word, word, or double-word value in +the right most bits, and fill the remaining bits to 0. I noticed this +code when working on PR target/108958 (which I just posted the patch). + +This patch creates a peephole2 to catch this case, and it eliminates +creating the TImode variable. Instead it just does the LXVR{B,H,W,D}x +instruction directly. + +I have built GCC with the patches in this patch set applied on both +little and big endian PowerPC systems and there were no regressions. +Can I apply this patch to GCC 16? + +2025-07-22 Michael Meissner <meiss...@linux.ibm.com> + +gcc/ + + PR target/120528 + * config/rs6000/rs6000.md (zero_extend??ti2 peephole2): Add a + peephole2 to simplify zero extending a QI/HI/SI/DImode value in + memory to a TImode target in a vector register to use the + LXVR{B,H,W,D}X instructins. + +gcc/testsuite/ + + PR target/120528 + * gcc.target/powerpc/pr120528.c: New test. + ==================== Branch work215-bugs, patch #100 ==================== PR 99293: Optimize splat of a V2DF/V2DI extract with constant element