https://gcc.gnu.org/g:9d8867fa9f44d03fb188aa470560983794d137b3

commit 9d8867fa9f44d03fb188aa470560983794d137b3
Author: Michael Meissner <meiss...@linux.ibm.com>
Date:   Tue Jul 22 00:52:20 2025 -0400

    Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.bugs | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
index 468610cdefc6..0af246567fd5 100644
--- a/gcc/ChangeLog.bugs
+++ b/gcc/ChangeLog.bugs
@@ -1,3 +1,79 @@
+==================== Branch work215-bugs, patch #101 ====================
+
+PR target/120528 -- Simplify zero extend from memory to VSX register on power10
+
+Previously GCC would zero extend a DImode value in memory to a TImode
+target in a vector register by firt zero extending the DImode value
+into a GPR TImode register pair, and then do a MTVSRDD to move this
+value to a VSX register.
+
+For example, consider the following code:
+
+       #ifndef TYPE
+       #define TYPE unsigned long long
+       #endif
+
+       void
+       mem_to_vsx (TYPE *p, __uint128_t *q)
+       {
+         /* lxvrdx 0,0,3
+            stxv 0,0(4)  */
+
+         __uint128_t x = *p;
+         __asm__ (" # %x0" : "+wa" (x));
+         *q = x;
+}
+
+It currently generates the following code on power10:
+
+       mem_to_vsx:
+               ld 10,0(3)
+               li 11,0
+               mtvsrdd 0,11,10
+       #APP
+                # 0
+       #NO_APP
+               stxv 0,0(4)
+               blr
+
+Instead it could generate:
+
+       mem_to_vsx:
+               lxvrdx 0,0,3
+       #APP
+                # 0
+       #NO_APP
+               stxv 0,0(4)
+               blr
+
+The lxvr{b,h,w,d}x instructions were added in power10, and they load up
+a vector register with a byte, half-word, word, or double-word value in
+the right most bits, and fill the remaining bits to 0.  I noticed this
+code when working on PR target/108958 (which I just posted the patch).
+
+This patch creates a peephole2 to catch this case, and it eliminates
+creating the TImode variable.  Instead it just does the LXVR{B,H,W,D}x
+instruction directly.
+
+I have built GCC with the patches in this patch set applied on both
+little and big endian PowerPC systems and there were no regressions.
+Can I apply this patch to GCC 16?
+
+2025-07-22  Michael Meissner  <meiss...@linux.ibm.com>
+
+gcc/
+
+       PR target/120528
+       * config/rs6000/rs6000.md (zero_extend??ti2 peephole2): Add a
+       peephole2 to simplify zero extending a QI/HI/SI/DImode value in
+       memory to a TImode target in a vector register to use the
+       LXVR{B,H,W,D}X instructins.
+
+gcc/testsuite/
+
+       PR target/120528
+       * gcc.target/powerpc/pr120528.c: New test.
+
 ==================== Branch work215-bugs, patch #100 ====================
 
 PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

Reply via email to