[RFA][PR target/113666] Simplify VEC_EXTRACT from a uniform vector

Jeffrey Law Mon, 12 Jan 2026 11:57:19 -0800

This fixes a P3 regression relative to gcc-13 on the RISC-V platform forthis code:

unsigned char a;


int main() {
   short b = a = 0;
   for (; a != 19; a++)
     if (a)
       b = 32872 >> a;

   if (b == 0)
     return 0;
   else
     return 1;
}

-march=rv64gcv_zvl256b -mabi=lp64d -O3 -ftree-vectorize


Doesn't need vector at all.  Good code generation here looks like:

         lui     a5,%hi(a)
         li      a4,19
         sb      a4,%lo(a)(a5)
         li      a0,0
         ret

gcc-14 and gcc-15 produce horrific code here, roughly 20 instructions,over half of which are vector. It's not even worth posting, it'satrocious.


The trunk improves things, but not quite to the quality of gcc-13:

vsetivlizero,8,e16,mf2,ta,ma
vmv.v.iv1,0
luia5,%hi(a)
lia4,19
vslidedown.viv1,v1,1
sba4,%lo(a)(a5)
vmv.x.sa0,v1
sneza0,a0
ret


If we look at the .optimized dump we have this nugget:

  _26 = .VEC_EXTRACT ({ 0, 0, 0, 0, 0, 0, 0, 0 }, 1);

If we're extracting an element out of a uniform vector, then any elementwill do and it's conveniently returned by uniform_vector_p. So with asimple match.pd pattern that simplifies to _26 = 0. That in turn allowselimination of all the vector code and simplify the return value to aconstant as well, resulting in the desired code shown earlier.

One could easily argue that this need not be restricted to a uniformvector and I would totally agree. But given we're in stage4, theminimal fix for the regression seems more appropriate. But I couldcertainly be convinced to handle the more general case here.

Bootstrapped and regression tested on x86 & riscv64. Tested across thecross configurations as well with no regressions.



OK for the trunk?

Jeff

        PR target/113666
gcc/
        * match.pd (VEC_EXTRACT): Simplify VEC_EXTRACT when asked for an known
        element from a uniform vector.

gcc/testsuite/
        * gcc.target/riscv/rvv/base/pr113666.c: New test.

diff --git a/gcc/match.pd b/gcc/match.pd
index 492d88514fce..274d64c98ac1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -12196,3 +12196,10 @@ and,
         && TYPE_UNSIGNED (type)
         && @0 == @3)
     (bit_xor (rrotate @0 @4) @2)))
+
+/* Optimize extraction from a uniform vector to a representative element as
+   long as the requested element is within range.  */
+(simplify (IFN_VEC_EXTRACT @0 INTEGER_CST@1)
+ (if (uniform_vector_p (@0)
+      && known_lt (tree_to_uhwi (@1), TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))))
+  { uniform_vector_p (@0); }))
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr113666.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr113666.c
new file mode 100644
index 000000000000..b1034d7676d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr113666.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" { target rv64} } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -O3" { target rv32} } */
+
+unsigned char a;
+
+int main() {
+  short b = a = 0;
+  for (; a != 19; a++)
+    if (a)
+      b = 32872 >> a;
+
+  if (b == 0)
+    return 0;
+  else
+    return 1;
+}
+
+/* If we vectorized, we should still be able to collapse away the VEC_EXTRACT,
+   leaving zero vector code in the final assembly.  So there should be no 
+   vsetvl instructions.  */
+/* { dg-final { scan-assembler-not {vsetivli} } } */
+
+

[RFA][PR target/113666] Simplify VEC_EXTRACT from a uniform vector

Reply via email to