From: Pan Li <pan2...@intel.com>

This patch would like to fix one bug of const vector for interleave.
Assume we need to generate interleave const vector like below.

 V = {{4, -4, 3, -3, 2, -2, 1, -1,}

Before this patch:
vsetvl a3, zero, e64, m8, ta, ma
vid.v       v8            v8 =  {0, 1, 2, 3, 4}
li          a6, -1
vmul.vx     v8, v8, a6    v8 =  {-0, -1, -2, -3, -4}
vadd.vi     v24, v8, 4    v24 = { 4,  3,  2,  1,  0}
vadd.vi     v8, v8, -4    v8 =  {-4, -5, -6, -7, -8}
li          a6, 32
vsll.vx     v8, v8, a6    v8 =  {0, -4, 0, -5, 0, -6, 0, -7,} for e32
vor         v24, v24, v8  v24 = {4, -4, 3, -5, 2, -6, 1, -7,} for e32

After this patch:
vsetvli a6,zero,e64,m8,ta,ma
vid.v  v8                  v8 =  {0, 1, 2, 3, 4}
li a7,-1
vmul.vx v16,v8,a7         v16 = {-0, -1, -2, -3, -4}
vaddvi v16,v16,4          v16 = { 4,  3,  2,  1, 0}
vaddvi v8,v8,-4           v8 =  {-4, -3, -2, -1, 0}
li a7,32
vsll.vx v8,v8,a7          v8 =  {0, -4, 0, -3, 0, -2,} for e32
vor.vv v16,v16,v8         v8 =  {4, -4, 3, -3, 2, -2,} for e32

It is not easy to add asm check stable enough for this case, as we need
to check the vadd -4 target comes from the vid output, which crosses 4
instructions up to point. Thus there is no test here and will be covered
by gcc.dg/vect/pr92420.c in the underlying patches.

gcc/ChangeLog:

        * config/riscv/riscv-v.cc (expand_const_vector): Take step2
        instead of step1 for second series.

Signed-off-by: Pan Li <pan2...@intel.com>
---
 gcc/config/riscv/riscv-v.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index eade8db4cf1..d1eb7a0a9a5 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1331,7 +1331,7 @@ expand_const_vector (rtx target, rtx src)
                  rtx tmp2 = gen_reg_rtx (new_mode);
                  base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode);
                  expand_vec_series (tmp2, base2,
-                                    gen_int_mode (step1, new_smode));
+                                    gen_int_mode (step2, new_smode));
                  rtx shifted_tmp2 = expand_simple_binop (
                    new_mode, ASHIFT, tmp2,
                    gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX,
-- 
2.34.1

Reply via email to