[Bug target/112327] RVV: Redundant vmv1r for widen reduction

2023-11-01 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112327

JuzheZhong  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from JuzheZhong  ---
Fixed

[Bug target/112327] RVV: Redundant vmv1r for widen reduction

2023-11-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112327

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Pan Li :

https://gcc.gnu.org/g:1a0af6e5a99cd895a663f0221c25321ae802413f

commit r14-5067-g1a0af6e5a99cd895a663f0221c25321ae802413f
Author: Juzhe-Zhong 
Date:   Wed Nov 1 14:56:39 2023 +0800

RISC-V: Allow dest operand and accumulator operand overlap of widen
reduction instruction[PR112327]

Consider this following intrinsic code:

void rvv_dot_prod(int16_t *pSrcA, int16_t *pSrcB, uint32_t n, int64_t
*result)
{
size_t vl;
vint16m4_t vSrcA, vSrcB;
vint64m1_t vSum = __riscv_vmv_s_x_i64m1(0, 1);
while (n > 0) {
vl = __riscv_vsetvl_e16m4(n);
vSrcA = __riscv_vle16_v_i16m4(pSrcA, vl);
vSrcB = __riscv_vle16_v_i16m4(pSrcB, vl);
vSum =
__riscv_vwredsum_vs_i32m8_i64m1(__riscv_vwmul_vv_i32m8(vSrcA, vSrcB, vl), vSum,
vl);
pSrcA += vl;
pSrcB += vl;
n -= vl;
}
*result = __riscv_vmv_x_s_i64m1_i64(vSum);
}

https://godbolt.org/z/vWd35W7G6

Before this patch:

...
Loop:
...
vmv1r.v v2,v1
...
vwredsum.vs v1,v8,v2
...

After this patch:

...
Loop:
...
vwredsum.vs v1,v8,v1
...

PR target/112327

gcc/ChangeLog:

* config/riscv/vector.md: Add '0'.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112327-1.c: New test.
* gcc.target/riscv/rvv/base/pr112327-2.c: New test.