This patch apply vla vs. vls mode heuristic which can fixes the following FAILs:
FAIL: gcc.target/riscv/rvv/autovec/pr111751.c -O3 -ftree-vectorize
scan-assembler-not vset
FAIL: gcc.target/riscv/rvv/autovec/pr111751.c -O3 -ftree-vectorize
scan-assembler-times li\\s+[a-x0-9]+,0\\s+ret 2
The root cause of this FAIL is we failed to pick VLS mode for the vectorization.
Before this patch:
foo2:
addisp,sp,-208
addia2,sp,64
addia5,sp,128
lui a6,%hi(.LANCHOR0)
sd ra,200(sp)
addia6,a6,%lo(.LANCHOR0)
mv a0,a2
mv a1,a5
li a3,16
mv a4,sp
vsetivlizero,8,e64,m8,ta,ma
vle64.v v8,0(a6)
vse64.v v8,0(a2)
vse64.v v8,0(a5)
.L4:
vsetvli a5,a3,e32,m1,ta,ma
sllia2,a5,2
vle32.v v2,0(a1)
vle32.v v1,0(a0)
sub a3,a3,a5
vadd.vv v1,v1,v2
vse32.v v1,0(a4)
add a1,a1,a2
add a0,a0,a2
add a4,a4,a2
bne a3,zero,.L4
lw a4,128(sp)
lw a5,64(sp)
addwa5,a5,a4
lw a4,0(sp)
bne a4,a5,.L5
lw a4,132(sp)
lw a5,68(sp)
addwa5,a5,a4
lw a4,4(sp)
bne a4,a5,.L5
lw a4,136(sp)
lw a5,72(sp)
addwa5,a5,a4
lw a4,8(sp)
bne a4,a5,.L5
lw a4,140(sp)
lw a5,76(sp)
addwa5,a5,a4
lw a4,12(sp)
bne a4,a5,.L5
lw a4,144(sp)
lw a5,80(sp)
addwa5,a5,a4
lw a4,16(sp)
bne a4,a5,.L5
lw a4,148(sp)
lw a5,84(sp)
addwa5,a5,a4
lw a4,20(sp)
bne a4,a5,.L5
lw a4,152(sp)
lw a5,88(sp)
addwa5,a5,a4
lw a4,24(sp)
bne a4,a5,.L5
lw a4,156(sp)
lw a5,92(sp)
addwa5,a5,a4
lw a4,28(sp)
bne a4,a5,.L5
lw a4,160(sp)
lw a5,96(sp)
addwa5,a5,a4
lw a4,32(sp)
bne a4,a5,.L5
lw a4,164(sp)
lw a5,100(sp)
addwa5,a5,a4
lw a4,36(sp)
bne a4,a5,.L5
lw a4,168(sp)
lw a5,104(sp)
addwa5,a5,a4
lw a4,40(sp)
bne a4,a5,.L5
lw a4,172(sp)
lw a5,108(sp)
addwa5,a5,a4
lw a4,44(sp)
bne a4,a5,.L5
lw a4,176(sp)
lw a5,112(sp)
addwa5,a5,a4
lw a4,48(sp)
bne a4,a5,.L5
lw a4,180(sp)
lw a5,116(sp)
addwa5,a5,a4
lw a4,52(sp)
bne a4,a5,.L5
lw a4,184(sp)
lw a5,120(sp)
addwa5,a5,a4
lw a4,56(sp)
bne a4,a5,.L5
lw a4,188(sp)
lw a5,124(sp)
addwa5,a5,a4
lw a4,60(sp)
bne a4,a5,.L5
ld ra,200(sp)
li a0,0
addisp,sp,208
jr ra
.L5:
callabort
After this patch:
li a0,0
ret
The heuristic leverage ARM SVE and fully tested and confirm we have same
behavior
as ARM SVE GCC and RVV Clang.
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (costs::analyze_loop_vinfo): New
function.
(costs::record_potential_vls_unrolling): Ditto.
(costs::prefer_unrolled_loop): Ditto.
(costs::better_main_loop_than_p): Ditto.
(costs::add_stmt_cost): Ditto.
* config/riscv/riscv-vector-costs.h (enum cost_type_enum): New enum.
* config/riscv/t-riscv: Add new include files.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr111313.c: Adapt test.
* gcc.target/riscv/rvv/autovec/vls/shift-3.c: Ditto.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-1.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-10.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-11.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-12.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-2.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-3.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-4.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-5.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-6.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-7.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-8.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-9.c: New test.
---
gcc/config/riscv/riscv-vector-costs.cc| 134 +-
gcc/config/riscv/riscv-vector-costs.h | 43 ++
gcc/config/riscv/t-ri