https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112092
--- Comment #5 from JuzheZhong <juzhe.zhong at rivai dot ai> --- Yes. I am agree that some arch prefer agnostic than undisturbed even with more vsetvls. That's why I have post PR for asking whether we can have a option like -mprefer-agosnotic. https://github.com/riscv-non-isa/riscv-toolchain-conventions/issues/37 But I think Maciej is worrying about why GCC fuse vsetvl, and change e16mf2 vsetvl into e32m1. For example: https://godbolt.org/z/6G9G7Pbe9 No 'TU' included. I think LLVM codegen looks more reasonable: beqz a5, .LBB0_4 vsetvli a1, a6, e32, m1, ta, ma beqz a4, .LBB0_3 .LBB0_2: # =>This Inner Loop Header: Depth=1 vsetvli zero, a1, e32, m1, ta, ma vle32.v v8, (a0) vadd.vv v8, v8, v8 addi a4, a4, -1 vse32.v v8, (a3) bnez a4, .LBB0_2 .LBB0_3: ret .LBB0_4: srai a1, a6, 2 vsetvli a1, a1, e16, mf2, ta, ma bnez a4, .LBB0_2 j .LBB0_3 But GCC is correct with optimizations: foo(int*, int*, int*, int*, unsigned long, int, int): beq a5,zero,.L2 vsetvli a5,a6,e32,m1,ta,ma .L3: beq a4,zero,.L10 li a2,0 .L5: vle32.v v1,0(a0) addi a2,a2,1 vadd.vv v1,v1,v1 vse32.v v1,0(a3) bne a4,a2,.L5 .L10: ret .L2: sraiw a5,a6,2 vsetvli zero,a5,e32,m1,ta,ma j .L3