I tried to enabled all mode but disable all pattern except
move-related pattern (without that it will ICE during expand time)
and it will result terrible code gen, give a practical example here:
The expand ICE is due to us just checking for mode availability and not
vls_mode_valid_p as well. With that we don't try to create a vec_extract that
doesn't exist for the VLS modes. There might be more cases like this but I
didn't audit everything.
```c
typedef int32_t int32x8_t __attribute__((vector_size(32)));
int32x8_t __attribute__((riscv_vls_cc(128)))
test_256bit_vector(int32x8_t vec1, int32x8_t vec2) {
int32x8_t result;
result = vec1 + vec2;
return result;
}
```
Will got:
```asm
addi sp,sp,-96
.cfi_def_cfa_offset 96
vsetivli zero,8,e32,m2,ta,ma
addi a5,sp,32
vse32.v v8,0(a5)
addi a5,sp,64
vse32.v v10,0(a5)
vmv.v.i v8,0
addi a5,sp,32
vse32.v v8,0(sp)
vsetivli zero,4,e32,m1,ta,ma
vle32.v v8,0(a5)
addi a5,sp,64
vle32.v v9,0(a5)
addi a5,sp,48
vadd.vv v8,v8,v9
vse32.v v8,0(sp)
vle32.v v8,0(a5)
addi a5,sp,80
vle32.v v9,0(a5)
addi a5,sp,16
vadd.vv v8,v8,v9
vse32.v v8,0(a5)
vsetivli zero,8,e32,m2,ta,ma
vle32.v v8,0(sp)
addi sp,sp,96
.cfi_def_cfa_offset 0
jr ra
```
That because we will got lots of subreg like:
```rtl
(insn 8 7 9 2 (set (reg:V4SI 145 [ _7 ])
(plus:V4SI (subreg:V4SI (reg/v:V8SI 141 [ vec1 ]) 0)
(subreg:V4SI (reg/v:V8SI 142 [ vec2 ]) 0)))
```
subreg can't easily take lower reg or higher reg since it will break
when VLEN is larger than MIN_VLEN,
so the only safe way is to go through memory.
Yeah, I think that's too late. Once we have established that we can hold
a mode in a register (hard_regno_mode_ok) we're expected to have
moves/subregs etc. and, as you say, all we have left is to go via memory.
However if we do
+ if (riscv_v_ext_vls_mode_p (mode) && !vls_mode_valid_p (mode))
+ return false;
in riscv_hard_regno_mode_ok we don't even get that far and rather use
other available modes.
So with the attached patch on top of your VLS fix things look pretty
reasonable and the "GCC vector" examples still depend on -mrvv-max-lmul
as before.
There are still a few (5) testsuite failures, though. It looks like most of
them are similar, latent and due to us not handling small VLS BImodes properly.
Maybe we still need
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index d2edffb36a2..d2d99b828ac 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -2220,7 +2220,7 @@ get_vector_mode (scalar_mode inner_mode, poly_uint64
nunits)
if (inner_mode == GET_MODE_INNER (mode)
&& known_eq (nunits, GET_MODE_NUNITS (mode))
&& (riscv_v_ext_vector_mode_p (mode)
- || riscv_v_ext_vls_mode_p (mode)))
+ || (riscv_v_ext_vls_mode_p (mode) && vls_mode_valid_p (mode))))
return mode;
return opt_machine_mode ();
}
Besides, I tried your VLS-CC tests and they work nicely!
--
Regards
Robin