Hi Robin:
Thanks for your try, but before I moving forward to debug that, I want
to check with you: I got m2 for the following testcase with following
commands:
$ riscv64-unknown-linux-gnu-gcc test.c -march=rv64gcv -O2 -S
-mrvv-max-lmul=m1 -o -
```c
#include <stdint.h>
typedef int32_t int32x8_t __attribute__((vector_size(32)));
int32x8_t __attribute__((riscv_vls_cc(128)))
test_256bit_vector(int32x8_t vec1, int32x8_t vec2) {
int32x8_t result;
result = vec1 + vec2;
return result;
}
```
```asm
test_256bit_vector:
vsetivli zero,8,e32,m2,ta,ma
vadd.vv v8,v8,v10
ret
```
I suspect I may stack those patches in the wrong way,
do you mind sharing your code on github or somewhere to make sure we
have the same status?
Or you can give me your github account so I can share the write
permission on my repo?
I think it's your
RISC-V: Allow VLS types using up to LMUL 8
that makes the difference. I don't have that one in my tree.
Your example above gives me
test_256bit_vector:
.LFB0:
.cfi_startproc
vsetivli zero,4,e32,m1,ta,ma
addi a5,a1,16
vle32.v v9,0(a5)
addi a5,a2,16
vle32.v v11,0(a5)
vle32.v v8,0(a1)
vle32.v v10,0(a2)
addi a5,a0,16
vadd.vv v9,v9,v11
vadd.vv v8,v8,v10
vse32.v v9,0(a5)
vse32.v v8,0(a0)
which is of course not great and kind of defeats the purpose of a vector CC if
we need pass via stack just because of an LMUL mismatch.
--
Regards
Robin