On Tue, Sep 2, 2025 at 5:47 PM Robin Dapp <[email protected]> wrote:
>
> > Yeah, I am not insisting it must honor the type, but I am not sure we
> > should move
> > this in that way now, it seems possible to make a vector mode legal
> > but still split by
> > controlling optab_supported_p I think.
>
> The issue with the ABI is that the modes are not available at all right now I
> suppose?  And that's due to riscv-vector-switch using vls_mode_valid_p?
> Or is there something else that complicates things (just from an ABI
> perspective)?
>
> I guess we could ditch vls_mode_valid_p in riscv-vector-switch altogether to
> have the modes available but no operations with them.  We'd probably need to
> ensure that we at least have a TARGET_MIN_VLEN in the mode requirements,
> though.
>
> For "GCC vector" deconstruction IIRC forwprop (and maybe other places) just
> check for pure mode availability and not a specific optab when doing
> constructor optimizations.  So for those cases we would need to provide custom
> fallbacks in the backend.  But if I'm not missing something I don't think
> there's a lot more to it?
>
> > However I think this should be related straightforwardly since it
> > kinda much fits what
> > we claim in the document, also as you say, it has more room to play with it,
> > so I incline fix/change that in this way and then introduce other
> > options to address that later?
> > (I assume at least defer until we introduce prefered LMUL in tune_param)
> >
> > What do you think?

I tried to enabled all mode but disable all pattern except
move-related pattern (without that it will ICE during expand time)
and it will result terrible code gen, give a practical example here:

```c
typedef int32_t int32x8_t __attribute__((vector_size(32)));
int32x8_t __attribute__((riscv_vls_cc(128)))
test_256bit_vector(int32x8_t vec1, int32x8_t vec2) {
   int32x8_t result;
   result = vec1 + vec2;
   return result;
}
```

Will got:
```asm
        addi    sp,sp,-96
       .cfi_def_cfa_offset 96
       vsetivli        zero,8,e32,m2,ta,ma
       addi    a5,sp,32
       vse32.v v8,0(a5)
       addi    a5,sp,64
       vse32.v v10,0(a5)
       vmv.v.i v8,0
       addi    a5,sp,32
       vse32.v v8,0(sp)
       vsetivli        zero,4,e32,m1,ta,ma
       vle32.v v8,0(a5)
       addi    a5,sp,64
       vle32.v v9,0(a5)
       addi    a5,sp,48
       vadd.vv v8,v8,v9
       vse32.v v8,0(sp)
       vle32.v v8,0(a5)
       addi    a5,sp,80
       vle32.v v9,0(a5)
       addi    a5,sp,16
       vadd.vv v8,v8,v9
       vse32.v v8,0(a5)
       vsetivli        zero,8,e32,m2,ta,ma
       vle32.v v8,0(sp)
       addi    sp,sp,96
       .cfi_def_cfa_offset 0
       jr      ra
```

That because we will got lots of subreg like:

```rtl
(insn 8 7 9 2 (set (reg:V4SI 145 [ _7 ])
       (plus:V4SI (subreg:V4SI (reg/v:V8SI 141 [ vec1 ]) 0)
           (subreg:V4SI (reg/v:V8SI 142 [ vec2 ]) 0)))
```


subreg can't easily take lower reg or higher reg since it will break
when VLEN is larger than MIN_VLEN,
so the only safe way is to go through memory.

Also I tried TARGET_OPTABL_SUPPORTED_P, but it did not work as expected ... :(

And I am not really insistent on making VLS type work without
-mrvv-max-lmul, so I am happy if we can find a way to implement VLS CC
without this change.




Here is the branch in case you are interested in playing with that:
https://github.com/kito-cheng/gcc/tree/kitoc/vls-cc-testing


>
> Is the suggestion to have another param at some point that controls LMUL for
> "GCC vectors"?  And then have both in a tune struct rather than just
> rvv-max-lmul?

Yes, but forgot this part, I don't want to expand too much topic here
which not my high priority stuff :P

>
> --
> Regards
>  Robin
>

Reply via email to