Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Mon, Mar 27, 2023 at 6:02 PM Kevin Lee <kev...@rivosinc.com> wrote:
>>
>> This patch is a proper fix to the previous patch
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614463.html
>> vect_grouped_store_supported checks if the count is a power of 2, but
>> doesn't check the size of the GET_MODE_NUNITS.
>> This should handle the riscv case where the mode is VNx1DI since the
>> nelt would be {1, 1}.
>> It was tested on RISCV and x86_64-linux-gnu. Would this be correct
>> for the vectors with size smaller than 2?
>>
>> ---
>>  gcc/tree-vect-data-refs.cc | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
>> index 8daf7bd7dd3..04ad12f7d04 100644
>> --- a/gcc/tree-vect-data-refs.cc
>> +++ b/gcc/tree-vect-data-refs.cc
>> @@ -5399,6 +5399,8 @@ vect_grouped_store_supported (tree vectype, unsigned 
>> HOST_WIDE_INT count)
>>           poly_uint64 nelt = GET_MODE_NUNITS (mode);
>>
>>           /* The encoding has 2 interleaved stepped patterns.  */
>> +    if(!nelt.is_constant() && maybe_lt(nelt, (unsigned int) 2))
>> +      return false;
>
> Indentation is off (or your MUA is broken).  I think the nelt.is_constant ()
> check is superfluous but with constant nelt we'd never end up with a
> grouped store.
>
> Note the calls are guarded with
>
>          && ! known_eq (TYPE_VECTOR_SUBPARTS (vectype), 1U)
>
> maybe the better fix is to change those to ! maybe_eq?

I think the point of those checks is that a grouped store of N 1-element
vectors is equivalent to a store of N scalars.  Nothing needs to happen
internally within the vectors.

For a grouped store of VNx1 vectors, some permutation would be needed.
But it's difficult to generate code for that case, because the minimum
size reduces to two scalars while larger sizes need normal interleaves.

But I think the better check for location above is:

   if (!multiple_p (nelt, 2))
     return false;

which then guards the assert in the later exact_div (nelt, 2).

Thanks,
Richard

Reply via email to