Re: Question about wrapv-vect-reduc-dot-s8b.c

2023-08-30 Thread Robin Dapp via Gcc
> the dump-scans.  Can we do sth like
> "vect_recog_dot_prod_pattern: detected\n(!FAILED)*SUCCEEDED", thus
> after the dot-prod pattern dumping allow arbitrary stuff but _not_
> a "failed" and then require a "succeeded"?

It took some fighting with tcl syntax until I arrived at the regex
pattern below but it looks like it is possible.

 "vect_recog_dot_prod_pattern: detected(?:(?!failed).)*succeeded".

This seems to work for the failing cases and I'm going to send a patch
tomorrow if an x86 testsuite run is unchanged.

Regards
 Robin


Re: Question about wrapv-vect-reduc-dot-s8b.c

2023-08-30 Thread Robin Dapp via Gcc


>> I am wondering whether we do have some situations that
>> vec_pack/vec_unpack/vec_widen_xxx/dot_prod pattern can be
>> beneficial for RVV ? I have ever met some situation that vec_unpack
>> can be beneficial when working on SELECT_VL but I don't which
>> case
> 
> With fixed size vectors you'll face the situation that the vectorizer
> chooses the "wrong" vector type so yes, I think implementing
> vec_unpack[s]_{lo,hi} might be useful.  But I wouldn't prioritize this
> until you have a more clear picture of how useful it would be.

Another thing that comes to mind is that we currently don't do
vectorizable calls with mismatched vector sizes.  So even if we detected
e.g. vec_widen_plus early it wouldn't get us much further.
On the other hand, I don't think we perform many optimizations on such
patterns between vect and combine (where we finally generate those).

Regards
 Robin



Re: Question about wrapv-vect-reduc-dot-s8b.c

2023-08-30 Thread Robin Dapp via Gcc
> it's target dependent what we choose first so it's going to be
> a bit difficult to adjust testcases like this (and it looks like
> a testsuite issue).  I think for this specific testcase changing
> scan-tree-dump-times to scan-tree-dump is reasonable.  Note we
> really want to check that for the case we choose finally
> we use the sdot pattern, but I don't see how we can easily constrain
> the dump-scans.  Can we do sth like
> "vect_recog_dot_prod_pattern: detected\n(!FAILED)*SUCCEEDED", thus
> after the dot-prod pattern dumping allow arbitrary stuff but _not_
> a "failed" and then require a "succeeded"?
> 
> The other way would be to somehow add a dump flag that produces
> dumps only for the succeeded part.  Of course we have targets that
> evaluate multiple succeeded parts for costing (but luckily costing
> is disabled for most tests).

I'm going to have a try at fixing the test expectations but not before
tomorrow.  Right now I can see
 "vect_recog_widen_mult_pattern: detected"
even four times in the dump, 2x for each try, so I first need to
understand what's going on.

Regards
 Robin


Re: Re: Question about wrapv-vect-reduc-dot-s8b.c

2023-08-30 Thread Richard Biener via Gcc
On Wed, 30 Aug 2023, juzhe.zh...@rivai.ai wrote:

> I am wondering whether we do have some situations that 
> vec_pack/vec_unpack/vec_widen_xxx/dot_prod pattern can be beneficial for RVV ?
> I have ever met some situation that vec_unpack can be beneficial when working 
> on SELECT_VL but I don't which case

With fixed size vectors you'll face the situation that the vectorizer
chooses the "wrong" vector type so yes, I think implementing
vec_unpack[s]_{lo,hi} might be useful.  But I wouldn't prioritize this
until you have a more clear picture of how useful it would be.

Richard.

> 
> 
> juzhe.zh...@rivai.ai
>  
> From: Robin Dapp
> Date: 2023-08-30 16:06
> To: Richard Biener; juzhe.zh...@rivai.ai
> CC: rdapp.gcc; gcc
> Subject: Re: Question about wrapv-vect-reduc-dot-s8b.c
> >> To fix it, is it necessary to support 'vec_unpack' ?
> > 
> > both same units would be sext, not vec_unpacks_{lo,hi} - the vectorizer
> > ties its hands by choosing vector types early and based on the number
> > of incoming/outgoing vectors it chooses one or the other method.
> > 
> > More precise dumping would probably help here but somewhere earlier you
> > should be able to see the vector type used for _2
> We usually try with a "normal" mode like VNx4SI (RVVM1SI or so) and
> then switch to VNx4QI (i.e. a mode that only determines the number of
> units/elements) and have vectorize_related_mode return modes with the
> same number of units.  This will then result in the sext/zext patterns
> matching.  The first round where we try the normal mode will not match
> those because the related mode has a different number of units.
>  
> So it's somewhat expected that the first try fails.
>  
> My dump shows that we vectorize, so IMHO no problem.  I can take a look
> at this but it doesn't look like a case for pack/unpack.  
>  
> Regards
> Robin
>  
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: Question about wrapv-vect-reduc-dot-s8b.c

2023-08-30 Thread Richard Biener via Gcc
On Wed, 30 Aug 2023, Robin Dapp wrote:

> >> To fix it, is it necessary to support 'vec_unpack' ?
> > 
> > both same units would be sext, not vec_unpacks_{lo,hi} - the vectorizer
> > ties its hands by choosing vector types early and based on the number
> > of incoming/outgoing vectors it chooses one or the other method.
> > 
> > More precise dumping would probably help here but somewhere earlier you
> > should be able to see the vector type used for _2
> We usually try with a "normal" mode like VNx4SI (RVVM1SI or so) and
> then switch to VNx4QI (i.e. a mode that only determines the number of
> units/elements) and have vectorize_related_mode return modes with the
> same number of units.  This will then result in the sext/zext patterns
> matching.  The first round where we try the normal mode will not match
> those because the related mode has a different number of units.
> 
> So it's somewhat expected that the first try fails.
> 
> My dump shows that we vectorize, so IMHO no problem.  I can take a look
> at this but it doesn't look like a case for pack/unpack.

it's target dependent what we choose first so it's going to be
a bit difficult to adjust testcases like this (and it looks like
a testsuite issue).  I think for this specific testcase changing
scan-tree-dump-times to scan-tree-dump is reasonable.  Note we
really want to check that for the case we choose finally
we use the sdot pattern, but I don't see how we can easily constrain
the dump-scans.  Can we do sth like
"vect_recog_dot_prod_pattern: detected\n(!FAILED)*SUCCEEDED", thus
after the dot-prod pattern dumping allow arbitrary stuff but _not_
a "failed" and then require a "succeeded"?

The other way would be to somehow add a dump flag that produces
dumps only for the succeeded part.  Of course we have targets that
evaluate multiple succeeded parts for costing (but luckily costing
is disabled for most tests).

Richard.


Re: Re: Question about wrapv-vect-reduc-dot-s8b.c

2023-08-30 Thread juzhe.zh...@rivai.ai
I am wondering whether we do have some situations that 
vec_pack/vec_unpack/vec_widen_xxx/dot_prod pattern can be beneficial for RVV ?
I have ever met some situation that vec_unpack can be beneficial when working 
on SELECT_VL but I don't which case



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-08-30 16:06
To: Richard Biener; juzhe.zh...@rivai.ai
CC: rdapp.gcc; gcc
Subject: Re: Question about wrapv-vect-reduc-dot-s8b.c
>> To fix it, is it necessary to support 'vec_unpack' ?
> 
> both same units would be sext, not vec_unpacks_{lo,hi} - the vectorizer
> ties its hands by choosing vector types early and based on the number
> of incoming/outgoing vectors it chooses one or the other method.
> 
> More precise dumping would probably help here but somewhere earlier you
> should be able to see the vector type used for _2
We usually try with a "normal" mode like VNx4SI (RVVM1SI or so) and
then switch to VNx4QI (i.e. a mode that only determines the number of
units/elements) and have vectorize_related_mode return modes with the
same number of units.  This will then result in the sext/zext patterns
matching.  The first round where we try the normal mode will not match
those because the related mode has a different number of units.
 
So it's somewhat expected that the first try fails.
 
My dump shows that we vectorize, so IMHO no problem.  I can take a look
at this but it doesn't look like a case for pack/unpack.  
 
Regards
Robin
 


Re: Question about wrapv-vect-reduc-dot-s8b.c

2023-08-30 Thread Robin Dapp via Gcc
>> To fix it, is it necessary to support 'vec_unpack' ?
> 
> both same units would be sext, not vec_unpacks_{lo,hi} - the vectorizer
> ties its hands by choosing vector types early and based on the number
> of incoming/outgoing vectors it chooses one or the other method.
> 
> More precise dumping would probably help here but somewhere earlier you
> should be able to see the vector type used for _2
We usually try with a "normal" mode like VNx4SI (RVVM1SI or so) and
then switch to VNx4QI (i.e. a mode that only determines the number of
units/elements) and have vectorize_related_mode return modes with the
same number of units.  This will then result in the sext/zext patterns
matching.  The first round where we try the normal mode will not match
those because the related mode has a different number of units.

So it's somewhat expected that the first try fails.

My dump shows that we vectorize, so IMHO no problem.  I can take a look
at this but it doesn't look like a case for pack/unpack.  

Regards
 Robin


Re: Re: Question about wrapv-vect-reduc-dot-s8b.c

2023-08-30 Thread juzhe.zh...@rivai.ai
Thanks Richi.

>> both same units would be sext, not vec_unpacks_{lo,hi} - the vectorizer
Sorry I made a mistake here. They are not the same nunits.

wrapv-vect-reduc-dot-s8b.c:29:14: note:   get vectype for scalar type:  short 
int
wrapv-vect-reduc-dot-s8b.c:29:14: note:   vectype: vector([8,8]) short int
wrapv-vect-reduc-dot-s8b.c:29:14: note:   nunits = [8,8]
wrapv-vect-reduc-dot-s8b.c:29:14: note:   ==> examining statement: _1 = X[i_14];
wrapv-vect-reduc-dot-s8b.c:29:14: note:   precomputed vectype: vector([16,16]) 
signed char
wrapv-vect-reduc-dot-s8b.c:29:14: note:   nunits = [16,16]
wrapv-vect-reduc-dot-s8b.c:29:14: note:   ==> examining statement: _2 = (short 
int) _1;
wrapv-vect-reduc-dot-s8b.c:29:14: note:   get vectype for scalar type: short int
wrapv-vect-reduc-dot-s8b.c:29:14: note:   vectype: vector([8,8]) short int
wrapv-vect-reduc-dot-s8b.c:29:14: note:   get vectype for smallest scalar type: 
signed char
wrapv-vect-reduc-dot-s8b.c:29:14: note:   nunits vectype: vector([16,16]) 
signed char
wrapv-vect-reduc-dot-s8b.c:29:14: note:   nunits = [16,16]

Turns out for _2, it picks vector([8,8]) short int and _1, it picks 
vector([16,16]) signed char
at the first time analysis.

It seems that because we don't support vec_unpacks so that the first time 
analysis failed ? 
Then we end up with "2" times these 2 checks:

> FAIL: gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c scan-tree-dump-times vect 
> "vect_recog_dot_prod_pattern: detected" 1
> FAIL: gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c scan-tree-dump-times vect 
> "vect_recog_widen_mult_pattern: detected" 1



juzhe.zh...@rivai.ai
 
From: Richard Biener
Date: 2023-08-30 15:45
To: juzhe.zh...@rivai.ai
CC: gcc; Robin Dapp
Subject: Re: Question about wrapv-vect-reduc-dot-s8b.c
On Wed, 30 Aug 2023, juzhe.zh...@rivai.ai wrote:
 
> Hi, I start to enable "vect" testsuite for RISC-V.
> 
> I have a question when analyzing the 'wrapv-vect-reduc-dot-s8b.c'
> It failed at:
> FAIL: gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c scan-tree-dump-times vect 
> "vect_recog_dot_prod_pattern: detected" 1
> FAIL: gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c scan-tree-dump-times vect 
> "vect_recog_widen_mult_pattern: detected" 1
> 
> They are found "2" times.
> 
> Since at the first time, it failed at the vectorization of conversion:
> 
> wrapv-vect-reduc-dot-s8b.c:29:14: missed:   conversion not supported by 
> target.
> wrapv-vect-reduc-dot-s8b.c:29:14: note:   vect_is_simple_use: operand 
> X[i_14], type of def: internal
> wrapv-vect-reduc-dot-s8b.c:29:14: note:   vect_is_simple_use: vectype 
> vector([16,16]) signed char
> wrapv-vect-reduc-dot-s8b.c:29:14: note:   vect_is_simple_use: operand 
> X[i_14], type of def: internal
> wrapv-vect-reduc-dot-s8b.c:29:14: note:   vect_is_simple_use: vectype 
> vector([16,16]) signed char
> wrapv-vect-reduc-dot-s8b.c:30:17: missed:   not vectorized: relevant stmt not 
> supported: _2 = (short int) _1;
> wrapv-vect-reduc-dot-s8b.c:29:14: missed:  bad operation or unsupported loop 
> bound.
> 
> Here loop vectorizer is trying to do the conversion from char -> short with 
> both same nunits.
> But we don't support 'vec_unpack' stuff in RISC-V backend since I don't see 
> the case that vec_unpack can optimize the codegen of autovectorizatio for RVV.
> 
> To fix it, is it necessary to support 'vec_unpack' ?
 
both same units would be sext, not vec_unpacks_{lo,hi} - the vectorizer
ties its hands by choosing vector types early and based on the number
of incoming/outgoing vectors it chooses one or the other method.
 
More precise dumping would probably help here but somewhere earlier you
should be able to see the vector type used for _2
 
Richard.
 


Re: Question about wrapv-vect-reduc-dot-s8b.c

2023-08-30 Thread Richard Biener via Gcc
On Wed, 30 Aug 2023, juzhe.zh...@rivai.ai wrote:

> Hi, I start to enable "vect" testsuite for RISC-V.
> 
> I have a question when analyzing the 'wrapv-vect-reduc-dot-s8b.c'
> It failed at:
> FAIL: gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c scan-tree-dump-times vect 
> "vect_recog_dot_prod_pattern: detected" 1
> FAIL: gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c scan-tree-dump-times vect 
> "vect_recog_widen_mult_pattern: detected" 1
> 
> They are found "2" times.
> 
> Since at the first time, it failed at the vectorization of conversion:
> 
> wrapv-vect-reduc-dot-s8b.c:29:14: missed:   conversion not supported by 
> target.
> wrapv-vect-reduc-dot-s8b.c:29:14: note:   vect_is_simple_use: operand 
> X[i_14], type of def: internal
> wrapv-vect-reduc-dot-s8b.c:29:14: note:   vect_is_simple_use: vectype 
> vector([16,16]) signed char
> wrapv-vect-reduc-dot-s8b.c:29:14: note:   vect_is_simple_use: operand 
> X[i_14], type of def: internal
> wrapv-vect-reduc-dot-s8b.c:29:14: note:   vect_is_simple_use: vectype 
> vector([16,16]) signed char
> wrapv-vect-reduc-dot-s8b.c:30:17: missed:   not vectorized: relevant stmt not 
> supported: _2 = (short int) _1;
> wrapv-vect-reduc-dot-s8b.c:29:14: missed:  bad operation or unsupported loop 
> bound.
> 
> Here loop vectorizer is trying to do the conversion from char -> short with 
> both same nunits.
> But we don't support 'vec_unpack' stuff in RISC-V backend since I don't see 
> the case that vec_unpack can optimize the codegen of autovectorizatio for RVV.
> 
> To fix it, is it necessary to support 'vec_unpack' ?

both same units would be sext, not vec_unpacks_{lo,hi} - the vectorizer
ties its hands by choosing vector types early and based on the number
of incoming/outgoing vectors it chooses one or the other method.

More precise dumping would probably help here but somewhere earlier you
should be able to see the vector type used for _2

Richard.