Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-13 Thread 钟居哲
No, I am just want to whether we have some CALL vectorization need len or mask 
predication.

For example, Current GCC vectorization  CALL onyl FMAX/FMIN/FMA/FNMA/FMS/FNMS 
these CALL function
need length or mask predicate. I don't care about sin/cos/popcount...etc. We 
just use full vector autovectorization
is fine, no need to support RVV into middle-end.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-07-13 22:32
To: Palmer Dabbelt; gcc-patches
CC: rdapp.gcc; richard.guenther; juzhe.zhong; Kito Cheng; kito.cheng
Subject: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV 
auto-vectorization
>>>>  From my understanding, we dont have RVV instruction for fmax/fmin?
> 
> Unless I'm misunderstanding, we do.  The ISA manual says
> 
> === Vector Floating-Point MIN/MAX Instructions
> 
> The vector floating-point `vfmin` and `vfmax` instructions have the
> same behavior as the corresponding scalar floating-point instructions
> in version 2.2 of the RISC-V F/D/Q extension: they perform the 
> `minimumNumber`
> or `maximumNumber` operation on active elements.
> 
> 
> # Floating-point minimum
> vfmin.vv vd, vs2, vs1, vm   # Vector-vector
> vfmin.vf vd, vs2, rs1, vm   # vector-scalar
> 
> # Floating-point maximum
> vfmax.vv vd, vs2, vs1, vm   # Vector-vector
> vfmax.vf vd, vs2, rs1, vm   # vector-scalar
> 
> 
> so we should be able to match at least some loops.
 
We're already emitting those (e.g. for a[i] = a[i] > b[i] ? a[i] : b[i])
but for fmin/fmax they are not wired up yet (as opposed to the scalar variants).
Juzhe are you referring to something else?  I mean it's always a bit tricky
for backends to verify if the fmin/fmax behavior exactly matches the instruction
regards signaling nans, rounding etc but if the scalar variant is fine
I don't see why the vector variant would be worse. 
 
Regards
Robin
 
 


Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-13 Thread Robin Dapp via Gcc-patches
  From my understanding, we dont have RVV instruction for fmax/fmin?
> 
> Unless I'm misunderstanding, we do.  The ISA manual says
> 
> === Vector Floating-Point MIN/MAX Instructions
> 
> The vector floating-point `vfmin` and `vfmax` instructions have the
> same behavior as the corresponding scalar floating-point instructions
> in version 2.2 of the RISC-V F/D/Q extension: they perform the 
> `minimumNumber`
> or `maximumNumber` operation on active elements.
> 
> 
> # Floating-point minimum
> vfmin.vv vd, vs2, vs1, vm   # Vector-vector
> vfmin.vf vd, vs2, rs1, vm   # vector-scalar
> 
> # Floating-point maximum
> vfmax.vv vd, vs2, vs1, vm   # Vector-vector
> vfmax.vf vd, vs2, rs1, vm   # vector-scalar
> 
> 
> so we should be able to match at least some loops.

We're already emitting those (e.g. for a[i] = a[i] > b[i] ? a[i] : b[i])
but for fmin/fmax they are not wired up yet (as opposed to the scalar variants).
Juzhe are you referring to something else?  I mean it's always a bit tricky
for backends to verify if the fmin/fmax behavior exactly matches the instruction
regards signaling nans, rounding etc but if the scalar variant is fine
I don't see why the vector variant would be worse. 

Regards
 Robin



Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-13 Thread Palmer Dabbelt
On Thu, 13 Jul 2023 07:01:26 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
>
>
> On 7/13/23 01:47, Richard Biener wrote:
>> On Thu, Jul 13, 2023 at 1:30 AM 钟居哲  wrote:
>>>
>>> I notice vectorizable_call in Loop Vectorizer.
>>> It's vectorizing CALL function for example like fmax/fmin.
>>>  From my understanding, we dont have RVV instruction for fmax/fmin?

Unless I'm misunderstanding, we do.  The ISA manual says

=== Vector Floating-Point MIN/MAX Instructions

The vector floating-point `vfmin` and `vfmax` instructions have the
same behavior as the corresponding scalar floating-point instructions
in version 2.2 of the RISC-V F/D/Q extension: they perform the 
`minimumNumber`
or `maximumNumber` operation on active elements.


# Floating-point minimum
vfmin.vv vd, vs2, vs1, vm   # Vector-vector
vfmin.vf vd, vs2, rs1, vm   # vector-scalar

# Floating-point maximum
vfmax.vv vd, vs2, vs1, vm   # Vector-vector
vfmax.vf vd, vs2, rs1, vm   # vector-scalar


so we should be able to match at least some loops.

>>
>> There's things like .POPCOUNT which we can vectorize, but sure, it
>> depends on the ISA if there's anything.
> Right.  And RV has some of these -- vcpop, vfirst...  Supporting them
> obviously isn't a requirement for a vector implementation, but they're
> nice to have :-)
>
> Jeff


Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-13 Thread Jeff Law via Gcc-patches




On 7/13/23 01:47, Richard Biener wrote:

On Thu, Jul 13, 2023 at 1:30 AM 钟居哲  wrote:


I notice vectorizable_call in Loop Vectorizer.
It's vectorizing CALL function for example like fmax/fmin.
 From my understanding, we dont have RVV instruction for fmax/fmin?


There's things like .POPCOUNT which we can vectorize, but sure, it
depends on the ISA if there's anything.
Right.  And RV has some of these -- vcpop, vfirst...  Supporting them 
obviously isn't a requirement for a vector implementation, but they're 
nice to have :-)


Jeff


Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-13 Thread Jeff Law via Gcc-patches




On 7/12/23 17:30, 钟居哲 wrote:

I notice vectorizable_call in Loop Vectorizer.
It's vectorizing CALL function for example like fmax/fmin.
 From my understanding, we dont have RVV instruction for fmax/fmin?

So for now, I don't need to support builtin call function vectorization 
for RVV.

Am I right?

Yes, you are correct.



I am wondering whether we do have some kind of builtin function call 
vectorization by using RVV instructions.
It can be advantageous, even if the call doesn't collapse down to a 
single vector instruction.  Consider libmvec which is an API to provide 
things like sin, cos, exp, log, etc in vector form.


Once the library routines are written, those can then be exposed to the 
compiler in turn allowing vectorization of loops with a subset of calls 
such as sin, cos, pow, log, etc.


jeff


Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-13 Thread Richard Biener via Gcc-patches
On Thu, Jul 13, 2023 at 1:30 AM 钟居哲  wrote:
>
> I notice vectorizable_call in Loop Vectorizer.
> It's vectorizing CALL function for example like fmax/fmin.
> From my understanding, we dont have RVV instruction for fmax/fmin?

There's things like .POPCOUNT which we can vectorize, but sure, it
depends on the ISA
if there's anything.

> So for now, I don't need to support builtin call function vectorization for 
> RVV.
> Am I right?
>
> I am wondering whether we do have some kind of builtin function call 
> vectorization by using RVV instructions.
>
>
> Thanks.
>
>
> juzhe.zh...@rivai.ai
>
> From: Jeff Law
> Date: 2023-07-13 06:25
> To: 钟居哲; gcc-patches
> CC: kito.cheng; kito.cheng; rdapp.gcc
> Subject: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV 
> auto-vectorization
>
>
> On 7/12/23 16:17, 钟居哲 wrote:
> > Thanks Jeff.
> > Will commit with formating the codes.
> >
> > I am gonna first support COND_FMA and reduction first (which I think
> > is higher priority).
> > Then come back support strided_load/store.
> Sure.One thing to note with strided loads, they can significantly
> help x264's sad/satd loops.  So hopefully you're testing with those :-)
>
>
>
> jeff
>


RE: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Thursday, July 13, 2023 5:49 AM
To: 钟居哲 ; gcc-patches 
Cc: kito.cheng ; kito.cheng ; 
rdapp.gcc 
Subject: Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV 
auto-vectorization



On 7/12/23 15:22, 钟居哲 wrote:
> I have removed strided load/store, instead, I will support strided 
> load/store in vectorizer later.
> 
> Ok for trunk?
Assuming this removes the strided loads/stores while we figure out the 
best way to support them, OK for the trunk.  The formatting is so messed 
up that it's nearly impossible to read.



Note for the future, if you hit the message size limit, go ahead and 
gzip the patch.  That's better than forwarding from a failed message as 
the latter mucks up indention so bad that the result is unreadable.

Jeff


Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread 钟居哲
I notice vectorizable_call in Loop Vectorizer.
It's vectorizing CALL function for example like fmax/fmin.
From my understanding, we dont have RVV instruction for fmax/fmin?

So for now, I don't need to support builtin call function vectorization for RVV.
Am I right?

I am wondering whether we do have some kind of builtin function call 
vectorization by using RVV instructions.


Thanks.


juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2023-07-13 06:25
To: 钟居哲; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV 
auto-vectorization
 
 
On 7/12/23 16:17, 钟居哲 wrote:
> Thanks Jeff.
> Will commit with formating the codes.
> 
> I am gonna first support COND_FMA and reduction first (which I think 
> is higher priority).
> Then come back support strided_load/store.
Sure.One thing to note with strided loads, they can significantly 
help x264's sad/satd loops.  So hopefully you're testing with those :-)
 
 
 
jeff
 


Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 16:17, 钟居哲 wrote:

Thanks Jeff.
Will commit with formating the codes.

I am gonna first support COND_FMA and reduction first (which I think 
is higher priority).

Then come back support strided_load/store.
Sure.One thing to note with strided loads, they can significantly 
help x264's sad/satd loops.  So hopefully you're testing with those :-)




jeff


Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread 钟居哲
Thanks Jeff.
Will commit with formating the codes.

I am gonna first support COND_FMA and reduction first (which I think is 
higher priority).
Then come back support strided_load/store.

Thanks.



juzhe.zh...@rivai.ai
 
发件人: Jeff Law
发送时间: 2023-07-13 05:48
收件人: 钟居哲; gcc-patches
抄送: kito.cheng; kito.cheng; rdapp.gcc
主题: Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV 
auto-vectorization
 
 
On 7/12/23 15:22, 钟居哲 wrote:
> I have removed strided load/store, instead, I will support strided 
> load/store in vectorizer later.
> 
> Ok for trunk?
Assuming this removes the strided loads/stores while we figure out the 
best way to support them, OK for the trunk.  The formatting is so messed 
up that it's nearly impossible to read.
 
 
 
Note for the future, if you hit the message size limit, go ahead and 
gzip the patch.  That's better than forwarding from a failed message as 
the latter mucks up indention so bad that the result is unreadable.
 
Jeff
 


Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 15:22, 钟居哲 wrote:
I have removed strided load/store, instead, I will support strided 
load/store in vectorizer later.


Ok for trunk?
Assuming this removes the strided loads/stores while we figure out the 
best way to support them, OK for the trunk.  The formatting is so messed 
up that it's nearly impossible to read.




Note for the future, if you hit the message size limit, go ahead and 
gzip the patch.  That's better than forwarding from a failed message as 
the latter mucks up indention so bad that the result is unreadable.


Jeff