Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
No, I am just want to whether we have some CALL vectorization need len or mask predication. For example, Current GCC vectorization CALL onyl FMAX/FMIN/FMA/FNMA/FMS/FNMS these CALL function need length or mask predicate. I don't care about sin/cos/popcount...etc. We just use full vector autovectorization is fine, no need to support RVV into middle-end. juzhe.zh...@rivai.ai From: Robin Dapp Date: 2023-07-13 22:32 To: Palmer Dabbelt; gcc-patches CC: rdapp.gcc; richard.guenther; juzhe.zhong; Kito Cheng; kito.cheng Subject: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization >>>> From my understanding, we dont have RVV instruction for fmax/fmin? > > Unless I'm misunderstanding, we do. The ISA manual says > > === Vector Floating-Point MIN/MAX Instructions > > The vector floating-point `vfmin` and `vfmax` instructions have the > same behavior as the corresponding scalar floating-point instructions > in version 2.2 of the RISC-V F/D/Q extension: they perform the > `minimumNumber` > or `maximumNumber` operation on active elements. > > > # Floating-point minimum > vfmin.vv vd, vs2, vs1, vm # Vector-vector > vfmin.vf vd, vs2, rs1, vm # vector-scalar > > # Floating-point maximum > vfmax.vv vd, vs2, vs1, vm # Vector-vector > vfmax.vf vd, vs2, rs1, vm # vector-scalar > > > so we should be able to match at least some loops. We're already emitting those (e.g. for a[i] = a[i] > b[i] ? a[i] : b[i]) but for fmin/fmax they are not wired up yet (as opposed to the scalar variants). Juzhe are you referring to something else? I mean it's always a bit tricky for backends to verify if the fmin/fmax behavior exactly matches the instruction regards signaling nans, rounding etc but if the scalar variant is fine I don't see why the vector variant would be worse. Regards Robin
Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
From my understanding, we dont have RVV instruction for fmax/fmin? > > Unless I'm misunderstanding, we do. The ISA manual says > > === Vector Floating-Point MIN/MAX Instructions > > The vector floating-point `vfmin` and `vfmax` instructions have the > same behavior as the corresponding scalar floating-point instructions > in version 2.2 of the RISC-V F/D/Q extension: they perform the > `minimumNumber` > or `maximumNumber` operation on active elements. > > > # Floating-point minimum > vfmin.vv vd, vs2, vs1, vm # Vector-vector > vfmin.vf vd, vs2, rs1, vm # vector-scalar > > # Floating-point maximum > vfmax.vv vd, vs2, vs1, vm # Vector-vector > vfmax.vf vd, vs2, rs1, vm # vector-scalar > > > so we should be able to match at least some loops. We're already emitting those (e.g. for a[i] = a[i] > b[i] ? a[i] : b[i]) but for fmin/fmax they are not wired up yet (as opposed to the scalar variants). Juzhe are you referring to something else? I mean it's always a bit tricky for backends to verify if the fmin/fmax behavior exactly matches the instruction regards signaling nans, rounding etc but if the scalar variant is fine I don't see why the vector variant would be worse. Regards Robin
Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
On Thu, 13 Jul 2023 07:01:26 PDT (-0700), gcc-patches@gcc.gnu.org wrote: > > > On 7/13/23 01:47, Richard Biener wrote: >> On Thu, Jul 13, 2023 at 1:30â¯AM éå± å² wrote: >>> >>> I notice vectorizable_call in Loop Vectorizer. >>> It's vectorizing CALL function for example like fmax/fmin. >>> From my understanding, we dont have RVV instruction for fmax/fmin? Unless I'm misunderstanding, we do. The ISA manual says === Vector Floating-Point MIN/MAX Instructions The vector floating-point `vfmin` and `vfmax` instructions have the same behavior as the corresponding scalar floating-point instructions in version 2.2 of the RISC-V F/D/Q extension: they perform the `minimumNumber` or `maximumNumber` operation on active elements. # Floating-point minimum vfmin.vv vd, vs2, vs1, vm # Vector-vector vfmin.vf vd, vs2, rs1, vm # vector-scalar # Floating-point maximum vfmax.vv vd, vs2, vs1, vm # Vector-vector vfmax.vf vd, vs2, rs1, vm # vector-scalar so we should be able to match at least some loops. >> >> There's things like .POPCOUNT which we can vectorize, but sure, it >> depends on the ISA if there's anything. > Right. And RV has some of these -- vcpop, vfirst... Supporting them > obviously isn't a requirement for a vector implementation, but they're > nice to have :-) > > Jeff
Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
On 7/13/23 01:47, Richard Biener wrote: On Thu, Jul 13, 2023 at 1:30 AM 钟居哲 wrote: I notice vectorizable_call in Loop Vectorizer. It's vectorizing CALL function for example like fmax/fmin. From my understanding, we dont have RVV instruction for fmax/fmin? There's things like .POPCOUNT which we can vectorize, but sure, it depends on the ISA if there's anything. Right. And RV has some of these -- vcpop, vfirst... Supporting them obviously isn't a requirement for a vector implementation, but they're nice to have :-) Jeff
Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
On 7/12/23 17:30, 钟居哲 wrote: I notice vectorizable_call in Loop Vectorizer. It's vectorizing CALL function for example like fmax/fmin. From my understanding, we dont have RVV instruction for fmax/fmin? So for now, I don't need to support builtin call function vectorization for RVV. Am I right? Yes, you are correct. I am wondering whether we do have some kind of builtin function call vectorization by using RVV instructions. It can be advantageous, even if the call doesn't collapse down to a single vector instruction. Consider libmvec which is an API to provide things like sin, cos, exp, log, etc in vector form. Once the library routines are written, those can then be exposed to the compiler in turn allowing vectorization of loops with a subset of calls such as sin, cos, pow, log, etc. jeff
Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
On Thu, Jul 13, 2023 at 1:30 AM 钟居哲 wrote: > > I notice vectorizable_call in Loop Vectorizer. > It's vectorizing CALL function for example like fmax/fmin. > From my understanding, we dont have RVV instruction for fmax/fmin? There's things like .POPCOUNT which we can vectorize, but sure, it depends on the ISA if there's anything. > So for now, I don't need to support builtin call function vectorization for > RVV. > Am I right? > > I am wondering whether we do have some kind of builtin function call > vectorization by using RVV instructions. > > > Thanks. > > > juzhe.zh...@rivai.ai > > From: Jeff Law > Date: 2023-07-13 06:25 > To: 钟居哲; gcc-patches > CC: kito.cheng; kito.cheng; rdapp.gcc > Subject: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV > auto-vectorization > > > On 7/12/23 16:17, 钟居哲 wrote: > > Thanks Jeff. > > Will commit with formating the codes. > > > > I am gonna first support COND_FMA and reduction first (which I think > > is higher priority). > > Then come back support strided_load/store. > Sure.One thing to note with strided loads, they can significantly > help x264's sad/satd loops. So hopefully you're testing with those :-) > > > > jeff >
RE: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
Committed, thanks Jeff. Pan -Original Message- From: Gcc-patches On Behalf Of Jeff Law via Gcc-patches Sent: Thursday, July 13, 2023 5:49 AM To: 钟居哲 ; gcc-patches Cc: kito.cheng ; kito.cheng ; rdapp.gcc Subject: Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization On 7/12/23 15:22, 钟居哲 wrote: > I have removed strided load/store, instead, I will support strided > load/store in vectorizer later. > > Ok for trunk? Assuming this removes the strided loads/stores while we figure out the best way to support them, OK for the trunk. The formatting is so messed up that it's nearly impossible to read. Note for the future, if you hit the message size limit, go ahead and gzip the patch. That's better than forwarding from a failed message as the latter mucks up indention so bad that the result is unreadable. Jeff
Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
I notice vectorizable_call in Loop Vectorizer. It's vectorizing CALL function for example like fmax/fmin. From my understanding, we dont have RVV instruction for fmax/fmin? So for now, I don't need to support builtin call function vectorization for RVV. Am I right? I am wondering whether we do have some kind of builtin function call vectorization by using RVV instructions. Thanks. juzhe.zh...@rivai.ai From: Jeff Law Date: 2023-07-13 06:25 To: 钟居哲; gcc-patches CC: kito.cheng; kito.cheng; rdapp.gcc Subject: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization On 7/12/23 16:17, 钟居哲 wrote: > Thanks Jeff. > Will commit with formating the codes. > > I am gonna first support COND_FMA and reduction first (which I think > is higher priority). > Then come back support strided_load/store. Sure.One thing to note with strided loads, they can significantly help x264's sad/satd loops. So hopefully you're testing with those :-) jeff
Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
On 7/12/23 16:17, 钟居哲 wrote: Thanks Jeff. Will commit with formating the codes. I am gonna first support COND_FMA and reduction first (which I think is higher priority). Then come back support strided_load/store. Sure.One thing to note with strided loads, they can significantly help x264's sad/satd loops. So hopefully you're testing with those :-) jeff
Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
Thanks Jeff. Will commit with formating the codes. I am gonna first support COND_FMA and reduction first (which I think is higher priority). Then come back support strided_load/store. Thanks. juzhe.zh...@rivai.ai 发件人: Jeff Law 发送时间: 2023-07-13 05:48 收件人: 钟居哲; gcc-patches 抄送: kito.cheng; kito.cheng; rdapp.gcc 主题: Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization On 7/12/23 15:22, 钟居哲 wrote: > I have removed strided load/store, instead, I will support strided > load/store in vectorizer later. > > Ok for trunk? Assuming this removes the strided loads/stores while we figure out the best way to support them, OK for the trunk. The formatting is so messed up that it's nearly impossible to read. Note for the future, if you hit the message size limit, go ahead and gzip the patch. That's better than forwarding from a failed message as the latter mucks up indention so bad that the result is unreadable. Jeff
Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
On 7/12/23 15:22, 钟居哲 wrote: I have removed strided load/store, instead, I will support strided load/store in vectorizer later. Ok for trunk? Assuming this removes the strided loads/stores while we figure out the best way to support them, OK for the trunk. The formatting is so messed up that it's nearly impossible to read. Note for the future, if you hit the message size limit, go ahead and gzip the patch. That's better than forwarding from a failed message as the latter mucks up indention so bad that the result is unreadable. Jeff