Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations
Since all the pipelines should be tuned to their cost model, they would be different anyway. If it would be simpler for now, I could separate the files out. I think I'm getting a bit confused. Is there a reason why we would want to exchange scheduler descriptions like the example you provided? I'm just thinking why a in-order model would want to use an ooo vector model and vice versa. Please correct me if I got the wrong idea. Yeah, the confusion is understandable as it's all in flow and several things I mentioned are artifacts of us not yet being stabilized (or actually having hard data to base our decisions on). Usually, once a uarch has settled there is no reason to exchange anything, just smaller tweaks might be done. I was more thinking of the near to mid-term future where larger changes like ripping out one thing and using another one altogether might still happen. Regarding out of order vs in order - for in-order pipelines we will always want to get latencies right. For out of order it is a balancing act (proper latencies often mean more spilling and the processor will reorder correctly anyway). So you're mostly right that the argument is not very strong as soon as we really know what to do and not to do. That makes sense to me! I also want to double check, isn't forcing all typed instructions to be part of a dfa pipeline in effect removing a situation where a tune model does not specify a "vector tune model"? At least from my testing with the assert statement, I get ICEs when trying to run the testsuite without the vector tune model even on gc. There are (at least) three parts of the "tune model": - vector cost model, specifying the cost of generic vector operations, not necessarily corresponding to an insn - insn cost, specifying the cost of an individual insn, usually close to latency but sometimes also "complexity" or other things. - insn latency and other hardware scheduler properties. We can leave out any of those which will make us fall back to default values. Even if we forced a scheduler description we could still have the default fallback for the other two and generate unfavorable code as a result. So if I'm understanding things correctly, the costs the Juzhe is working on in riscv.cc would be part of the vector cost model since they don't correspond to individual instructions and only target vector code. These costs would be the default fallback in the event of having no scheduler descriptions for the insn. The vector pipelines I'm working describes the insn latency categorized by the insn type. The scheduler will attempt to generate favorable code by this description but also consider the vector cost model. That is, it's possible for an insn with a latency of 1 and cost of 10 to be replaced by an insn with a latency of 2 and cost of 3. The insn cost is the cost of every insn which can be specified elsewhere. These override the values in the vector cost model for vector insns? Where are these specified? Then, all of these combined form a tune model like generic-ooo or rocket. However, this is of course not desirable and we will soon have a reasonable vector cost model that corresponds to the non-uarch specific properties of the vector spec. Once this is in place we will also want a somewhat generic vector scheduler description that goes hand in hand with that. Despite the name, the vector part of generic-ooo could be used for in-order vector uarchs and we might want to define a different description for out-of-order uarchs. That's a separate discussion but at least for that contingency it would make sense to easily interchange the scheduler description ;) I think I understand everything. I'm currently testing a run with a generic-vector-ooo file and I'm a little unsure how we would create a second generic-vector-in-order file such that each insn maps only to one reservation without using tune attributes but I guess that will be an implementation detail for later :) Edwin
Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations
> Since all the pipelines should be tuned to their cost model, they > would be different anyway. If it would be simpler for now, I could > separate the files out. > I think I'm getting a bit confused. Is there a reason why we would > want to exchange scheduler descriptions like the example you > provided? I'm just thinking why a in-order model would want to use an > ooo vector model and vice versa. Please correct me if I got the wrong > idea. Yeah, the confusion is understandable as it's all in flow and several things I mentioned are artifacts of us not yet being stabilized (or actually having hard data to base our decisions on). Usually, once a uarch has settled there is no reason to exchange anything, just smaller tweaks might be done. I was more thinking of the near to mid-term future where larger changes like ripping out one thing and using another one altogether might still happen. Regarding out of order vs in order - for in-order pipelines we will always want to get latencies right. For out of order it is a balancing act (proper latencies often mean more spilling and the processor will reorder correctly anyway). So you're mostly right that the argument is not very strong as soon as we really know what to do and not to do. > I also want to double check, isn't forcing all typed instructions to > be part of a dfa pipeline in effect removing a situation where a tune > model does not specify a "vector tune model"? At least from my > testing with the assert statement, I get ICEs when trying to run the > testsuite without the vector tune model even on gc. There are (at least) three parts of the "tune model": - vector cost model, specifying the cost of generic vector operations, not necessarily corresponding to an insn - insn cost, specifying the cost of an individual insn, usually close to latency but sometimes also "complexity" or other things. - insn latency and other hardware scheduler properties. We can leave out any of those which will make us fall back to default values. Even if we forced a scheduler description we could still have the default fallback for the other two and generate unfavorable code as a result. However, this is of course not desirable and we will soon have a reasonable vector cost model that corresponds to the non-uarch specific properties of the vector spec. Once this is in place we will also want a somewhat generic vector scheduler description that goes hand in hand with that. Despite the name, the vector part of generic-ooo could be used for in-order vector uarchs and we might want to define a different description for out-of-order uarchs. That's a separate discussion but at least for that contingency it would make sense to easily interchange the scheduler description ;) Regards Robin
Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations
Hi Robin, On 1/10/2024 8:00 AM, Robin Dapp wrote: Hi Edwin, This patch copies the vector reservations from generic-ooo.md and inserts them into generic.md and sifive.md. Creates new vector crypto related insn reservations. In principle, the changes look good to me but I wonder if we could split off the vector parts from generic-ooo into their own md file (generic-vector-ooo or so?) and include this in the others? Or is there a reason why you decided against this? I forgot we could include other md files into another file (I'll double check that there isn't anything fancy for including other pipelines), but I also thought that eventually all the tunes would have their own vector cost pipelines. Since all the pipelines should be tuned to their cost model, they would be different anyway. If it would be simpler for now, I could separate the files out. A recurring question in vector cost model discussions seems to be how to handle the situation when a tune model does not specify a "vector tune model". The problem exists for the scheduler descriptions and the normal vector cost model (and possibly insn_costs as well). Juzhe just implemented a fallback so we always use the "generic rvv" cost model. Your changes would be in the same vein and if we could split them off then we'd be able to easier exchange one scheduler descriptions for another one (say if one tune model wants to use an in-order vector model). I think I'm getting a bit confused. Is there a reason why we would want to exchange scheduler descriptions like the example you provided? I'm just thinking why a in-order model would want to use an ooo vector model and vice versa. Please correct me if I got the wrong idea. I also want to double check, isn't forcing all typed instructions to be part of a dfa pipeline in effect removing a situation where a tune model does not specify a "vector tune model"? At least from my testing with the assert statement, I get ICEs when trying to run the testsuite without the vector tune model even on gc. There is also still the question of whether to set all latencies to 1 for an OOO core but this question should be settled separately as soon as we have proper hardware benchmark results. If so we would probably rename generic-vector-ooo into generic-vector-in-order ;) Regards Robin I agree the latencies can be tweaked after we get those benchmarks :) Edwin
Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations
Hi Edwin, > This patch copies the vector reservations from generic-ooo.md and > inserts them into generic.md and sifive.md. Creates new vector crypto related > insn reservations. In principle, the changes look good to me but I wonder if we could split off the vector parts from generic-ooo into their own md file (generic-vector-ooo or so?) and include this in the others? Or is there a reason why you decided against this? A recurring question in vector cost model discussions seems to be how to handle the situation when a tune model does not specify a "vector tune model". The problem exists for the scheduler descriptions and the normal vector cost model (and possibly insn_costs as well). Juzhe just implemented a fallback so we always use the "generic rvv" cost model. Your changes would be in the same vein and if we could split them off then we'd be able to easier exchange one scheduler descriptions for another one (say if one tune model wants to use an in-order vector model). There is also still the question of whether to set all latencies to 1 for an OOO core but this question should be settled separately as soon as we have proper hardware benchmark results. If so we would probably rename generic-vector-ooo into generic-vector-in-order ;) Regards Robin
[PATCH V2 2/4][RFC] RISC-V: Add vector related reservations
This patch copies the vector reservations from generic-ooo.md and inserts them into generic.md and sifive.md. Creates new vector crypto related insn reservations. gcc/ChangeLog: * config/riscv/generic-ooo.md (generic_ooo_crypto_aes): create reservation (generic_ooo_crypto_sha): ditto (generic_ooo_crypto_sm): ditto (generic_ooo_vec_vesetvl): ditto (generic_ooo_vec_vsetvl): ditto * config/riscv/generic.md (pipe0): ditto (generic_vec_load): ditto (generic_vec_store): ditto (generic_vec_loadstore_seg): ditto (generic_vec_alu): ditto (generic_vec_fcmp): ditto (generic_vec_imul): ditto (generic_vec_fadd): ditto (generic_vec_fmul): ditto (generic_crypto): ditto (generic_crypto_aes): ditto (generic_crypto_sha): ditto (generic_crypto_sm): ditto (generic_perm): ditto (generic_vec_reduction): ditto (generic_vec_ordered_reduction): ditto (generic_vec_idiv): ditto (generic_vec_float_divsqrt): ditto (generic_vec_mask): ditto (generic_vec_vesetvl): ditto (generic_vec_setrm): ditto (generic_vec_readlen): ditto * config/riscv/sifive-7.md (sifive_7): ditto (sifive_7_vec_load): ditto (sifive_7_vec_store): ditto (sifive_7_vec_loadstore_seg): ditto (sifive_7_vec_alu): ditto (sifive_7_vec_fcmp): ditto (sifive_7_vec_imul): ditto (sifive_7_vec_fadd): ditto (sifive_7_vec_fmul): ditto (sifive_7_crypto): ditto (sifive_7_crypto_aes): ditto (sifive_7_crypto_sha): ditto (sifive_7_crypto_sm): ditto (sifive_7_perm): ditto (sifive_7_vec_reduction): ditto (sifive_7_vec_ordered_reduction): ditto (sifive_7_vec_idiv): ditto (sifive_7_vec_float_divsqrt): ditto (sifive_7_vec_mask): ditto (sifive_7_vec_vesetvl): ditto (sifive_7_vec_setrm): ditto (sifive_7_vec_readlen): ditto Signed-off-by: Edwin Lu Co-authored-by: Robin Dapp --- V2: - Remove unnecessary syntax changes in generic-ooo - Add new vector crypto reservations and types to pipelines --- gcc/config/riscv/generic-ooo.md | 27 +- gcc/config/riscv/generic.md | 143 +++ gcc/config/riscv/sifive-7.md| 144 3 files changed, 311 insertions(+), 3 deletions(-) diff --git a/gcc/config/riscv/generic-ooo.md b/gcc/config/riscv/generic-ooo.md index ef8cb96daf4..fb5f34c0ef2 100644 --- a/gcc/config/riscv/generic-ooo.md +++ b/gcc/config/riscv/generic-ooo.md @@ -195,7 +195,8 @@ (define_insn_reservation "generic_ooo_popcount" 2 (define_insn_reservation "generic_ooo_vec_alu" 3 (and (eq_attr "tune" "generic_ooo") (eq_attr "type" "vialu,viwalu,vext,vicalu,vshift,vnshift,viminmax,vicmp,\ - vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector")) + vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector,\ + vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,vror,vwsll")) "generic_ooo_vxu_issue,generic_ooo_vxu_alu") ;; Vector float comparison, conversion etc. @@ -209,7 +210,8 @@ (define_insn_reservation "generic_ooo_vec_fcmp" 3 ;; Vector integer multiplication. (define_insn_reservation "generic_ooo_vec_imul" 4 (and (eq_attr "tune" "generic_ooo") - (eq_attr "type" "vimul,viwmul,vimuladd,viwmuladd,vsmul")) + (eq_attr "type" "vimul,viwmul,vimuladd,viwmuladd,vsmul,vclmul,vclmulh,\ + vghsh,vgmul")) "generic_ooo_vxu_issue,generic_ooo_vxu_alu") ;; Vector float addition. @@ -230,6 +232,25 @@ (define_insn_reservation "generic_ooo_crypto" 4 (eq_attr "type" "crypto")) "generic_ooo_vxu_issue,generic_ooo_vxu_alu") +;; Vector crypto, AES +(define_insn_reservation "generic_ooo_crypto_aes" 4 + (and (eq_attr "tune" "generic_ooo") + (eq_attr "type" "vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz")) + "generic_ooo_vxu_issue,generic_ooo_vxu_alu") + +;; Vector crypto, sha +(define_insn_reservation "generic_ooo_crypto_sha" 4 + (and (eq_attr "tune" "generic_ooo") + (eq_attr "type" "vsha2ms,vsha2ch,vsha2cl")) + "generic_ooo_vxu_issue,generic_ooo_vxu_alu") + +;; Vector crypto, SM3/4 +(define_insn_reservation "generic_ooo_crypto_sm" 4 + (and (eq_attr "tune" "generic_ooo") + (eq_attr "type" "vsm4k,vsm4r,vsm3me,vsm3c")) + "generic_ooo_vxu_issue,generic_ooo_vxu_alu") + + ;; Vector permute. (define_insn_reservation "generic_ooo_perm" 3 (and (eq_attr "tune" "generic_ooo") @@ -271,7 +292,7 @@ (define_insn_reservation "generic_ooo_vec_mask" 2 "generic_ooo_vxu_issue,generic_ooo_vxu_alu") ;; Vector vsetvl. -(define_insn_reservation "generic_ooo_vec_vesetvl" 1 +(define_insn_reservation "generic_ooo_vec_vsetvl" 1 (and (eq_attr "tune" "generic_ooo") (eq_attr "type" "vsetvl,vsetvl_pre")) "generic_ooo_vxu_issue")