Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations

2024-01-10 Thread Edwin Lu




Since all the pipelines should be tuned to their cost model, they
would be different anyway. If it would be simpler for now, I could
separate the files out.
I think I'm getting a bit confused. Is there a reason why we would
want to exchange scheduler descriptions like the example you
provided? I'm just thinking why a in-order model would want to use an
ooo vector model and vice versa. Please correct me if I got the wrong
idea.


Yeah, the confusion is understandable as it's all in flow and several
things I mentioned are artifacts of us not yet being stabilized (or
actually having hard data to base our decisions on).

Usually, once a uarch has settled there is no reason to exchange
anything, just smaller tweaks might be done.  I was more thinking of
the near to mid-term future where larger changes like ripping out
one thing and using another one altogether might still happen.

Regarding out of order vs in order - for in-order pipelines we will
always want to get latencies right.  For out of order it is a balancing
act (proper latencies often mean more spilling and the processor will
reorder correctly anyway).

So you're mostly right that the argument is not very strong as soon
as we really know what to do and not to do.


That makes sense to me!

I also want to double check, isn't forcing all typed instructions to
be part of a dfa pipeline in effect removing a situation where a tune
model does not specify a "vector tune model"? At least from my
testing with the assert statement, I get ICEs when trying to run the
testsuite without the vector tune model even on gc.


There are (at least) three parts of the "tune model":
  - vector cost model, specifying the cost of generic vector operations,
not necessarily corresponding to an insn
  - insn cost, specifying the cost of an individual insn, usually close
to latency but sometimes also "complexity" or other things.
  - insn latency and other hardware scheduler properties.

We can leave out any of those which will make us fall back to default
values.  Even if we forced a scheduler description we could still have
the default fallback for the other two and generate unfavorable code
as a result.

So if I'm understanding things correctly, the costs the Juzhe is working 
on in riscv.cc would be part of the vector cost model since they don't 
correspond to individual instructions and only target vector code. These 
costs would be the default fallback in the event of having no scheduler 
descriptions for the insn.


The vector pipelines I'm working describes the insn latency categorized 
by the insn type. The scheduler will attempt to generate favorable code 
by this description but also consider the vector cost model. That is, 
it's possible for an insn with a latency of 1 and cost of 10 to be 
replaced by an insn with a latency of 2 and cost of 3.


The insn cost is the cost of every insn which can be specified 
elsewhere. These override the values in the vector cost model for vector 
insns? Where are these specified?


Then, all of these combined form a tune model like generic-ooo or rocket.

However, this is of course not desirable and we will soon have a
reasonable vector cost model that corresponds to the non-uarch
specific properties of the vector spec.  Once this is in place
we will also want a somewhat generic vector scheduler description
that goes hand in hand with that.  Despite the name, the vector
part of generic-ooo could be used for in-order vector uarchs and
we might want to define a different description for out-of-order
uarchs.  That's a separate discussion but at least for that
contingency it would make sense to easily interchange the scheduler
description ;)

I think I understand everything. I'm currently testing a run with a 
generic-vector-ooo file and I'm a little unsure how we would create a 
second generic-vector-in-order file such that each insn maps only to one 
reservation without using tune attributes but I guess that will be an 
implementation detail for later :)


Edwin


Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations

2024-01-10 Thread Robin Dapp
> Since all the pipelines should be tuned to their cost model, they
> would be different anyway. If it would be simpler for now, I could
> separate the files out.
> I think I'm getting a bit confused. Is there a reason why we would
> want to exchange scheduler descriptions like the example you
> provided? I'm just thinking why a in-order model would want to use an
> ooo vector model and vice versa. Please correct me if I got the wrong
> idea.

Yeah, the confusion is understandable as it's all in flow and several
things I mentioned are artifacts of us not yet being stabilized (or
actually having hard data to base our decisions on).

Usually, once a uarch has settled there is no reason to exchange
anything, just smaller tweaks might be done.  I was more thinking of
the near to mid-term future where larger changes like ripping out
one thing and using another one altogether might still happen.

Regarding out of order vs in order - for in-order pipelines we will
always want to get latencies right.  For out of order it is a balancing
act (proper latencies often mean more spilling and the processor will
reorder correctly anyway).

So you're mostly right that the argument is not very strong as soon
as we really know what to do and not to do.

> I also want to double check, isn't forcing all typed instructions to
> be part of a dfa pipeline in effect removing a situation where a tune
> model does not specify a "vector tune model"? At least from my
> testing with the assert statement, I get ICEs when trying to run the
> testsuite without the vector tune model even on gc.

There are (at least) three parts of the "tune model":
 - vector cost model, specifying the cost of generic vector operations,
   not necessarily corresponding to an insn
 - insn cost, specifying the cost of an individual insn, usually close
   to latency but sometimes also "complexity" or other things.
 - insn latency and other hardware scheduler properties.

We can leave out any of those which will make us fall back to default
values.  Even if we forced a scheduler description we could still have
the default fallback for the other two and generate unfavorable code
as a result.

However, this is of course not desirable and we will soon have a
reasonable vector cost model that corresponds to the non-uarch
specific properties of the vector spec.  Once this is in place
we will also want a somewhat generic vector scheduler description
that goes hand in hand with that.  Despite the name, the vector
part of generic-ooo could be used for in-order vector uarchs and
we might want to define a different description for out-of-order
uarchs.  That's a separate discussion but at least for that
contingency it would make sense to easily interchange the scheduler
description ;)

Regards
 Robin


Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations

2024-01-10 Thread Edwin Lu

Hi Robin,
On 1/10/2024 8:00 AM, Robin Dapp wrote:

Hi Edwin,


This patch copies the vector reservations from generic-ooo.md and
inserts them into generic.md and sifive.md. Creates new vector crypto related
insn reservations.


In principle, the changes look good to me but I wonder if we could
split off the vector parts from generic-ooo into their own md file
(generic-vector-ooo or so?) and include this in the others?  Or is
there a reason why you decided against this?

I forgot we could include other md files into another file (I'll double 
check that there isn't anything fancy for including other pipelines), 
but I also thought that eventually all the tunes would have their own 
vector cost pipelines. Since all the pipelines should be tuned to their 
cost model, they would be different anyway. If it would be simpler for 
now, I could separate the files out.



A recurring question in vector cost model discussions seems to be how
to handle the situation when a tune model does not specify a "vector tune
model".  The problem exists for the scheduler descriptions and the
normal vector cost model (and possibly insn_costs as well).

Juzhe just implemented a fallback so we always use the "generic rvv" cost
model.  Your changes would be in the same vein and if we could split
them off then we'd be able to easier exchange one scheduler descriptions
for another one (say if one tune model wants to use an in-order vector
model).

I think I'm getting a bit confused. Is there a reason why we would want 
to exchange scheduler descriptions like the example you provided? I'm 
just thinking why a in-order model would want to use an ooo vector model 
and vice versa. Please correct me if I got the wrong idea.


I also want to double check, isn't forcing all typed instructions to be 
part of a dfa pipeline in effect removing a situation where a tune model 
does not specify a "vector tune model"? At least from my testing with 
the assert statement, I get ICEs when trying to run the testsuite 
without the vector tune model even on gc.



There is also still the question of whether to set all latencies
to 1 for an OOO core but this question should be settled separately
as soon as we have proper hardware benchmark results.  If so we
would probably rename generic-vector-ooo into
generic-vector-in-order ;)

Regards
  Robin



I agree the latencies can be tweaked after we get those benchmarks :)

Edwin



Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations

2024-01-10 Thread Robin Dapp
Hi Edwin,

> This patch copies the vector reservations from generic-ooo.md and
> inserts them into generic.md and sifive.md. Creates new vector crypto related
> insn reservations.

In principle, the changes look good to me but I wonder if we could
split off the vector parts from generic-ooo into their own md file
(generic-vector-ooo or so?) and include this in the others?  Or is
there a reason why you decided against this?

A recurring question in vector cost model discussions seems to be how
to handle the situation when a tune model does not specify a "vector tune
model".  The problem exists for the scheduler descriptions and the
normal vector cost model (and possibly insn_costs as well).

Juzhe just implemented a fallback so we always use the "generic rvv" cost
model.  Your changes would be in the same vein and if we could split
them off then we'd be able to easier exchange one scheduler descriptions
for another one (say if one tune model wants to use an in-order vector
model).

There is also still the question of whether to set all latencies
to 1 for an OOO core but this question should be settled separately
as soon as we have proper hardware benchmark results.  If so we
would probably rename generic-vector-ooo into
generic-vector-in-order ;)

Regards
 Robin



[PATCH V2 2/4][RFC] RISC-V: Add vector related reservations

2024-01-09 Thread Edwin Lu
This patch copies the vector reservations from generic-ooo.md and
inserts them into generic.md and sifive.md. Creates new vector crypto related
insn reservations.

gcc/ChangeLog:

* config/riscv/generic-ooo.md (generic_ooo_crypto_aes): create 
reservation
(generic_ooo_crypto_sha): ditto
(generic_ooo_crypto_sm): ditto
(generic_ooo_vec_vesetvl): ditto
(generic_ooo_vec_vsetvl): ditto
* config/riscv/generic.md (pipe0): ditto
(generic_vec_load): ditto
(generic_vec_store): ditto
(generic_vec_loadstore_seg): ditto
(generic_vec_alu): ditto
(generic_vec_fcmp): ditto
(generic_vec_imul): ditto
(generic_vec_fadd): ditto
(generic_vec_fmul): ditto
(generic_crypto): ditto
(generic_crypto_aes): ditto
(generic_crypto_sha): ditto
(generic_crypto_sm): ditto
(generic_perm): ditto
(generic_vec_reduction): ditto
(generic_vec_ordered_reduction): ditto
(generic_vec_idiv): ditto
(generic_vec_float_divsqrt): ditto
(generic_vec_mask): ditto
(generic_vec_vesetvl): ditto
(generic_vec_setrm): ditto
(generic_vec_readlen): ditto
* config/riscv/sifive-7.md (sifive_7): ditto
(sifive_7_vec_load): ditto
(sifive_7_vec_store): ditto
(sifive_7_vec_loadstore_seg): ditto
(sifive_7_vec_alu): ditto
(sifive_7_vec_fcmp): ditto
(sifive_7_vec_imul): ditto
(sifive_7_vec_fadd): ditto
(sifive_7_vec_fmul): ditto
(sifive_7_crypto): ditto
(sifive_7_crypto_aes): ditto
(sifive_7_crypto_sha): ditto
(sifive_7_crypto_sm): ditto
(sifive_7_perm): ditto
(sifive_7_vec_reduction): ditto
(sifive_7_vec_ordered_reduction): ditto
(sifive_7_vec_idiv): ditto
(sifive_7_vec_float_divsqrt): ditto
(sifive_7_vec_mask): ditto
(sifive_7_vec_vesetvl): ditto
(sifive_7_vec_setrm): ditto
(sifive_7_vec_readlen): ditto

Signed-off-by: Edwin Lu 
Co-authored-by: Robin Dapp 
---
V2:
- Remove unnecessary syntax changes in generic-ooo
- Add new vector crypto reservations and types to
  pipelines
---
 gcc/config/riscv/generic-ooo.md |  27 +-
 gcc/config/riscv/generic.md | 143 +++
 gcc/config/riscv/sifive-7.md| 144 
 3 files changed, 311 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/generic-ooo.md b/gcc/config/riscv/generic-ooo.md
index ef8cb96daf4..fb5f34c0ef2 100644
--- a/gcc/config/riscv/generic-ooo.md
+++ b/gcc/config/riscv/generic-ooo.md
@@ -195,7 +195,8 @@ (define_insn_reservation "generic_ooo_popcount" 2
 (define_insn_reservation "generic_ooo_vec_alu" 3
   (and (eq_attr "tune" "generic_ooo")
(eq_attr "type" 
"vialu,viwalu,vext,vicalu,vshift,vnshift,viminmax,vicmp,\
-   vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector"))
+   vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector,\
+   vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,vror,vwsll"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 ;; Vector float comparison, conversion etc.
@@ -209,7 +210,8 @@ (define_insn_reservation "generic_ooo_vec_fcmp" 3
 ;; Vector integer multiplication.
 (define_insn_reservation "generic_ooo_vec_imul" 4
   (and (eq_attr "tune" "generic_ooo")
-   (eq_attr "type" "vimul,viwmul,vimuladd,viwmuladd,vsmul"))
+   (eq_attr "type" "vimul,viwmul,vimuladd,viwmuladd,vsmul,vclmul,vclmulh,\
+   vghsh,vgmul"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 ;; Vector float addition.
@@ -230,6 +232,25 @@ (define_insn_reservation "generic_ooo_crypto" 4
(eq_attr "type" "crypto"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
+;; Vector crypto, AES
+(define_insn_reservation "generic_ooo_crypto_aes" 4
+  (and (eq_attr "tune" "generic_ooo")
+   (eq_attr "type" "vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz"))
+  "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
+
+;; Vector crypto, sha
+(define_insn_reservation "generic_ooo_crypto_sha" 4
+  (and (eq_attr "tune" "generic_ooo")
+   (eq_attr "type" "vsha2ms,vsha2ch,vsha2cl"))
+  "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
+
+;; Vector crypto, SM3/4
+(define_insn_reservation "generic_ooo_crypto_sm" 4
+  (and (eq_attr "tune" "generic_ooo")
+   (eq_attr "type" "vsm4k,vsm4r,vsm3me,vsm3c"))
+  "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
+
+
 ;; Vector permute.
 (define_insn_reservation "generic_ooo_perm" 3
   (and (eq_attr "tune" "generic_ooo")
@@ -271,7 +292,7 @@ (define_insn_reservation "generic_ooo_vec_mask" 2
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 ;; Vector vsetvl.
-(define_insn_reservation "generic_ooo_vec_vesetvl" 1
+(define_insn_reservation "generic_ooo_vec_vsetvl" 1
   (and (eq_attr "tune" "generic_ooo")
(eq_attr "type" "vsetvl,vsetvl_pre"))
   "generic_ooo_vxu_issue")