On Fri, Dec 30, 2022 at 3:21 AM Mayshao-oc <mayshao...@zhaoxin.com> wrote:
>
> >Ping. If there are any questions or concerns about the patch, please let me
> >know: I'm interested in continuing this cleanup at least for older AMD 
> >models.
> >
> Hi Alexander:
>         According to the speccpu2017 benchmark result, the patch looks good 
> in lujiazui.

The patch is OK then.

Thanks,
Uros.

> BR
> Mayshao
> >I noticed I had an extra line in my Changelog:
> >
> >>      (lua_sseicvt_si): Ditto.
> >
> >It got there accidentally and I will drop it.
> >
> >Alexander
> >
> >On Fri, 9 Dec 2022, Alexander Monakov wrote:
> >
> >> Model the divider in Lujiazui processors as a separate automaton to
> >> significantly reduce the overall model size. This should also result
> >> in improved accuracy, as pipe 0 should be able to accept new
> >> instructions while the divider is occupied.
> >>
> >> It is unclear why integer divisions are modeled as if pipes 0-3 are
> >> all occupied. I've opted to keep a single-cycle reservation of all
> >> four pipes together, so GCC should continue trying to pack
> >> instructions around a division accordingly.
> >>
> >> Currently top three symbols in insn-automata.o are:
> >>
> >> 106102 r lujiazui_core_check
> >> 106102 r lujiazui_core_transitions
> >> 196123 r lujiazui_core_min_issue_delay
> >>
> >> This patch shrinks all lujiazui tables to:
> >>
> >> 3 r lujiazui_decoder_min_issue_delay
> >> 20 r lujiazui_decoder_transitions
> >> 32 r lujiazui_agu_min_issue_delay
> >> 126 r lujiazui_agu_transitions
> >> 304 r lujiazui_div_base
> >> 352 r lujiazui_div_check
> >> 352 r lujiazui_div_transitions
> >> 1152 r lujiazui_core_min_issue_delay
> >> 1592 r lujiazui_agu_translate
> >> 1592 r lujiazui_core_translate
> >> 1592 r lujiazui_decoder_translate
> >> 1592 r lujiazui_div_translate
> >> 3952 r lujiazui_div_min_issue_delay
> >> 9216 r lujiazui_core_transitions
> >>
> >> This continues the work on reducing i386 insn-automata.o size started
> >> with similar fixes for division and multiplication instructions in
> >> znver.md [1][2]. I plan to submit corresponding fixes for
> >> b[td]ver[123].md as well.
> >>
> >> [1]
> >> https://inbox.sourceware.org/gcc-patches/23c795d6-403c-5927-e610-f0f12
> >> 15f5...@ispras.ru/T/#m36e069d43d07d768d4842a779e26b4a0915cc543
> >> [2]
> >> https://inbox.sourceware.org/gcc-patches/20221101162637.14238-1-amonak
> >> o...@ispras.ru/
> >>
> >> gcc/ChangeLog:
> >>
> >>      PR target/87832
> >>      * config/i386/lujiazui.md (lujiazui_div): New automaton.
> >>      (lua_div): New unit.
> >>      (lua_idiv_qi): Correct unit in the reservation.
> >>      (lua_idiv_qi_load): Ditto.
> >>      (lua_idiv_hi): Ditto.
> >>      (lua_idiv_hi_load): Ditto.
> >>      (lua_idiv_si): Ditto.
> >>      (lua_idiv_si_load): Ditto.
> >>      (lua_idiv_di): Ditto.
> >>      (lua_idiv_di_load): Ditto.
> >>      (lua_fdiv_SF): Ditto.
> >>      (lua_fdiv_SF_load): Ditto.
> >>      (lua_fdiv_DF): Ditto.
> >>      (lua_fdiv_DF_load): Ditto.
> >>      (lua_fdiv_XF): Ditto.
> >>      (lua_fdiv_XF_load): Ditto.
> >>      (lua_ssediv_SF): Ditto.
> >>      (lua_ssediv_load_SF): Ditto.
> >>      (lua_ssediv_V4SF): Ditto.
> >>      (lua_ssediv_load_V4SF): Ditto.
> >>      (lua_ssediv_V8SF): Ditto.
> >>      (lua_ssediv_load_V8SF): Ditto.
> >>      (lua_ssediv_SD): Ditto.
> >>      (lua_ssediv_load_SD): Ditto.
> >>      (lua_ssediv_V2DF): Ditto.
> >>      (lua_ssediv_load_V2DF): Ditto.
> >>      (lua_ssediv_V4DF): Ditto.
> >>      (lua_ssediv_load_V4DF): Ditto.
> >>      (lua_sseicvt_si): Ditto.
> >> ---
> >>  gcc/config/i386/lujiazui.md | 58
> >> +++++++++++++++++++------------------
> >>  1 file changed, 30 insertions(+), 28 deletions(-)
> >>
> >> diff --git a/gcc/config/i386/lujiazui.md b/gcc/config/i386/lujiazui.md
> >> index 9046c09f2..58a230c70 100644
> >> --- a/gcc/config/i386/lujiazui.md
> >> +++ b/gcc/config/i386/lujiazui.md
> >> @@ -19,8 +19,8 @@
> >>
> >>  ;; Scheduling for ZHAOXIN lujiazui processor.
> >>
> >> -;; Modeling automatons for decoders, execution pipes and AGU pipes.
> >> -(define_automaton "lujiazui_decoder,lujiazui_core,lujiazui_agu")
> >> +;; Modeling automatons for decoders, execution pipes, AGU pipes, and 
> >> divider.
> >> +(define_automaton
> >> +"lujiazui_decoder,lujiazui_core,lujiazui_agu,lujiazui_div")
> >>
> >>  ;; The rules for the decoder are simple:
> >>  ;;  - an instruction with 1 uop can be decoded by any of the three @@
> >> -55,6 +55,8 @@ (define_reservation "lua_decoder01"
> >> "lua_decoder0|lua_decoder1")  (define_cpu_unit
> >> "lua_p0,lua_p1,lua_p2,lua_p3" "lujiazui_core")  (define_cpu_unit
> >> "lua_p4,lua_p5" "lujiazui_agu")
> >>
> >> +(define_cpu_unit "lua_div" "lujiazui_div")
> >> +
> >>  (define_reservation "lua_p03" "lua_p0|lua_p3")  (define_reservation
> >> "lua_p12" "lua_p1|lua_p2")  (define_reservation "lua_p1p2"
> >> "lua_p1+lua_p2") @@ -229,56 +231,56 @@ (define_insn_reservation
> >> "lua_idiv_qi" 21
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "QI")
> >>                                      (eq_attr "type" "idiv"))))
> >> -                     "lua_decoder0,lua_p0p1p2p3*21")
> >> +                     "lua_decoder0,lua_p0p1p2p3,lua_div*21")
> >>
> >>  (define_insn_reservation "lua_idiv_qi_load" 25
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "QI")
> >>                                      (eq_attr "type" "idiv"))))
> >> -                     "lua_decoder0,lua_p45,lua_p0p1p2p3*21")
> >> +                     "lua_decoder0,lua_p45,lua_p0p1p2p3,lua_div*21")
> >>
> >>  (define_insn_reservation "lua_idiv_hi" 22
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "HI")
> >>                                      (eq_attr "type" "idiv"))))
> >> -                     "lua_decoder0,lua_p0p1p2p3*22")
> >> +                     "lua_decoder0,lua_p0p1p2p3,lua_div*22")
> >>
> >>  (define_insn_reservation "lua_idiv_hi_load" 26
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "HI")
> >>                                      (eq_attr "type" "idiv"))))
> >> -                     "lua_decoder0,lua_p45,lua_p0p1p2p3*22")
> >> +                     "lua_decoder0,lua_p45,lua_p0p1p2p3,lua_div*22")
> >>
> >>  (define_insn_reservation "lua_idiv_si" 20
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "SI")
> >>                                      (eq_attr "type" "idiv"))))
> >> -                     "lua_decoder0,lua_p0p1p2p3*20")
> >> +                     "lua_decoder0,lua_p0p1p2p3,lua_div*20")
> >>
> >>  (define_insn_reservation "lua_idiv_si_load" 24
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "SI")
> >>                                      (eq_attr "type" "idiv"))))
> >> -                     "lua_decoder0,lua_p45,lua_p0p1p2p3*20")
> >> +                     "lua_decoder0,lua_p45,lua_p0p1p2p3,lua_div*20")
> >>
> >>  (define_insn_reservation "lua_idiv_di" 150
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "DI")
> >>                                      (eq_attr "type" "idiv"))))
> >> -                     "lua_decoder0,lua_p0p1p2p3*150")
> >> +                     "lua_decoder0,lua_p0p1p2p3,lua_div*150")
> >>
> >>  (define_insn_reservation "lua_idiv_di_load" 154
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "DI")
> >>                                      (eq_attr "type" "idiv"))))
> >> -                     "lua_decoder0,lua_p45,lua_p0p1p2p3*150")
> >> +                     "lua_decoder0,lua_p45,lua_p0p1p2p3,lua_div*150")
> >>
> >>  ;; x87 floating point operations.
> >>
> >> @@ -406,42 +408,42 @@ (define_insn_reservation "lua_fdiv_SF" 15
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "SF")
> >>                                  (eq_attr "type" "fdiv,fpspc"))))
> >> -                     "lua_decodern,lua_p0*15")
> >> +                     "lua_decodern,lua_p0,lua_div*15")
> >>
> >>  (define_insn_reservation "lua_fdiv_SF_load" 19
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "SF")
> >>                                  (eq_attr "type" "fdiv,fpspc"))))
> >> -                     "lua_decoder01,lua_p45,lua_p0*15")
> >> +                     "lua_decoder01,lua_p45,lua_p0,lua_div*15")
> >>
> >>  (define_insn_reservation "lua_fdiv_DF" 18
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "DF")
> >>                                  (eq_attr "type" "fdiv,fpspc"))))
> >> -                     "lua_decodern,lua_p0*18")
> >> +                     "lua_decodern,lua_p0,lua_div*18")
> >>
> >>  (define_insn_reservation "lua_fdiv_DF_load" 22
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "DF")
> >>                                  (eq_attr "type" "fdiv,fpspc"))))
> >> -                     "lua_decoder01,lua_p45,lua_p0*18")
> >> +                     "lua_decoder01,lua_p45,lua_p0,lua_div*18")
> >>
> >>  (define_insn_reservation "lua_fdiv_XF" 22
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "XF")
> >>                                  (eq_attr "type" "fdiv,fpspc"))))
> >> -                     "lua_decoder0,lua_p0*22")
> >> +                     "lua_decoder0,lua_p0,lua_div*22")
> >>
> >>  (define_insn_reservation "lua_fdiv_XF_load" 26
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "XF")
> >>                                  (eq_attr "type" "fdiv,fpspc"))))
> >> -                     "lua_decoder0,lua_p45,lua_p0*22")
> >> +                     "lua_decoder0,lua_p45,lua_p0,lua_div*22")
> >>
> >>  ;; MMX instructions.
> >>
> >> @@ -593,84 +595,84 @@ (define_insn_reservation "lua_ssediv_SF" 13
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "SF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decodern,lua_p0*13")
> >> +                     "lua_decodern,lua_p0,lua_div*13")
> >>
> >>  (define_insn_reservation "lua_ssediv_load_SF" 17
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "SF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decoder01,lua_p45,lua_p0*13")
> >> +                     "lua_decoder01,lua_p45,lua_p0,lua_div*13")
> >>
> >>  (define_insn_reservation "lua_ssediv_V4SF" 23
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "V4SF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decodern,lua_p0*23")
> >> +                     "lua_decodern,lua_p0,lua_div*23")
> >>
> >>  (define_insn_reservation "lua_ssediv_load_V4SF" 27
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "V4SF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decoder01,lua_p45,lua_p0*23")
> >> +                     "lua_decoder01,lua_p45,lua_p0,lua_div*23")
> >>
> >>  (define_insn_reservation "lua_ssediv_V8SF" 47
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "V8SF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decoder0,lua_p0*47")
> >> +                     "lua_decoder0,lua_p0,lua_div*47")
> >>
> >>  (define_insn_reservation "lua_ssediv_load_V8SF" 51
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "V8SF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decoder0,lua_p45,lua_p0*47")
> >> +                     "lua_decoder0,lua_p45,lua_p0,lua_div*47")
> >>
> >>  (define_insn_reservation "lua_ssediv_SD" 17
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "DF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decodern,lua_p0*17")
> >> +                     "lua_decodern,lua_p0,lua_div*17")
> >>
> >>  (define_insn_reservation "lua_ssediv_load_SD" 21
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "DF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decoder01,lua_p45,lua_p0*17")
> >> +                     "lua_decoder01,lua_p45,lua_p0,lua_div*17")
> >>
> >>  (define_insn_reservation "lua_ssediv_V2DF" 30
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "V2DF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decodern,lua_p0*30")
> >> +                     "lua_decodern,lua_p0,lua_div*30")
> >>
> >>  (define_insn_reservation "lua_ssediv_load_V2DF" 34
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "V2DF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decoder01,lua_p45,lua_p0*30")
> >> +                     "lua_decoder01,lua_p45,lua_p0,lua_div*30")
> >>
> >>  (define_insn_reservation "lua_ssediv_V4DF" 56
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "none")
> >>                                 (and (eq_attr "mode" "V4DF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decoder0,lua_p0*56")
> >> +                     "lua_decoder0,lua_p0,lua_div*56")
> >>
> >>  (define_insn_reservation "lua_ssediv_load_V4DF" 60
> >>                       (and (eq_attr "cpu" "lujiazui")
> >>                            (and (eq_attr "memory" "load")
> >>                                 (and (eq_attr "mode" "V4DF")
> >>                                      (eq_attr "type" "ssediv"))))
> >> -                     "lua_decoder0,lua_p4p5,lua_p0*56")
> >> +                     "lua_decoder0,lua_p4p5,lua_p0,lua_div*56")
> >>
> >>
> >>
> >
>
>

Reply via email to