On Fri, 2020-12-04 at 13:19 -0600, acsawdey--- via Gcc-patches wrote:
> From: Aaron Sawdey <acsaw...@linux.ibm.com>
> 

Assorted comments sprinkled around below.
thanks
-Will


> This patch adds the first batch of patterns to support p10 fusion. These
> will allow combine to create a single insn for a pair of instructions
> that that power10 can fuse and execute. These particular ones have the

Just one that, or maybe 'that the'.
s/ones/fusion pairs/ ?

> requirement that only cr0 can be used when fusing a load with a compare
> immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine
> to put that requirement in, and if it doesn't work out later the splitter
> can get used.

... splitter can get used, or ... splitter will <do something...>

> 
> The patterns are generated by a script genfusion.pl and live in new file
> fusion.md. This script will be expanded to generate more patterns for
> fusion.

ok

> 
> This also adds option -mpower10-fusion which defaults on for power10 and
> will gate all these fusion patterns. In addition I have added an
> undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
> that just controls the load+compare-immediate patterns. I have make

made

> these default on for power10 but they are not disallowed for earlier
> processors because it is still valid code. This allows us to test the
> correctness of fusion code generation by turning it on explicitly.
> 
> If bootstrap/regtest is clean, ok for trunk?
> 
> Thanks!
> 
>    Aaron
> 
> gcc/ChangeLog:
> 
>       * config/rs6000/genfusion.pl: New file, script to generate
>       define_insn_and_split patterns so combine can arrange fused
>       instructions next to each other.

New script to generate ...

>       * config/rs6000/fusion.md: New file, generated fused instruction
>       patterns for combine.

>       * config/rs6000/predicates.md (const_m1_to_1_operand): New predicate.
>       (non_update_memory_operand): New predicate.
ok
>       * config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
>       OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and
>       POWERPC_MASKS.
>       * config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add
>       prototype.

All usages of address_is_non_pfx_d_or_x() appear to be negated, i.e. 
        +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), 
        DImode, NON_PREFIXED_DS))" 
Fully understanding that naming is
hard, I'd wonder if that can be adjusted to avoid the double negative. 
something like (address_load_mode_requires_prefix (...foo) ?


>       * config/rs6000/rs6000.c (rs6000_option_override_internal):
>       automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
>       if target is power10.  (rs600_opt_masks): Allow -mpower10-fusion
>       in function attributes.  (address_is_non_pfx_d_or_x): New function.

ok

>       * config/rs6000/rs6000.h: Add MASK_P10_FUSION.
>       * config/rs6000/rs6000.md: Include fusion.md.
>       * config/rs6000/rs6000.opt: Add -mpower10-fusion
>       and -mpower10-fusion-ld-cmpi.

ok

>       * config/rs6000/t-rs6000: Add dependencies involving fusion.md.

ok


> ---
>  gcc/config/rs6000/fusion.md       | 357 ++++++++++++++++++++++++++++++
>  gcc/config/rs6000/genfusion.pl    | 144 ++++++++++++
>  gcc/config/rs6000/predicates.md   |  14 ++
>  gcc/config/rs6000/rs6000-cpus.def |   6 +-
>  gcc/config/rs6000/rs6000-protos.h |   2 +
>  gcc/config/rs6000/rs6000.c        |  51 +++++
>  gcc/config/rs6000/rs6000.h        |   1 +
>  gcc/config/rs6000/rs6000.md       |   1 +
>  gcc/config/rs6000/rs6000.opt      |   8 +
>  gcc/config/rs6000/t-rs6000        |   6 +-
>  10 files changed, 588 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/config/rs6000/fusion.md
>  create mode 100755 gcc/config/rs6000/genfusion.pl
> 
> diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
> new file mode 100644
> index 00000000000..a4d3a6ae7f3
> --- /dev/null
> +++ b/gcc/config/rs6000/fusion.md
> @@ -0,0 +1,357 @@
> +;; -*- buffer-read-only: t -*-
> +;; Generated automatically by genfusion.pl
> +
> +;; Copyright (C) 2020 Free Software Foundation, Inc.
> +;;
> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it under
> +;; the terms of the GNU General Public License as published by the Free
> +;; Software Foundation; either version 3, or (at your option) any later
> +;; version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +;; for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is clobber compare mode is CC extend is none
> +(define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))
> +   (clobber (match_scratch:DI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, 
> NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is clobber compare mode is CCUNS extend is 
> none
> +(define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:DI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, 
> NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is DI compare mode is CC extend is none
> +(define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, 
> NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is DI compare mode is CCUNS extend is none
> +(define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, 
> NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is clobber compare mode is CC extend is none
> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> +   (clobber (match_scratch:SI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, 
> NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is clobber compare mode is CCUNS extend is 
> none
> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:SI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, 
> NON_PREFIXED_D))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is SI compare mode is CC extend is none
> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, 
> NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is SI compare mode is CCUNS extend is none
> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, 
> NON_PREFIXED_D))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI 
> (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, 
> NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (sign_extend:EXTSI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is EXTSI compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (zero_extend:EXTSI 
> (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, 
> NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:EXTSI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is clobber compare mode is CC extend is sign
> +(define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_m1_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lha%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, 
> NON_PREFIXED_D))"
> +  [(set (match_dup 0) (sign_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is clobber compare mode is CCUNS extend is 
> zero
> +(define_insn_and_split "*lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lhz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, 
> NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is EXTHI compare mode is CC extend is sign
> +(define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (sign_extend:EXTHI 
> (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lha%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, 
> NON_PREFIXED_D))"
> +  [(set (match_dup 0) (sign_extend:EXTHI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is EXTHI compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lhz_cmpldi_cr0_HI_EXTHI_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (zero_extend:EXTHI 
> (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lhz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, 
> NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:EXTHI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is QI result mode is clobber compare mode is CCUNS extend is 
> zero
> +(define_insn_and_split "*lbz_cmpldi_cr0_QI_clobber_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
> +                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, 
> NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is QI result mode is GPR compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lbz_cmpldi_cr0_QI_GPR_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
> +                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:GPR 0 "gpc_reg_operand" "=r") (zero_extend:GPR 
> (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, 
> NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +                 (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +


Reviewed with a mix of in-depth analysis and a skim.. nothing jumped
out at me here.


> diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
> new file mode 100755
> index 00000000000..494537c9439
> --- /dev/null
> +++ b/gcc/config/rs6000/genfusion.pl
> @@ -0,0 +1,144 @@
> +#!/usr/bin/perl -w
> +# Generate fusion.md 
> +# Copyright (C) 2020 Free Software Foundation, Inc.
> +#
> +# This file is part of GCC.
> +#
> +# GCC is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3, or (at your option)
> +# any later version.
> +#
> +# GCC is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +my $copyright =  <<'EOF';


> +;; -*- buffer-read-only: t -*-
> +;; Generated automatically by genfusion.pl
> +
> +;; Copyright (C) 2020 Free Software Foundation, Inc.
> +;;


Embedding the date in an autogenerated file catches my eye.  I don't
see this in things like $GCC_BUILD/gcc/insn-recog.c ; I'm not sure it's
necessary in this case. (but prob doesn't hurt).  


> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it under
> +;; the terms of the GNU General Public License as published by the Free
> +;; Software Foundation; either version 3, or (at your option) any later
> +;; version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +;; for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +EOF
> +
> +print $copyright;
> +
> +sub mode_to_ldst_char
> +{
> +    my ($mode) = @_;
> +    if ($mode eq 'DI') { return 'd'; }
> +    if ($mode eq 'SI') { return 'w'; }
> +    if ($mode eq 'HI') { return 'h'; }
> +    if ($mode eq 'QI') { return 'b'; }
> +    return '?';
> +}
> +
> +sub gen_ld_cmpi_p10
> +{
> +  LMODE: foreach $lmode ('DI','SI','HI','QI') {
> +      $ldst = mode_to_ldst_char($lmode);
> +      $clobbermode = $lmode;
> +      # For clobber, we need a SI/DI reg in case we split because we have to 
> sign/zero extend.
> +      if ( $lmode eq 'HI' || $lmode eq 'QI' ) { $clobbermode = "GPR"; }
> +    RESULT: foreach $result ('clobber', $lmode,  "EXT".$lmode) {
> +     # EXTDI does not exist, and we cannot directly produce HI/QI results.
> +     next RESULT if $result eq "EXTDI" || $result eq "HI" || $result eq "QI";
> +     # Don't allow EXTQI because that would allow HI result which we can't 
> do.
> +     if ( $result eq "EXTQI" ) { $result = "GPR"; }
> +      CCMODE: foreach $ccmode ('CC','CCUNS') {
> +       $np = "NON_PREFIXED_D";
> +       if ( $ccmode eq 'CC' ) {
> +           next CCMODE if $lmode eq 'QI';
> +           if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
> +               # ld and lwa are both DS-FORM.
> +               $np = "NON_PREFIXED_DS";
> +           }
> +           $cmpl = "";
> +           $echr = "a";
> +           $constpred = "const_m1_to_1_operand";
> +       } else {
> +           if ( $lmode eq 'DI' ) {
> +               # ld is DS-form, but lwz is not.
> +               $np = "NON_PREFIXED_DS";
> +           }
> +           $cmpl = "l";
> +           $echr = "z";
> +           $constpred = "const_0_to_1_operand";
> +       }
> +       if ($lmode eq 'DI') { $echr = ""; }
> +       if ($result =~ m/EXT/ || $result eq 'GPR' || $clobbermode eq 'GPR') {
> +           # We always need extension if result > lmode.
> +           if ( $ccmode eq 'CC' ) {
> +               $extend = "sign";
> +           } else {
> +               $extend = "zero";
> +           }
> +       } else {
> +           # Result of SI/DI does not need sign extension.
> +           $extend = "none";
> +       }
> +       print ";; load-cmpi fusion pattern generated by gen_ld_cmpi_p10\n";
> +       print ";; load mode is $lmode result mode is $result compare mode is 
> $ccmode extend is $extend\n";
> +
> +       print "(define_insn_and_split 
> \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";
> +       print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" 
> \"=x\")\n";
> +       print "        (compare:${ccmode} (match_operand:${lmode} 1 
> \"non_update_memory_operand\" \"m\")\n";
> +       print "                 (match_operand:${lmode} 3 \"${constpred}\" 
> \"n\")))\n";
> +       if ($result eq 'clobber') {
> +           print "   (clobber (match_scratch:${clobbermode} 0 \"=r\"))]\n";
> +       } elsif ($result eq $lmode) {
> +           print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" 
> \"=r\") (match_dup 1))]\n";
> +       } else {
> +           print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" 
> \"=r\") (${extend}_extend:${result} (match_dup 1)))]\n";
> +       }
> +       print "  \"(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)\"\n";
> +       print "  \"l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di 0,%0,%3\"\n";
> +       print "  \"&& reload_completed\n";
> +       print "   && (cc_reg_not_cr0_operand (operands[2], CCmode)\n";
> +       print "       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), 
> ${lmode}mode, ${np}))\"\n";
> +       if ($extend eq "none") {
> +           print "  [(set (match_dup 0) (match_dup 1))\n";
> +       } else {
> +           $resultmode = $result;
> +           if ( $result eq 'clobber' ) { $resultmode = $clobbermode }
> +           print "  [(set (match_dup 0) (${extend}_extend:${resultmode} 
> (match_dup 1)))\n";
> +       }
> +       print "   (set (match_dup 2)\n";
> +       print "        (compare:${ccmode} (match_dup 0)\n";
> +       print "                   (match_dup 3)))]\n";
> +       print "  \"\"\n";
> +       print "  [(set_attr \"type\" \"load\")\n";
> +       print "   (set_attr \"cost\" \"8\")\n";
> +       print "   (set_attr \"length\" \"8\")])\n";
> +       print "\n";
> +      }
> +    }
> +  }
> +}

Looked over, seems OK.   presumably testing will reveal any issues. :-)


> +
> +
> +gen_ld_cmpi_p10();
> +
> +exit(0);
> +
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index 9ad5ae67302..78de8102f44 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -297,6 +297,11 @@ (define_predicate "const_0_to_1_operand"
>    (and (match_code "const_int")
>         (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
> 
> +;; Match op = -1, op = 0, or op = 1.
> +(define_predicate "const_m1_to_1_operand"
> +  (and (match_code "const_int")
> +       (match_test "IN_RANGE (INTVAL (op), -1, 1)")))
> +

What does the _m1 indicate here?  (I can't tell from pre-existing usage
if it's negative, or match or mode or something other..)


>  ;; Match op = 0..3.
>  (define_predicate "const_0_to_3_operand"
>    (and (match_code "const_int")
> @@ -847,6 +852,15 @@ (define_special_predicate "update_address_mem"
>                   || GET_CODE (XEXP (op, 0)) == PRE_DEC
>                   || GET_CODE (XEXP (op, 0)) == PRE_MODIFY))"))
> 
> +;; Anything that matches memory_operand but does not update the address.
> +(define_predicate "non_update_memory_operand"
> +  (match_code "mem")
> +{
> +  if (update_address_mem (op, mode))
> +    return 0;
> +  return memory_operand (op, mode);
> +})
> +
>  ;; Return 1 if the operand is a MEM with an indexed-form address.
>  (define_special_predicate "indexed_address_mem"
>    (match_test "(MEM_P (op)
> diff --git a/gcc/config/rs6000/rs6000-cpus.def 
> b/gcc/config/rs6000/rs6000-cpus.def
> index 8d2c1ffd6cf..3e65289d8df 100644
> --- a/gcc/config/rs6000/rs6000-cpus.def
> +++ b/gcc/config/rs6000/rs6000-cpus.def
> @@ -82,7 +82,9 @@
> 
>  #define ISA_3_1_MASKS_SERVER (ISA_3_0_MASKS_SERVER                   \
>                                | OPTION_MASK_POWER10                  \
> -                              | OTHER_POWER10_MASKS)
> +                              | OTHER_POWER10_MASKS                  \
> +                              | OPTION_MASK_P10_FUSION               \
> +                              | OPTION_MASK_P10_FUSION_LD_CMPI)
> 
>  /* Flags that need to be turned off if -mno-power9-vector.  */
>  #define OTHER_P9_VECTOR_MASKS        (OPTION_MASK_FLOAT128_HW                
> \
> @@ -129,6 +131,8 @@
>                                | OPTION_MASK_FLOAT128_KEYWORD         \
>                                | OPTION_MASK_FPRND                    \
>                                | OPTION_MASK_POWER10                  \
> +                              | OPTION_MASK_P10_FUSION               \
> +                              | OPTION_MASK_P10_FUSION_LD_CMPI       \
>                                | OPTION_MASK_HTM                      \
>                                | OPTION_MASK_ISEL                     \
>                                | OPTION_MASK_MFCRF                    \

ok

> diff --git a/gcc/config/rs6000/rs6000-protos.h 
> b/gcc/config/rs6000/rs6000-protos.h
> index 3c4682b0e26..cd644083558 100644
> --- a/gcc/config/rs6000/rs6000-protos.h
> +++ b/gcc/config/rs6000/rs6000-protos.h
> @@ -191,6 +191,8 @@ enum non_prefixed_form {
> 
>  extern enum insn_form address_to_insn_form (rtx, machine_mode,
>                                           enum non_prefixed_form);
> +extern bool address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
> +                                    enum non_prefixed_form 
> non_prefix_format);
>  extern bool prefixed_load_p (rtx_insn *);
>  extern bool prefixed_store_p (rtx_insn *);
>  extern bool prefixed_paddi_p (rtx_insn *);
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 517467ebc63..759551d07ec 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4423,6 +4423,12 @@ rs6000_option_override_internal (bool global_init_p)
>    if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0)
>      rs6000_isa_flags |= OPTION_MASK_MMA;
> 
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) 
> == 0)
> +    rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
> +
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & 
> OPTION_MASK_P10_FUSION_LD_CMPI) == 0)
> +    rs6000_isa_flags |= OPTION_MASK_P10_FUSION_LD_CMPI;
> +
>    /* Turn off vector pair/mma options on non-power10 systems.  */
>    else if (!TARGET_POWER10 && TARGET_MMA)
>      {
> @@ -23614,6 +23620,7 @@ static struct rs6000_opt_mask const 
> rs6000_opt_masks[] =
>    { "power9-minmax",         OPTION_MASK_P9_MINMAX,          false, true  },
>    { "power9-misc",           OPTION_MASK_P9_MISC,            false, true  },
>    { "power9-vector",         OPTION_MASK_P9_VECTOR,          false, true  },
> +  { "power10-fusion",                OPTION_MASK_P10_FUSION,         false, 
> true  },
>    { "powerpc-gfxopt",                OPTION_MASK_PPC_GFXOPT,         false, 
> true  },
>    { "powerpc-gpopt",         OPTION_MASK_PPC_GPOPT,          false, true  },
>    { "prefixed",                      OPTION_MASK_PREFIXED,           false, 
> true  },
> @@ -25705,6 +25712,50 @@ address_to_insn_form (rtx addr,
>    return INSN_FORM_BAD;
>  }
> 


ok

> +/* Given address rtx ADDR for a load of MODE, is this legitimate for a
> +   non-prefixed D-form or X-form instruction?  NON_PREFIXED_FORMAT is
> +   given NON_PREFIXED_D or NON_PREFIXED_DS to indicate whether we want
> +   a D-form or DS-form instruction.  X-form and base_reg are always
> +   allowed.  */
> +bool
> +address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
> +                        enum non_prefixed_form non_prefixed_format)
> +{
> +  enum insn_form result_form;
> +
> +  result_form = address_to_insn_form (addr, mode, non_prefixed_format);
> +
> +  switch (non_prefixed_format)
> +    {
> +    case NON_PREFIXED_D:
> +      switch (result_form)
> +     {
> +     case INSN_FORM_X:
> +     case INSN_FORM_D:
> +     case INSN_FORM_DS:
> +     case INSN_FORM_BASE_REG:
> +       return true;
> +     default:
> +       break;
> +     }
> +      break;
> +    case NON_PREFIXED_DS:
> +      switch (result_form)
> +     {
> +     case INSN_FORM_X:
> +     case INSN_FORM_DS:
> +     case INSN_FORM_BASE_REG:
> +       return true;
> +     default:
> +       break;
> +     }
> +      break;
> +    default:
> +      break;
> +    }
> +  return false;
> +}
> +
>  /* Helper function to see if we're potentially looking at lfs/stfs.
>     - PARALLEL containing a SET and a CLOBBER
>     - stfs:


ok

> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index 5bf9c83fc1e..307c0b200bd 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -539,6 +539,7 @@ extern int rs6000_vector_align[];
>  #define MASK_UPDATE                  OPTION_MASK_UPDATE
>  #define MASK_VSX                     OPTION_MASK_VSX
>  #define MASK_POWER10                 OPTION_MASK_POWER10
> +#define MASK_P10_FUSION                      OPTION_MASK_P10_FUSION
> 
>  #ifndef IN_LIBGCC2
>  #define MASK_POWERPC64                       OPTION_MASK_POWERPC64

ok

> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index b89990f46bf..c39b7098978 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -14926,3 +14926,4 @@ (define_insn "*cmpeqb_internal"
>  (include "dfp.md")
>  (include "crypto.md")
>  (include "htm.md")
> +(include "fusion.md")

ok

> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> index 2888172cb27..008a318b98d 100644
> --- a/gcc/config/rs6000/rs6000.opt
> +++ b/gcc/config/rs6000/rs6000.opt
> @@ -479,6 +479,14 @@ mpower8-vector
>  Target Report Mask(P8_VECTOR) Var(rs6000_isa_flags)
>  Use vector and scalar instructions added in ISA 2.07.
> 
> +mpower10-fusion
> +Target Report Mask(P10_FUSION) Var(rs6000_isa_flags)
> +Fuse certain integer operations together for better performance on power10.
> +
> +mpower10-fusion-ld-cmpi
> +Target Undocumented Mask(P10_FUSION_LD_CMPI) Var(rs6000_isa_flags)
> +Fuse certain integer operations together for better performance on power10.
> +
>  mcrypto
>  Target Report Mask(CRYPTO) Var(rs6000_isa_flags)
>  Use ISA 2.07 Category:Vector.AES and Category:Vector.SHA2 instructions.


ok

> diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
> index 1ddb5729cb2..bcc71a9e21b 100644
> --- a/gcc/config/rs6000/t-rs6000
> +++ b/gcc/config/rs6000/t-rs6000
> @@ -47,6 +47,9 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
>       $(COMPILE) $<
>       $(POSTCOMPILE)
> 
> +$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
> +     $(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md
> +
>  $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh 
> \
>    $(srcdir)/config/rs6000/rs6000-cpus.def
>       $(SHELL) $(srcdir)/config/rs6000/genopt.sh $(srcdir)/config/rs6000 > \
> @@ -86,4 +89,5 @@ MD_INCLUDES = $(srcdir)/config/rs6000/rs64.md \
>       $(srcdir)/config/rs6000/mma.md \
>       $(srcdir)/config/rs6000/crypto.md \
>       $(srcdir)/config/rs6000/htm.md \
> -     $(srcdir)/config/rs6000/dfp.md
> +     $(srcdir)/config/rs6000/dfp.md \
> +     $(srcdir)/config/rs6000/fusion.md


ok.


Reply via email to