Ping. I assume we’re going to want a separate patch for the new instruction type.
Aaron Sawdey, Ph.D. saw...@linux.ibm.com IBM Linux on POWER Toolchain > On Dec 4, 2020, at 1:19 PM, acsaw...@linux.ibm.com wrote: > > From: Aaron Sawdey <acsaw...@linux.ibm.com> > > This patch adds the first batch of patterns to support p10 fusion. These > will allow combine to create a single insn for a pair of instructions > that that power10 can fuse and execute. These particular ones have the > requirement that only cr0 can be used when fusing a load with a compare > immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine > to put that requirement in, and if it doesn't work out later the splitter > can get used. > > The patterns are generated by a script genfusion.pl and live in new file > fusion.md. This script will be expanded to generate more patterns for > fusion. > > This also adds option -mpower10-fusion which defaults on for power10 and > will gate all these fusion patterns. In addition I have added an > undocumented option -mpower10-fusion-ld-cmpi (which may be removed later) > that just controls the load+compare-immediate patterns. I have make > these default on for power10 but they are not disallowed for earlier > processors because it is still valid code. This allows us to test the > correctness of fusion code generation by turning it on explicitly. > > If bootstrap/regtest is clean, ok for trunk? > > Thanks! > > Aaron > > gcc/ChangeLog: > > * config/rs6000/genfusion.pl: New file, script to generate > define_insn_and_split patterns so combine can arrange fused > instructions next to each other. > * config/rs6000/fusion.md: New file, generated fused instruction > patterns for combine. > * config/rs6000/predicates.md (const_m1_to_1_operand): New predicate. > (non_update_memory_operand): New predicate. > * config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and > OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and > POWERPC_MASKS. > * config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add > prototype. > * config/rs6000/rs6000.c (rs6000_option_override_internal): > automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi > if target is power10. (rs600_opt_masks): Allow -mpower10-fusion > in function attributes. (address_is_non_pfx_d_or_x): New function. > * config/rs6000/rs6000.h: Add MASK_P10_FUSION. > * config/rs6000/rs6000.md: Include fusion.md. > * config/rs6000/rs6000.opt: Add -mpower10-fusion > and -mpower10-fusion-ld-cmpi. > * config/rs6000/t-rs6000: Add dependencies involving fusion.md. > --- > gcc/config/rs6000/fusion.md | 357 ++++++++++++++++++++++++++++++ > gcc/config/rs6000/genfusion.pl | 144 ++++++++++++ > gcc/config/rs6000/predicates.md | 14 ++ > gcc/config/rs6000/rs6000-cpus.def | 6 +- > gcc/config/rs6000/rs6000-protos.h | 2 + > gcc/config/rs6000/rs6000.c | 51 +++++ > gcc/config/rs6000/rs6000.h | 1 + > gcc/config/rs6000/rs6000.md | 1 + > gcc/config/rs6000/rs6000.opt | 8 + > gcc/config/rs6000/t-rs6000 | 6 +- > 10 files changed, 588 insertions(+), 2 deletions(-) > create mode 100644 gcc/config/rs6000/fusion.md > create mode 100755 gcc/config/rs6000/genfusion.pl > > diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md > new file mode 100644 > index 00000000000..a4d3a6ae7f3 > --- /dev/null > +++ b/gcc/config/rs6000/fusion.md > @@ -0,0 +1,357 @@ > +;; -*- buffer-read-only: t -*- > +;; Generated automatically by genfusion.pl > + > +;; Copyright (C) 2020 Free Software Foundation, Inc. > +;; > +;; This file is part of GCC. > +;; > +;; GCC is free software; you can redistribute it and/or modify it under > +;; the terms of the GNU General Public License as published by the Free > +;; Software Foundation; either version 3, or (at your option) any later > +;; version. > +;; > +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY > +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or > +;; FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > +;; for more details. > +;; > +;; You should have received a copy of the GNU General Public License > +;; along with GCC; see the file COPYING3. If not see > +;; <http://www.gnu.org/licenses/>. > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is DI result mode is clobber compare mode is CC extend is none > +(define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" > + [(set (match_operand:CC 2 "cc_reg_operand" "=x") > + (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m") > + (match_operand:DI 3 "const_m1_to_1_operand" "n"))) > + (clobber (match_scratch:DI 0 "=r"))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "ld%X1 %0,%1\;cmpdi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, > NON_PREFIXED_DS))" > + [(set (match_dup 0) (match_dup 1)) > + (set (match_dup 2) > + (compare:CC (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is DI result mode is clobber compare mode is CCUNS extend is > none > +(define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" > + [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > + (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m") > + (match_operand:DI 3 "const_0_to_1_operand" "n"))) > + (clobber (match_scratch:DI 0 "=r"))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "ld%X1 %0,%1\;cmpldi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, > NON_PREFIXED_DS))" > + [(set (match_dup 0) (match_dup 1)) > + (set (match_dup 2) > + (compare:CCUNS (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is DI result mode is DI compare mode is CC extend is none > +(define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" > + [(set (match_operand:CC 2 "cc_reg_operand" "=x") > + (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m") > + (match_operand:DI 3 "const_m1_to_1_operand" "n"))) > + (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "ld%X1 %0,%1\;cmpdi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, > NON_PREFIXED_DS))" > + [(set (match_dup 0) (match_dup 1)) > + (set (match_dup 2) > + (compare:CC (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is DI result mode is DI compare mode is CCUNS extend is none > +(define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" > + [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > + (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m") > + (match_operand:DI 3 "const_0_to_1_operand" "n"))) > + (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "ld%X1 %0,%1\;cmpldi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, > NON_PREFIXED_DS))" > + [(set (match_dup 0) (match_dup 1)) > + (set (match_dup 2) > + (compare:CCUNS (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is SI result mode is clobber compare mode is CC extend is none > +(define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none" > + [(set (match_operand:CC 2 "cc_reg_operand" "=x") > + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m") > + (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > + (clobber (match_scratch:SI 0 "=r"))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lwa%X1 %0,%1\;cmpdi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, > NON_PREFIXED_DS))" > + [(set (match_dup 0) (match_dup 1)) > + (set (match_dup 2) > + (compare:CC (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is SI result mode is clobber compare mode is CCUNS extend is > none > +(define_insn_and_split "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none" > + [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > + (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m") > + (match_operand:SI 3 "const_0_to_1_operand" "n"))) > + (clobber (match_scratch:SI 0 "=r"))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lwz%X1 %0,%1\;cmpldi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, > NON_PREFIXED_D))" > + [(set (match_dup 0) (match_dup 1)) > + (set (match_dup 2) > + (compare:CCUNS (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is SI result mode is SI compare mode is CC extend is none > +(define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none" > + [(set (match_operand:CC 2 "cc_reg_operand" "=x") > + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m") > + (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > + (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lwa%X1 %0,%1\;cmpdi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, > NON_PREFIXED_DS))" > + [(set (match_dup 0) (match_dup 1)) > + (set (match_dup 2) > + (compare:CC (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is SI result mode is SI compare mode is CCUNS extend is none > +(define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none" > + [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > + (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m") > + (match_operand:SI 3 "const_0_to_1_operand" "n"))) > + (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lwz%X1 %0,%1\;cmpldi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, > NON_PREFIXED_D))" > + [(set (match_dup 0) (match_dup 1)) > + (set (match_dup 2) > + (compare:CCUNS (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is SI result mode is EXTSI compare mode is CC extend is sign > +(define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" > + [(set (match_operand:CC 2 "cc_reg_operand" "=x") > + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m") > + (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > + (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI > (match_dup 1)))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lwa%X1 %0,%1\;cmpdi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, > NON_PREFIXED_DS))" > + [(set (match_dup 0) (sign_extend:EXTSI (match_dup 1))) > + (set (match_dup 2) > + (compare:CC (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is SI result mode is EXTSI compare mode is CCUNS extend is zero > +(define_insn_and_split "*lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero" > + [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > + (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m") > + (match_operand:SI 3 "const_0_to_1_operand" "n"))) > + (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (zero_extend:EXTSI > (match_dup 1)))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lwz%X1 %0,%1\;cmpldi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, > NON_PREFIXED_D))" > + [(set (match_dup 0) (zero_extend:EXTSI (match_dup 1))) > + (set (match_dup 2) > + (compare:CCUNS (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is HI result mode is clobber compare mode is CC extend is sign > +(define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign" > + [(set (match_operand:CC 2 "cc_reg_operand" "=x") > + (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m") > + (match_operand:HI 3 "const_m1_to_1_operand" "n"))) > + (clobber (match_scratch:GPR 0 "=r"))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lha%X1 %0,%1\;cmpdi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, > NON_PREFIXED_D))" > + [(set (match_dup 0) (sign_extend:GPR (match_dup 1))) > + (set (match_dup 2) > + (compare:CC (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is HI result mode is clobber compare mode is CCUNS extend is > zero > +(define_insn_and_split "*lhz_cmpldi_cr0_HI_clobber_CCUNS_zero" > + [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > + (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m") > + (match_operand:HI 3 "const_0_to_1_operand" "n"))) > + (clobber (match_scratch:GPR 0 "=r"))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lhz%X1 %0,%1\;cmpldi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, > NON_PREFIXED_D))" > + [(set (match_dup 0) (zero_extend:GPR (match_dup 1))) > + (set (match_dup 2) > + (compare:CCUNS (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is HI result mode is EXTHI compare mode is CC extend is sign > +(define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign" > + [(set (match_operand:CC 2 "cc_reg_operand" "=x") > + (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m") > + (match_operand:HI 3 "const_m1_to_1_operand" "n"))) > + (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (sign_extend:EXTHI > (match_dup 1)))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lha%X1 %0,%1\;cmpdi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, > NON_PREFIXED_D))" > + [(set (match_dup 0) (sign_extend:EXTHI (match_dup 1))) > + (set (match_dup 2) > + (compare:CC (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is HI result mode is EXTHI compare mode is CCUNS extend is zero > +(define_insn_and_split "*lhz_cmpldi_cr0_HI_EXTHI_CCUNS_zero" > + [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > + (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m") > + (match_operand:HI 3 "const_0_to_1_operand" "n"))) > + (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (zero_extend:EXTHI > (match_dup 1)))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lhz%X1 %0,%1\;cmpldi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, > NON_PREFIXED_D))" > + [(set (match_dup 0) (zero_extend:EXTHI (match_dup 1))) > + (set (match_dup 2) > + (compare:CCUNS (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is QI result mode is clobber compare mode is CCUNS extend is > zero > +(define_insn_and_split "*lbz_cmpldi_cr0_QI_clobber_CCUNS_zero" > + [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > + (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m") > + (match_operand:QI 3 "const_0_to_1_operand" "n"))) > + (clobber (match_scratch:GPR 0 "=r"))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lbz%X1 %0,%1\;cmpldi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, > NON_PREFIXED_D))" > + [(set (match_dup 0) (zero_extend:GPR (match_dup 1))) > + (set (match_dup 2) > + (compare:CCUNS (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > +;; load mode is QI result mode is GPR compare mode is CCUNS extend is zero > +(define_insn_and_split "*lbz_cmpldi_cr0_QI_GPR_CCUNS_zero" > + [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > + (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m") > + (match_operand:QI 3 "const_0_to_1_operand" "n"))) > + (set (match_operand:GPR 0 "gpc_reg_operand" "=r") (zero_extend:GPR > (match_dup 1)))] > + "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)" > + "lbz%X1 %0,%1\;cmpldi 0,%0,%3" > + "&& reload_completed > + && (cc_reg_not_cr0_operand (operands[2], CCmode) > + || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, > NON_PREFIXED_D))" > + [(set (match_dup 0) (zero_extend:GPR (match_dup 1))) > + (set (match_dup 2) > + (compare:CCUNS (match_dup 0) > + (match_dup 3)))] > + "" > + [(set_attr "type" "load") > + (set_attr "cost" "8") > + (set_attr "length" "8")]) > + > diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl > new file mode 100755 > index 00000000000..494537c9439 > --- /dev/null > +++ b/gcc/config/rs6000/genfusion.pl > @@ -0,0 +1,144 @@ > +#!/usr/bin/perl -w > +# Generate fusion.md > +# Copyright (C) 2020 Free Software Foundation, Inc. > +# > +# This file is part of GCC. > +# > +# GCC is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3, or (at your option) > +# any later version. > +# > +# GCC is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with GCC; see the file COPYING3. If not see > +# <http://www.gnu.org/licenses/>. > + > +my $copyright = <<'EOF'; > +;; -*- buffer-read-only: t -*- > +;; Generated automatically by genfusion.pl > + > +;; Copyright (C) 2020 Free Software Foundation, Inc. > +;; > +;; This file is part of GCC. > +;; > +;; GCC is free software; you can redistribute it and/or modify it under > +;; the terms of the GNU General Public License as published by the Free > +;; Software Foundation; either version 3, or (at your option) any later > +;; version. > +;; > +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY > +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or > +;; FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > +;; for more details. > +;; > +;; You should have received a copy of the GNU General Public License > +;; along with GCC; see the file COPYING3. If not see > +;; <http://www.gnu.org/licenses/>. > + > +EOF > + > +print $copyright; > + > +sub mode_to_ldst_char > +{ > + my ($mode) = @_; > + if ($mode eq 'DI') { return 'd'; } > + if ($mode eq 'SI') { return 'w'; } > + if ($mode eq 'HI') { return 'h'; } > + if ($mode eq 'QI') { return 'b'; } > + return '?'; > +} > + > +sub gen_ld_cmpi_p10 > +{ > + LMODE: foreach $lmode ('DI','SI','HI','QI') { > + $ldst = mode_to_ldst_char($lmode); > + $clobbermode = $lmode; > + # For clobber, we need a SI/DI reg in case we split because we have to > sign/zero extend. > + if ( $lmode eq 'HI' || $lmode eq 'QI' ) { $clobbermode = "GPR"; } > + RESULT: foreach $result ('clobber', $lmode, "EXT".$lmode) { > + # EXTDI does not exist, and we cannot directly produce HI/QI results. > + next RESULT if $result eq "EXTDI" || $result eq "HI" || $result eq "QI"; > + # Don't allow EXTQI because that would allow HI result which we can't > do. > + if ( $result eq "EXTQI" ) { $result = "GPR"; } > + CCMODE: foreach $ccmode ('CC','CCUNS') { > + $np = "NON_PREFIXED_D"; > + if ( $ccmode eq 'CC' ) { > + next CCMODE if $lmode eq 'QI'; > + if ( $lmode eq 'DI' || $lmode eq 'SI' ) { > + # ld and lwa are both DS-FORM. > + $np = "NON_PREFIXED_DS"; > + } > + $cmpl = ""; > + $echr = "a"; > + $constpred = "const_m1_to_1_operand"; > + } else { > + if ( $lmode eq 'DI' ) { > + # ld is DS-form, but lwz is not. > + $np = "NON_PREFIXED_DS"; > + } > + $cmpl = "l"; > + $echr = "z"; > + $constpred = "const_0_to_1_operand"; > + } > + if ($lmode eq 'DI') { $echr = ""; } > + if ($result =~ m/EXT/ || $result eq 'GPR' || $clobbermode eq 'GPR') { > + # We always need extension if result > lmode. > + if ( $ccmode eq 'CC' ) { > + $extend = "sign"; > + } else { > + $extend = "zero"; > + } > + } else { > + # Result of SI/DI does not need sign extension. > + $extend = "none"; > + } > + print ";; load-cmpi fusion pattern generated by gen_ld_cmpi_p10\n"; > + print ";; load mode is $lmode result mode is $result compare mode is > $ccmode extend is $extend\n"; > + > + print "(define_insn_and_split > \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n"; > + print " [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" > \"=x\")\n"; > + print " (compare:${ccmode} (match_operand:${lmode} 1 > \"non_update_memory_operand\" \"m\")\n"; > + print " (match_operand:${lmode} 3 \"${constpred}\" > \"n\")))\n"; > + if ($result eq 'clobber') { > + print " (clobber (match_scratch:${clobbermode} 0 \"=r\"))]\n"; > + } elsif ($result eq $lmode) { > + print " (set (match_operand:${result} 0 \"gpc_reg_operand\" > \"=r\") (match_dup 1))]\n"; > + } else { > + print " (set (match_operand:${result} 0 \"gpc_reg_operand\" > \"=r\") (${extend}_extend:${result} (match_dup 1)))]\n"; > + } > + print " \"(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)\"\n"; > + print " \"l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di 0,%0,%3\"\n"; > + print " \"&& reload_completed\n"; > + print " && (cc_reg_not_cr0_operand (operands[2], CCmode)\n"; > + print " || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), > ${lmode}mode, ${np}))\"\n"; > + if ($extend eq "none") { > + print " [(set (match_dup 0) (match_dup 1))\n"; > + } else { > + $resultmode = $result; > + if ( $result eq 'clobber' ) { $resultmode = $clobbermode } > + print " [(set (match_dup 0) (${extend}_extend:${resultmode} > (match_dup 1)))\n"; > + } > + print " (set (match_dup 2)\n"; > + print " (compare:${ccmode} (match_dup 0)\n"; > + print " (match_dup 3)))]\n"; > + print " \"\"\n"; > + print " [(set_attr \"type\" \"load\")\n"; > + print " (set_attr \"cost\" \"8\")\n"; > + print " (set_attr \"length\" \"8\")])\n"; > + print "\n"; > + } > + } > + } > +} > + > + > +gen_ld_cmpi_p10(); > + > +exit(0); > + > diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md > index 9ad5ae67302..78de8102f44 100644 > --- a/gcc/config/rs6000/predicates.md > +++ b/gcc/config/rs6000/predicates.md > @@ -297,6 +297,11 @@ (define_predicate "const_0_to_1_operand" > (and (match_code "const_int") > (match_test "IN_RANGE (INTVAL (op), 0, 1)"))) > > +;; Match op = -1, op = 0, or op = 1. > +(define_predicate "const_m1_to_1_operand" > + (and (match_code "const_int") > + (match_test "IN_RANGE (INTVAL (op), -1, 1)"))) > + > ;; Match op = 0..3. > (define_predicate "const_0_to_3_operand" > (and (match_code "const_int") > @@ -847,6 +852,15 @@ (define_special_predicate "update_address_mem" > || GET_CODE (XEXP (op, 0)) == PRE_DEC > || GET_CODE (XEXP (op, 0)) == PRE_MODIFY))")) > > +;; Anything that matches memory_operand but does not update the address. > +(define_predicate "non_update_memory_operand" > + (match_code "mem") > +{ > + if (update_address_mem (op, mode)) > + return 0; > + return memory_operand (op, mode); > +}) > + > ;; Return 1 if the operand is a MEM with an indexed-form address. > (define_special_predicate "indexed_address_mem" > (match_test "(MEM_P (op) > diff --git a/gcc/config/rs6000/rs6000-cpus.def > b/gcc/config/rs6000/rs6000-cpus.def > index 8d2c1ffd6cf..3e65289d8df 100644 > --- a/gcc/config/rs6000/rs6000-cpus.def > +++ b/gcc/config/rs6000/rs6000-cpus.def > @@ -82,7 +82,9 @@ > > #define ISA_3_1_MASKS_SERVER (ISA_3_0_MASKS_SERVER \ > | OPTION_MASK_POWER10 \ > - | OTHER_POWER10_MASKS) > + | OTHER_POWER10_MASKS \ > + | OPTION_MASK_P10_FUSION \ > + | OPTION_MASK_P10_FUSION_LD_CMPI) > > /* Flags that need to be turned off if -mno-power9-vector. */ > #define OTHER_P9_VECTOR_MASKS (OPTION_MASK_FLOAT128_HW \ > @@ -129,6 +131,8 @@ > | OPTION_MASK_FLOAT128_KEYWORD \ > | OPTION_MASK_FPRND \ > | OPTION_MASK_POWER10 \ > + | OPTION_MASK_P10_FUSION \ > + | OPTION_MASK_P10_FUSION_LD_CMPI \ > | OPTION_MASK_HTM \ > | OPTION_MASK_ISEL \ > | OPTION_MASK_MFCRF \ > diff --git a/gcc/config/rs6000/rs6000-protos.h > b/gcc/config/rs6000/rs6000-protos.h > index 3c4682b0e26..cd644083558 100644 > --- a/gcc/config/rs6000/rs6000-protos.h > +++ b/gcc/config/rs6000/rs6000-protos.h > @@ -191,6 +191,8 @@ enum non_prefixed_form { > > extern enum insn_form address_to_insn_form (rtx, machine_mode, > enum non_prefixed_form); > +extern bool address_is_non_pfx_d_or_x (rtx addr, machine_mode mode, > + enum non_prefixed_form > non_prefix_format); > extern bool prefixed_load_p (rtx_insn *); > extern bool prefixed_store_p (rtx_insn *); > extern bool prefixed_paddi_p (rtx_insn *); > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index 517467ebc63..759551d07ec 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -4423,6 +4423,12 @@ rs6000_option_override_internal (bool global_init_p) > if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0) > rs6000_isa_flags |= OPTION_MASK_MMA; > > + if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) > == 0) > + rs6000_isa_flags |= OPTION_MASK_P10_FUSION; > + > + if (TARGET_POWER10 && (rs6000_isa_flags_explicit & > OPTION_MASK_P10_FUSION_LD_CMPI) == 0) > + rs6000_isa_flags |= OPTION_MASK_P10_FUSION_LD_CMPI; > + > /* Turn off vector pair/mma options on non-power10 systems. */ > else if (!TARGET_POWER10 && TARGET_MMA) > { > @@ -23614,6 +23620,7 @@ static struct rs6000_opt_mask const > rs6000_opt_masks[] = > { "power9-minmax", OPTION_MASK_P9_MINMAX, false, true }, > { "power9-misc", OPTION_MASK_P9_MISC, false, true }, > { "power9-vector", OPTION_MASK_P9_VECTOR, false, true }, > + { "power10-fusion", OPTION_MASK_P10_FUSION, false, > true }, > { "powerpc-gfxopt", OPTION_MASK_PPC_GFXOPT, false, true }, > { "powerpc-gpopt", OPTION_MASK_PPC_GPOPT, false, true }, > { "prefixed", OPTION_MASK_PREFIXED, false, > true }, > @@ -25705,6 +25712,50 @@ address_to_insn_form (rtx addr, > return INSN_FORM_BAD; > } > > +/* Given address rtx ADDR for a load of MODE, is this legitimate for a > + non-prefixed D-form or X-form instruction? NON_PREFIXED_FORMAT is > + given NON_PREFIXED_D or NON_PREFIXED_DS to indicate whether we want > + a D-form or DS-form instruction. X-form and base_reg are always > + allowed. */ > +bool > +address_is_non_pfx_d_or_x (rtx addr, machine_mode mode, > + enum non_prefixed_form non_prefixed_format) > +{ > + enum insn_form result_form; > + > + result_form = address_to_insn_form (addr, mode, non_prefixed_format); > + > + switch (non_prefixed_format) > + { > + case NON_PREFIXED_D: > + switch (result_form) > + { > + case INSN_FORM_X: > + case INSN_FORM_D: > + case INSN_FORM_DS: > + case INSN_FORM_BASE_REG: > + return true; > + default: > + break; > + } > + break; > + case NON_PREFIXED_DS: > + switch (result_form) > + { > + case INSN_FORM_X: > + case INSN_FORM_DS: > + case INSN_FORM_BASE_REG: > + return true; > + default: > + break; > + } > + break; > + default: > + break; > + } > + return false; > +} > + > /* Helper function to see if we're potentially looking at lfs/stfs. > - PARALLEL containing a SET and a CLOBBER > - stfs: > diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h > index 5bf9c83fc1e..307c0b200bd 100644 > --- a/gcc/config/rs6000/rs6000.h > +++ b/gcc/config/rs6000/rs6000.h > @@ -539,6 +539,7 @@ extern int rs6000_vector_align[]; > #define MASK_UPDATE OPTION_MASK_UPDATE > #define MASK_VSX OPTION_MASK_VSX > #define MASK_POWER10 OPTION_MASK_POWER10 > +#define MASK_P10_FUSION OPTION_MASK_P10_FUSION > > #ifndef IN_LIBGCC2 > #define MASK_POWERPC64 OPTION_MASK_POWERPC64 > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index b89990f46bf..c39b7098978 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -14926,3 +14926,4 @@ (define_insn "*cmpeqb_internal" > (include "dfp.md") > (include "crypto.md") > (include "htm.md") > +(include "fusion.md") > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt > index 2888172cb27..008a318b98d 100644 > --- a/gcc/config/rs6000/rs6000.opt > +++ b/gcc/config/rs6000/rs6000.opt > @@ -479,6 +479,14 @@ mpower8-vector > Target Report Mask(P8_VECTOR) Var(rs6000_isa_flags) > Use vector and scalar instructions added in ISA 2.07. > > +mpower10-fusion > +Target Report Mask(P10_FUSION) Var(rs6000_isa_flags) > +Fuse certain integer operations together for better performance on power10. > + > +mpower10-fusion-ld-cmpi > +Target Undocumented Mask(P10_FUSION_LD_CMPI) Var(rs6000_isa_flags) > +Fuse certain integer operations together for better performance on power10. > + > mcrypto > Target Report Mask(CRYPTO) Var(rs6000_isa_flags) > Use ISA 2.07 Category:Vector.AES and Category:Vector.SHA2 instructions. > diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000 > index 1ddb5729cb2..bcc71a9e21b 100644 > --- a/gcc/config/rs6000/t-rs6000 > +++ b/gcc/config/rs6000/t-rs6000 > @@ -47,6 +47,9 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c > $(COMPILE) $< > $(POSTCOMPILE) > > +$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl > + $(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md > + > $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \ > $(srcdir)/config/rs6000/rs6000-cpus.def > $(SHELL) $(srcdir)/config/rs6000/genopt.sh $(srcdir)/config/rs6000 > \ > @@ -86,4 +89,5 @@ MD_INCLUDES = $(srcdir)/config/rs6000/rs64.md \ > $(srcdir)/config/rs6000/mma.md \ > $(srcdir)/config/rs6000/crypto.md \ > $(srcdir)/config/rs6000/htm.md \ > - $(srcdir)/config/rs6000/dfp.md > + $(srcdir)/config/rs6000/dfp.md \ > + $(srcdir)/config/rs6000/fusion.md > -- > 2.27.0 >