Hi Carl,

on 2023/7/12 02:06, Carl Love wrote:
> GCC maintainers:
> 
> Ver 4, Removed extra space in subject line.  Added comment to commit
> log comments about new __SET_FPSCR_RN_RETURNS_FPSCR__ define.  Changed
> Added to Add and Renamed to Rename in ChangeLog.  Updated define_expand
> "rs6000_set_fpscr_rn" per Peter's comments to use new temporary
> register for output value.  Also, comments from Kewen about moving rtx
> tmp_di1 close to use.  Renamed tmp_di2 as orig_df_in_di.  Additionally,
> changed the name of tmp_di3 to tmp_di2 so the numbering is
> sequential.  Moved the new rtx tmp_di2 = gen_reg_rtx (DImode); right
> before its use to be consistent with previous move request.  Fixed tabs
> in comment.  Remove -std=c99 from test_fpscr_rn_builtin_1.c. Cleaned up
> comment and removed abort from test_fpscr_rn_builtin_2.c.  
> 
> Fixed a couple of additional issues with the ChangeLog per feedback
> from git gcc-verify.
> 
> Retested updated patch on Power 8, 9 and 10 to verify changes.
> 
> Ver 3, Renamed the patch per comments on ver 2.  Previous subject line
> was " [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value".  
> Fixed spelling mistakes and formatting.  Updated define_expand
> "rs6000_set_fpscr_rn to have the rs6000_get_fpscr_fields and
> rs6000_update_fpscr_rn_field define expands inlined.  Optimized the
> code and fixed use of temporary register values. Updated the test file
> dg-do run arguments and dg-options.  Removed the check for
> __SET_FPSCR_RN_RETURNS_FPSCR__. Removed additional references to the
> overloaded built-in with double argument.  Fixed up the documentation
> file.  Updated patch retested on Power 8 BE/LE, Power 9 BE/LE and Power
> 10 LE.
> 
> Ver 2,  Went back thru the requirements and emails.  Not sure where I
> came up with the requirement for an overloaded version with double
> argument.  Removed the overloaded version with the double argument. 
> Added the macro to announce if the __builtin_set_fpscr_rn returns a
> void or a double with the FPSCR bits.  Updated the documentation file. 
> Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE.  Redid the test
> file.  Per request, the original test file functionality was not
> changed.  Just changed the name from test_fpscr_rn_builtin.c to 
> test_fpscr_rn_builtin_1.c.  Put new tests for the return values into a
> new test file, test_fpscr_rn_builtin_2.c.
> 
> The GLibC team requested a builtin to replace the mffscrn and
> mffscrniinline asm instructions in the GLibC code.  Previously there
> was discussion on adding builtins for the mffscrn instructions.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
> 
> In the end, it was felt that it would be to extend the existing
> __builtin_set_fpscr_rn builtin to return a double instead of a void
> type.  The desire is that we could have the functionality of the
> mffscrn and mffscrni instructions on older ISAs.  The two instructions
> were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has the
> needed functionality to set the RN field using the mffscrn and mffscrni
> instructions if ISA 3.0 is supported or fall back to using logical
> instructions to mask and set the bits for earlier ISAs.  The
> instructions return the current value of the FPSCR fields DRN, VE, OE,
> UE, ZE, XE, NI, RN bit positions then update the RN bit positions with
> the new RN value provided.
> 
> The current __builtin_set_fpscr_rn builtin has a return type of void. 
> So, changing the return type to double and returning the  FPSCR fields
> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
> functionally equivalent of the mffscrn and mffscrni instructions.  Any
> current uses of the builtin would just ignore the return value yet any
> new uses could use the return value.  So the requirement is for the
> change to the __builtin_set_fpscr_rn builtin to be backwardly
> compatible and work for all ISAs.
> 
> The following patch changes the return type of the
>  __builtin_set_fpscr_rn builtin from void to double.  The return value
> is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
> XE, NI, RN bit positions when the builtin is called.  The builtin then
> updated the RN field with the new value provided as an argument to the
> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
> check that the builtin returns the current value of the FPSCR fields
> and then updates the RN field.
> 
> The GLibC team has reviewed the patch to make sure it met their needs
> as a drop in replacement for the inline asm mffscr and mffscrni
> statements in the GLibC code.  T
> 
> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
> LE.
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
>                                Carl 
> 
> -----------------------------------------
> rs6000, Add return value to __builtin_set_fpscr_rn
> 
> Change the return value from void to double for __builtin_set_fpscr_rn.
> The return value consists of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI,
> RN bit positions.  A new test file, test powerpc/test_fpscr_rn_builtin_2.c,
> is added to test the new return value for the built-in.
> 
> The value __SET_FPSCR_RN_RETURNS_FPSCR__ is defined if
> __builtin_set_fpscr_rn returns a double.
> 
> gcc/ChangeLog:
>       * config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Update
>       built-in definition return type.
>       * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Add check,
>       define __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
>       * config/rs6000/rs6000.md (rs6000_set_fpscr_rn): Add return
>       argument to return FPSCR fields.
>       * doc/extend.texi (__builtin_set_fpscr_rn): Update description for
>       the return value.  Add description for
>       __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
> 
> gcc/testsuite/ChangeLog:
>       * gcc.target/powerpc/test_fpscr_rn_builtin.c: Rename to
>       test_fpscr_rn_builtin_1.c.  Add comment.
>       * gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
>       return value of __builtin_set_fpscr_rn builtin.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |   2 +-
>  gcc/config/rs6000/rs6000-c.cc                 |   5 +
>  gcc/config/rs6000/rs6000.md                   |  55 ++++---
>  gcc/doc/extend.texi                           |  23 ++-
>  ...rn_builtin.c => test_fpscr_rn_builtin_1.c} |  12 +-
>  .../powerpc/test_fpscr_rn_builtin_2.c         | 148 ++++++++++++++++++
>  6 files changed, 213 insertions(+), 32 deletions(-)
>  rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c => 
> test_fpscr_rn_builtin_1.c} (89%)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 289a37998b1..28788b69c7d 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -237,7 +237,7 @@
>    const __ibm128 __builtin_pack_ibm128 (double, double);
>      PACK_IF packif {ibm128}
>  
> -  void __builtin_set_fpscr_rn (const int[0,3]);
> +  double __builtin_set_fpscr_rn (const int[0,3]);
>      SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
>  
>    const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
> diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
> index 8555174d36e..07a32badb54 100644
> --- a/gcc/config/rs6000/rs6000-c.cc
> +++ b/gcc/config/rs6000/rs6000-c.cc
> @@ -604,6 +604,11 @@ rs6000_target_modify_macros (bool define_p, 
> HOST_WIDE_INT flags)
>    /* Tell the user -mrop-protect is in play.  */
>    if (rs6000_rop_protect)
>      rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
> +  /* Tell the user __builtin_set_fpscr_rn now returns the FPSCR fields
> +     in a double.  Originally the built-in returned void.  */
> +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
> +    rs6000_define_or_undefine_macro (define_p,
> +                                  "__SET_FPSCR_RN_RETURNS_FPSCR__");
>  }
>  
>  void
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index b0db8ae508d..641ab2dfcfe 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -6441,7 +6441,8 @@ (define_insn "rs6000_mffscdrn"
>    [(set_attr "type" "fp")])
>  
>  (define_expand "rs6000_set_fpscr_rn"
> - [(match_operand:SI 0 "reg_or_cint_operand")]
> + [(use (match_operand:DF 0 "gpc_reg_operand"))
> +  (use (match_operand:SI 1 "reg_or_cint_operand"))]
>    "TARGET_HARD_FLOAT"
>  {
>    rtx tmp_df = gen_reg_rtx (DFmode);
> @@ -6450,49 +6451,63 @@ (define_expand "rs6000_set_fpscr_rn"
>       new rounding mode bits from operands[0][62:63] into FPSCR[62:63].  */
>    if (TARGET_P9_MISC)
>      {
> -      if (const_0_to_3_operand (operands[0], VOIDmode))
> -     emit_insn (gen_rs6000_mffscrni (tmp_df, operands[0]));
> +      if (const_0_to_3_operand (operands[1], VOIDmode))
> +     emit_insn (gen_rs6000_mffscrni (tmp_df, operands[1]));
>        else
>       {
> -       rtx op0 = convert_to_mode (DImode, operands[0], false);
> -       rtx src_df = simplify_gen_subreg (DFmode, op0, DImode, 0);
> +       rtx op1 = convert_to_mode (DImode, operands[1], false);
> +       rtx src_df = simplify_gen_subreg (DFmode, op1, DImode, 0);
>         emit_insn (gen_rs6000_mffscrn (tmp_df, src_df));
>       }
> -      DONE;
> +       emit_move_insn (operands[0], tmp_df);
> +       DONE;
>      }
>  
> -  if (CONST_INT_P (operands[0]))
> +  /* Emulate the behavior of the mffscrni, mffscrn instructions for earlier
> +     ISAs.  Return bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
> +     RN) from the FPSCR.  Set the RN field based on the value in operands[1].
> +  */
> +
> +  /* Get the current FPSCR fields, bits 29:31 (DRN) and bits 56:63 (VE, OE, 
> UE,
> +    ZE, XE, NI, RN) from the FPSCR and return them.  */
> +
> +  emit_insn (gen_rs6000_mffs (tmp_df));
> +  rtx orig_df_in_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> +  rtx tmp_di1 = gen_reg_rtx (DImode);
> +  emit_insn (gen_anddi3 (tmp_di1, orig_df_in_di,
> +          GEN_INT (0x00000007000000FFULL)));

Nit: The line with GEN_INT should indent more to tmp_di1, that is:

  emit_insn (gen_anddi3 (tmp_di1, orig_df_in_di,
                         GEN_INT (0x00000007000000FFULL)));

> +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di1, DImode, 0);
> +  emit_move_insn (operands[0], tmp_rtn);
> +
> +  if (CONST_INT_P (operands[1]))
>      {
> -      if ((INTVAL (operands[0]) & 0x1) == 0x1)
> +      if ((INTVAL (operands[1]) & 0x1) == 0x1)
>       emit_insn (gen_rs6000_mtfsb1 (GEN_INT (31)));
>        else
>       emit_insn (gen_rs6000_mtfsb0 (GEN_INT (31)));
>  
> -      if ((INTVAL (operands[0]) & 0x2) == 0x2)
> +      if ((INTVAL (operands[1]) & 0x2) == 0x2)
>       emit_insn (gen_rs6000_mtfsb1 (GEN_INT (30)));
>        else
>       emit_insn (gen_rs6000_mtfsb0 (GEN_INT (30)));
>      }
>    else
>      {
> -      rtx tmp_rn = gen_reg_rtx (DImode);
> -      rtx tmp_di = gen_reg_rtx (DImode);
> -
>        /* Extract new RN mode from operand.  */
> -      rtx op0 = convert_to_mode (DImode, operands[0], false);
> -      emit_insn (gen_anddi3 (tmp_rn, op0, GEN_INT (3)));
> +      rtx op1 = convert_to_mode (DImode, operands[1], false);
> +      rtx tmp_rn = gen_reg_rtx (DImode);
> +      emit_insn (gen_anddi3 (tmp_rn, op1, GEN_INT (3)));
>  
> -      /* Insert new RN mode into FSCPR.  */
> -      emit_insn (gen_rs6000_mffs (tmp_df));
> -      tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> -      emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
> -      emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
> +      /* Insert the new RN value from tmp_rn into FPSCR bit [62:63].  */
> +      emit_insn (gen_anddi3 (tmp_di1, orig_df_in_di, GEN_INT (-4)));

Nit: This tmp_di1 has been used as destination pseudo above, as Peter's
comments, it's better to use one new pseudo here.

OK for trunk with these nits fixed and bootstrapped & regress-tested
as before.  Thanks!

Kewen

Reply via email to