Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.
On 10/25/23 04:15, Jin Ma wrote: +;; The conversion of DF to BF needs to be done with SF if there is a +;; chance to generate at least one instruction, otherwise just using +;; libfunc __truncdfbf2. +(define_expand "truncdfbf2" + [(set (match_operand:BF 0 "register_operand" "=f") + (float_truncate:BF + (match_operand:DF 1 "register_operand" " f")))] + "TARGET_DOUBLE_FLOAT || TARGET_ZDINX" + { +convert_move (operands[0], + convert_modes (SFmode, DFmode, operands[1], 0), 0); +DONE; + }) So for conversions to/from BFmode, doesn't generic code take care of this for us? Search for convert_mode_scalar in expr.cc. That code will utilize SFmode as an intermediate step just like your expander. Is there some reason that generic code is insufficient? Similarly for the the other conversions. As far as I can see, the function 'convert_mode_scalar' doesn't seem to be perfect for dealing with the conversions to/from BFmode. It can only handle BF to HF, SF, DF and SF to BF well, but the rest of the conversion without any processing, directly using the libcall. Maybe I should choose to enhance its functionality? This seems to be a good choice, I'm not sure.My recollection was that BF could be converted to/from SF trivially and if we wanted BF->DF we'd first convert to SF, then to DF. Direct BF<->DF conversions aren't actually important from a performance standpoint. So it's OK if they have an extra step IMHO. Thank you very much for your review and detailed reply. Maybe there are some problems with my expression and I am a little confused about your guidance. My understanding is that you also think that it is reasonable to convert through SF, right? In fact, this is what I did. My point was that I would expect the generic code to handle the conversion and that we didn't need to handle it explicitly in the RISC-V backend. Meaning that I don't think we need a define_expand for truncdfbf2, fix_truncbf2, fixuns_truncbf2, floatbf2, or floatunsbf2. In this patch, my thoughts are as follows: The general principle is to use the real instructions instead of libcall as much as possible for conversions, while minimizing the definition of libcall(only reusing which has been defined by other architectures such as aarch64). If SF can be used as a transit, it is preferred to convert to SF, otherwise libcall is directly used. 1. For the conversions between floating points For BF->DF, as you said, the function 'convert_mode_scalar' in the general code has been well implemented, which will be expressed as BF->SF->DF. And the generated instruction list may be as follows: 'call __extendbfsf2' + 'call __extendsfdf2' (when only soft floating point support); 'call __extendbfsf2' + 'fcvt.d.s' (when (TARGET_DOUBLE_FLOAT || TARGET_ZDINX) is true); 'fcvt.s.bf16'+ 'fcvt.d.s' (when ((TARGET_DOUBLE_FLOAT || TARGET_ZDINX) && TARGET_ZFBFMIN) is true) For DF->BF, if any of fcvt.s.d and fcvt.bf16.s cannot be generated, the 'call __truncdfbf2' is directly generated by the function 'convert_mode_scalar'. Otherwise the new pattern(define_expand "truncdfbf2") is used. This makes it possible to implement DF->BF by 'fcvt.s.d' + 'fcvt.bf16.s', which cannot be generated by the function 'convert_mode_scala'. But I would have expected convert_mode_scalar to generate DF->BF by first truncating to SF, then to BF. If that is missing for truncation, then we should add it to convert_mode_scalar rather than expressing it as a backend expander. 2. For the conversions between integer and BF, it seems that gcc only uses libcall to implement it, but this is obviously wrong. For example, the conversion BF->SI directly calls the unimplemented libcall __fixunsbfsi. So I added some new pattern to handle these transformations with SF. I would suggest these move into target independent code as well. There's no reason I'm aware of that they should be implemented entirely in a target machine description. We're not really doing anything target specific in here. jeff
Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.
> >>> +;; The conversion of DF to BF needs to be done with SF if there is a > >>> +;; chance to generate at least one instruction, otherwise just using > >>> +;; libfunc __truncdfbf2. > >>> +(define_expand "truncdfbf2" > >>> + [(set (match_operand:BF 0 "register_operand" "=f") > >>> + (float_truncate:BF > >>> + (match_operand:DF 1 "register_operand" " f")))] > >>> + "TARGET_DOUBLE_FLOAT || TARGET_ZDINX" > >>> + { > >>> +convert_move (operands[0], > >>> + convert_modes (SFmode, DFmode, operands[1], 0), 0); > >>> +DONE; > >>> + }) > >> So for conversions to/from BFmode, doesn't generic code take care of > >> this for us? Search for convert_mode_scalar in expr.cc. That code will > >> utilize SFmode as an intermediate step just like your expander. Is > >> there some reason that generic code is insufficient? > >> > >> Similarly for the the other conversions. > > > > As far as I can see, the function 'convert_mode_scalar' doesn't seem to be > > perfect for > > dealing with the conversions to/from BFmode. It can only handle BF to HF, > > SF, DF and > > SF to BF well, but the rest of the conversion without any processing, > > directly using > > the libcall. > > > > Maybe I should choose to enhance its functionality? This seems to be a > > good choice, I'm not sure.My recollection was that BF could be converted > > to/from SF trivially and > if we wanted BF->DF we'd first convert to SF, then to DF. > > Direct BF<->DF conversions aren't actually important from a performance > standpoint. So it's OK if they have an extra step IMHO. Thank you very much for your review and detailed reply. Maybe there are some problems with my expression and I am a little confused about your guidance. My understanding is that you also think that it is reasonable to convert through SF, right? In fact, this is what I did. In this patch, my thoughts are as follows: The general principle is to use the real instructions instead of libcall as much as possible for conversions, while minimizing the definition of libcall(only reusing which has been defined by other architectures such as aarch64). If SF can be used as a transit, it is preferred to convert to SF, otherwise libcall is directly used. 1. For the conversions between floating points For BF->DF, as you said, the function 'convert_mode_scalar' in the general code has been well implemented, which will be expressed as BF->SF->DF. And the generated instruction list may be as follows: 'call __extendbfsf2' + 'call __extendsfdf2' (when only soft floating point support); 'call __extendbfsf2' + 'fcvt.d.s' (when (TARGET_DOUBLE_FLOAT || TARGET_ZDINX) is true); 'fcvt.s.bf16'+ 'fcvt.d.s' (when ((TARGET_DOUBLE_FLOAT || TARGET_ZDINX) && TARGET_ZFBFMIN) is true) For DF->BF, if any of fcvt.s.d and fcvt.bf16.s cannot be generated, the 'call __truncdfbf2' is directly generated by the function 'convert_mode_scalar'. Otherwise the new pattern(define_expand "truncdfbf2") is used. This makes it possible to implement DF->BF by 'fcvt.s.d' + 'fcvt.bf16.s', which cannot be generated by the function 'convert_mode_scala'. 2. For the conversions between integer and BF, it seems that gcc only uses libcall to implement it, but this is obviously wrong. For example, the conversion BF->SI directly calls the unimplemented libcall __fixunsbfsi. So I added some new pattern to handle these transformations with SF. Thanks, Jin > > jeff
Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.
On 10/9/23 00:18, Jin Ma wrote: +;; The conversion of DF to BF needs to be done with SF if there is a +;; chance to generate at least one instruction, otherwise just using +;; libfunc __truncdfbf2. +(define_expand "truncdfbf2" + [(set (match_operand:BF 0 "register_operand" "=f") + (float_truncate:BF + (match_operand:DF 1 "register_operand" " f")))] + "TARGET_DOUBLE_FLOAT || TARGET_ZDINX" + { +convert_move (operands[0], + convert_modes (SFmode, DFmode, operands[1], 0), 0); +DONE; + }) So for conversions to/from BFmode, doesn't generic code take care of this for us? Search for convert_mode_scalar in expr.cc. That code will utilize SFmode as an intermediate step just like your expander. Is there some reason that generic code is insufficient? Similarly for the the other conversions. As far as I can see, the function 'convert_mode_scalar' doesn't seem to be perfect for dealing with the conversions to/from BFmode. It can only handle BF to HF, SF, DF and SF to BF well, but the rest of the conversion without any processing, directly using the libcall. Maybe I should choose to enhance its functionality? This seems to be a good choice, I'm not sure.My recollection was that BF could be converted to/from SF trivially and if we wanted BF->DF we'd first convert to SF, then to DF. Direct BF<->DF conversions aren't actually important from a performance standpoint. So it's OK if they have an extra step IMHO. jeff
Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.
> On 9/19/23 02:44, Jin Ma wrote: > > gcc/ChangeLog: > > > > * config/riscv/iterators.md (HFBF): New. > > * config/riscv/riscv-builtins.cc (riscv_init_builtin_types): > > Initialize data type_Bfloat16. > > * config/riscv/riscv-modes.def (FLOAT_MODE): New. > > (ADJUST_FLOAT_FORMAT): New. > > * config/riscv/riscv.cc (riscv_mangle_type): Support for BFmode. > > (riscv_scalar_mode_supported_p): Ditto. > > (riscv_libgcc_floating_mode_supported_p): Ditto. > > (riscv_block_arith_comp_libfuncs_for_mode): New. > > (riscv_init_libfuncs): Opening and closing some libfuncs for BFmode. > > * config/riscv/riscv.md (mode" ): Add BF. > > (truncdfbf2): New. > > (movhf): Support for BFmode. > > (mov): Ditto. > > (*mov_softfloat): Ditto. > > (fix_truncbf2): New. > > (fixuns_truncbf2): New. > > (floatbf2): New. > > (floatunsbf2): New. > > > > libgcc/ChangeLog: > > > > * config/riscv/sfp-machine.h (_FP_NANFRAC_B): New. > > (_FP_NANSIGN_B): New. > > * config/riscv/t-softfp32: Add support for BF libfuncs. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/riscv/bf16_arithmetic.c: New test. > > * gcc.target/riscv/bf16_call.c: New test. > > * gcc.target/riscv/bf16_comparisons.c: New test. > > * gcc.target/riscv/bf16_convert-1.c: New test. > > * gcc.target/riscv/bf16_convert-2.c: New test. > > * gcc.target/riscv/bf16_convert_run.c: New test. > So this can't go in the tree until the extension has moved into a frozen > state. Hopefully that'll happen before we close stage1 development in Nov. Ok, this is very reasonable. > > > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > > index e00b8ee3579..5048628c784 100644 > > --- a/gcc/config/riscv/riscv.md > > +++ b/gcc/config/riscv/riscv.md > > @@ -1631,6 +1631,20 @@ (define_insn "truncdfhf2" > > [(set_attr "type" "fcvt") > > (set_attr "mode" "HF")]) > > > > +;; The conversion of DF to BF needs to be done with SF if there is a > > +;; chance to generate at least one instruction, otherwise just using > > +;; libfunc __truncdfbf2. > > +(define_expand "truncdfbf2" > > + [(set (match_operand:BF 0 "register_operand" "=f") > > + (float_truncate:BF > > + (match_operand:DF 1 "register_operand" " f")))] > > + "TARGET_DOUBLE_FLOAT || TARGET_ZDINX" > > + { > > +convert_move (operands[0], > > + convert_modes (SFmode, DFmode, operands[1], 0), 0); > > +DONE; > > + }) > So for conversions to/from BFmode, doesn't generic code take care of > this for us? Search for convert_mode_scalar in expr.cc. That code will > utilize SFmode as an intermediate step just like your expander. Is > there some reason that generic code is insufficient? > > Similarly for the the other conversions. As far as I can see, the function 'convert_mode_scalar' doesn't seem to be perfect for dealing with the conversions to/from BFmode. It can only handle BF to HF, SF, DF and SF to BF well, but the rest of the conversion without any processing, directly using the libcall. Maybe I should choose to enhance its functionality? This seems to be a good choice, I'm not sure. Jin > > Otherwise it looks pretty good. > > Jeff
Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.
On 9/19/23 02:44, Jin Ma wrote: gcc/ChangeLog: * config/riscv/iterators.md (HFBF): New. * config/riscv/riscv-builtins.cc (riscv_init_builtin_types): Initialize data type_Bfloat16. * config/riscv/riscv-modes.def (FLOAT_MODE): New. (ADJUST_FLOAT_FORMAT): New. * config/riscv/riscv.cc (riscv_mangle_type): Support for BFmode. (riscv_scalar_mode_supported_p): Ditto. (riscv_libgcc_floating_mode_supported_p): Ditto. (riscv_block_arith_comp_libfuncs_for_mode): New. (riscv_init_libfuncs): Opening and closing some libfuncs for BFmode. * config/riscv/riscv.md (mode" ): Add BF. (truncdfbf2): New. (movhf): Support for BFmode. (mov): Ditto. (*mov_softfloat): Ditto. (fix_truncbf2): New. (fixuns_truncbf2): New. (floatbf2): New. (floatunsbf2): New. libgcc/ChangeLog: * config/riscv/sfp-machine.h (_FP_NANFRAC_B): New. (_FP_NANSIGN_B): New. * config/riscv/t-softfp32: Add support for BF libfuncs. gcc/testsuite/ChangeLog: * gcc.target/riscv/bf16_arithmetic.c: New test. * gcc.target/riscv/bf16_call.c: New test. * gcc.target/riscv/bf16_comparisons.c: New test. * gcc.target/riscv/bf16_convert-1.c: New test. * gcc.target/riscv/bf16_convert-2.c: New test. * gcc.target/riscv/bf16_convert_run.c: New test. So this can't go in the tree until the extension has moved into a frozen state. Hopefully that'll happen before we close stage1 development in Nov. diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index e00b8ee3579..5048628c784 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -1631,6 +1631,20 @@ (define_insn "truncdfhf2" [(set_attr "type" "fcvt") (set_attr "mode" "HF")]) +;; The conversion of DF to BF needs to be done with SF if there is a +;; chance to generate at least one instruction, otherwise just using +;; libfunc __truncdfbf2. +(define_expand "truncdfbf2" + [(set (match_operand:BF 0 "register_operand" "=f") + (float_truncate:BF + (match_operand:DF 1 "register_operand" " f")))] + "TARGET_DOUBLE_FLOAT || TARGET_ZDINX" + { +convert_move (operands[0], + convert_modes (SFmode, DFmode, operands[1], 0), 0); +DONE; + }) So for conversions to/from BFmode, doesn't generic code take care of this for us? Search for convert_mode_scalar in expr.cc. That code will utilize SFmode as an intermediate step just like your expander. Is there some reason that generic code is insufficient? Similarly for the the other conversions. Otherwise it looks pretty good. Jeff
[RFC 1/2] RISC-V: Add support for _Bfloat16.
gcc/ChangeLog: * config/riscv/iterators.md (HFBF): New. * config/riscv/riscv-builtins.cc (riscv_init_builtin_types): Initialize data type_Bfloat16. * config/riscv/riscv-modes.def (FLOAT_MODE): New. (ADJUST_FLOAT_FORMAT): New. * config/riscv/riscv.cc (riscv_mangle_type): Support for BFmode. (riscv_scalar_mode_supported_p): Ditto. (riscv_libgcc_floating_mode_supported_p): Ditto. (riscv_block_arith_comp_libfuncs_for_mode): New. (riscv_init_libfuncs): Opening and closing some libfuncs for BFmode. * config/riscv/riscv.md (mode" ): Add BF. (truncdfbf2): New. (movhf): Support for BFmode. (mov): Ditto. (*mov_softfloat): Ditto. (fix_truncbf2): New. (fixuns_truncbf2): New. (floatbf2): New. (floatunsbf2): New. libgcc/ChangeLog: * config/riscv/sfp-machine.h (_FP_NANFRAC_B): New. (_FP_NANSIGN_B): New. * config/riscv/t-softfp32: Add support for BF libfuncs. gcc/testsuite/ChangeLog: * gcc.target/riscv/bf16_arithmetic.c: New test. * gcc.target/riscv/bf16_call.c: New test. * gcc.target/riscv/bf16_comparisons.c: New test. * gcc.target/riscv/bf16_convert-1.c: New test. * gcc.target/riscv/bf16_convert-2.c: New test. * gcc.target/riscv/bf16_convert_run.c: New test. --- gcc/config/riscv/iterators.md | 2 + gcc/config/riscv/riscv-builtins.cc| 16 ++ gcc/config/riscv/riscv-modes.def | 4 + gcc/config/riscv/riscv.cc | 93 -- gcc/config/riscv/riscv.md | 94 -- .../gcc.target/riscv/bf16_arithmetic.c| 36 gcc/testsuite/gcc.target/riscv/bf16_call.c| 17 ++ .../gcc.target/riscv/bf16_comparisons.c | 25 +++ .../gcc.target/riscv/bf16_convert-1.c | 39 + .../gcc.target/riscv/bf16_convert-2.c | 38 .../gcc.target/riscv/bf16_convert_run.c | 163 ++ libgcc/config/riscv/sfp-machine.h | 3 + libgcc/config/riscv/t-softfp32| 7 +- 13 files changed, 503 insertions(+), 34 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_arithmetic.c create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_call.c create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_comparisons.c create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_convert-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_convert-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_convert_run.c diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md index ecf033f2fa7..73523b73fdd 100644 --- a/gcc/config/riscv/iterators.md +++ b/gcc/config/riscv/iterators.md @@ -84,6 +84,8 @@ (define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF "TARGET_ZFHMIN")]) ;; instruction. (define_mode_attr size [(QI "b") (HI "h")]) +(define_mode_iterator HFBF [HF BF]) + ;; Mode attributes for loads. (define_mode_attr load [(QI "lb") (HI "lh") (SI "lw") (DI "ld") (HF "flh") (SF "flw") (DF "fld")]) diff --git a/gcc/config/riscv/riscv-builtins.cc b/gcc/config/riscv/riscv-builtins.cc index 3fe3a89dcc2..b7bb89794f7 100644 --- a/gcc/config/riscv/riscv-builtins.cc +++ b/gcc/config/riscv/riscv-builtins.cc @@ -192,6 +192,7 @@ static GTY(()) int riscv_builtin_decl_index[NUM_INSN_CODES]; riscv_builtin_decls[riscv_builtin_decl_index[(CODE)]] tree riscv_float16_type_node = NULL_TREE; +tree riscv_bfloat16_type_node = NULL_TREE; /* Return the function type associated with function prototype TYPE. */ @@ -235,6 +236,21 @@ riscv_init_builtin_types (void) if (!maybe_get_identifier ("_Float16")) lang_hooks.types.register_builtin_type (riscv_float16_type_node, "_Float16"); + + /* Provide the _Bfloat16 type and bfloat16_type_node if needed. */ + if (!bfloat16_type_node) +{ + riscv_bfloat16_type_node = make_node (REAL_TYPE); + TYPE_PRECISION (riscv_bfloat16_type_node) = 16; + SET_TYPE_MODE (riscv_bfloat16_type_node, BFmode); + layout_type (riscv_bfloat16_type_node); +} + else +riscv_bfloat16_type_node = bfloat16_type_node; + + if (!maybe_get_identifier ("_Bfloat16")) +lang_hooks.types.register_builtin_type (riscv_bfloat16_type_node, + "_Bfloat16"); } /* Implement TARGET_INIT_BUILTINS. */ diff --git a/gcc/config/riscv/riscv-modes.def b/gcc/config/riscv/riscv-modes.def index e3c6ccb2809..723bfaee42d 100644 --- a/gcc/config/riscv/riscv-modes.def +++ b/gcc/config/riscv/riscv-modes.def @@ -22,6 +22,10 @@ along with GCC; see the file COPYING3. If not see FLOAT_MODE (HF, 2, ieee_half_format); FLOAT_MODE (TF, 16, ieee_quad_format); +FLOAT_MODE (BF, 2, 0); +/* Reuse definition from arm. */ +ADJUST_FLOAT_FORMAT (BF, _bfloat_half_format); + /* Vector modes. */ /* Encode the ratio of