Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.

Jin Ma Wed, 25 Oct 2023 03:16:56 -0700

> >>> +;; The conversion of DF to BF needs to be done with SF if there is a
> >>> +;; chance to generate at least one instruction, otherwise just using
> >>> +;; libfunc __truncdfbf2.
> >>> +(define_expand "truncdfbf2"
> >>> +  [(set (match_operand:BF     0 "register_operand" "=f")
> >>> +       (float_truncate:BF
> >>> +           (match_operand:DF 1 "register_operand" " f")))]
> >>> +  "TARGET_DOUBLE_FLOAT || TARGET_ZDINX"
> >>> +  {
> >>> +    convert_move (operands[0],
> >>> +           convert_modes (SFmode, DFmode, operands[1], 0), 0);
> >>> +    DONE;
> >>> +  })
> >> So for conversions to/from BFmode, doesn't generic code take care of
> >> this for us?  Search for convert_mode_scalar in expr.cc. That code will
> >> utilize SFmode as an intermediate step just like your expander.   Is
> >> there some reason that generic code is insufficient?
> >>
> >> Similarly for the the other conversions.
> > 
> > As far as I can see, the function 'convert_mode_scalar' doesn't seem to be 
> > perfect for
> > dealing with the conversions to/from BFmode. It can only handle BF to HF, 
> > SF, DF and
> > SF to BF well, but the rest of the conversion without any processing, 
> > directly using
> > the libcall.
> > 
> > Maybe I should choose to enhance its functionality? This seems to be a
> > good choice, I'm not sure.My recollection was that BF could be converted 
> > to/from SF trivially and 
> if we wanted BF->DF we'd first convert to SF, then to DF.
> 
> Direct BF<->DF conversions aren't actually important from a performance 
> standpoint.  So it's OK if they have an extra step IMHO.


Thank you very much for your review and detailed reply. Maybe there are some 
problems with my expression
and I am a little confused about your guidance. My understanding is that you 
also think that it is reasonable to
convert through SF, right? In fact, this is what I did.

In this patch, my thoughts are as follows:

The general principle is to use the real instructions instead of libcall as 
much as possible for conversions,
while minimizing the definition of libcall(only reusing which has been defined 
by other architectures such
as aarch64). If SF can be used as a transit, it is preferred to convert to SF, 
otherwise libcall is directly used.

1. For the conversions between floating points

For BF->DF, as you said, the function 'convert_mode_scalar' in the general code 
has been well implemented,
which will be expressed as BF->SF->DF. And the generated instruction list may 
be as follows:
  'call __extendbfsf2' + 'call __extendsfdf2' (when only soft floating point 
support);
  'call __extendbfsf2' + 'fcvt.d.s'           (when (TARGET_DOUBLE_FLOAT || 
TARGET_ZDINX) is true);
  'fcvt.s.bf16'        + 'fcvt.d.s'           (when ((TARGET_DOUBLE_FLOAT || 
TARGET_ZDINX) && TARGET_ZFBFMIN) is true)

For DF->BF, if any of fcvt.s.d and fcvt.bf16.s cannot be generated, the 'call 
__truncdfbf2' is directly generated
by the function 'convert_mode_scalar'. Otherwise the new pattern(define_expand 
"truncdfbf2") is used. This
makes it possible to implement DF->BF by 'fcvt.s.d' + 'fcvt.bf16.s', which 
cannot be generated by the function
'convert_mode_scala'.

2. For the conversions between integer and BF, it seems that gcc only uses 
libcall to implement it, but this is
obviously wrong. For example, the conversion BF->SI directly calls the 
unimplemented libcall __fixunsbfsi.
So I added some new pattern to handle these transformations with SF.

Thanks,

Jin

> 
> jeff

Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.

Reply via email to