on 2023/5/24 23:20, Carl Love wrote:
> On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote:
>> on 2023/5/24 06:30, Peter Bergner wrote:
>>> On 5/23/23 12:24 AM, Kewen.Lin wrote:
>>>> on 2023/5/23 01:31, Carl Love wrote:
>>>>> The builtins were requested for use in GLibC.  As of version
>>>>> 2.31 they
>>>>> were added as inline asm.  They requested a builtin so the asm
>>>>> could be
>>>>> removed.
>>>>
>>>> So IMHO we also want the similar support for mffscrn, that is to
>>>> make
>>>> use of mffscrn and mffscrni on Power9 and later, but falls back
>>>> to 
>>>> __builtin_set_fpscr_rn + mffs similar on older platforms.
>>>
>>> So __builtin_set_fpscr_rn everything we want (sets the RN bits) and
>>> uses mffscrn/mffscrni on P9 and later and uses older insns on pre-
>>> P9.
>>> The only problem is we don't return the current FPSCR bits, as the
>>> bif
>>> is defined to return void.
>>
>> Yes.
>>
>>> Crazy idea, but could we extend the built-in
>>> with an overload that returns the FPSCR bits?  
>>
>> So you agree that we should make this proposed new bif handle pre-P9
>> just
>> like some other existing bifs. :)  I think extending it is good and
>> doable,
>> but the only concern here is the bif name "__builtin_set_fpscr_rn",
>> which
>> matches the existing behavior (only set rounding) but doesn't match
>> the
>> proposed extending behavior (set rounding and get some env bits
>> back).
>> Maybe it's not a big deal if the documentation clarify it well.
> 
> Extending the builtin to pre Power 9 is straight forward and I agree
> would make good sense to do.
> 
> I am a bit concerned on how to extend __builtin_set_fpscr_rn to add the
> new functionality.  Peter suggests overloading the builtin to either
> return void or returns FPSCR bits.  It is my understanding that the
> return value for a given builtin had to be the same, i.e. you can't
> overload the return value. Maybe you can with Bill's new
> infrastructure?  I recall having problems trying to overload the return
> value in the past and Bill said you couldn't do it.  I play with this
> and see if I can overload the return value.

Your understanding on that we fail to overload this for just different
return types is correct.  But previously I interpreted the extending
proposal as to extend
 
  void __builtin_set_fpscr_rn (int);

to 

  void __builtin_set_fpscr_rn (int, double*);

The related address taken and store here can be optimized out normally.

BR,
Kewen

Reply via email to