on 2023/5/24 23:20, Carl Love wrote: > On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote: >> on 2023/5/24 06:30, Peter Bergner wrote: >>> On 5/23/23 12:24 AM, Kewen.Lin wrote: >>>> on 2023/5/23 01:31, Carl Love wrote: >>>>> The builtins were requested for use in GLibC. As of version >>>>> 2.31 they >>>>> were added as inline asm. They requested a builtin so the asm >>>>> could be >>>>> removed. >>>> >>>> So IMHO we also want the similar support for mffscrn, that is to >>>> make >>>> use of mffscrn and mffscrni on Power9 and later, but falls back >>>> to >>>> __builtin_set_fpscr_rn + mffs similar on older platforms. >>> >>> So __builtin_set_fpscr_rn everything we want (sets the RN bits) and >>> uses mffscrn/mffscrni on P9 and later and uses older insns on pre- >>> P9. >>> The only problem is we don't return the current FPSCR bits, as the >>> bif >>> is defined to return void. >> >> Yes. >> >>> Crazy idea, but could we extend the built-in >>> with an overload that returns the FPSCR bits? >> >> So you agree that we should make this proposed new bif handle pre-P9 >> just >> like some other existing bifs. :) I think extending it is good and >> doable, >> but the only concern here is the bif name "__builtin_set_fpscr_rn", >> which >> matches the existing behavior (only set rounding) but doesn't match >> the >> proposed extending behavior (set rounding and get some env bits >> back). >> Maybe it's not a big deal if the documentation clarify it well. > > Extending the builtin to pre Power 9 is straight forward and I agree > would make good sense to do. > > I am a bit concerned on how to extend __builtin_set_fpscr_rn to add the > new functionality. Peter suggests overloading the builtin to either > return void or returns FPSCR bits. It is my understanding that the > return value for a given builtin had to be the same, i.e. you can't > overload the return value. Maybe you can with Bill's new > infrastructure? I recall having problems trying to overload the return > value in the past and Bill said you couldn't do it. I play with this > and see if I can overload the return value.
Your understanding on that we fail to overload this for just different return types is correct. But previously I interpreted the extending proposal as to extend void __builtin_set_fpscr_rn (int); to void __builtin_set_fpscr_rn (int, double*); The related address taken and store here can be optimized out normally. BR, Kewen