Hi Carl, on 2023/6/19 23:57, Carl Love wrote: > GCC maintainers: > > > The GLibC team requested a builtin to replace the mffscrn and mffscrni inline > asm instructions in the GLibC code> Previously there was discussion on adding > builtins for the mffscrn instructions. > > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html > > In the end, it was felt that it would be to extend the existing > __builtin_set_fpscr_rn builtin to return a double instead of a void > type. The desire is that we could have the functionality of the > mffscrn and mffscrni instructions on older ISAs. The two instructions > were initially added in ISA 3.0. The __builtin_set_fpscr_rn has the > needed functionality to set the RN field using the mffscrn and mffscrni > instructions if ISA 3.0 is supported or fall back to using logical > instructions to mask and set the bits for earlier ISAs. The > instructions return the current value of the FPSCR fields DRN, VE, OE, > UE, ZE, XE, NI, RN bit positions then update the RN bit positions with > the new RN value provided. > > The current __builtin_set_fpscr_rn builtin has a return type of void. > So, changing the return type to double and returning the FPSCR fields > DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the > functionally equivalent of the mffscrn and mffscrni instructions. Any > current uses of the builtin would just ignore the return value yet any > new uses could use the return value. So the requirement is for the > change to the __builtin_set_fpscr_rn builtin to be backwardly > compatible and work for all ISAs. > > The following patch changes the return type of the > __builtin_set_fpscr_rn builtin from void to double. The return value > is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE, > XE, NI, RN bit positions when the builtin is called. The builtin then > updated the RN field with the new value provided as an argument to the > builtin. The patch adds new testcases to test_fpscr_rn_builtin.c to > check that the builtin returns the current value of the FPSCR fields > and then updates the RN field.
But this patch also introduces one more overloading instance with argument type double, I had a check on glibc usage of mffscrn/mffscrni, I don't think it's necessary to add this new instance, as it takes the given rounding mode as integral type. For examples: #define __fe_mffscrn(rn) \ ({register fenv_union_t __fr; \ if (__builtin_constant_p (rn)) \ __asm__ __volatile__ ( \ ".machine push; .machine \"power9\"; mffscrni %0,%1; .machine pop" \ : "=f" (__fr.fenv) : "n" (rn)); \ else \ { \ __fr.l = (rn); \ __asm__ __volatile__ ( \ ".machine push; .machine \"power9\"; mffscrn %0,%1; .machine pop" \ : "=f" (__fr.fenv) : "f" (__fr.fenv)); \ } \ __fr.fenv; \ }) /* Same as __fesetround_inline, however without runtime check to use DFP mtfsfi syntax (as relax_fenv_state) or if round value is valid. */ static inline void __fesetround_inline_nocheck (const int round) { #ifdef _ARCH_PWR9 __fe_mffscrn (round); #else if (__glibc_likely (GLRO(dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)) __fe_mffscrn (round); else asm volatile ("mtfsfi 7,%0" : : "n" (round)); #endif } So could you just extend return type (from void to double) but without one more overloading instance? Without overloading, we can still use the original bif instance SET_FPSCR_RN and its correpsonding expander rs6000_set_fpscr_rn, just add some more handlings to fetch bits for return value. It would be simpler IMHO. > > The GLibC team has reviewed the patch to make sure it met their needs > as a drop in replacement for the inline asm mffscr and mffscrni > statements in the GLibC code. T > > The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10 > LE. > > Please let me know if the patch is acceptable for mainline. Thanks. > > Carl > > -------------------------------- > rs6000, __builtin_set_fpscr_rn add retrun value > > Change the return value from void to double. The return value consists of > the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions. Add an > overloaded version which accepts a double argument. > > The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the > double reterun value and the new double argument. > > gcc/ChangeLog: > * config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Delete. > (__builtin_set_fpscr_rn_i): New builtin definition. > (__builtin_set_fpscr_rn_d): New builtin definition. > * config/rs6000/rs6000-overload.def (__builtin_set_fpscr_rn): New > overloaded definition. > * config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New > define_expand. > (rs6000_update_fpscr_rn_field): New define_expand. > (rs6000_set_fpscr_rn_d): New define expand. > (rs6000_set_fpscr_rn_i): Renamed from rs6000_set_fpscr_rn, Added > return argument. Updated to use new rs6000_get_fpscr_fields and > rs6000_update_fpscr_rn_field define _expands. > * doc/extend.texi (__builtin_set_fpscr_rn): Update description for > the return value and new double argument. > > gcc/testsuite/ChangeLog: > gcc.target/powerpc/test_fpscr_rn_builtin.c: Add new tests th check > double return value. Add tests for overloaded double argument. Since we have the expectation that the existing usage of __builtin_set_fpscr_rn would still work fine, so could we keep the existing one unchanged to test it? and test the new capability with one new test case? BR, Kewen