[PATCH, version 2], Add support for _Float and _FloatX sqrt, fma, fmin, fmax built-in functions

Michael Meissner Thu, 19 Oct 2017 15:09:06 -0700

On Wed, Sep 13, 2017 at 10:49:43PM +0000, Joseph Myers wrote:
> On Wed, 13 Sep 2017, Michael Meissner wrote:
> 
> > This patch adds support on PowerPC ISA 3.0 for the built-in function
> > __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction 
> > and
> > the built-in function __builtin_fmaf128 generating XSMADDQP, XSMSUBQP,
> > XSNMADDQP, and XSNMSUBQP fused multiply-add instructions.
> 
> Is there a reason for these to be architecture-specific rather than 
> generic everywhere _Float128 is supported?  (With the fmaf128 / sqrtf128 
> names available as well as the __builtin_* variants of those.)


The basic reason was I hadn't yet discovered all of the places that need to be
modified to add generic _Float128 math functions.

> Full support for _FloatN/_FloatNx variants of all the existing built-in 
> functions might be complicated, and run into potential issues with startup 
> cost of creating large numbers of extra built-in functions (it's 
> desirable, but possibly hard, which is why I excluded it from the initial 
> _FloatN / _FloatNx support patches).  But adding just these two functions 
> to builtins.def and making them fold / expand appropriately ought to be 
> much simpler.  (I realise sqrt goes through internal-fn.def and 
> DEF_INTERNAL_FLT_FN expects a particular set of functions for standard 
> types, so maybe some duplication would be involved to get the built-in 
> function expanded appropriately, i.e. using an insn pattern or a call to 
> an external sqrtf128 function according to whether such an insn pattern is 
> available.  fma ought not to involve much more than adding an extra case 
> where CASE_FLT_FN (BUILT_IN_FMA) is used.)

I have now gone through and added the proper support for _Float128 sqrt, fma,
fmin, and fmax.  I have added the framework so that other functions as needed
can be added over time.

> > While I was at it, I changed the documentation so that it no longer 
> > documents
> > the 'q' built-in functions (to mirror libquadmath) but instead just 
> > documented
> > the 'f128' functions that matches glibc 2.26 and the technical report that
> > added the _FloatF128 date.
> 
> Those *f128 built-in functions (inf / huge_val / nan / nans / fabs / 
> copysign) are not target-specific; they exist for all _FloatN / _FloatNx 
> types for all targets with such types.  So it doesn't seem appropriate to 
> document them in a target-specific section of the manual, beyond a brief 
> cross-reference to the documentation of the functions as 
> target-independent.

Highlights of the patch:

    1)  I switched to use DEF_EXT_LIB_BUILTIN to declare the _Float<N> and
        _Float<N>X functions.  This allows treating __builtin_sqrtf128 the same
        as sqrtf128.

    2)  Add support in gencfn-macros.c to build the appropriate CASE_CFN_* and
        operators in cfn-operators.pd that can be used as needed.

    3)  I did not enable _Float128 support for all math built-ins, but just the
        built-in functions I am currently need to support in (just like
        copysign and fabs were previously done).  I expect over time there
        might be some more needed to be added to the list.  I added fmin and
        fmax to the machine independent built-ins, but I will submit a patch
        later to enable them in the PowerPC.

    4)  I went through and added support for copysign, fma, fmin, and fmax
        functions in the same places the current float/double/long double
        functions are handled.

    5)  I removed the PowerPC sqrtf128 and fmaf128 built-ins, since these are
        now handled by machine independent code.  In doing so, I deleted two
        tests that did not allow the built-ins where the software emulator is
        used.  The GLIBC 2.26 as shipped with the Advance Toolchain 11.0-1
        contain these functions.

    6)  In the previous version of the patch, I put in a special warning for
        fmaf128 (that it might not be present if the h/w instructions weren't
        available).  When I wrote that patch, the initial release of Advance
        Toolchain 11.0-0 did not include a fmaf128 function.  It now includes
        the function, so I don't need the warning.

I have checked this patch on the following systems with bootstrap and make
check for gcc/g++/gfortran/lto:

    1)  Little endian power8 system using --with-cpu=power8
    2)  Big endian power7 system (both 64/32-bit) using --with-cpu=power7
    3)  Little endian power9 prototype system using --with-cpu=power9
    4)  A Fedora 21 x86_64 system

Can I check these patches into the trunk?

[gcc]
2017-10-19  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * builtins.c (CASE_MATHRN_FLOATN): New helper macro to support
        math functions that have _Float<N> and _Float<N>X variants.
        (mathfn_built_in_2): Add support for copysign, fma, fmax, fmin,
        and sqrt having _Float<N> and _Float<N>X variants.
        (DEF_INTERNAL_FLT_FLOATN_FN): New helper macro to support for math
        functions with _Float<N> and _Float<N>X variants.
        (expand_builtin_mathfn_ternary): Add fma _Float<N> and _Float<N>X
        support.
        (expand_builtin): Likewise.
        (fold_builtin_3): Likewise.
        * fold-const.c (tree_call_nonnegative_warnv_p): Add support for
        sqrt, fmax, fmin, and copysign with _Float<N> and _Float<N>X
        variants.
        (integer_valued_real_call_p): Likewise.
        * builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16): New
        function signatures for fma _Float<N> and _Float<N>X variants.
        (BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise.
        (BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise.
        (BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise.
        (BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise.
        (BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise.
        (BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise.
        * builtins.def (DEF_GCC_FLOATN_NX_BUILTINS): Use
        DEF_EXT_LIB_BUILTIN instead of DEF_GCC_BUILTIN, so that
        sqrtf128 is normally processed to be __builtin_sqrtf128.
        (BUILT_IN_FMA): Define _Float<N> and _Float<N>X variants.
        (BUILT_IN_FMAX): Likewise.
        (BUILT_IN_FMIN): Likewise.
        (BUILT_IN_SQRT): Likewise.
        * tree-call-cdce.c (can_test_argument_range): Add support for sqrt
        _Float<N> and _Float<N>X variants.
        (edom_only_function): Likewise.
        (get_no_error_domain): Likewise.
        * tree-ssa-math-opts.c (gimple_call_combined_fn): Likewise.
        * fold-const-call.c (fold_const_call_ss): Likewise.
        (fold_const_call_sss): Add support for copysign, fmin, and fmax
        _Float<N> and _Float<N>X variants.
        (fold_const_call_ssss): Add support for fma _Float<N> and
        _Float<N>X variants.
        * internal-fn.def (DEF_INTERNAL_FLT_FLOATN_FN): New helper macro
        for math functions that have _Float<N> and _Float<N>X variants.
        (SQRT): Add support for sqrt, copysign, fmin and fmax _Float<N>
        and _Float<N>X variants.
        (COPYSIGN): Likewise.
        (FMIN): Likewise.
        (FMAX): Likewise.
        * gencfn-macros.c (print_case_cfn): Add support for math functions
        that have _Float<N> and _Float<N>X variants.
        (print_define_operator_list): Likewise.
        (fltfn_suffixes): Likewise.
        (main): Likewise.
        * tree-ssa-reassoc.c (attempt_builtin_copysign): Add support for
        copysign with _Float<N> and _Float<N>X variants.
        * gimple-ssa-backprop.c (backprop::process_builtin_call_use): Add
        support for copysign and fma with _Float<N> and _Float<N>X
        variants.
        * config/rs6000/rs6000-builtin.def (SQRTF128): Delete rs6000
        sqrtf128 and fmaf128 builtins, as this is handled by machine
        independent code.
        (FMAF128): Likewise.

[gcc/c-family]
2017-10-19  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * c-cppbuiltin.c (mode_has_fma): Add support for PowerPC fmakf3
        for float128 fma when long double is not __float128.
        (c_cpp_builtins): Define __FP_FAST_FMAF<N> and __FP_FAST_FMA<N>X
        if the _Float<N> and _Float<N>X variants for fma exist.

[gcc/c]
2017-10-19  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * c-decl.c (header_for_builtin_fn): Add support for fma with
        _Float<N> and _Float<N>X variants.

[gcc/testsuite]
2017-10-19  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * gcc.target/powerpc/float128-fma2.c: Delete, test is no longer
        relavant now that machine independent code handles sqrt and fma
        _Float<N> and _Float<N>X variants.
        * gcc.target/powerpc/float128-sqrt2.c: Likewise.
        * gcc.target/powerpc/float128-hw.c: Add more tests for FMA
        variants.  Test code generated to convert __float128 to float.
        * gcc.target/powerpc/float128-hw2.c: New test for machine
        independent handling of copysignf128, sqrtf128, and fmaf128.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c      (.../trunk/gcc) (revision 253857)
+++ gcc/builtins.c      (.../branches/ibm/ieee/gcc)     (working copy)
@@ -1816,14 +1816,26 @@ expand_builtin_classify_type (tree exp)
   return GEN_INT (no_type_class);
 }
 
-/* This helper macro, meant to be used in mathfn_built_in below,
-   determines which among a set of three builtin math functions is
-   appropriate for a given type mode.  The `F' and `L' cases are
-   automatically generated from the `double' case.  */
+/* This helper macro, meant to be used in mathfn_built_in below, determines
+   which among a set of builtin math functions is appropriate for a given type
+   mode.  The `F' (float) and `L' (long double) are automatically generated
+   from the 'double' case.  If a function supports the _Float<N> and _Float<N>X
+   types, there are additional types that are considered with 'F32', 'F64',
+   'F128', etc. suffixes.  */
 #define CASE_MATHFN(MATHFN) \
   CASE_CFN_##MATHFN: \
   fcode = BUILT_IN_##MATHFN; fcodef = BUILT_IN_##MATHFN##F ; \
   fcodel = BUILT_IN_##MATHFN##L ; break;
+/* Similar to the above, but also add support for the _Float<N> and _Float<N>X
+   types.  */
+#define CASE_MATHFN_FLOATN(MATHFN) \
+  CASE_CFN_##MATHFN: \
+  fcode = BUILT_IN_##MATHFN; fcodef = BUILT_IN_##MATHFN##F ; \
+  fcodel = BUILT_IN_##MATHFN##L ; fcodef16 = BUILT_IN_##MATHFN##F16 ; \
+  fcodef32 = BUILT_IN_##MATHFN##F32; fcodef64 = BUILT_IN_##MATHFN##F64 ; \
+  fcodef128 = BUILT_IN_##MATHFN##F128 ; fcodef32x = BUILT_IN_##MATHFN##F32X ; \
+  fcodef64x = BUILT_IN_##MATHFN##F64X ; fcodef128x = BUILT_IN_##MATHFN##F128X 
;\
+  break;
 /* Similar to above, but appends _R after any F/L suffix.  */
 #define CASE_MATHFN_REENT(MATHFN) \
   case CFN_BUILT_IN_##MATHFN##_R: \
@@ -1840,7 +1852,15 @@ expand_builtin_classify_type (tree exp)
 static built_in_function
 mathfn_built_in_2 (tree type, combined_fn fn)
 {
+  tree mtype;
   built_in_function fcode, fcodef, fcodel;
+  built_in_function fcodef16 = END_BUILTINS;
+  built_in_function fcodef32 = END_BUILTINS;
+  built_in_function fcodef64 = END_BUILTINS;
+  built_in_function fcodef128 = END_BUILTINS;
+  built_in_function fcodef32x = END_BUILTINS;
+  built_in_function fcodef64x = END_BUILTINS;
+  built_in_function fcodef128x = END_BUILTINS;
 
   switch (fn)
     {
@@ -1854,7 +1874,7 @@ mathfn_built_in_2 (tree type, combined_f
     CASE_MATHFN (CBRT)
     CASE_MATHFN (CEIL)
     CASE_MATHFN (CEXPI)
-    CASE_MATHFN (COPYSIGN)
+    CASE_MATHFN_FLOATN (COPYSIGN)
     CASE_MATHFN (COS)
     CASE_MATHFN (COSH)
     CASE_MATHFN (DREM)
@@ -1867,9 +1887,9 @@ mathfn_built_in_2 (tree type, combined_f
     CASE_MATHFN (FABS)
     CASE_MATHFN (FDIM)
     CASE_MATHFN (FLOOR)
-    CASE_MATHFN (FMA)
-    CASE_MATHFN (FMAX)
-    CASE_MATHFN (FMIN)
+    CASE_MATHFN_FLOATN (FMA)
+    CASE_MATHFN_FLOATN (FMAX)
+    CASE_MATHFN_FLOATN (FMIN)
     CASE_MATHFN (FMOD)
     CASE_MATHFN (FREXP)
     CASE_MATHFN (GAMMA)
@@ -1923,7 +1943,7 @@ mathfn_built_in_2 (tree type, combined_f
     CASE_MATHFN (SIN)
     CASE_MATHFN (SINCOS)
     CASE_MATHFN (SINH)
-    CASE_MATHFN (SQRT)
+    CASE_MATHFN_FLOATN (SQRT)
     CASE_MATHFN (TAN)
     CASE_MATHFN (TANH)
     CASE_MATHFN (TGAMMA)
@@ -1936,12 +1956,27 @@ mathfn_built_in_2 (tree type, combined_f
       return END_BUILTINS;
     }
 
-  if (TYPE_MAIN_VARIANT (type) == double_type_node)
+  mtype = TYPE_MAIN_VARIANT (type);
+  if (mtype == double_type_node)
     return fcode;
-  else if (TYPE_MAIN_VARIANT (type) == float_type_node)
+  else if (mtype == float_type_node)
     return fcodef;
-  else if (TYPE_MAIN_VARIANT (type) == long_double_type_node)
+  else if (mtype == long_double_type_node)
     return fcodel;
+  else if (mtype == float16_type_node)
+    return fcodef16;
+  else if (mtype == float32_type_node)
+    return fcodef32;
+  else if (mtype == float64_type_node)
+    return fcodef64;
+  else if (mtype == float128_type_node)
+    return fcodef128;
+  else if (mtype == float32x_type_node)
+    return fcodef32x;
+  else if (mtype == float64x_type_node)
+    return fcodef64x;
+  else if (mtype == float128x_type_node)
+    return fcodef128x;
   else
     return END_BUILTINS;
 }
@@ -1995,6 +2030,9 @@ associated_internal_fn (tree fndecl)
     {
 #define DEF_INTERNAL_FLT_FN(NAME, FLAGS, OPTAB, TYPE) \
     CASE_FLT_FN (BUILT_IN_##NAME): return IFN_##NAME;
+#define DEF_INTERNAL_FLT_FLOATN_FN(NAME, FLAGS, OPTAB, TYPE) \
+    CASE_FLT_FN (BUILT_IN_##NAME): return IFN_##NAME; \
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_##NAME): return IFN_##NAME;
 #define DEF_INTERNAL_INT_FN(NAME, FLAGS, OPTAB, TYPE) \
     CASE_INT_FN (BUILT_IN_##NAME): return IFN_##NAME;
 #include "internal-fn.def"
@@ -2068,6 +2106,7 @@ expand_builtin_mathfn_ternary (tree exp,
   switch (DECL_FUNCTION_CODE (fndecl))
     {
     CASE_FLT_FN (BUILT_IN_FMA):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
       builtin_optab = fma_optab; break;
     default:
       gcc_unreachable ();
@@ -6559,6 +6598,7 @@ expand_builtin (tree exp, rtx target, rt
       break;
 
     CASE_FLT_FN (BUILT_IN_FMA):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
       target = expand_builtin_mathfn_ternary (exp, target, subtarget);
       if (target)
        return target;
@@ -8989,6 +9029,7 @@ fold_builtin_3 (location_t loc, tree fnd
       return fold_builtin_sincos (loc, arg0, arg1, arg2);
 
     CASE_FLT_FN (BUILT_IN_FMA):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
       return fold_builtin_fma (loc, arg0, arg1, arg2, type);
 
     CASE_FLT_FN (BUILT_IN_REMQUO):
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c    (.../trunk/gcc) (revision 253857)
+++ gcc/fold-const.c    (.../branches/ibm/ieee/gcc)     (working copy)
@@ -12761,6 +12761,7 @@ tree_call_nonnegative_warnv_p (tree type
       return true;
 
     CASE_CFN_SQRT:
+    CASE_CFN_SQRT_FN:
       /* sqrt(-0.0) is -0.0.  */
       if (!HONOR_SIGNED_ZEROS (element_mode (type)))
        return true;
@@ -12805,14 +12806,17 @@ tree_call_nonnegative_warnv_p (tree type
       return RECURSE (arg0);
 
     CASE_CFN_FMAX:
+    CASE_CFN_FMAX_FN:
       /* True if the 1st OR 2nd arguments are nonnegative.  */
       return RECURSE (arg0) || RECURSE (arg1);
 
     CASE_CFN_FMIN:
+    CASE_CFN_FMIN_FN:
       /* True if the 1st AND 2nd arguments are nonnegative.  */
       return RECURSE (arg0) && RECURSE (arg1);
 
     CASE_CFN_COPYSIGN:
+    CASE_CFN_COPYSIGN_FN:
       /* True if the 2nd argument is nonnegative.  */
       return RECURSE (arg1);
 
@@ -13311,7 +13315,9 @@ integer_valued_real_call_p (combined_fn
       return true;
 
     CASE_CFN_FMIN:
+    CASE_CFN_FMIN_FN:
     CASE_CFN_FMAX:
+    CASE_CFN_FMAX_FN:
       return RECURSE (arg0) && RECURSE (arg1);
 
     default:
Index: gcc/builtin-types.def
===================================================================
--- gcc/builtin-types.def       (.../trunk/gcc) (revision 253857)
+++ gcc/builtin-types.def       (.../branches/ibm/ieee/gcc)     (working copy)
@@ -544,6 +544,20 @@ DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_DOUBLE
                     BT_DOUBLE, BT_DOUBLE, BT_DOUBLE, BT_DOUBLE)
 DEF_FUNCTION_TYPE_3 (BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE,
                     BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16,
+                    BT_FLOAT16, BT_FLOAT16, BT_FLOAT16, BT_FLOAT16)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32,
+                    BT_FLOAT32, BT_FLOAT32, BT_FLOAT32, BT_FLOAT32)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64,
+                    BT_FLOAT64, BT_FLOAT64, BT_FLOAT64, BT_FLOAT64)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128,
+                    BT_FLOAT128, BT_FLOAT128, BT_FLOAT128, BT_FLOAT128)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X,
+                    BT_FLOAT32X, BT_FLOAT32X, BT_FLOAT32X, BT_FLOAT32X)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X,
+                    BT_FLOAT64X, BT_FLOAT64X, BT_FLOAT64X, BT_FLOAT64X)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X,
+                    BT_FLOAT128X, BT_FLOAT128X, BT_FLOAT128X, BT_FLOAT128X)
 DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT_FLOAT_FLOAT_INTPTR,
                     BT_FLOAT, BT_FLOAT, BT_FLOAT, BT_INT_PTR)
 DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_DOUBLE_DOUBLE_INTPTR,
Index: gcc/builtins.def
===================================================================
--- gcc/builtins.def    (.../trunk/gcc) (revision 253857)
+++ gcc/builtins.def    (.../branches/ibm/ieee/gcc)     (working copy)
@@ -92,13 +92,13 @@ along with GCC; see the file COPYING3.
    value for the type.  */
 #undef DEF_GCC_FLOATN_NX_BUILTINS
 #define DEF_GCC_FLOATN_NX_BUILTINS(ENUM, NAME, TYPE_MACRO, ATTRS)      \
-  DEF_GCC_BUILTIN (ENUM ## F16, NAME "f16", TYPE_MACRO (FLOAT16), ATTRS) \
-  DEF_GCC_BUILTIN (ENUM ## F32, NAME "f32", TYPE_MACRO (FLOAT32), ATTRS) \
-  DEF_GCC_BUILTIN (ENUM ## F64, NAME "f64", TYPE_MACRO (FLOAT64), ATTRS) \
-  DEF_GCC_BUILTIN (ENUM ## F128, NAME "f128", TYPE_MACRO (FLOAT128), ATTRS) \
-  DEF_GCC_BUILTIN (ENUM ## F32X, NAME "f32x", TYPE_MACRO (FLOAT32X), ATTRS) \
-  DEF_GCC_BUILTIN (ENUM ## F64X, NAME "f64x", TYPE_MACRO (FLOAT64X), ATTRS) \
-  DEF_GCC_BUILTIN (ENUM ## F128X, NAME "f128x", TYPE_MACRO (FLOAT128X), ATTRS)
+  DEF_EXT_LIB_BUILTIN (ENUM ## F16, NAME "f16", TYPE_MACRO (FLOAT16), ATTRS) \
+  DEF_EXT_LIB_BUILTIN (ENUM ## F32, NAME "f32", TYPE_MACRO (FLOAT32), ATTRS) \
+  DEF_EXT_LIB_BUILTIN (ENUM ## F64, NAME "f64", TYPE_MACRO (FLOAT64), ATTRS) \
+  DEF_EXT_LIB_BUILTIN (ENUM ## F128, NAME "f128", TYPE_MACRO (FLOAT128), 
ATTRS) \
+  DEF_EXT_LIB_BUILTIN (ENUM ## F32X, NAME "f32x", TYPE_MACRO (FLOAT32X), 
ATTRS) \
+  DEF_EXT_LIB_BUILTIN (ENUM ## F64X, NAME "f64x", TYPE_MACRO (FLOAT64X), 
ATTRS) \
+  DEF_EXT_LIB_BUILTIN (ENUM ## F128X, NAME "f128x", TYPE_MACRO (FLOAT128X), 
ATTRS)
 
 /* A library builtin (like __builtin_strchr) is a builtin equivalent
    of an ANSI/ISO standard library function.  In addition to the
@@ -382,12 +382,21 @@ DEF_C99_C90RES_BUILTIN (BUILT_IN_FLOORL,
 DEF_C99_BUILTIN        (BUILT_IN_FMA, "fma", 
BT_FN_DOUBLE_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING)
 DEF_C99_BUILTIN        (BUILT_IN_FMAF, "fmaf", BT_FN_FLOAT_FLOAT_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING)
 DEF_C99_BUILTIN        (BUILT_IN_FMAL, "fmal", 
BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING)
+#define FMA_TYPE(F) BT_FN_##F##_##F##_##F##_##F
+DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_FMA, "fma", FMA_TYPE, 
ATTR_MATHFN_FPROUNDING)
+#undef FMA_TYPE
 DEF_C99_BUILTIN        (BUILT_IN_FMAX, "fmax", BT_FN_DOUBLE_DOUBLE_DOUBLE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_FMAXF, "fmaxf", BT_FN_FLOAT_FLOAT_FLOAT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_FMAXL, "fmaxl", 
BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define FMAX_TYPE(F) BT_FN_##F##_##F##_##F
+DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_FMAX, "fmax", FMAX_TYPE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef FMAX_TYPE
 DEF_C99_BUILTIN        (BUILT_IN_FMIN, "fmin", BT_FN_DOUBLE_DOUBLE_DOUBLE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_FMINF, "fminf", BT_FN_FLOAT_FLOAT_FLOAT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_FMINL, "fminl", 
BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define FMIN_TYPE(F) BT_FN_##F##_##F##_##F
+DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_FMIN, "fmin", FMIN_TYPE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef FMIN_TYPE
 DEF_LIB_BUILTIN        (BUILT_IN_FMOD, "fmod", BT_FN_DOUBLE_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FMODF, "fmodf", BT_FN_FLOAT_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FMODL, "fmodl", 
BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
@@ -564,6 +573,9 @@ DEF_C99_C90RES_BUILTIN (BUILT_IN_SINL, "
 DEF_LIB_BUILTIN        (BUILT_IN_SQRT, "sqrt", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_SQRTF, "sqrtf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_SQRTL, "sqrtl", BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
+#define SQRT_TYPE(F) BT_FN_##F##_##F
+DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_SQRT, "sqrt", SQRT_TYPE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
+#undef SQRT_TYPE
 DEF_LIB_BUILTIN        (BUILT_IN_TAN, "tan", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_TANF, "tanf", BT_FN_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING)
 DEF_LIB_BUILTIN        (BUILT_IN_TANH, "tanh", BT_FN_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING)
Index: gcc/tree-call-cdce.c
===================================================================
--- gcc/tree-call-cdce.c        (.../trunk/gcc) (revision 253857)
+++ gcc/tree-call-cdce.c        (.../branches/ibm/ieee/gcc)     (working copy)
@@ -314,6 +314,7 @@ can_test_argument_range (gcall *call)
     CASE_FLT_FN (BUILT_IN_POW10):
     /* Sqrt.  */
     CASE_FLT_FN (BUILT_IN_SQRT):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_SQRT):
       return check_builtin_call (call);
     /* Special one: two argument pow.  */
     case BUILT_IN_POW:
@@ -342,6 +343,7 @@ edom_only_function (gcall *call)
     CASE_FLT_FN (BUILT_IN_SIGNIFICAND):
     CASE_FLT_FN (BUILT_IN_SIN):
     CASE_FLT_FN (BUILT_IN_SQRT):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_SQRT):
     CASE_FLT_FN (BUILT_IN_FMOD):
     CASE_FLT_FN (BUILT_IN_REMAINDER):
       return true;
@@ -703,6 +705,7 @@ get_no_error_domain (enum built_in_funct
                          308, true, false);
     /* sqrt: [0, +inf)  */
     CASE_FLT_FN (BUILT_IN_SQRT):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_SQRT):
       return get_domain (0, true, true,
                          0, false, false);
     default:
Index: gcc/tree-ssa-math-opts.c
===================================================================
--- gcc/tree-ssa-math-opts.c    (.../trunk/gcc) (revision 253857)
+++ gcc/tree-ssa-math-opts.c    (.../branches/ibm/ieee/gcc)     (working copy)
@@ -515,6 +515,7 @@ internal_fn_reciprocal (gcall *call)
   switch (gimple_call_combined_fn (call))
     {
     CASE_CFN_SQRT:
+    CASE_CFN_SQRT_FN:
       ifn = IFN_RSQRT;
       break;
 
Index: gcc/fold-const-call.c
===================================================================
--- gcc/fold-const-call.c       (.../trunk/gcc) (revision 253857)
+++ gcc/fold-const-call.c       (.../branches/ibm/ieee/gcc)     (working copy)
@@ -596,6 +596,7 @@ fold_const_call_ss (real_value *result,
   switch (fn)
     {
     CASE_CFN_SQRT:
+    CASE_CFN_SQRT_FN:
       return (real_compare (GE_EXPR, arg, &dconst0)
              && do_mpfr_arg1 (result, mpfr_sqrt, arg, format));
 
@@ -1179,14 +1180,17 @@ fold_const_call_sss (real_value *result,
       return do_mpfr_arg2 (result, mpfr_hypot, arg0, arg1, format);
 
     CASE_CFN_COPYSIGN:
+    CASE_CFN_COPYSIGN_FN:
       *result = *arg0;
       real_copysign (result, arg1);
       return true;
 
     CASE_CFN_FMIN:
+    CASE_CFN_FMIN_FN:
       return do_mpfr_arg2 (result, mpfr_min, arg0, arg1, format);
 
     CASE_CFN_FMAX:
+    CASE_CFN_FMAX_FN:
       return do_mpfr_arg2 (result, mpfr_max, arg0, arg1, format);
 
     CASE_CFN_POW:
@@ -1473,6 +1477,7 @@ fold_const_call_ssss (real_value *result
   switch (fn)
     {
     CASE_CFN_FMA:
+    CASE_CFN_FMA_FN:
       return do_mpfr_arg3 (result, mpfr_fma, arg0, arg1, arg2, format);
 
     default:
Index: gcc/internal-fn.def
===================================================================
--- gcc/internal-fn.def (.../trunk/gcc) (revision 253857)
+++ gcc/internal-fn.def (.../branches/ibm/ieee/gcc)     (working copy)
@@ -80,6 +80,11 @@ along with GCC; see the file COPYING3.
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_FLT_FLOATN_FN
+#define DEF_INTERNAL_FLT_FLOATN_FN(NAME, FLAGS, OPTAB, TYPE) \
+  DEF_INTERNAL_FLT_FN (NAME, FLAGS, OPTAB, TYPE)
+#endif
+
 #ifndef DEF_INTERNAL_INT_FN
 #define DEF_INTERNAL_INT_FN(NAME, FLAGS, OPTAB, TYPE) \
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
@@ -109,7 +114,7 @@ DEF_INTERNAL_FLT_FN (LOG2, ECF_CONST, lo
 DEF_INTERNAL_FLT_FN (LOGB, ECF_CONST, logb, unary)
 DEF_INTERNAL_FLT_FN (SIGNIFICAND, ECF_CONST, significand, unary)
 DEF_INTERNAL_FLT_FN (SIN, ECF_CONST, sin, unary)
-DEF_INTERNAL_FLT_FN (SQRT, ECF_CONST, sqrt, unary)
+DEF_INTERNAL_FLT_FLOATN_FN (SQRT, ECF_CONST, sqrt, unary)
 DEF_INTERNAL_FLT_FN (TAN, ECF_CONST, tan, unary)
 
 /* FP rounding.  */
@@ -122,13 +127,13 @@ DEF_INTERNAL_FLT_FN (TRUNC, ECF_CONST, b
 
 /* Binary math functions.  */
 DEF_INTERNAL_FLT_FN (ATAN2, ECF_CONST, atan2, binary)
-DEF_INTERNAL_FLT_FN (COPYSIGN, ECF_CONST, copysign, binary)
+DEF_INTERNAL_FLT_FLOATN_FN (COPYSIGN, ECF_CONST, copysign, binary)
 DEF_INTERNAL_FLT_FN (FMOD, ECF_CONST, fmod, binary)
 DEF_INTERNAL_FLT_FN (POW, ECF_CONST, pow, binary)
 DEF_INTERNAL_FLT_FN (REMAINDER, ECF_CONST, remainder, binary)
 DEF_INTERNAL_FLT_FN (SCALB, ECF_CONST, scalb, binary)
-DEF_INTERNAL_FLT_FN (FMIN, ECF_CONST, fmin, binary)
-DEF_INTERNAL_FLT_FN (FMAX, ECF_CONST, fmax, binary)
+DEF_INTERNAL_FLT_FLOATN_FN (FMIN, ECF_CONST, fmin, binary)
+DEF_INTERNAL_FLT_FLOATN_FN (FMAX, ECF_CONST, fmax, binary)
 DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary)
 
 /* FP scales.  */
@@ -230,5 +235,6 @@ DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF
 
 #undef DEF_INTERNAL_INT_FN
 #undef DEF_INTERNAL_FLT_FN
+#undef DEF_INTERNAL_FLT_FLOATN_FN
 #undef DEF_INTERNAL_OPTAB_FN
 #undef DEF_INTERNAL_FN
Index: gcc/gencfn-macros.c
===================================================================
--- gcc/gencfn-macros.c (.../trunk/gcc) (revision 253857)
+++ gcc/gencfn-macros.c (.../branches/ibm/ieee/gcc)     (working copy)
@@ -98,11 +98,12 @@ is_group (string_set *builtins, const ch
 
 static void
 print_case_cfn (const char *name, bool internal_p,
-               const char *const *suffixes)
+               const char *const *suffixes, bool floatn_p)
 {
-  printf ("#define CASE_CFN_%s", name);
+  const char *floatn = (floatn_p) ? "_FN" : "";
+  printf ("#define CASE_CFN_%s%s", name, floatn);
   if (internal_p)
-    printf (" \\\n  case CFN_%s", name);
+    printf (" \\\n  case CFN_%s%s", name, floatn);
   for (unsigned int i = 0; suffixes[i]; ++i)
     printf ("%s \\\n  case CFN_BUILT_IN_%s%s",
            internal_p || i > 0 ? ":" : "", name, suffixes[i]);
@@ -115,9 +116,10 @@ print_case_cfn (const char *name, bool i
 
 static void
 print_define_operator_list (const char *name, bool internal_p,
-                           const char *const *suffixes)
+                           const char *const *suffixes, bool floatn_p)
 {
-  printf ("(define_operator_list %s\n", name);
+  const char *floatn = (floatn_p) ? "_FN" : "";
+  printf ("(define_operator_list %s%s\n", name, floatn);
   for (unsigned int i = 0; suffixes[i]; ++i)
     printf ("    BUILT_IN_%s%s\n", name, suffixes[i]);
   if (internal_p)
@@ -148,6 +150,8 @@ const char *const internal_fn_int_names[
 };
 
 static const char *const flt_suffixes[] = { "F", "", "L", NULL };
+static const char *const fltfn_suffixes[] = { "F16", "F32", "F128", "F32X",
+                                             "F64X", "F128X", NULL };
 static const char *const int_suffixes[] = { "", "L", "LL", "IMAX", NULL };
 
 static const char *const *const suffix_lists[] = {
@@ -200,15 +204,33 @@ main (int argc, char **argv)
        {
          const char *root = name + 9;
          for (unsigned int j = 0; suffix_lists[j]; ++j)
-           if (is_group (&builtins, root, suffix_lists[j]))
-             {
-               bool internal_p = internal_fns.contains (root);
-               if (type == 'c')
-                 print_case_cfn (root, internal_p, suffix_lists[j]);
-               else
-                 print_define_operator_list (root, internal_p,
-                                             suffix_lists[j]);
-             }
+           {
+             const char *const *const suffix = suffix_lists[j];
+
+             if (is_group (&builtins, root, suffix))
+               {
+                 bool internal_p = internal_fns.contains (root);
+
+                 if (type == 'c')
+                   print_case_cfn (root, internal_p, suffix, false);
+                 else
+                   print_define_operator_list (root, internal_p,
+                                               suffix, false);
+
+                     /* Support the _Float<N> and _Float<N>X math functions if
+                        they exist.  We put these out as a separate CFN macro,
+                        so code can add support or not as needed.  */
+                 if (suffix == flt_suffixes
+                     && is_group (&builtins, root, fltfn_suffixes))
+                   {
+                     if (type == 'c')
+                       print_case_cfn (root, false, fltfn_suffixes, true);
+                     else
+                       print_define_operator_list (root, false, fltfn_suffixes,
+                                                   true);
+                   }
+               }
+           }
        }
     }
 
Index: gcc/tree-ssa-reassoc.c
===================================================================
--- gcc/tree-ssa-reassoc.c      (.../trunk/gcc) (revision 253857)
+++ gcc/tree-ssa-reassoc.c      (.../branches/ibm/ieee/gcc)     (working copy)
@@ -5625,6 +5625,7 @@ attempt_builtin_copysign (vec<operand_en
              switch (gimple_call_combined_fn (old_call))
                {
                CASE_CFN_COPYSIGN:
+               CASE_CFN_COPYSIGN_FN:
                  arg0 = gimple_call_arg (old_call, 0);
                  arg1 = gimple_call_arg (old_call, 1);
                  /* The first argument of copysign must be a constant,
Index: gcc/gimple-ssa-backprop.c
===================================================================
--- gcc/gimple-ssa-backprop.c   (.../trunk/gcc) (revision 253857)
+++ gcc/gimple-ssa-backprop.c   (.../branches/ibm/ieee/gcc)     (working copy)
@@ -354,6 +354,7 @@ backprop::process_builtin_call_use (gcal
       break;
 
     CASE_CFN_COPYSIGN:
+    CASE_CFN_COPYSIGN_FN:
       /* The sign of the first input is ignored.  */
       if (rhs != gimple_call_arg (call, 1))
        info->flags.ignore_sign = true;
@@ -373,6 +374,7 @@ backprop::process_builtin_call_use (gcal
       }
 
     CASE_CFN_FMA:
+    CASE_CFN_FMA_FN:
       /* In X * X + Y, where Y is distinct from X, the sign of X doesn't
         matter.  */
       if (gimple_call_arg (call, 0) == rhs
@@ -689,6 +691,7 @@ strip_sign_op_1 (tree rhs)
     switch (gimple_call_combined_fn (call))
       {
       CASE_CFN_COPYSIGN:
+      CASE_CFN_COPYSIGN_FN:
        return gimple_call_arg (call, 0);
 
       default:
Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def        (.../trunk/gcc) (revision 
253857)
+++ gcc/config/rs6000/rs6000-builtin.def        (.../branches/ibm/ieee/gcc)     
(working copy)
@@ -2374,17 +2374,13 @@ BU_FLOAT128_1 (FABSQ,           "fabsq",       CO
 BU_FLOAT128_2 (COPYSIGNQ,      "copysignq",   CONST, copysignkf3)
 
 /* 1, 2, and 3 argument IEEE 128-bit floating point functions that require ISA
-   3.0 hardware.  These functions use the new 'f128' suffix.  Eventually the
-   standard functions should be folded into the common built-in function
-   handling. */
-BU_FLOAT128_1_HW (SQRTF128,     "sqrtf128",               CONST, sqrtkf2)
+   3.0 hardware.  These functions use the new 'f128' suffix.  */
 BU_FLOAT128_1_HW (SQRTF128_ODD,         "sqrtf128_round_to_odd",  CONST, 
sqrtkf2_odd)
 BU_FLOAT128_1_HW (TRUNCF128_ODD, "truncf128_round_to_odd", CONST, 
trunckfdf2_odd)
 BU_FLOAT128_2_HW (ADDF128_ODD,  "addf128_round_to_odd",   CONST, addkf3_odd)
 BU_FLOAT128_2_HW (SUBF128_ODD,  "subf128_round_to_odd",   CONST, subkf3_odd)
 BU_FLOAT128_2_HW (MULF128_ODD,  "mulf128_round_to_odd",   CONST, mulkf3_odd)
 BU_FLOAT128_2_HW (DIVF128_ODD,  "divf128_round_to_odd",   CONST, divkf3_odd)
-BU_FLOAT128_3_HW (FMAF128,      "fmaf128",                CONST, fmakf4_hw)
 BU_FLOAT128_3_HW (FMAF128_ODD,  "fmaf128_round_to_odd",   CONST, fmakf4_odd)
 
 /* 1 argument crypto functions.  */
Index: gcc/c-family/c-cppbuiltin.c
===================================================================
--- gcc/c-family/c-cppbuiltin.c (.../trunk/gcc) (revision 253857)
+++ gcc/c-family/c-cppbuiltin.c (.../branches/ibm/ieee/gcc)     (working copy)
@@ -82,6 +82,11 @@ mode_has_fma (machine_mode mode)
       return !!HAVE_fmadf4;
 #endif
 
+#ifdef HAVE_fmakf4     /* PowerPC if long double != __float128.  */
+    case E_KFmode:
+      return !!HAVE_fmakf4;
+#endif
+
 #ifdef HAVE_fmaxf4
     case E_XFmode:
       return !!HAVE_fmaxf4;
@@ -1119,7 +1124,7 @@ c_cpp_builtins (cpp_reader *pfile)
               floatn_nx_types[i].extended ? "X" : "");
       sprintf (csuffix, "F%d%s", floatn_nx_types[i].n,
               floatn_nx_types[i].extended ? "x" : "");
-      builtin_define_float_constants (prefix, csuffix, "%s", NULL,
+      builtin_define_float_constants (prefix, csuffix, "%s", csuffix,
                                      FLOATN_NX_TYPE_NODE (i));
     }
 
Index: gcc/c/c-decl.c
===================================================================
--- gcc/c/c-decl.c      (.../trunk/gcc) (revision 253857)
+++ gcc/c/c-decl.c      (.../branches/ibm/ieee/gcc)     (working copy)
@@ -3171,6 +3171,7 @@ header_for_builtin_fn (enum built_in_fun
     CASE_FLT_FN (BUILT_IN_FDIM):
     CASE_FLT_FN (BUILT_IN_FLOOR):
     CASE_FLT_FN (BUILT_IN_FMA):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
     CASE_FLT_FN (BUILT_IN_FMAX):
     CASE_FLT_FN (BUILT_IN_FMIN):
     CASE_FLT_FN (BUILT_IN_FMOD):
Index: gcc/testsuite/gcc.target/powerpc/float128-fma2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-fma2.c    (.../trunk/gcc) 
(revision 253857)
+++ gcc/testsuite/gcc.target/powerpc/float128-fma2.c    
(.../branches/ibm/ieee/gcc)     (nonexistent)
@@ -1,9 +0,0 @@
-/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
-/* { dg-require-effective-target powerpc_p9vector_ok } */
-/* { dg-options "-mpower9-vector -mno-float128-hardware -O2" } */
-
-__float128
-xfma (__float128 a, __float128 b, __float128 c)
-{
-  return __builtin_fmaf128 (a, b, c); /* { dg-error "ISA 3.0 IEEE 128-bit" } */
-}
Index: gcc/testsuite/gcc.target/powerpc/float128-sqrt2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-sqrt2.c   (.../trunk/gcc) 
(revision 253857)
+++ gcc/testsuite/gcc.target/powerpc/float128-sqrt2.c   
(.../branches/ibm/ieee/gcc)     (nonexistent)
@@ -1,9 +0,0 @@
-/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
-/* { dg-require-effective-target powerpc_p9vector_ok } */
-/* { dg-options "-mpower9-vector -mno-float128-hardware -O2" } */
-
-__float128
-xsqrt (__float128 a)
-{
-  return __builtin_sqrtf128 (a); /* { dg-error "ISA 3.0 IEEE 128-bit" } */
-}
Index: gcc/testsuite/gcc.target/powerpc/float128-hw.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-hw.c      (.../trunk/gcc) 
(revision 253857)
+++ gcc/testsuite/gcc.target/powerpc/float128-hw.c      
(.../branches/ibm/ieee/gcc)     (working copy)
@@ -7,11 +7,20 @@ __float128 f128_sub (__float128 a, __flo
 __float128 f128_mul (__float128 a, __float128 b) { return a*b; }
 __float128 f128_div (__float128 a, __float128 b) { return a/b; }
 __float128 f128_fma (__float128 a, __float128 b, __float128 c) { return 
(a*b)+c; }
+__float128 f128_fms (__float128 a, __float128 b, __float128 c) { return 
(a*b)-c; }
+__float128 f128_nfma (__float128 a, __float128 b, __float128 c) { return 
-((a*b)+c); }
+__float128 f128_nfms (__float128 a, __float128 b, __float128 c) { return 
-((a*b)-c); }
 long f128_cmove (__float128 a, __float128 b, long c, long d) { return (a == b) 
? c : d; }
+float f128_to_flt (__float128 a) { return (float)a; }
+
+/* { dg-final { scan-assembler {\mxsaddqp\M}   } } */
+/* { dg-final { scan-assembler {\mxssubqp\M}   } } */
+/* { dg-final { scan-assembler {\mxsmulqp\M}   } } */
+/* { dg-final { scan-assembler {\mxsdivqp\M}   } } */
+/* { dg-final { scan-assembler {\mxsmaddqp\M}  } } */
+/* { dg-final { scan-assembler {\mxsmsubqp\M}  } } */
+/* { dg-final { scan-assembler {\mxsnmaddqp\M} } } */
+/* { dg-final { scan-assembler {\mxsnmsubqp\M} } } */
+/* { dg-final { scan-assembler {\mxscmpuqp\M}  } } */
+/* { dg-final { scan-assembler {\mxscvqpdpo\M} } } */
 
-/* { dg-final { scan-assembler "xsaddqp"  } } */
-/* { dg-final { scan-assembler "xssubqp"  } } */
-/* { dg-final { scan-assembler "xsmulqp"  } } */
-/* { dg-final { scan-assembler "xsdivqp"  } } */
-/* { dg-final { scan-assembler "xsmaddqp" } } */
-/* { dg-final { scan-assembler "xscmpuqp" } } */
Index: gcc/testsuite/gcc.target/powerpc/float128-hw2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-hw2.c     (.../trunk/gcc) 
(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/float128-hw2.c     
(.../branches/ibm/ieee/gcc)     (revision 253908)
@@ -0,0 +1,56 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2 -ffast-math" } */
+
+/* Test to make sure the compiler handles the standard _Float128 functions that
+   have hardware support in ISA 3.0/power9.  */
+
+#define __STDC_WANT_IEC_60559_TYPES_EXT__ 1
+
+extern _Float128 copysignf128 (_Float128, _Float128);
+extern _Float128 sqrtf128 (_Float128);
+extern _Float128 fmaf128 (_Float128, _Float128, _Float128);
+
+_Float128
+do_copysign (_Float128 a, _Float128 b)
+{
+  return copysignf128 (a, b);
+}
+
+_Float128
+do_sqrt (_Float128 a)
+{
+  return sqrtf128 (a);
+}
+
+_Float128
+do_fma (_Float128 a, _Float128 b, _Float128 c)
+{
+  return fmaf128 (a, b, c);
+}
+
+_Float128
+do_fms (_Float128 a, _Float128 b, _Float128 c)
+{
+  return fmaf128 (a, b, -c);
+}
+
+_Float128
+do_nfma (_Float128 a, _Float128 b, _Float128 c)
+{
+  return -fmaf128 (a, b, c);
+}
+
+_Float128
+do_nfms (_Float128 a, _Float128 b, _Float128 c)
+{
+  return -fmaf128 (a, b, -c);
+}
+
+/* { dg-final { scan-assembler     {\mxscpsgnqp\M} } } */
+/* { dg-final { scan-assembler     {\mxssqrtqp\M}  } } */
+/* { dg-final { scan-assembler     {\mxsmaddqp\M}  } } */
+/* { dg-final { scan-assembler     {\mxsmsubqp\M}  } } */
+/* { dg-final { scan-assembler     {\mxsnmaddqp\M} } } */
+/* { dg-final { scan-assembler     {\mxsnmsubqp\M} } } */
+/* { dg-final { scan-assembler-not {\mbl\M}        } } */

[PATCH, version 2], Add support for _Float and _FloatX sqrt, fma, fmin, fmax built-in functions

Reply via email to