Some general comments, following what I said on libc-alpha:
1. Can you confirm that the ABI being used for 64-bit, for _Float16 and _Complex _Float16 argument passing and return, follows the current x86_64 ABI document? 2. Can you confirm that if you build with this instruction set extension enabled by default, and run GCC tests for a corresponding (emulated?) processor, all the existing float16 tests in the testsuite are enabled and PASS (both compilation and execution) (both 64-bit and 32-bit testing)? 3. There's an active 32-bit ABI mailing list (ia32-...@googlegroups.com). If you want to support _Float16 in the 32-bit case, please work with it to get the corresponding ABI documented (using only memory and general-purpose registers seems like a good idea, so that the ABI can be supported for the base architecture without depending on SSE registers being present). In the absence of 32-bit ABI support it might be better to disable the HFmode support for 32-bit. 4. Support for _Float16 really ought not to depend on whether a particular instruction set extension is present, just like with other floating-point types; it makes sense, as an API, for all x86 processors (and like many APIs, it will be faster on some processors than on others). More specific points here are: (a) Basic arithmetic (+-*/) can be done by converting to SFmode, doing arithmetic there and converting back to HFmode; the results of doing so will be correctly rounded. Indeed, I think optabs.c handles that automatically when operations are available on a wider mode but not on the desired mode (but you'd need to check carefully that all the expected conversions do occur). (b) Conversions to/from all other floating-point modes will always be needed, whether in hardware or in software. (c) In the F16C (Ivy Bridge and later) case, where you have hardware conversions to/from float (only), it's fine to convert to double (or long double) via float. (On efficiency grounds, widening from HFmode to TFmode should be a pure software operations, that should be faster than having an intermediate conversion to SFmode when the SFmode-to-TFmode conversion is a software operation.) (d) In the F16C case (where there are hardware conversions only from SFmode, not from wider modes), conversion *from* DFmode (or XFmode or TFmode) to HFmode should be a software operation, to avoid double rounding; an intermediate conversion to SFmode would be incorrect. (e) It's OK for conversions to/from integer modes to go via SFmode (although I don't know if that's efficient or not). Any case where a conversion from integer to SFmode is inexact would overflow HFmode, so there are no double rounding issues. (f) In the F16C case, it seems the hardware instructions only work on vectors, not scalars, so care would need to be taken to use them for scalar conversions only if the other elements of the vector register are known to be safe to convert without raising any exceptions (e.g. all zero bits, or -fno-trapping-math in effect). (g) If concerned about efficiency of intermediate truncations on processors without hardware _Float16 arithmetic, look at aarch64_excess_precision; you have the option of using excess precision for _Float16 by default, though that only really helps for C given the lack of excess precision support in the C++ front end. (Enabling this can cause trouble for code that only expects C99/C11 values of FLT_EVAL_METHOD, however; see the -fpermitted-flt-eval-methods option for more details.) 5. Suppose that in some cases you do disable _Float16 support (whether that's just for 32-bit until the ABI has been defined, or also in the absence of instruction set support despite my comments above). Then the way you do that in this patch series, enabling the type in ix86_scalar_mode_supported_p and ix86_libgcc_floating_mode_supported_p and giving an error later in ix86_expand_move, is a bad idea. Errors in expanders are generally problematic (they don't have good location information available). But apart from that, ordinary user code should be able to tell whether _Float16 is supported by testing whether e.g. __FLT16_MANT_DIG__ is defined (like float.h does), or by including float.h (with __STDC_WANT_IEC_60559_TYPES_EXT__ defined) and then testing whether one of the FLT16_* macros is defined, or in a configure test by just declaring something using the _Float16 type. Patch 1 changes check_effective_target_float16 to work around your technique for disabling _Float16 in ix86_expand_move, but it should be considered a stable user API that any of the above methods can be used in user code to check for _Float16 support - user code shouldn't need to know implementation details that you need to do something that will go through ix86_expand_move to see whether _Float16 is supported or not (and user code shouldn't need to use a configure test at all for this, testing FLT16_* after including float.h should work as a fully portable way of testing it - that's using only ISO C facilities). So enable HFmode in ix86_scalar_mode_supported_p and ix86_libgcc_floating_mode_supported_p exactly when all operations are supported in the rest of the compiler - don't enable it there and then disable it elsewhere, because that will break user code testing for whether _Float16 is available using FLT16_* macros. -- Joseph S. Myers jos...@codesourcery.com