On Wed, Jul 7, 2021 at 2:11 AM Joseph Myers <jos...@codesourcery.com> wrote: > > On Tue, 6 Jul 2021, Hongtao Liu via Gcc-patches wrote: > > > There may be inconsistent behavior between soft-fp and avx512fp16 > > instructions if we emulate _Float16 w/ float . > > i.e > > 1) for a + b - c where b and c are variables with the same big value > > and a + b is NAN at _Float16 and real value at float, avx512fp16 > > instruction will raise an exception but soft-fp won't(unless it's > > rounded after every operation.) > > There are at least two variants of emulation using float: > > (a) Using the excess precision support, as on AArch64, which means the C > front end converts the _Float16 operations to float ones, with explicit > narrowing on assignment (and conversion as if by assignment - argument > passing and return, casts, etc.). Excess precision indeed involves > different semantics compared to doing each operation directly in the range > and precision of _Float16. > Yes, set excess_precision_type to FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 could round after each operation. > (b) Letting the expand/optabs code generate operations in a wider mode. > My understanding is that the result should get converted back to the > narrower mode after each operation (by the expand/optabs code / > convert_move called by it generating such a conversion), meaning (for > basic arithmetic operations) that the semantics end up the same as if the > operation had been done directly on _Float16 (but with more truncation > operations occurring than would be the case with excess precision support > used). Yes, just w/ different behavior related to exceptions.. > > > 2) a / b where b is denormal value and AVX512FP16 won't flush it to > > zero even w/ -Ofast, but when it's extended to float and using divss, > > it will be flushed to zero and raise an exception when compiling w/ > > Ofast > > I don't think that's a concern, flush to zero is well outside the scope of > standards defining _Float16 semantics. Ok. > > > So the key point is that the soft-fp and avx512fp16 instructions may > > do not behave the same on the exception, is this acceptable? > > As far as I understand it, all cases within the standards will behave as > expected for exceptions, whether pure software floating-point is used, > pure hardware _Float16 arithmetic or one of the forms of emulation listed > above. (Where "as expected" itself depends on the value of > FLT_EVAL_METHOD, i.e. whether excess precision is used for _Float16.) > Flush to zero and trapping exceptions are outside the scope of the > standards. Since trapping exceptions is outside the scope of the > standards, so is anything that distinguishes whether an arithmetic > operation raises the same exception more than once or the order in which > it raises different exceptions (e.g. the possibility of "inexact" being > raised more than once, both by arithmetic on float and by narrowing from > float to _Float16). > Set excess_precision_type to FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 to round after each operation could keep semantics right. And I'll document the behavior difference between soft-fp and AVX512FP16 instruction for exceptions. > -- > Joseph S. Myers > jos...@codesourcery.com
-- BR, Hongtao