rjmccall added a comment.

In D113107#3137921 <https://reviews.llvm.org/D113107#3137921>, @zahiraam wrote:

> In D113107#3136464 <https://reviews.llvm.org/D113107#3136464>, @rjmccall 
> wrote:
>
>> Does GCC actually change the formal types of expressions to `float`, or is 
>> it doing this "no intermediate casts thing" as some sort of 
>> fp_contract-style custom emission of trees of expressions that involve 
>> `_Float16`?
>>
>> In any case, doing operation-by-operation emulation seems like the right 
>> first approach rather than starting by doing the less conformant thing.
>
> I have created another patch https://reviews.llvm.org/D114099 that does the 
> first step.
>
> Not sure what you mean by "no intermediate casts thing".

I think we keep dancing around this in this review, so let me go back and start 
from the basics.  There are four approaches I know of for evaluating a 
homogeneous `_Float16` expression like `a + b + c`:

1. You can perform each operation with normal `_Float16` semantics.  Ideally, 
you would have hardware support for this.  If that isn't available, you can 
emulate the operations in software.  It happens to be true that, for the 
operations (`+`, `-`, `*`, `/`, `sqrt`) on `_Float16`, this emulation can just 
involve converting to e.g. `float`, doing the operation, and immediately 
converting back.  The core property of this approach is that there are no 
detectable differences from hardware support.

2. As a slight twist on approach #1, you can ignore the differences between 
native `_Float16` and emulation with `float`; instead, you just always do 
arithmetic in `float`.  This potentially changes the result in some cases; e.g. 
Steve Canon tells me that FMAs on `float` avoid some rounding errors that FMAs 
on `_Float16` fall subject to.

3. Approaches #1 and #2 require a lot of intermediate conversions when hardware 
isn't available.  In our example, `a + b + c` has to be calculated as 
`(_Float16) ((float) (_Float16) ((float) a + (float) b) + (float) c)`, where 
the result of one addition is converted down and then converted back again.  
You can avoid this by specifically recognizing this pattern and eliminating the 
conversion from sub-operations that happen to be of type `float`, so that in 
our example, `a + b + c` would be calculated as `(_Float16) ((float) a + 
(float) b + (float) c)`.  This is actually allowed by the C standard by default 
as a form of FP contraction; in fact, I believe C's rules for FP contraction 
were originally designed for exactly this kind of situation, except that it was 
emulating `float` with `double` on hardware that only provided arithmetic on 
the latter.  Obviously, this can change results.

4. The traditional language rule for `__fp16` is superficially similar to 
Approach #3 in terms of generated code, but it has some subtle differences in 
terms of the language.  `__fp16` is immediately promoted to `float` whenever it 
appears as an arithmetic operand.  What this means is that operations are 
performed in `float` but then not formally converted back (unless they're used 
in a context which requires a value of the original type, which entails a 
normal conversion, just as if you assigned a `double` into a `float` variable). 
 Thus, for example, `a + b + c` would actually have type `float`, not type 
`__fp16`.

What this patch is doing to `_Float16` is approach #4, basically treating it 
like `__fp16`.  That is non-conformant, and it doesn't seem to be what GCC 
does.  You can see that quite clearly here: https://godbolt.org/z/55oaajoax

What I believe GCC is doing (when not forbidden by `-fexcess-precision`) is 
approach #3: basically, FP contraction on expressions of `_Float16` type.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113107/new/

https://reviews.llvm.org/D113107

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to