On Tue, 05 Jun 2012 22:50:00 -0700 John McCall <[email protected]> wrote:
> On Jun 5, 2012, at 9:24 PM, Hal Finkel wrote: > > On Tue, 05 Jun 2012 20:12:00 -0700 John McCall <[email protected]> > > wrote: > >> On Jun 5, 2012, at 3:35 PM, John McCall wrote: > >>> On Jun 5, 2012, at 3:04 PM, Chandler Carruth wrote: > >>>> On Tue, Jun 5, 2012 at 2:58 PM, Stephen Canon <[email protected]> > >>>> wrote: On Jun 5, 2012, at 2:45 PM, John McCall > >>>> <[email protected]> wrote: > >>>> > >>>>> On Jun 5, 2012, at 2:15 PM, Stephen Canon wrote: > >>>>> > >>>>>> On Jun 5, 2012, at 1:51 PM, Chandler Carruth > >>>>>> <[email protected]> wrote: > >>>>>> > >>>>>>> That said, FP_CONTRACT doesn't apply to C++, and it's quite > >>>>>>> unlikely to become a serious part of the standard given these > >>>>>>> (among other) limitations. Curiously, in C++11, it may not be > >>>>>>> needed to get the benefit of fused multiply-add: > >>>>>> > >>>>>> Perversely, a strict reading of C++11 seems (to me) to not > >>>>>> allow FMA formation in C++ at all: > >>>>>> > >>>>>> • The values of the floating operands and the results of > >>>>>> floating expressions may be represented in greater precision > >>>>>> and range than that required by the type; the types are not > >>>>>> changed thereby. > >>>>>> > >>>>>> FMA formation does not increase the precision or range of the > >>>>>> result (it may or may not have smaller error, but it is not > >>>>>> more precise), so this paragraph doesn't actually license FMA > >>>>>> formation. I can't find anywhere else in the standard that > >>>>>> could (though I am *far* less familiar with C++11 than C11, so > >>>>>> I may not be looking in the right places). > >>>>> > >>>>> Correct me if I'm wrong, but I thought that an FMA could be > >>>>> formalized as representing the result of the multiply with > >>>>> greater precision than the operation's type actually provides, > >>>>> and then using that as the operand of the addition. It's > >>>>> understand that that can change the result of the addition in > >>>>> ways that aren't just "more precise". Similarly, performing > >>>>> 'float' operations using x87 long doubles can change the result > >>>>> of the operation, but I'm pretty sure that the committees > >>>>> explicitly had hardware limitations like that in mind when they > >>>>> added this language. > >>>> > >>>> That's an interesting point. I'm inclined to agree with this > >>>> interpretation (there are some minor details about whether or not > >>>> 0*INF + NAN raises the invalid flag, but let's agree to ignore > >>>> that). > >>>> > >>>> I'm not familiar enough with the language used in the C++ spec to > >>>> know whether this makes C++ numerics equivalent to STDC > >>>> FP_CONTRACT on, or equivalent to "allow greedy FMA formation". > >>>> Anyone? > >>>> > >>>> If you agree w/ John's interpretation, and don't consider the > >>>> flag case you mention, AFAICT, this allows greedy FMA formation, > >>>> unless the intermediate values are round-tripped through a cast > >>>> construct such as I described. > >>> > >>> I'm still not sure why you think this restriction *only* happens > >>> when round-tripping through casts, rather than through any thing > >>> which is not an operand or result, e.g. an object. > >>> > >>> Remember that the builtin operators are privileged in C++ — they > >>> are not semantically like calls, even in the cases where they're > >>> selected by overload resolution. > >>> > >>> I agree that my interpretation implies that a type which merely > >>> wraps a double nonetheless forces stricter behavior. I also agree > >>> that this sucks. > >> > >> To continue this thought, the most straightforward way to represent > >> this in IR would be to (1) add a "contractable" bit to the LLVM > >> operation (possibly as metadata) and (2) provide an explicit "value > >> barrier" instruction (a unary operator preventing contraction > >> "across" it). We would introduce the barrier in the appropriate > >> circumstances, i.e. an explicit cast, a load from a variable, or > >> whatever else we conclude requires these semantics. It would then > >> be straightforward to produce FMAs from this, as well as just > >> generally avoiding rounding when the doing sequences of illegal FP > >> ops. -ffast-math would imply never inserting the barriers. > >> > >> The disadvantages I see are: > >> - there might be lots of peepholes and isel patterns that would > >> need to be taught to to look through a value barrier > >> - the polarity of barriers is wrong, because code that lacks > >> barriers is implicitly opting in to things, so e.g. LTO could pick > >> a weak_odr function from an old tunit that lacks a barrier which a > >> fresh compile would insist on. > > > > I don't like the barrier approach because it implies that the FE > > must serialize each C expression as a distinct group of LLVM > > instructions. While it may be true that this currently happens in > > practice, I don't think we want to force it to be this way. > > I think you misunderstand. Indeed I did misunderstand. Thank you for clarifying, and I agree, your proposal makes sense. -Hal > By a "barrier", I mean an instruction > like this: %1 = call float @llvm.fp_contract_barrier.float(float %0) > readnone nounwind which states that %1 must be a representable float > value and therefore blocks FP contraction "across" the intrinsic, in > the sense that something using %1 can't be fused with the operation > producing %0. I do not mean something like a memory barrier that > divides things based on whether the instruction comes before or after > the barrier; that's clearly not workable. > > John. -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
