> -----Original Message----- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > Song, Ruiling > Sent: Wednesday, March 11, 2015 11:05 AM > To: Matt Turner > Cc: Luo, Xionghu; beignet@lists.freedesktop.org > Subject: Re: [Beignet] [PATCH 6/7] replace mad with llvm intrinsic. > > > > > -----Original Message----- > > From: Matt Turner [mailto:matts...@gmail.com] > > Sent: Wednesday, March 11, 2015 10:20 AM > > To: Song, Ruiling > > Cc: Luo, Xionghu; beignet@lists.freedesktop.org > > Subject: Re: [Beignet] [PATCH 6/7] replace mad with llvm intrinsic. > > > > On Tue, Mar 10, 2015 at 6:55 PM, Song, Ruiling > > <ruiling.s...@intel.com> > > wrote: > > >> I'm not sure that it matters for this patch, but do we know if > > >> Gen's MAD instruction is a fused-multiply-add? That is, does it not > > >> do an intermediate rounding step after the multiply? > > > I also have such kind of concern, so I did a simple test: > > > on cpu side, I use "reference = (double)x1*(double)x2 + (double)x3;" > > > > Some recent CPUs have FMA instructions. You should make sure you know > > whether your code is compiled using FMA or not. > > > > > And on gpu side, I use "result = mad(x1, x2, x3);" > > > Then compare the result and reference, the bits are exactly the > > > same, so I > > think gen's MAD does not do intermediate rounding after multiply. > > > > The intermediate rounding step will not affect many pairs of numbers > > that are multiplied together. You need to make sure you're testing a > > pair of numbers that are affected by the intermediate rounding step. > > > > I wrote a small program to find cases where fmaf(x, y, z) != x*y+z > (attached). > > Compile with -std=c99 -O2 -march=native -lm. I'm testing on Haswell > > which has FMA. > > > > It shows that > > > > fmaf(1, 0.333333, 0.666667) is 1 (0x1.000002p+0), but 1 * 0.333333 + > > 0.666667 is 1 (0x1p+0) > > > > Please test that Gen's MAD instruction produces what fmaf() produces > > for > > 1.0 * 0.333333 + 0.666667. > I tried these number, the binary representation of 0.333333 is 0x1.55553ep-2 > The binary representation of 0.666667 is 0x1.5555p-1 I manually sum it up.
Sorry, typo here, should be " binary representation of 0.666667 is 0x1.55556p-1" > The mantissa bits is 24 bits ones (here not counting in the hidden one). As > floating point only has 23 bits mantissa, I don't know how to round it here, > if > select to round up, the result would be 0x1p0. I need to check IEEE754 spec. > But it cannot generate 0x1.000002p+0. > I think you'd better not output using %g, using %g would not show its exact > binary representation. I always like %a representation. > > > > Assuming glibc's fmaf() is correct... I'm again surprised by > > floating-point numbers. :) > _______________________________________________ > Beignet mailing list > Beignet@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/beignet _______________________________________________ Beignet mailing list Beignet@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/beignet