On Tue, Mar 10, 2015 at 6:55 PM, Song, Ruiling <ruiling.s...@intel.com> wrote: >> I'm not sure that it matters for this patch, but do we know if Gen's MAD >> instruction is a fused-multiply-add? That is, does it not do an intermediate >> rounding step after the multiply? > I also have such kind of concern, so I did a simple test: > on cpu side, I use "reference = (double)x1*(double)x2 + (double)x3;"
Some recent CPUs have FMA instructions. You should make sure you know whether your code is compiled using FMA or not. > And on gpu side, I use "result = mad(x1, x2, x3);" > Then compare the result and reference, the bits are exactly the same, so I > think gen's MAD does not do intermediate rounding after multiply. The intermediate rounding step will not affect many pairs of numbers that are multiplied together. You need to make sure you're testing a pair of numbers that are affected by the intermediate rounding step. I wrote a small program to find cases where fmaf(x, y, z) != x*y+z (attached). Compile with -std=c99 -O2 -march=native -lm. I'm testing on Haswell which has FMA. It shows that fmaf(1, 0.333333, 0.666667) is 1 (0x1.000002p+0), but 1 * 0.333333 + 0.666667 is 1 (0x1p+0) Please test that Gen's MAD instruction produces what fmaf() produces for 1.0 * 0.333333 + 0.666667. Assuming glibc's fmaf() is correct... I'm again surprised by floating-point numbers. :)
#include <stdio.h> #include <math.h> int main() { const float y = 1.0f / 3.0f; const float z = 2.0f / 3.0f; for (float x = 1.0f; x < 10.0f; x = nextafterf(x, 2.0f)) { float fma_result = fmaf(x, y, z); float opencoded_result = x * y + z; if (fma_result != opencoded_result) { printf("fmaf(%g, %g, %g) is %g (%a), but %g * %g + %g is %g (%a)\n", x, y, z, fma_result, fma_result, x, y, z, opencoded_result, opencoded_result); return -1; } } return 0; }
_______________________________________________ Beignet mailing list Beignet@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/beignet