On 11/19/2013 11:40 PM, Tom Musta wrote: > + /* NOTE: in order to get accurate results, we must first round back */ > \ > + /* to single precision and use the fused multiply add routine */ > \ > + /* for 32-bit floats. */ > \ > + float_status tstat = env->fp_status; > \ > + float32 a32 = float64_to_float32(xa.f64[0], &tstat); > \ > + float32 b32 = float64_to_float32(b->f64[0], &tstat); > \ > + float32 c32 = float64_to_float32(c->f64[0], &tstat); > \ > + > \ > + set_float_exception_flags(0, &tstat); > \ > + float32 t32 = float32_muladd(a32, b32, c32, maddflgs, &tstat); > \
While this will produce correct results for the "normal" use case of correctly rounded single-precision inputs, the spec says # Except for xsresp or xsrsqrtesp, any double-precision value can # be used in single-precision scalar arithmetic operations when # OE=0 and UE=0. Thus a more correct implementation would use the full double-precision inputs while also correctly rounding. I pointed you at the glibc implementation to show how that can be done using round-to-zero plus examining the inexact bit. float_status tstat = env->fp_status; set_float_exception_flags(0, &tstat); if (tstat.float_rounding_mode == float_round_nearest_even) { /* Avoid double rounding errors by rounding the intermediate result to odd. See http://hal.inria.fr/docs/00/08/04/27/PDF/odd-rounding.pdf */ set_float_rounding_mode(float_round_to_zero, &tstat); res = float64_muladd(...); res |= (get_float_exception_flags(&tstat) & float_flag_inexact) != 0; } else { res = float64_muladd(...); } res = helper_frsp(env, res); apply tstat exceptions; r~