I know it's been some time since I worked on this but I was looking at this again:
On Tuesday, 26. July 2011 19:46:17 Matthias Bentrup wrote: > The rounding mode affects all floating point operations, not just float to > int conversion. And round-to-zero has a higher relative error than round to > nearest. I see. How does this rounding mode play into the default fpu control word then? Have a look at the little test program I attached. It seems to me, on UNIX, the rounding mode on the FPU and control word is always "round to zero" per default. What I strive to do is, to have the same rounding modes on all systems so the physics are the same on all systems. > Oh. The original SnapVector assembly loaded the control word to set > round-to-nearest, but that should have been the default mode anyway. Well, it isn't. At least not on Linux. > I > somehow remembered that SnapVector was using the default rounding mode, but > I didn't think that it explicitly loaded the control word. That's what started the different behaviour between windows and linux in the first place. And this is the reason why id wrote an assembly version of these fp operations so it would work same on all platforms. > I have run both versions of snapvector 10 million times in a loop and > measured them with CodeAnalyst: > The maskmovdqu version: CPU Clocks 278678, IPC 0.06, DC miss rate 0,02 > The andps/orps version: CPU Clocks 61028, IPC 0.36, DC miss rate 0 Alright. I'll have a look at it. > If SSE is available, we can use the cvtt* opcodes for fast truncating > conversions and still keep the standard round-to-nearest mode for > everything else. If we have no SSE we can chose the "correct" or "fast" > float to int conversion, but I'd prefer to keep the default rounding mode > for all the other operations. That's the problem. There are no guarantees on what is standard, and what is not. At all. -- Thilo Schulz
#include <stdio.h>
#include <fenv.h>
main()
{
float test1, test2;
int oldcw __attribute__((aligned(16)));
short fpucw;
printf("FE_TONEAREST: %d, FE_TOWARDZERO: %d\n", FE_TONEAREST, FE_TOWARDZERO);
printf("%d\n", fegetround());
__asm__ volatile
(
"stmxcsr %0\n"
"fnstcw %1\n"
:
: "m" (oldcw), "m" (fpucw)
);
test1 = 2.389364562823984748499f;
test2 = 38.395723739347593847474594f;
test1 = test1 + test2;
printf("%X, %hX, %f\n", oldcw, fpucw, test1);
}
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ ioquake3 mailing list [email protected] http://lists.ioquake.org/listinfo.cgi/ioquake3-ioquake.org By sending this message I agree to love ioquake3 and libsdl.
