Alex Bennée <alex.ben...@linaro.org> writes: > This is an attempt to save some of the cost of sqrt by using the > inbuilt support of the host hardware. The idea is assuming we start > with a valid input we can use the hardware. If any tininess issues > occur this will trip and FPU exception where: > > - we turn off cpu->use_host_fpu > - mask the FPU exceptions > - return to what we were doing > > Once we return we should pick up the fact that there was something > weird about the operation and fall-back to the pure software > implementation. > > You could imagine this being extended for code generation but instead > of returning to the code we could exit and re-generate the TB but this > time with pure software helpers rather than any support from the > hardware. > > This is a sort of fix-it-up after the fact approach because reading > the FP state is an expensive operation for everything so let's only > worry about exceptions when they trip... > <snip> > --- a/linux-user/signal.c > +++ b/linux-user/signal.c > @@ -20,6 +20,7 @@ > #include "qemu/bitops.h" > #include <sys/ucontext.h> > #include <sys/resource.h> > +#include <fenv.h> > > #include "qemu.h" > #include "qemu-common.h" > @@ -639,6 +640,21 @@ static void host_signal_handler(int host_signum, > siginfo_t *info, > ucontext_t *uc = puc; > struct emulated_sigtable *k; > > + /* Catch any FPU exceptions we might get from having tried to use > + * the host FPU to speed up some calculations > + */ > + if (host_signum == SIGFPE && cpu->use_host_fpu) { > + cpu->use_host_fpu = false; > + /* sadly this gets lost on the context switch when we return */ > + fedisableexcept(FE_INVALID | > + FE_OVERFLOW | > + FE_UNDERFLOW | > + FE_INEXACT); > + /* sigaddset(&uc->uc_sigmask, SIGFPE); */ > + uc->__fpregs_mem.mxcsr |= 0x1f80;
This is a bug, the correct place to reset mxcsr for the return is: (uc->uc_mcontext.fpregs)->mxcsr |= 0x1f80; -- Alex Bennée