Alex Bennée <alex.ben...@linaro.org> writes:

> This is an attempt to save some of the cost of sqrt by using the
> inbuilt support of the host hardware. The idea is assuming we start
> with a valid input we can use the hardware. If any tininess issues
> occur this will trip and FPU exception where:
>
>   - we turn off cpu->use_host_fpu
>   - mask the FPU exceptions
>   - return to what we were doing
>
> Once we return we should pick up the fact that there was something
> weird about the operation and fall-back to the pure software
> implementation.
>
> You could imagine this being extended for code generation but instead
> of returning to the code we could exit and re-generate the TB but this
> time with pure software helpers rather than any support from the
> hardware.
>
> This is a sort of fix-it-up after the fact approach because reading
> the FP state is an expensive operation for everything so let's only
> worry about exceptions when they trip...
>
<snip>
> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -20,6 +20,7 @@
>  #include "qemu/bitops.h"
>  #include <sys/ucontext.h>
>  #include <sys/resource.h>
> +#include <fenv.h>
>
>  #include "qemu.h"
>  #include "qemu-common.h"
> @@ -639,6 +640,21 @@ static void host_signal_handler(int host_signum, 
> siginfo_t *info,
>      ucontext_t *uc = puc;
>      struct emulated_sigtable *k;
>
> +    /* Catch any FPU exceptions we might get from having tried to use
> +     * the host FPU to speed up some calculations
> +     */
> +    if (host_signum == SIGFPE && cpu->use_host_fpu) {
> +        cpu->use_host_fpu = false;
> +        /* sadly this gets lost on the context switch when we return */
> +        fedisableexcept(FE_INVALID   |
> +                        FE_OVERFLOW  |
> +                        FE_UNDERFLOW |
> +                        FE_INEXACT);
> +        /* sigaddset(&uc->uc_sigmask, SIGFPE); */
> +        uc->__fpregs_mem.mxcsr |= 0x1f80;

This is a bug, the correct place to reset mxcsr for the return is:

        (uc->uc_mcontext.fpregs)->mxcsr |= 0x1f80;

--
Alex Bennée

Reply via email to