Some of us (game developers especially) greatly prefer a minor inaccuracy
to a potentially major slowdown; I would personally opposed this change, as
you're noticeably increasing the cost of something that's used heavily in
tightly looped code.  Perhaps an appropriately named #ifdef switch would be
a way to please everyone here?

Regards,
Riot

On 7 September 2016 at 16:21, lhmouse <lh_mo...@126.com> wrote:

> (I don't write AT&T assembly so I am unable to make a patch.
> Nevertheless I hope someone who writes AT&T assembly could fix it.)
>
> The x87 `sinl` instruction has been suffering from an accuracy problem
> since decades ago, which is described in this article:
> https://software.intel.com/blogs/2014/10/09/fsin-
> documentation-improvements-in-the-intel-64-and-ia-32-
> architectures-software
>
> Long story short: Before we can calculate `sin(x)`, we have to reduce `x`
> such that it falls in (-π/2,π/2]. This can be easily done via dividing `x`
> with π and getting the remainder. The problem is that, instead of
> a reasonably accurate value of π, `FSIN` uses a 66-bit approximate value
> as the divisor, which makes the result very inaccurate if `x` is proximate
> to
> some multiple of π, because the remainder would end up with most of
> its upper bits being zero and very few significant bits left.
>
> To compromise with Intel people, as the article suggests, it is essential
> to reduce `x` before executing the `fsin` instruction. This is done as
> follows:
>
> 1. Use `FLDPI` instruction to get an accurate value of π.
> 2. Run `FPREM1` instruction repeatly until the _C2_ bit in FPU status word
>     is cleared. The result remainder will be in (-π/2,π/2], and
>     the _C0_,_C3_,_C1_ bits are the least three significant bits of
>     the quotient, from left to right.
> 3. Calculate the sine value using `FSIN` instruction. This never fails.
> 4. Acknowledging that `sin(x) = -sin(x+kπ)` when `k` is odd and
>     `sin(x) = sin(x+kπ)` when `k` is even, because the parity bit of the
> quotient
>     is the _C1_ bit in the FPU status word, if it is set, negate the
> result with
>     `FCHS` instruction. We get the sine value now.
>
> The above process is the same for cosine.
> In the case of tangent, step 4 should be removed.
>
>
> The following code fragment compares `sinl` from mingw-w64 and
> my own implementation:
>
>     volatile auto one = 1.0l;
>     auto theta = atanl(one) * 4; // This function is from mingw-w64.
>     std::printf("sinl   (theta) = %.16Le\n", sinl   (theta));
>     std::printf("my_sinl(theta) = %.16Le\n", my_sinl(theta));
>
> It produces the following result:
>
>     sinl   (theta) = 1.6263032587282567e-019
>     my_sinl(theta) = -0.0000000000000000e+000
>
> My implementation could be found here:
> https://github.com/lhmouse/MCF/blob/master/MCFCRT/src/stdc/math/sin.c#L12
>
>     static inline long double fpu_sin(long double x){
>         unsigned fsw;
>         const long double reduced = __MCFCRT_fremainder(&fsw, x,
> __MCFCRT_fldpi());
>         long double ret = __MCFCRT_fsin_unsafe(reduced);
>         if(fsw & 0x0200){
>             ret = __MCFCRT_fneg(ret);
>         }
>         return ret;
>     }
>
> --------------
> Best regards,
> lh_mouse
> 2016-09-07
>
>
> ------------------------------------------------------------
> ------------------
> _______________________________________________
> Mingw-w64-public mailing list
> Mingw-w64-public@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
>
------------------------------------------------------------------------------
_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to