Re: [Mingw-w64-public] Does MinGW support Signals and sigset_t ?
SSE2 is mandatory on x64. On x86 you probably want to take a look at Vectored Exception Handling which needs no compiler specific magic. However the canonical way to test a CPU for SSE support would be using the CPUID instruction. -- Best regards, lh_mouse 2016-08-01 - 发件人:Jeffrey Walton 发送日期:2016-08-01 19:32 收件人:mingw-w64-public 抄送: 主题:Re: [Mingw-w64-public] Does MinGW support Signals and sigset_t ? On Mon, Aug 1, 2016 at 5:30 AM, LRN wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > On 01.08.2016 11:25, Jeffrey Walton wrote: > >> My question is, does MinGW support Signals and sigset_t ? > > No. > >> Or does MinGQW use SEH and try/except/finally? > > No. Perfect, thanks. Our problem is runtime feature detection. X86/X64 can result in an illegal instruction for SSE2 *if* the OS does not support it. We use SEH on Windows and SigIll handler on Unix/Linux to guard the code that performs the probe. I realize the OSes that will trap are basically extinct. However, our governance dictates we still support them. Its not too difficult in practice. My next question is, what are the MinGW alternatives to guard the code if neither SEH nor Signals are available? Jeff -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] [PATCH] for missing voids
GCC (but not G++) doesn't enable this warning unless requested with `-Wstrict-prototypes` explicitly. -- Best regards, lh_mouse 2016-08-22 - 发件人:David Wohlferd 发送日期:2016-08-22 11:20 收件人:mingw-w64-public@lists.sourceforge.net 抄送: 主题:[Mingw-w64-public] [PATCH] for missing voids To my surprise, these two statements have (slightly) different meanings: STDAPI MFUnregisterPlatformFromMMCSS (); STDAPI MFUnregisterPlatformFromMMCSS (void); And clang complains about it (warning: function with no prototype cannot use the stdcall calling convention). I have added 'void' every place clang complained (attached). dw -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] GCC 6.2 available with C++11 and C11 thread support from mcfgthread
Hello all, GCC 6.2 has been release today and new toolchains targeting i686 and x86_64 are now available. Pre-built binaries can be found just under the description on https://github.com/lhmouse/mcfgthread. (SF was rejecting this mail because of possibility of spamming so I have to put the link on this page instead.) These toolchains are not built from GCC release tar balls, but from corresponding git branch HEADs. New builds will come available constantly. Like mingwbuilds, these toolchains are all-in-one packages consist of GCC and GDB with C, C++ and LTO enabled. C++11 and C11 thread support with high performance are available from the mcfgthread library. The mcfgthread library is unable to be linked against statically at the moment. Due to incompleteness of testing, these toolchains are highly experimental. Please send comments and bug reports to https://github.com/lhmouse/lhmouse.github.io/issues. -- Best regards, lh_mouse 2016-08-22 -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] GCC 6.2 available with C++11 and C11 thread support from mcfgthread
Hello all, GCC 6.2 has been release today and new toolchains targeting i686 and x86_64 are now available at http://www.lhmouse.com/gcc-mcf/ . Like mingwbuilds, these toolchains are all-in-one packages consist of GCC and GDB with C, C++ and LTO enabled. C++11 and C11 thread support with high performance are available from the mcfgthread library https://github.com/lhmouse/mcfgthread . The mcfgthread library is unable to be linked against statically at the moment. Due to incompleteness of testing, these toolchains are highly experimental. Please send comments and bug reports to https://github.com/lhmouse/lhmouse.github.io/issues. -- Best regards, lh_mouse 2016-08-22 -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] Building mingw-w64 and include paths
I haven't had a deep look at the patch, but AFAICS the problem is a simple matter that can be solved by adding `-nostdinc -I"path/to/the/directory/of/new/headers"` to `CPPFLAGS` when building the CRT. -- Best regards, lh_mouse 2016-08-26 - 发件人:David Wohlferd 发送日期:2016-08-23 07:34 收件人:mingw-w64-public 抄送: 主题:Re: [Mingw-w64-public] Building mingw-w64 and include paths Let's try this again. This time the proposed patch is attached which may help. For 'ease of review,' this patch does not include the 'generated' makefile.in files. The problem I am trying to fix is that when building mingw-w64, the compiler will often use headers from the Tools Directories instead of the mingw-w64 source directory. This is due to the fact that while some mingw-w64 SourceDir paths are passed to the compiler, several are not. This causes 3 problems: 1) In order for me to make and test changes to intrin.h (for example), I have to remember to copy it to ToolsDir after every change. 2) Users who want to build mingw-w64 have to *know* that they need to copy certain files (from a variety of locations) to their ToolsDir before trying to build mingw-w64. While the build may succeed without this, the results will be uncertain. 3) Header files that are in ToolsDir are usually treated as 'System Headers' (https://gcc.gnu.org/onlinedocs/cpp/System-Headers.html). Among other things, this means that warnings in these files tend to get suppressed rather than found and fixed. While there are multiple ways to fix this, I have chosen to add the necessary paths to Makefile.am. I also needed to duplicate 2 files (_mingw_directx.h _mingw_ddk.h) into appropriately named directories so that _mingw.h's #include "sdks/_mingw_directx.h" would work (Is there a better way to resolve this? I can't just generate them into the sdks directory, since 'distribution' needs them where they are.). Kai has suggested that we modify the configure script to copy all the header files to a single directory (a la 'make install') and use that single directory instead of the 9 above. My scripting skills are not sufficient to do this. If this is how we want to proceed, someone else will need to write it. dw -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] Wrong quotient results of `remquo()`?
Hello guys, I am testing my `remquo()` implementation when I find that `remquo` on Linux (using glibc) and on Windows (using mingw-w64) generate different results. I don't think this is the correct behavior. Any ideas? The testcases in file `remquo.txt` the attached zip file was generated on my VPS running Debian. MinGW-w64 is failing some of them: E:\Desktop\remquo_test>gcc test.c -std=c99 && a.exe > nul passed: 37864 failed: 2537 -- Best regards, lh_mouse 2016-09-05 -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] Wrong quotient results of `remquo()`?
Found an example on cppreference: http://en.cppreference.com/w/cpp/numeric/math/remquo The example shows that, since `cos()` is periodic, adding 1 * PI to its parameter doesn't change the result. But, we can also say that, subtracting 1 * PI from its parameter should not change the result either. However, with mingw-w64 and MSVCRT, it DOES change the result, as shown on the last line: E:\Desktop>g++ test.cpp -std=c++14 E:\Desktop>a.exe cos(pi * -0.25) = 0.707107 cos(pi * -1.25) = -0.707107 cos(pi * -1.25) = 0.707123 cos(pi * -10001.25) = -0.707117 cos(pi * -1.25) = 0.707107 cos(pi * -10001.25) = 0.707107 This could be a potential bug. -- Best regards, lh_mouse 2016-09-05 - 发件人:"lhmouse" 发送日期:2016-09-05 22:27 收件人:mingw-w64-public 抄送: 主题:[Mingw-w64-public] Wrong quotient results of `remquo()`? Hello guys, I am testing my `remquo()` implementation when I find that `remquo` on Linux (using glibc) and on Windows (using mingw-w64) generate different results. I don't think this is the correct behavior. Any ideas? The testcases in file `remquo.txt` the attached zip file was generated on my VPS running Debian. MinGW-w64 is failing some of them: E:\Desktop\remquo_test>gcc test.c -std=c99 && a.exe > nul passed: 37864 failed: 2537 -- Best regards, lh_mouse 2016-09-05 -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] Wrong quotient results of `remquo()`?
The disagreement of glibc and mingw-w64 (in my opinion) is definitely glibc's bug: lh_mouse@lhmouse-dev:~$ cat test3.c #include #include int main(){ double x = 10.001000; double y = 0.701000; int quo; double rem = remquo(x, y, &quo); printf("%f %f %d %f\n", x, y, quo, rem); } lh_mouse@lhmouse-dev:~$ gcc test3.c -lm -O0 && ./a.out # use glibc 10.001000 0.701000 8 4.393000 lh_mouse@lhmouse-dev:~$ gcc test3.c -lm -O2 && ./a.out # performs constant folding 10.001000 0.701000 14 0.187000 lh_mouse@lhmouse-dev:~$ The remainder of `remquo` from mingw-w64 seems all right. However the value (or rather, the 3 least significant bits) returned in the third parameter still seems problematic. -- Best regards, lh_mouse 2016-09-06 ----- 发件人:"lhmouse" 发送日期:2016-09-05 23:08 收件人:mingw-w64-public,lhmouse 抄送: 主题:Re: [Mingw-w64-public] Wrong quotient results of `remquo()`? Found an example on cppreference: http://en.cppreference.com/w/cpp/numeric/math/remquo The example shows that, since `cos()` is periodic, adding 1 * PI to its parameter doesn't change the result. But, we can also say that, subtracting 1 * PI from its parameter should not change the result either. However, with mingw-w64 and MSVCRT, it DOES change the result, as shown on the last line: E:\Desktop>g++ test.cpp -std=c++14 E:\Desktop>a.exe cos(pi * -0.25) = 0.707107 cos(pi * -1.25) = -0.707107 cos(pi * -1.25) = 0.707123 cos(pi * -10001.25) = -0.707117 cos(pi * -1.25) = 0.707107 cos(pi * -10001.25) = 0.707107 This could be a potential bug. -- Best regards, lh_mouse 2016-09-05 --------- 发件人:"lhmouse" 发送日期:2016-09-05 22:27 收件人:mingw-w64-public 抄送: 主题:[Mingw-w64-public] Wrong quotient results of `remquo()`? Hello guys, I am testing my `remquo()` implementation when I find that `remquo` on Linux (using glibc) and on Windows (using mingw-w64) generate different results. I don't think this is the correct behavior. Any ideas? The testcases in file `remquo.txt` the attached zip file was generated on my VPS running Debian. MinGW-w64 is failing some of them: E:\Desktop\remquo_test>gcc test.c -std=c99 && a.exe > nul passed: 37864 failed: 2537 -- Best regards, lh_mouse 2016-09-05 -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] Wrong quotient results of `remquo()`?
More likely a bug in mingw-w64: #include #include volatile double x = 10.001000, y = -1.299000; int main(){ int quo; double rem = remquo(x, y, &quo); printf("rem = %f, quo = %d\n", rem, quo); } With mingw-w64 this program gives the following output: E:\Desktop>gcc test.c E:\Desktop>a rem = -0.391000, quo = 0 However, according to ISO C11 draft: > 7.12.10.3 The remquo functions > 2 The remquo functions compute the same remainder as the remainder functions. > In > the object pointed to by quo they store a value whose sign is the sign of x/y > and whose > magnitude is congruent modulo 2n to the magnitude of the integral quotient of > x/y, where > n is an implementation-defined integer greater than or equal to 3. the value stored in `quo` must have the same sign with `x/y`. In the above example, since `x/y`, which is about -7.699, is negative, returning a non-negative value (zero in the above example) should be a bug. -- Best regards, lh_mouse 2016-09-07 -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] sinl/cosl/tanl accuracy problem
(I don't write AT&T assembly so I am unable to make a patch. Nevertheless I hope someone who writes AT&T assembly could fix it.) The x87 `sinl` instruction has been suffering from an accuracy problem since decades ago, which is described in this article: https://software.intel.com/blogs/2014/10/09/fsin-documentation-improvements-in-the-intel-64-and-ia-32-architectures-software Long story short: Before we can calculate `sin(x)`, we have to reduce `x` such that it falls in (-π/2,π/2]. This can be easily done via dividing `x` with π and getting the remainder. The problem is that, instead of a reasonably accurate value of π, `FSIN` uses a 66-bit approximate value as the divisor, which makes the result very inaccurate if `x` is proximate to some multiple of π, because the remainder would end up with most of its upper bits being zero and very few significant bits left. To compromise with Intel people, as the article suggests, it is essential to reduce `x` before executing the `fsin` instruction. This is done as follows: 1. Use `FLDPI` instruction to get an accurate value of π. 2. Run `FPREM1` instruction repeatly until the _C2_ bit in FPU status word is cleared. The result remainder will be in (-π/2,π/2], and the _C0_,_C3_,_C1_ bits are the least three significant bits of the quotient, from left to right. 3. Calculate the sine value using `FSIN` instruction. This never fails. 4. Acknowledging that `sin(x) = -sin(x+kπ)` when `k` is odd and `sin(x) = sin(x+kπ)` when `k` is even, because the parity bit of the quotient is the _C1_ bit in the FPU status word, if it is set, negate the result with `FCHS` instruction. We get the sine value now. The above process is the same for cosine. In the case of tangent, step 4 should be removed. The following code fragment compares `sinl` from mingw-w64 and my own implementation: volatile auto one = 1.0l; auto theta = atanl(one) * 4; // This function is from mingw-w64. std::printf("sinl (theta) = %.16Le\n", sinl (theta)); std::printf("my_sinl(theta) = %.16Le\n", my_sinl(theta)); It produces the following result: sinl (theta) = 1.6263032587282567e-019 my_sinl(theta) = -0.e+000 My implementation could be found here: https://github.com/lhmouse/MCF/blob/master/MCFCRT/src/stdc/math/sin.c#L12 static inline long double fpu_sin(long double x){ unsigned fsw; const long double reduced = __MCFCRT_fremainder(&fsw, x, __MCFCRT_fldpi()); long double ret = __MCFCRT_fsin_unsafe(reduced); if(fsw & 0x0200){ ret = __MCFCRT_fneg(ret); } return ret; } -- Best regards, lh_mouse 2016-09-07 -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem
If performance is the problem there are a number of solutions such as inline assembly, static lookup tables, etc. `fsinl()` is apparently not one of them. But yes, I am all ears -- Best regards, lh_mouse 2016-09-08 - 发件人:Riot 发送日期:2016-09-08 04:00 收件人:mingw-w64-public 抄送: 主题:Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem Some of us (game developers especially) greatly prefer a minor inaccuracy to a potentially major slowdown; I would personally opposed this change, as you're noticeably increasing the cost of something that's used heavily in tightly looped code. Perhaps an appropriately named #ifdef switch would be a way to please everyone here? Regards, Riot -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem
It is merely a function guaranteed to be declared implicitly (thus requires no ) and has the same semantics with the standard function `sinl()`. The GCC optimizer can perform certain types of optimization such as constant folding and inlining only if `fsinl()` is supposed to do the same thing as specified by the C standard, which could be explicitly disabled using `-fno-builtin` or `-ffreestanding`. AFAICS there is otherwise no difference. `__builtin_fsinl()` may result in a call to `sinl()`. -- Best regards, lh_mouse 2016-09-08 - 发件人:NightStrike 发送日期:2016-09-08 15:06 收件人:mingw-w64-public@lists.sourceforge.net 抄送: 主题:Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem What does gcc's __builtin_sinl() do? -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] `fma()` functions are completely wrong in mingw-w64
Reading `mingw-w64/mingw-w64-crt/math/fmal.c`: long double fmal ( long double _x, long double _y, long double _z) { return ((_x * _y) + _z); } This implementation is completely wrong. https://en.wikipedia.org/wiki/Multiply–accumulate_operation#Fused_multiply.E2.80.93add The multiplication in a single FMA operation must behave as if the result had infinite precision. That is, multiplying two x87-extended-precision floating point numbers (1 sign + 15 exp + 64 frac = 80 bits) yields a result of 144 bits (1 sign + 15 exp + 64 frac * 2 = 144 bits). For example, with a conforming `fmal()`, the expression `fmal(1.2l, 3.4l, -3.00010l)` shall yield approximately `8e-18`, because `1.2l * 3.4l` yields `3.000108l`. But in mingw-w64, this indeterminate result is truncated when converted to `long double`, yielding `3.0001l`, and adding `-3.00010l` to it yields zero. Since x87 does not have 128-bit registers, FMA must be done in software: 1. Split both multiplier into higher and lower parts. Since a `long double` has 64 significant bits (it does not have a hidden bit), either of the two parts has to have 32 bits so we don't get precision losses when multiplying them. 2. Keeping in mind that `(a+b)(c+d)=ac+ad+bc+bd`, calculate the sum IN THE FOLLOWING ORDER: long double ret = z; ret += xhi * yhi; ret += xhi * ylo + xlo * yhi; ret += xlo * ylo; A conforming implementation can be found here: https://github.com/lhmouse/MCF/blob/master/MCFCRT/src/stdc/math/fma.c -- Best regards, lh_mouse 2016-09-08 -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem
Thanks for such nice work! I hope someone would accept it. Kai has been away for days. -- Best regards, lh_mouse 2016-09-08 - 发件人:Thomas Bickel 发送日期:2016-09-08 21:01 收件人:mingw-w64-public 抄送: 主题:Re: [Mingw-w64-public] sinl/cosl/tanl accuracy problem On 07.09.2016 17:21, lhmouse wrote: > (I don't write AT&T assembly so I am unable to make a patch. > Nevertheless I hope someone who writes AT&T assembly could fix it.) I don't have time to write a patch but I can donate some code that AFAIK does what you need for the sin functions. >gcc -m32 sinl32.s sin.c >a sinl = -5.421010862427522170037264e-020 my_sinl = -0.e+000 sin = 1.224606353822377258211418e-016 my_sin = 1.225148454908620010428422e-016 sinf = -8.7422776573475858e-008 my_sinf = -8.7422780003674585e-008 >gcc -m64 sinl64.s sin.c >a sinl = -5.421010862427522170037264e-020 my_sinl = -0.e+000 sin = 1.224606353822377258211418e-016 my_sin = 1.225148454908620010428422e-016 sinf = -8.7422776573475858e-008 my_sinf = -8.7422776573475858e-008 Regards Thomas -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] [PATCH] Added standard-conforming fmaf(), fma() and fmal() functions.
--- mingw-w64-crt/Makefile.am | 4 ++-- mingw-w64-crt/math/fma.S | 42 mingw-w64-crt/math/fma.c | 12 ++ mingw-w64-crt/math/fmaf.S | 43 - mingw-w64-crt/math/fmaf.c | 10 mingw-w64-crt/math/fmal.c | 61 ++- 6 files changed, 84 insertions(+), 88 deletions(-) delete mode 100644 mingw-w64-crt/math/fma.S create mode 100644 mingw-w64-crt/math/fma.c delete mode 100644 mingw-w64-crt/math/fmaf.S create mode 100644 mingw-w64-crt/math/fmaf.c diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am index 886fcf0..1c6e534 100644 --- a/mingw-w64-crt/Makefile.am +++ b/mingw-w64-crt/Makefile.am @@ -244,7 +244,6 @@ src_libmingwex=\ \ math/_chgsignl.S math/ceil.Smath/ceilf.S math/ceill.S math/copysignl.S \ math/floor.S math/floorf.S math/floorl.S \ - math/fma.Smath/fmaf.S\ math/nearbyint.S math/nearbyintf.S math/nearbyintl.S \ math/trunc.S math/truncf.S \ math/cbrt.c \ @@ -252,7 +251,8 @@ src_libmingwex=\ math/coshf.c math/coshl.c math/erfl.c \ math/expf.c \ math/fabs.c math/fabsf.c math/fabsl.c math/fdim.c math/fdimf.c math/fdiml.c \ - math/fmal.c math/fmax.cmath/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ + math/fma.cmath/fmaf.cmath/fmal.c \ + math/fmax.c math/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ math/fminl.c math/fp_consts.c math/fp_constsf.c \ math/fp_constsl.c math/fpclassify.c math/fpclassifyf.c math/fpclassifyl.c math/frexpf.c\ math/hypotf.c math/hypot.c math/hypotl.c math/isnan.c math/isnanf.cmath/isnanl.c\ diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S deleted file mode 100644 index 74becde..000 --- a/mingw-w64-crt/math/fma.S +++ /dev/null @@ -1,42 +0,0 @@ -/** - * This file has no copyright assigned and is placed in the Public Domain. - * This file is part of the mingw-w64 runtime package. - * No warranty is given; refer to the file DISCLAIMER.PD within this package. - */ -#include <_mingw_mac.h> - - .file "fma.S" - .text -#ifdef __x86_64__ - .align 8 -#else - .align 4 -#endif - .p2align 4,,15 - .globl __MINGW_USYMBOL(fma) - .def__MINGW_USYMBOL(fma); .scl2; .type 32; .endef -__MINGW_USYMBOL(fma): -#if defined(_AMD64_) || defined(__x86_64__) - subq$56, %rsp - movsd %xmm0,(%rsp) - movsd %xmm1,16(%rsp) - movsd %xmm2,32(%rsp) - fldl(%rsp) - fmull 16(%rsp) - fldl32(%rsp) - faddp - fstpl (%rsp) - movsd (%rsp),%xmm0 - addq$56, %rsp - ret -#elif defined(_ARM_) || defined(__arm__) - fmacd d2, d0, d1 - fcpyd d0, d2 - bx lr -#elif defined(_X86_) || defined(__i386__) - fldl4(%esp) - fmull 12(%esp) - fldl20(%esp) - faddp - ret -#endif diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c new file mode 100644 index 000..3703e00 --- /dev/null +++ b/mingw-w64-crt/math/fma.c @@ -0,0 +1,12 @@ +/** + * This file has no copyright assigned and is placed in the Public Domain. + * This file is part of the mingw-w64 runtime package. + * No warranty is given; refer to the file DISCLAIMER.PD within this package. + */ +long double fmal ( long double _x, long double _y, long double _z); + +double +fma ( double _x, double _y, double _z) +{ + return (double)fmal(_x, _y, _z); +} diff --git a/mingw-w64-crt/math/fmaf.S b/mingw-w64-crt/math/fmaf.S deleted file mode 100644 index 6bc7ef0..000 --- a/mingw-w64-crt/math/fmaf.S +++ /dev/null @@ -1,43 +0,0 @@ -/** - * This file has no copyright assigned and is placed in the Public Domain. - * This file is part of the mingw-w64 runtime package. - * No warranty is given; refer to the file DISCLAIMER.PD within this package. - */ -#include <_mingw_mac.h> - - .file "fmaf.S" - .text -#ifdef __x86_64__ - .align 8 -#else - .align 2 -#endif - .p2align 4,,15 - .globl __MINGW_USYMBOL(fmaf) - .def__MINGW_USYMBOL(fmaf); .scl2; .type 32; .endef -__MINGW_USYMBOL(fmaf): -#if defined(_AMD64_) || defined(__x86_64__) - subq$56, %rsp - movss %xmm0,(%rsp) - movss %xmm1,16(%rsp) - movss %xmm2,32(%rsp) - flds(%rsp) - fmuls 16(%rsp) - flds32(%rsp) - faddp - fstps (%rsp) - movss (%rsp),%xmm0 - addq$56, %rsp - ret -#elif defined(_ARM_) || defined(__arm__) - fmacs s2, s0, s1 - fcpys s0, s2 - bx
Re: [Mingw-w64-public] [PATCH] Added standard-conforming fmaf(), fma() and fmal() functions.
Oops. Are there any volunteers to implement `fma()` functions for ARM ? -- Best regards, lh_mouse 2016-09-08 -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] Wrong output from %La specifier in printf()?
When the `%La` specifier is used in `printf()` to format the C99 hexdecimal floating point value `0x5p-80l`, a wrong result is generated, as shown in this example: E:\Desktop>cat test.c extern int __mingw_printf(const char *, ...); int main(){ __mingw_printf("%La\n", 0x5p-80l); } E:\Desktop>gcc test.c -std=c99 E:\Desktop>a.exe 0x0p-141 Removing the `__mingw_` prefix and testing the same program on Linux gives the correct result: lh_mouse@lhmouse-dev:~$ uname -a Linux lhmouse-dev 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2+deb8u3 (2016-07-02) x86_64 GNU/Linux lh_mouse@lhmouse-dev:~$ gcc test.c -std=c99 lh_mouse@lhmouse-dev:~$ ./a.out 0xap-81 Is this a bug in `printf()`? -- Best regards, lh_mouse 2016-09-09 -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] [PATCH] [FIXED_UP] Added standard-conforming fmaf(), fma() and fmal() functions.
>From 52dd6b38d01e1f30bf1821a2621d707d07ec8f15 Mon Sep 17 00:00:00 2001 From: lhmouse Date: Fri, 9 Sep 2016 22:52:26 +0800 Subject: [PATCH] Added standard-conforming fmaf(), fma() and fmal() functions. Signed-off-by: lhmouse --- mingw-w64-crt/Makefile.am | 4 +-- mingw-w64-crt/math/fma.S | 42 - mingw-w64-crt/math/fma.c | 12 +++ mingw-w64-crt/math/fmaf.S | 43 - mingw-w64-crt/math/fmaf.c | 10 ++ mingw-w64-crt/math/fmal.c | 80 ++- 6 files changed, 103 insertions(+), 88 deletions(-) delete mode 100644 mingw-w64-crt/math/fma.S create mode 100644 mingw-w64-crt/math/fma.c delete mode 100644 mingw-w64-crt/math/fmaf.S create mode 100644 mingw-w64-crt/math/fmaf.c diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am index 886fcf0..1c6e534 100644 --- a/mingw-w64-crt/Makefile.am +++ b/mingw-w64-crt/Makefile.am @@ -244,7 +244,6 @@ src_libmingwex=\ \ math/_chgsignl.S math/ceil.Smath/ceilf.S math/ceill.S math/copysignl.S \ math/floor.S math/floorf.S math/floorl.S \ - math/fma.Smath/fmaf.S\ math/nearbyint.S math/nearbyintf.S math/nearbyintl.S \ math/trunc.S math/truncf.S \ math/cbrt.c \ @@ -252,7 +251,8 @@ src_libmingwex=\ math/coshf.c math/coshl.c math/erfl.c \ math/expf.c \ math/fabs.c math/fabsf.c math/fabsl.c math/fdim.c math/fdimf.c math/fdiml.c \ - math/fmal.c math/fmax.cmath/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ + math/fma.cmath/fmaf.cmath/fmal.c \ + math/fmax.c math/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ math/fminl.c math/fp_consts.c math/fp_constsf.c \ math/fp_constsl.c math/fpclassify.c math/fpclassifyf.c math/fpclassifyl.c math/frexpf.c\ math/hypotf.c math/hypot.c math/hypotl.c math/isnan.c math/isnanf.cmath/isnanl.c\ diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S deleted file mode 100644 index 74becde..000 --- a/mingw-w64-crt/math/fma.S +++ /dev/null @@ -1,42 +0,0 @@ -/** - * This file has no copyright assigned and is placed in the Public Domain. - * This file is part of the mingw-w64 runtime package. - * No warranty is given; refer to the file DISCLAIMER.PD within this package. - */ -#include <_mingw_mac.h> - - .file "fma.S" - .text -#ifdef __x86_64__ - .align 8 -#else - .align 4 -#endif - .p2align 4,,15 - .globl __MINGW_USYMBOL(fma) - .def__MINGW_USYMBOL(fma); .scl2; .type 32; .endef -__MINGW_USYMBOL(fma): -#if defined(_AMD64_) || defined(__x86_64__) - subq$56, %rsp - movsd %xmm0,(%rsp) - movsd %xmm1,16(%rsp) - movsd %xmm2,32(%rsp) - fldl(%rsp) - fmull 16(%rsp) - fldl32(%rsp) - faddp - fstpl (%rsp) - movsd (%rsp),%xmm0 - addq$56, %rsp - ret -#elif defined(_ARM_) || defined(__arm__) - fmacd d2, d0, d1 - fcpyd d0, d2 - bx lr -#elif defined(_X86_) || defined(__i386__) - fldl4(%esp) - fmull 12(%esp) - fldl20(%esp) - faddp - ret -#endif diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c new file mode 100644 index 000..3703e00 --- /dev/null +++ b/mingw-w64-crt/math/fma.c @@ -0,0 +1,12 @@ +/** + * This file has no copyright assigned and is placed in the Public Domain. + * This file is part of the mingw-w64 runtime package. + * No warranty is given; refer to the file DISCLAIMER.PD within this package. + */ +long double fmal ( long double _x, long double _y, long double _z); + +double +fma ( double _x, double _y, double _z) +{ + return (double)fmal(_x, _y, _z); +} diff --git a/mingw-w64-crt/math/fmaf.S b/mingw-w64-crt/math/fmaf.S deleted file mode 100644 index 6bc7ef0..000 --- a/mingw-w64-crt/math/fmaf.S +++ /dev/null @@ -1,43 +0,0 @@ -/** - * This file has no copyright assigned and is placed in the Public Domain. - * This file is part of the mingw-w64 runtime package. - * No warranty is given; refer to the file DISCLAIMER.PD within this package. - */ -#include <_mingw_mac.h> - - .file "fmaf.S" - .text -#ifdef __x86_64__ - .align 8 -#else - .align 2 -#endif - .p2align 4,,15 - .globl __MINGW_USYMBOL(fmaf) - .def__MINGW_USYMBOL(fmaf); .scl2; .type 32; .endef -__MINGW_USYMBOL(fmaf): -#if defined(_AMD64_) || defined(__x86_64__) - subq$56, %rsp - movss %xmm0,(%rsp) - movss %xmm1,16(%rsp) - movss %xmm2,32(%rsp) - flds(%rsp) -
Re: [Mingw-w64-public] [PATCH] [FIXED_UP] Added standard-conforming fmaf(), fma() and fmal() functions.
Please discard the previous patch as it does not handle denormal numbers correctly. Sorry for that. --- >From 890aab4fc63264074b6e1e16f5bd64e3c4a6f795 Mon Sep 17 00:00:00 2001 From: lhmouse Date: Fri, 9 Sep 2016 22:52:26 +0800 Subject: [PATCH] Added standard-conforming fmaf(), fma() and fmal() functions. Signed-off-by: lhmouse --- mingw-w64-crt/Makefile.am | 4 +-- mingw-w64-crt/math/fma.S | 42 mingw-w64-crt/math/fma.c | 12 +++ mingw-w64-crt/math/fmaf.S | 43 mingw-w64-crt/math/fmaf.c | 10 ++ mingw-w64-crt/math/fmal.c | 84 ++- 6 files changed, 107 insertions(+), 88 deletions(-) delete mode 100644 mingw-w64-crt/math/fma.S create mode 100644 mingw-w64-crt/math/fma.c delete mode 100644 mingw-w64-crt/math/fmaf.S create mode 100644 mingw-w64-crt/math/fmaf.c diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am index 886fcf0..1c6e534 100644 --- a/mingw-w64-crt/Makefile.am +++ b/mingw-w64-crt/Makefile.am @@ -244,7 +244,6 @@ src_libmingwex=\ \ math/_chgsignl.S math/ceil.Smath/ceilf.S math/ceill.S math/copysignl.S \ math/floor.S math/floorf.S math/floorl.S \ - math/fma.Smath/fmaf.S\ math/nearbyint.S math/nearbyintf.S math/nearbyintl.S \ math/trunc.S math/truncf.S \ math/cbrt.c \ @@ -252,7 +251,8 @@ src_libmingwex=\ math/coshf.c math/coshl.c math/erfl.c \ math/expf.c \ math/fabs.c math/fabsf.c math/fabsl.c math/fdim.c math/fdimf.c math/fdiml.c \ - math/fmal.c math/fmax.cmath/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ + math/fma.cmath/fmaf.cmath/fmal.c \ + math/fmax.c math/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ math/fminl.c math/fp_consts.c math/fp_constsf.c \ math/fp_constsl.c math/fpclassify.c math/fpclassifyf.c math/fpclassifyl.c math/frexpf.c\ math/hypotf.c math/hypot.c math/hypotl.c math/isnan.c math/isnanf.cmath/isnanl.c\ diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S deleted file mode 100644 index 74becde..000 --- a/mingw-w64-crt/math/fma.S +++ /dev/null @@ -1,42 +0,0 @@ -/** - * This file has no copyright assigned and is placed in the Public Domain. - * This file is part of the mingw-w64 runtime package. - * No warranty is given; refer to the file DISCLAIMER.PD within this package. - */ -#include <_mingw_mac.h> - - .file "fma.S" - .text -#ifdef __x86_64__ - .align 8 -#else - .align 4 -#endif - .p2align 4,,15 - .globl __MINGW_USYMBOL(fma) - .def__MINGW_USYMBOL(fma); .scl2; .type 32; .endef -__MINGW_USYMBOL(fma): -#if defined(_AMD64_) || defined(__x86_64__) - subq$56, %rsp - movsd %xmm0,(%rsp) - movsd %xmm1,16(%rsp) - movsd %xmm2,32(%rsp) - fldl(%rsp) - fmull 16(%rsp) - fldl32(%rsp) - faddp - fstpl (%rsp) - movsd (%rsp),%xmm0 - addq$56, %rsp - ret -#elif defined(_ARM_) || defined(__arm__) - fmacd d2, d0, d1 - fcpyd d0, d2 - bx lr -#elif defined(_X86_) || defined(__i386__) - fldl4(%esp) - fmull 12(%esp) - fldl20(%esp) - faddp - ret -#endif diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c new file mode 100644 index 000..3703e00 --- /dev/null +++ b/mingw-w64-crt/math/fma.c @@ -0,0 +1,12 @@ +/** + * This file has no copyright assigned and is placed in the Public Domain. + * This file is part of the mingw-w64 runtime package. + * No warranty is given; refer to the file DISCLAIMER.PD within this package. + */ +long double fmal ( long double _x, long double _y, long double _z); + +double +fma ( double _x, double _y, double _z) +{ + return (double)fmal(_x, _y, _z); +} diff --git a/mingw-w64-crt/math/fmaf.S b/mingw-w64-crt/math/fmaf.S deleted file mode 100644 index 6bc7ef0..000 --- a/mingw-w64-crt/math/fmaf.S +++ /dev/null @@ -1,43 +0,0 @@ -/** - * This file has no copyright assigned and is placed in the Public Domain. - * This file is part of the mingw-w64 runtime package. - * No warranty is given; refer to the file DISCLAIMER.PD within this package. - */ -#include <_mingw_mac.h> - - .file "fmaf.S" - .text -#ifdef __x86_64__ - .align 8 -#else - .align 2 -#endif - .p2align 4,,15 - .globl __MINGW_USYMBOL(fmaf) - .def__MINGW_USYMBOL(fmaf); .scl2; .type 32; .endef -__MINGW_USYMBOL(fmaf): -#if defined(_AMD64_) || defined(__x86_64__) - subq$56, %rsp - movss %x
Re: [Mingw-w64-public] Wrong quotient results of `remquo()`?
Reliable results could usually be generated using GCC's constant folding. For example, the following program: #include int main(){ const double x = 10.001000, y = -1.299000; int quo; double rem = __builtin_remquo(x, y, &quo); printf("rem = %f, quo = %d\n", rem, quo); } after compiled with `gcc test.c -O3 -masm=intel -S`, produces the following assembly: movsd xmm0, QWORD PTR .LC0[rip] lea rcx, .LC1[rip] mov r8d, -8# <== folded constant goes here movapd xmm1, xmm0 movqrdx, xmm0 callprintf And yes, the result -8 is correct. 0 isn't. It doesn't matter whether we return -16 or -8 here, as both ISO C and POSIX only require three bits at least. Here they are both conforming as long as we document our `remquo()` as returning only three bits into `quo`. -- Best regards, lh_mouse 2016-09-12 - 发件人:"K. Frank" 发送日期:2016-09-12 22:23 收件人:mingw64 抄送: 主题:Re: [Mingw-w64-public] Wrong quotient results of `remquo()`? Hello Lefty! I do think you have found a bug here, and it does appear to be in the mingw-w64 code. Disclaimer: I don't understand this completely. Further comments in line, below. On Tue, Sep 6, 2016 at 11:52 PM, lhmouse wrote: > More likely a bug in mingw-w64: > > #include > #include > volatile double x = 10.001000, y = -1.299000; > int main(){ > int quo; > double rem = remquo(x, y, &quo); > printf("rem = %f, quo = %d\n", rem, quo); > } > > With mingw-w64 this program gives the following output: > > E:\Desktop>gcc test.c > > E:\Desktop>a > rem = -0.391000, quo = 0 I get the same result as you in a c++ test program using: g++ (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.2 > However, according to ISO C11 draft: > >> 7.12.10.3 The remquo functions >> 2 The remquo functions compute the same remainder as the remainder >> functions. In >> the object pointed to by quo they store a value whose sign is the sign of >> x/y and whose >> magnitude is congruent modulo 2n to the magnitude of the integral quotient >> of x/y, where >> n is an implementation-defined integer greater than or equal to 3. > > the value stored in `quo` must have the same sign with `x/y`. > > In the above example, since `x/y`, which is about -7.699, is negative, > returning a non-negative value (zero in the above example) should be a bug. I agree with lh_mouse's reading of the standard, and that quo should be negative to match the sign of x / y. Here is my imperfect analysis of what is going on. First, I found a copy of remquo.S here: https://sourceforge.net/p/mingw-w64/code/6570/tree//stable/v1.x/mingw-w64-crt/math/remquo.S (but I don't understand it). I also found a "softmath" copy of remquo.c here: https://github.com/msys2/mingw-w64/blob/master/mingw-w64-crt/math/softmath/remquo.c (I have no idea whether remquo.c is equivalent in detail to remquo.S.) #include "softmath_private.h" double remquo(double x, double y, int *quo) { double r; if (isnan(x) || isnan(y)) return NAN; if (y == 0.0) { errno = EDOM; __mingw_raise_matherr (_DOMAIN, "remquo", x, y, NAN); return NAN; } r = remainder(x, y); if (quo) *quo = (int)((x - r) / y) % 8; return r; } First, the expression "(int)((x - r) / y)" is undefined behavior when (x - r) / y is too large for an int. (This can easily happen with floats and doubles.) (remquo.S uses the intel floating-point instruction fprem1, and therefore -- if written correctly -- should not have this problem.) But ignoring the possible integer overflow, the error here, which is the result lh_mouse gets in his test, is that if (int)((x - r) / y) is a multiple of 8, then (int)((x - r) / y) % 8 will evaluate to zero, losing the sign information. In lh_mouse's test case x / y = -7.699 which rounds-to-nearest to -8, which equals 0 mod 8. How might one fix this (in c code)? My reading of the standard says that quo doesn't have to be exactly the last three bits -- or even the last n bits -- of (int)((x - r) / y) Rather, it only has to be congruent to this mod 8. So (ignoring overflow in the integer conversion), one could do something like this: r = remainder(x, y); if (quo) *quo = (int)((x - r) / y) % 8; if (quo && *quo == 0 && x/y < 0.0) *quo = -16; return r; (Here, we are deeming the sign of 0 to be positive. I don't know whether this would be language-lawyer consistent
[Mingw-w64-public] Bootstrap gcc for i686 with SJLJ exception model in MSYS2 ?
Today I tried bootstrapping GCC 6.2.1 using PKGBUILD modified from the MSYS2 one for gcc-git package. I changed the line https://github.com/lhmouse/MINGW-packages/blob/master/mingw-w64-gcc-git/PKGBUILD#L148 from `local _conf="--disable-sjlj-exceptions --with-dwarf2"` to `local _conf="--enable-sjlj-exceptions"`, and the 3-stage bootstrap started at the end of stage 1 with thousands of undefined references to _Unwind_* functions. I tried both MSYS2 toolchains (with DWARF) and mingw-builds toolchains (with SJLJ) and the latter failed with fewer yet the same errors. Do you have any ideas why this error happened? -- Best regards, lh_mouse 2016-10-14 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] [Msys2-users] Bootstrap gcc for i686 with SJLJ exception model in MSYS2 ?
Problem solved. I didn't delete MSYS2 gcc's libraries and it was these libraries that were linked rather than mingwbuilds' libraries. However, even GCC itself does not ask for libgcc_s_dw2, the libgomp DLL from MSYS2 packages still asks for it, as well as other packages. Hence the DLL must not be removed. -- Best regards, lh_mouse 2016-10-14 ----- 发件人:"lhmouse" 发送日期:2016-10-14 01:21 收件人:Msys2,mingw-w64-public 抄送: 主题:[Msys2-users] Bootstrap gcc for i686 with SJLJ exception model in MSYS2 ? Today I tried bootstrapping GCC 6.2.1 using PKGBUILD modified from the MSYS2 one for gcc-git package. I changed the line https://github.com/lhmouse/MINGW-packages/blob/master/mingw-w64-gcc-git/PKGBUILD#L148 from `local _conf="--disable-sjlj-exceptions --with-dwarf2"` to `local _conf="--enable-sjlj-exceptions"`, and the 3-stage bootstrap started at the end of stage 1 with thousands of undefined references to _Unwind_* functions. I tried both MSYS2 toolchains (with DWARF) and mingw-builds toolchains (with SJLJ) and the latter failed with fewer yet the same errors. Do you have any ideas why this error happened? -- Best regards, lh_mouse 2016-10-14 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Msys2-users mailing list msys2-us...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/msys2-users -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] Make GCC emit ASM instructions in 'gcc/except.c' for i686 MinGW targets ?
Hi there, I come up with an idea about implementing stack unwinding for the i686-w64-mingw32 target using native Windows Structured Exception Handling (a.k.a SEH) for efficiency reasons. Unlike DWARF and SEH for x64, SEH for x86 is stack-based and works like the SJLJ exception model: The operating system keeps a thread specific pointer to an SEH node on the stack that must be installed/uninstalled during run time. The SEH-head pointer is stored in `fs:[0]`. Typecially, an SEH handler is installed like this, in Intel syntax: # typedef EXCEPTION_DISPOSITION # filter_function( # EXCEPTION_RECORD *record, void *establisher_frame, # CONTEXT *machine_context, void *dispatcher_context) # __attribute__((__cdecl__)); # struct x86_seh_node_header { # struct x86_seh_node_header *next; # filter_function *filter; # char extra_data[]; # }; sub esp, 8 # struct x86_seh_node_header this_node; mov ecx, dword ptr fs:[0] # mov dword ptr[esp], ecx # this_node.next = get_thread_seh_head(); mov dword ptr[esp + 4], offset my_seh_filter # this_node.filter = &my_seh_filter mov dword ptr fs:[0], esp # set_thread_seh_head(&this_node); Before the function exits and its frame is destroyed, the node must be uninstalled like this: mov ecx, dword ptr fs:[0] # mov dword ptr fs:[0], ecx # set_thread_seh_head(this_node.next); Since I am looking at the SJLJ exception model and it seems using a slim, inlined version of `setjmp()` with `__builtin_longjmp()` that only stores 3 or 4 pointers, extending that structure should be a simple matter. The problem is that, installation and uninstallation of SEH nodes require target-specific ASM code generation. Is it possible to do in 'gcc/except.c' ? -- Best regards, lh_mouse 2016-10-17 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] Make GCC emit ASM instructions in 'gcc/except.c' for i686 MinGW targets ?
> I'd probably create a new exception handling model and conditionalize > whatever code you need based on that. That would require copy-n-paste of tons of code... All this remains contingent on Microsoft's generosity because they don't provide APIs for SEH on x86, unlike on x64. So I have to reuse stack unwinding code from SJLJ at the moment. > Emission of code for that new > exception model would likely require some amount of target specific code > called via target hooks. Hooks... Er, are you talking about those global pointer-to-functions? There are a lot, indeed. -- Best regards, lh_mouse 2016-10-17 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] What is _pei386_runtime_relocator?
See comments about --enable-auto-import on https://www.sourceware.org/binutils/docs/ld/Options.html . When you refer to a member of a struct or array with static storage duration, the compiler may generate instructions to read from or write to a constant address that compares unequal to the address of the struct or array. Such an address can't be resolved by the DLL loader (because it is not exported) and is unable to be fixed by LD either and has to be resolved during run time. -- Best regards, lh_mouse 2016-12-05 - 发件人:Иван Иванов 发送日期:2016-11-29 16:47 收件人:Mingw W64 Public 抄送: 主题:[Mingw-w64-public] What is _pei386_runtime_relocator? Could you please tell me what operation does the _pei386_runtime_relocator function perform exactly? In which cases does the compiler generate calls to this function? How can I get rid of this function when doing "-nostdlib" development (think of it like bare metal development)? -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public -- ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] FLT_EPSILON missing
That might be because gcc has its own float.h: /mingw32/lib/gcc/i686-w64-mingw32/6.2.1/include/float.h:113:#define FLT_EPSILON __FLT_EPSILON__ -- Best regards, lh_mouse 2016-12-20 - 发件人:niXman 发送日期:2016-12-20 19:38 收件人:mingw-w64-public 抄送: 主题:Re: [Mingw-w64-public] FLT_EPSILON missing Vincent Torri 2016-12-20 09:04: > Hello Hi, > it seems that FLT_EPSILON and DBL_EPSILON are missing in float.h. at > least, i can't find it here : > > https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-headers/crt/float.h > > for reference, see : > > https://msdn.microsoft.com/fr-fr/library/k15zsh48.aspx > > can this be added ? This is strange because just now I test it using i686-6.2.0-posix-dwarf and all works fine. My example: #include #include int main() { printf("%g\n", FLT_EPSILON); printf("%g\n", DBL_EPSILON); } -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly
Patch is attached. This patch removes assembly files that implement FMA on ARM and merges them into the corresponding C files with the same name using inline assembly. A re-generation of Makefile.in is required. I don't have any knowledge about ARM assembly. Those functions for ARM were created using my x86 assembly knowledge and the actual instructions are copy-n-paste'd from old .S files. I don't have an ARM compiler to test those functions. Please fix them should they be broken. -- Best regards, lh_mouse 2017-01-18 From 0534577644f12e94cc408d37083277f133d1ca47 Mon Sep 17 00:00:00 2001 From: LH_Mouse Date: Wed, 18 Jan 2017 19:35:43 +0800 Subject: [PATCH] mingw-w64-crt/math/fma{,f,l}.c: Implement fused multiply-add (FMA) funcitons for x86 families properly. mingw-w64-crt/Makefile.am: Likewise. mingw-w64-crt/math/fma{,f}.S: Merge into corresponding C files with the same names, respectively. --- mingw-w64-crt/Makefile.am | 4 +- mingw-w64-crt/math/fma.S | 42 --- mingw-w64-crt/math/fma.c | 31 +++ mingw-w64-crt/math/fmaf.S | 43 --- mingw-w64-crt/math/fmaf.c | 31 +++ mingw-w64-crt/math/fmal.c | 135 -- 6 files changed, 194 insertions(+), 92 deletions(-) delete mode 100644 mingw-w64-crt/math/fma.S create mode 100644 mingw-w64-crt/math/fma.c delete mode 100644 mingw-w64-crt/math/fmaf.S create mode 100644 mingw-w64-crt/math/fmaf.c diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am index 44360db..5eba234 100644 --- a/mingw-w64-crt/Makefile.am +++ b/mingw-w64-crt/Makefile.am @@ -227,7 +227,6 @@ src_libmingwex=\ \ math/_chgsignl.S math/ceil.Smath/ceilf.S math/ceill.S math/copysignl.S \ math/floor.S math/floorf.S math/floorl.S \ - math/fma.Smath/fmaf.S\ math/nearbyint.S math/nearbyintf.S math/nearbyintl.S \ math/trunc.S math/truncf.S \ math/cbrt.c \ @@ -235,7 +234,8 @@ src_libmingwex=\ math/coshf.c math/coshl.c math/erfl.c \ math/expf.c \ math/fabs.c math/fabsf.c math/fabsl.c math/fdim.c math/fdimf.c math/fdiml.c \ - math/fmal.c math/fmax.cmath/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ + math/fma.cmath/fmaf.cmath/fmal.c \ + math/fmax.c math/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ math/fminl.c math/fp_consts.c math/fp_constsf.c \ math/fp_constsl.c math/fpclassify.c math/fpclassifyf.c math/fpclassifyl.c math/frexpf.c\ math/hypotf.c math/hypot.c math/hypotl.c math/isnan.c math/isnanf.cmath/isnanl.c\ diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S deleted file mode 100644 index 74becde..000 --- a/mingw-w64-crt/math/fma.S +++ /dev/null @@ -1,42 +0,0 @@ -/** - * This file has no copyright assigned and is placed in the Public Domain. - * This file is part of the mingw-w64 runtime package. - * No warranty is given; refer to the file DISCLAIMER.PD within this package. - */ -#include <_mingw_mac.h> - - .file "fma.S" - .text -#ifdef __x86_64__ - .align 8 -#else - .align 4 -#endif - .p2align 4,,15 - .globl __MINGW_USYMBOL(fma) - .def__MINGW_USYMBOL(fma); .scl2; .type 32; .endef -__MINGW_USYMBOL(fma): -#if defined(_AMD64_) || defined(__x86_64__) - subq$56, %rsp - movsd %xmm0,(%rsp) - movsd %xmm1,16(%rsp) - movsd %xmm2,32(%rsp) - fldl(%rsp) - fmull 16(%rsp) - fldl32(%rsp) - faddp - fstpl (%rsp) - movsd (%rsp),%xmm0 - addq$56, %rsp - ret -#elif defined(_ARM_) || defined(__arm__) - fmacd d2, d0, d1 - fcpyd d0, d2 - bx lr -#elif defined(_X86_) || defined(__i386__) - fldl4(%esp) - fmull 12(%esp) - fldl20(%esp) - faddp - ret -#endif diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c new file mode 100644 index 000..98249aa --- /dev/null +++ b/mingw-w64-crt/math/fma.c @@ -0,0 +1,31 @@ +/** + * This file has no copyright assigned and is placed in the Public Domain. + * This file is part of the mingw-w64 runtime package. + * No warranty is given; refer to the file DISCLAIMER.PD within this package. + */ +double fma(double x, double y, double z); + +#if defined(_AMD64_) || defined(__x86_64__) || defined(_X86_) || defined(__i386__) + +long double fmal(long double x, long double y, long double z); + +double fma(double x, double y, double z){ + return (double)fmal(x, y, z); +} + +#elif defined(_ARM_) || defined(__arm__) + +double fma(double x, double y, double z){ + __asm__ ( +
Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly
The correctness of fma() function can be verified using the following program: --- #include #include volatile double x = 0x1.3p52; volatile double y = 0x1.5p52; volatile double z = -0x1.8p104; int main(){ printf("x * y + z= %f\n", x * y + z); printf("fma(x, y, z) = %f\n", fma(x, y, z)); } --- A naive multiply-then-add loses some LSBs during the multiplication and yields zero when the MSBs are complemented by a negative number. A true FMA function yields 15 in this example. -- Best regards, lh_mouse 2017-01-18 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly
> I see that you have replaced the x86 parts for fma and fmaf with C > code. That seems like a good thing. Is there some reason you can't do > that with the ARM versions too? ARM has hardware FMA and software emulation is not optimal. > Reducing the amount of platform-specific code also seems like a good thing. The x87 80-bit floating point format is already platform-specific. > There are a number of reasons not to use inline asm (for example > https://gcc.gnu.org/wiki/DontUseInlineAsm ). Are you sure this is a > good idea? I am not sure about the inline asm itself. The primary reason I did that is because, if we have `fma.S` and `fma.c` in the same directory they will compile to the same file `fma.o`, and `make` complains about that. Inline asm is indeed hard to maintain and I am aware of it. Personally I only write asm statements that contain very few instructions, simulating builtin functions or intrinsics for use in C code. > Yup, that's one of the downsides to using inline asm. > > I'm no ARM expert, but I'm not sure about this ARM code for fmal: > > +long double fmal(long double x, long double y, long double z){ > + __asm__ ( > +"fmacd %2, %0, %1 \n" > +"fcpyd %0, %2 \n" > +: "+&w"(z) > +: "w"(x), "w"(y) > + ); > > Doesn't fmacd modify %2? That would be (y), which is listed as an input > parameter (and therefore is read-only). What's more, I thought fmacd > was calculating "Fd + Fn * Fm" where the parameters were "fmacd Fd, Fn, > Fm". Such being the case, I would have expected "fmacd %0, %1 %2"? I > don't have a way to run this either, but this looks wrong. Thanks for pointing it out. That is a mistake. I forgot to fix it after copying it from the asm code. The `fma()` function was the correct one. > Under the nit-picky heading: > > +double fma(double x, double y, double z){ > + __asm__ ( > +"fmacd %0, %1, %2 \n" > +: "+&w"(z) > +: "w"(x), "w"(y) > + ); > > The \n is redundant. And doesn't the + make the & redundant as well? I just perfer to terminate every line of asm code with \n. I believe the & is redundant not only because of the +, but also because that there is only one instruction so nothing can be written before the others are read. > Lastly I gotta ask: Can we use __builtin_fmal? Or is mingw-w64 the one > providing the implementations for these? We have to ask a GCC developer for sure. According to my experience this function is something guaranteed to be semantically equivalent to the one without the __builtin_ prefix in the standard library. Sometimes the compiler cannot assume all functions from the standard C library are available and have the specified behavior e.g. when compiling the Linux kernel. The `__builtin_fmal()` function is then considered to be a standard FMA, suitable for constant folding. It may result in an inline instruction where possible, but could also result in a call to the `fmal()` external function, resulting in infinite recursion if used in `fmal()`. -- Best regards, lh_mouse 2017-01-19 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly
New patch attached. This patch fixes ARM functions and adds a check in `fpu_fma()` for potential NaN or INF results. -- Best regards, lh_mouse 2017-01-19 From 3c55daec84dac190b9e3cb032371960e1acbc38f Mon Sep 17 00:00:00 2001 From: LH_Mouse Date: Wed, 18 Jan 2017 19:35:43 +0800 Subject: [PATCH] mingw-w64-crt/math/fma{,f,l}.c: Implement fused multiply-add (FMA) funcitons for x86 families properly. mingw-w64-crt/Makefile.am: Likewise. mingw-w64-crt/math/fma{,f}.S: Merge into corresponding C files with the same names, respectively. --- mingw-w64-crt/Makefile.am | 4 +- mingw-w64-crt/math/fma.S | 42 - mingw-w64-crt/math/fma.c | 31 ++ mingw-w64-crt/math/fmaf.S | 43 -- mingw-w64-crt/math/fmaf.c | 31 ++ mingw-w64-crt/math/fmal.c | 146 -- 6 files changed, 205 insertions(+), 92 deletions(-) delete mode 100644 mingw-w64-crt/math/fma.S create mode 100644 mingw-w64-crt/math/fma.c delete mode 100644 mingw-w64-crt/math/fmaf.S create mode 100644 mingw-w64-crt/math/fmaf.c diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am index 44360db..5eba234 100644 --- a/mingw-w64-crt/Makefile.am +++ b/mingw-w64-crt/Makefile.am @@ -227,7 +227,6 @@ src_libmingwex=\ \ math/_chgsignl.S math/ceil.Smath/ceilf.S math/ceill.S math/copysignl.S \ math/floor.S math/floorf.S math/floorl.S \ - math/fma.Smath/fmaf.S\ math/nearbyint.S math/nearbyintf.S math/nearbyintl.S \ math/trunc.S math/truncf.S \ math/cbrt.c \ @@ -235,7 +234,8 @@ src_libmingwex=\ math/coshf.c math/coshl.c math/erfl.c \ math/expf.c \ math/fabs.c math/fabsf.c math/fabsl.c math/fdim.c math/fdimf.c math/fdiml.c \ - math/fmal.c math/fmax.cmath/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ + math/fma.cmath/fmaf.cmath/fmal.c \ + math/fmax.c math/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ math/fminl.c math/fp_consts.c math/fp_constsf.c \ math/fp_constsl.c math/fpclassify.c math/fpclassifyf.c math/fpclassifyl.c math/frexpf.c\ math/hypotf.c math/hypot.c math/hypotl.c math/isnan.c math/isnanf.cmath/isnanl.c\ diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S deleted file mode 100644 index 74becde..000 --- a/mingw-w64-crt/math/fma.S +++ /dev/null @@ -1,42 +0,0 @@ -/** - * This file has no copyright assigned and is placed in the Public Domain. - * This file is part of the mingw-w64 runtime package. - * No warranty is given; refer to the file DISCLAIMER.PD within this package. - */ -#include <_mingw_mac.h> - - .file "fma.S" - .text -#ifdef __x86_64__ - .align 8 -#else - .align 4 -#endif - .p2align 4,,15 - .globl __MINGW_USYMBOL(fma) - .def__MINGW_USYMBOL(fma); .scl2; .type 32; .endef -__MINGW_USYMBOL(fma): -#if defined(_AMD64_) || defined(__x86_64__) - subq$56, %rsp - movsd %xmm0,(%rsp) - movsd %xmm1,16(%rsp) - movsd %xmm2,32(%rsp) - fldl(%rsp) - fmull 16(%rsp) - fldl32(%rsp) - faddp - fstpl (%rsp) - movsd (%rsp),%xmm0 - addq$56, %rsp - ret -#elif defined(_ARM_) || defined(__arm__) - fmacd d2, d0, d1 - fcpyd d0, d2 - bx lr -#elif defined(_X86_) || defined(__i386__) - fldl4(%esp) - fmull 12(%esp) - fldl20(%esp) - faddp - ret -#endif diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c new file mode 100644 index 000..00f100c --- /dev/null +++ b/mingw-w64-crt/math/fma.c @@ -0,0 +1,31 @@ +/** + * This file has no copyright assigned and is placed in the Public Domain. + * This file is part of the mingw-w64 runtime package. + * No warranty is given; refer to the file DISCLAIMER.PD within this package. + */ +double fma(double x, double y, double z); + +#if defined(_AMD64_) || defined(__x86_64__) || defined(_X86_) || defined(__i386__) + +long double fmal(long double x, long double y, long double z); + +double fma(double x, double y, double z){ + return (double)fmal(x, y, z); +} + +#elif defined(_ARM_) || defined(__arm__) + +double fma(double x, double y, double z){ + __asm__ ( +"fmacd %0, %1, %2 \n" +: "+w"(z) +: "w"(x), "w"(y) + ); + return z; +} + +#else + +#error This platform is not supported. + +#endif diff --git a/mingw-w64-crt/math/fmaf.S b/mingw-w64-crt/math/fmaf.S deleted file mode 100644 index 6bc7ef0..000 --- a/mingw-w64-crt/math/fmaf.S +++ /dev/null @@ -1,43 +0,0 @@ -/** - * This file has no copyrig
Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly
> So you have decided that __builtins can't be used then? That's too bad. Yes it results in a call to `fma()` on x64. Can't test it on ARM though. > I know almost nothing about the guts of floating point, so I'm prepared > to defer to your judgement, but here's what I think: > > Let me propose an alternative for fma.c: > ... ... > In other words, remove all the platform specific code. This (greatly) > simplifies this file. You were already using fmal for x86. And it > doesn't lose anything for ARM, since both fma() and fmal() use the exact > same inline asm. Why have the exact same (hard to maintain) code in 2 > places? Keeping asm code in fmaf.c but not in fma.c seems style inconsistency. However the contrary is doable: In the case of ARM, call `fma()` in `fmal()`. > As for fmaf, what about: > ... ... > The case here is less compelling, but I assert that if fmal is > supported, it can always be used to calculate fmaf. If there is a > shorter/more efficient method (such as there is with ARM), it can be > added here. Fair enough. Updated. > As for fmal, I have a question about your code. Not the implementation, > but the design. Looking at https://en.wikipedia.org/wiki/Long_double, > it says "Microsoft Windows with Visual C++ also sets the processor in > double-precision mode by default." Since (it appears?) you aren't > following _controlfp_s, won't this give use a different answer than fmal > from msvcr120.dll? MSVC doesn't support 80-bit `long double` (it is 64 bits there) so the results can't equal unless it fits into 64 bits precisely. My FMA algorithm is basically splitting both operands into two 32-bit ones, multiplying them using elementary arithmetics then adding the four 64-bit results altogether: (a+b)(c+d) = ac+(bc+ad)+bd. So the precision of x87 indeed affects the result. I doubt whether it is necessary to save the x87 control word and set it to 64-bit precision before the calcuation and restore it thereafter. MinGW-w64 already sets it to 64-bit precision during CRT initialization, and if people set it lower they ain't going to need `fma()` either. An interesting look at https://msdn.microsoft.com/en-us/library/c9676k6h.aspx reminds me that _PC_64 isn't supported on x64. Sounds incredible, no? Does `_controlfp_s()` return an error if we try to set _PC_64 on 0x64? I have no idea. Nevertheless the precision flags can be set and restored using inline assembly - yet another dirty solution. > More nits: > > s/whecher/whether > s/#x86_Extended_Precision_Format/#x86_extended_precision_format Fixed. The bookmark to wikipedia was copied from my broswer half a year ago at least and it probably was modified. -- Best regards, lh_mouse 2017-01-20 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly
The mail has been being rejected for spamming for a few hours. Hope it wouldn't be this time. -- Best regards, lh_mouse From 82fd24e992a402ff2f7c55780fd76945ef83e094 Mon Sep 17 00:00:00 2001 From: LH_Mouse Date: Wed, 18 Jan 2017 19:35:43 +0800 Subject: [PATCH] mingw-w64-crt/math/fma{,f,l}.c: Implement fused multiply-add (FMA) funcitons for x86 families properly. mingw-w64-crt/Makefile.am: Likewise. mingw-w64-crt/math/fma{,f}.S: Merge into corresponding C files with the same names, respectively. --- mingw-w64-crt/Makefile.am | 4 +- mingw-w64-crt/math/fma.S | 42 -- mingw-w64-crt/math/fma.c | 29 ++ mingw-w64-crt/math/fmaf.S | 43 -- mingw-w64-crt/math/fmaf.c | 29 ++ mingw-w64-crt/math/fmal.c | 143 -- 6 files changed, 198 insertions(+), 92 deletions(-) delete mode 100644 mingw-w64-crt/math/fma.S create mode 100644 mingw-w64-crt/math/fma.c delete mode 100644 mingw-w64-crt/math/fmaf.S create mode 100644 mingw-w64-crt/math/fmaf.c diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am index 44360db..5eba234 100644 --- a/mingw-w64-crt/Makefile.am +++ b/mingw-w64-crt/Makefile.am @@ -227,7 +227,6 @@ src_libmingwex=\ \ math/_chgsignl.S math/ceil.Smath/ceilf.S math/ceill.S math/copysignl.S \ math/floor.S math/floorf.S math/floorl.S \ - math/fma.Smath/fmaf.S\ math/nearbyint.S math/nearbyintf.S math/nearbyintl.S \ math/trunc.S math/truncf.S \ math/cbrt.c \ @@ -235,7 +234,8 @@ src_libmingwex=\ math/coshf.c math/coshl.c math/erfl.c \ math/expf.c \ math/fabs.c math/fabsf.c math/fabsl.c math/fdim.c math/fdimf.c math/fdiml.c \ - math/fmal.c math/fmax.cmath/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ + math/fma.cmath/fmaf.cmath/fmal.c \ + math/fmax.c math/fmaxf.c math/fmaxl.c math/fmin.c math/fminf.c \ math/fminl.c math/fp_consts.c math/fp_constsf.c \ math/fp_constsl.c math/fpclassify.c math/fpclassifyf.c math/fpclassifyl.c math/frexpf.c\ math/hypotf.c math/hypot.c math/hypotl.c math/isnan.c math/isnanf.cmath/isnanl.c\ diff --git a/mingw-w64-crt/math/fma.S b/mingw-w64-crt/math/fma.S deleted file mode 100644 index 74becde..000 --- a/mingw-w64-crt/math/fma.S +++ /dev/null @@ -1,42 +0,0 @@ -/** - * This file has no copyright assigned and is placed in the Public Domain. - * This file is part of the mingw-w64 runtime package. - * No warranty is given; refer to the file DISCLAIMER.PD within this package. - */ -#include <_mingw_mac.h> - - .file "fma.S" - .text -#ifdef __x86_64__ - .align 8 -#else - .align 4 -#endif - .p2align 4,,15 - .globl __MINGW_USYMBOL(fma) - .def__MINGW_USYMBOL(fma); .scl2; .type 32; .endef -__MINGW_USYMBOL(fma): -#if defined(_AMD64_) || defined(__x86_64__) - subq$56, %rsp - movsd %xmm0,(%rsp) - movsd %xmm1,16(%rsp) - movsd %xmm2,32(%rsp) - fldl(%rsp) - fmull 16(%rsp) - fldl32(%rsp) - faddp - fstpl (%rsp) - movsd (%rsp),%xmm0 - addq$56, %rsp - ret -#elif defined(_ARM_) || defined(__arm__) - fmacd d2, d0, d1 - fcpyd d0, d2 - bx lr -#elif defined(_X86_) || defined(__i386__) - fldl4(%esp) - fmull 12(%esp) - fldl20(%esp) - faddp - ret -#endif diff --git a/mingw-w64-crt/math/fma.c b/mingw-w64-crt/math/fma.c new file mode 100644 index 000..645a3d1 --- /dev/null +++ b/mingw-w64-crt/math/fma.c @@ -0,0 +1,29 @@ +/** + * This file has no copyright assigned and is placed in the Public Domain. + * This file is part of the mingw-w64 runtime package. + * No warranty is given; refer to the file DISCLAIMER.PD within this package. + */ +double fma(double x, double y, double z); + +#if defined(_ARM_) || defined(__arm__) + +/* Use hardware FMA on ARM. */ +double fma(double x, double y, double z){ + __asm__ ( +"fmacd %0, %1, %2 \n" +: "+w"(z) +: "w"(x), "w"(y) + ); + return z; +} + +#else + +long double fmal(long double x, long double y, long double z); + +/* For platforms that don't have hardware FMA, emulate it. */ +double fma(double x, double y, double z){ + return (double)fmal(x, y, z); +} + +#endif diff --git a/mingw-w64-crt/math/fmaf.S b/mingw-w64-crt/math/fmaf.S deleted file mode 100644 index 6bc7ef0..000 --- a/mingw-w64-crt/math/fmaf.S +++ /dev/null @@ -1,43 +0,0 @@ -/** - * This file has no copyright assigned and is placed in the Public Domain. - * This file is part of the mingw-w64 runtime package. - * No warrant
Re: [Mingw-w64-public] Implement fused multiply-add (FMA) funcitons for x86 families properly
On 2017/1/23 9:08, David Wohlferd wrote: > Hmm. > > It seems a bit backwards to have the function that takes a 'long double' > calling the function that takes a 'double.' Yes, they are both the same > size on ARM, but I think I would have gone the other way. Plus I kinda > like having all the implementations in one file (fmal.c). I prefer that too. At the moment I have to follow what mingw-w64 has been doing. That is, keeping separated functions for {f,,l} in different files. > Other than that, this looks ok to me. Building for ARM with clang seems > to work (although I have no way to run it). Thanks for testing. -- Best regards, LH_Mouse -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] Issue with headers.
On 2017/1/26 19:00, Petri Hodju wrote: > Hi! > I ran to the same problem earlier and I posted patches here on the > list on 2nd December 2016 for this problem. > In short, the CompPtr has specialized constructors that can't access > the protected members as they are not working on class level. The fix > was trivial, changing the direct member variable access to use the > already available accessor methods. > Other problem I encountered was a missing BitmapBrushProperties1 in > the d2d1_1helper1.h, for which I also posted a patch. > With these patches I'm able to build the Qt-5.8.0 just fine : ) > Have I not followed some step of providing patches as these have not > been commented at all so far... ? I am afraid [1] isn't the correct way to fix it. As for consistency, the correct solution is adding a `friend` declaration, as what Microsoft people did. Patch attached, please test. [1] https://sourceforge.net/p/mingw-w64/mailman/message/35527066/ -- Best regards, LH_Mouse From 2ab50e9a9b1d3a8d8c6e33d1e2e9077a872166a8 Mon Sep 17 00:00:00 2001 From: LH_Mouse Date: Thu, 26 Jan 2017 19:28:51 +0800 Subject: [PATCH] mingw-w64-headers/include/wrl/client.h: Fix error: 'ptr_' is protected within this context. --- mingw-w64-headers/include/wrl/client.h | 1 + 1 file changed, 1 insertion(+) diff --git a/mingw-w64-headers/include/wrl/client.h b/mingw-w64-headers/include/wrl/client.h index 83b4cb3..448c7a2 100644 --- a/mingw-w64-headers/include/wrl/client.h +++ b/mingw-w64-headers/include/wrl/client.h @@ -252,6 +252,7 @@ namespace Microsoft { */ protected: InterfaceType *ptr_; +template friend class ComPtr; void InternalAddRef() const throw() { if(ptr_) -- 2.10.2 From 2ab50e9a9b1d3a8d8c6e33d1e2e9077a872166a8 Mon Sep 17 00:00:00 2001 From: LH_Mouse Date: Thu, 26 Jan 2017 19:28:51 +0800 Subject: [PATCH] mingw-w64-headers/include/wrl/client.h: Fix error: 'ptr_' is protected within this context. --- mingw-w64-headers/include/wrl/client.h | 1 + 1 file changed, 1 insertion(+) diff --git a/mingw-w64-headers/include/wrl/client.h b/mingw-w64-headers/include/wrl/client.h index 83b4cb3..448c7a2 100644 --- a/mingw-w64-headers/include/wrl/client.h +++ b/mingw-w64-headers/include/wrl/client.h @@ -252,6 +252,7 @@ namespace Microsoft { */ protected: InterfaceType *ptr_; +template friend class ComPtr; void InternalAddRef() const throw() { if(ptr_) -- 2.10.2 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] delay loading of dll
On 2017/2/7 0:26, Hannes Domani wrote: > Hello > > > Does delay-loading work with 32bit executables? > > In the following example it crashes for me on the dll_function() call. > I've used i686-6.3.0-release-win32-dwarf-rt_v5-rev1.7z for my tests. I compiled the program and it did crash. The assembly code generated looks like this: 00401570 | pushebp| int main(){ 00401571 | mov ebp, esp | 00401573 | and esp, FFF0 | 00401576 | callapp.401700 | __main(); 0040157B | mov eax, dword ptr ds:[407204] | 00401580 | calleax| 00401582 | mov eax, 0 | return 0; 00401587 | leave | 00401588 | ret| } 0040158C | pushecx| 0040158D | pushedx| 0040158E | pusheax| 0040158F | push | 00401594 | callapp.4026A0 | 00401599 | pop edx| 0040159A | pop ecx| 0040159B | jmp eax| The pointer at address 407204 should be a pointer to the DLL loader function initially, which is located at 0040158C. The pointer here is initially null and results in jumping to address zero, hence the crash. In addition to that, the assembly code of the DLL loader function is incorrect. The DLL loader function requires the caller to pass the address of the function pointer above (which is 407204) via the EAX register. That is, the first instruction at 0040158C should have been `lea eax, dword ptr ds:[407204]`. Compiling app.c with `-S -masm=intel` produces the following assembly code, with directives removed: _main: pushebp mov ebp, esp and esp, -16 call___main mov eax, DWORD PTR __imp__dll_function calleax mov eax, 0 leave ret The DLL loader function `__imp__dll_function` seems not generated by the compiler. So it seems that dlltool for i686 isn't generating correct machine code for delay-loaded functions. -- Best regards, LH_Mouse -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] [PATCH] remove cast to int from mantissa for __mingw_printf
On 2017/6/12 23:24, Martell Malone wrote: > In this thread https://sourceforge.net/p/mingw-w64/bugs/459/ > there is a suggested fix for print with whole numbers > > The builtin __mingw_printf is inconsistent with printf on %a format. >> I think __mingw_printf is wrong, because obviously 1.0 != 0x0p-63. > > > vacaboja opened an issue on msys2 for this > https://github.com/msys2/msys2/issues/35 > and suggested a fix of removing the case to int > here is a patch that does just that. > According to the discussion on that ticket, this patch looks correct. But why was there such a suspicious cast? It might be there to silence a warning about comparison between signed and unsigned integers, which is enabled by `-Wsign-compare` or `-Wall` in C++ or `-Wextra` in C. If you do see such a warning, I suggest you 1) redeclare `c` as `unsigned` instead of `int`, or 2) cast `c` back to `unsigned` before the comparison. -- Best regards, LH_Mouse -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] [PATCHv2] crt: Add an ldexpl function for arm and arm64
On 2017/12/18 3:43, Martin Storsjö wrote: > Since long double just is normal double on arm and arm64, just > call the normal ldexp function. > > Signed-off-by: Martin Storsjö > --- > Fixed the parameters in the C wrappers. > --- > > mingw-w64-crt/Makefile.am | 3 ++- > mingw-w64-crt/math/arm/ldexpl.c > | 16 > mingw-w64-crt/math/arm64/ldexpl.c | 16 > > 3 files changed, 34 insertions(+), 1 deletion(-) > > create mode 100644 mingw-w64-crt/math/arm/ldexpl.c > create mode 100644 > mingw-w64-crt/math/arm64/ldexpl.c > > diff --git a/mingw-w64-crt/Makefile.am > b/mingw-w64-crt/Makefile.am > index 6812a5e..7d6c395 100644 > --- > a/mingw-w64-crt/Makefile.am > +++ b/mingw-w64-crt/Makefile.am > @@ -390,13 > +390,14 @@ src_libmingwexarm32+=\ >math/softmath/sinf.c > math/softmath/sinl.c math/softmath/tanf.c math/softmath/tanl.c > > else > src_libmingwexarm32+=\ > - math/arm/exp2.c math/arm/log2.c > math/arm/scalbn.c math/arm/sincos.c > + math/arm/exp2.c >math/arm/ldexpl.c math/arm/log2.c math/arm/scalbn.c >math/arm/sincos.c > endif > > # these only go into the ARM64 > version: > src_libmingwexarm64=\ >math/arm64/_chgsignl.S > math/arm64/ceil.S math/arm64/ceilf.Smath/arm64/ceill.S > math/arm64/copysignl.c\ >math/arm64/exp2.S math/arm64/exp2f.S >math/arm64/floor.Smath/arm64/floorf.S > math/arm64/floorl.S \ > + math/arm64/ldexpl.c \ >math/arm64/log2.c > math/arm64/nearbyint.Smath/arm64/nearbyintf.S > math/arm64/nearbyintl.S math/arm64/scalbn.c \ > > math/arm64/sincos.c math/arm64/trunc.Smath/arm64/truncf.S > > > diff --git a/mingw-w64-crt/math/arm/ldexpl.c > b/mingw-w64-crt/math/arm/ldexpl.c > new file mode 100644 > index > 000..7d3bffc > --- /dev/null > +++ b/mingw-w64-crt/math/arm/ldexpl.c > @@ > -0,0 +1,16 @@ > +/** > + * This file has no copyright assigned and is placed > in the Public Domain. > + * This file is part of the mingw-w64 runtime > package. > + * No warranty is given; refer to the file DISCLAIMER.PD within > this package. > + */ > + > +#include> + > +long double ldexpl(long double x, > int n) > +{ > +#if defined(__arm__) || defined(_ARM_) > +return ldexp(x, > n); > +#else > +#error Not supported on your platform yet > +#endif > +} > > diff --git a/mingw-w64-crt/math/arm64/ldexpl.c > b/mingw-w64-crt/math/arm64/ldexpl.c > new file mode 100644 > index > 000..bfa3287 > --- /dev/null > +++ b/mingw-w64-crt/math/arm64/ldexpl.c > > @@ -0,0 +1,16 @@ > +/** > + * This file has no copyright assigned and is > placed in the Public Domain. > + * This file is part of the mingw-w64 runtime > package. > + * No warranty is given; refer to the file DISCLAIMER.PD within > this package. > + */ > + > +#include -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] [PATCH 3/5] libmsvcr*.a: Added compatibility implementation of onexit table functions.
On 2018/1/12 4:31, Jacek Caban wrote: > Signed-off-by: Jacek Caban -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] [PATCH] intrin-impl.h: Added missing volatile to _interlockedbittestandset and _interlockedbittestandset64 declarations.
On 2018/1/12 4:28, Jacek Caban wrote: > > Fixes compilation with clang. > > Signed-off-by: Jacek Caban -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] [PATCH 2/5] corecrt_startup.h: Added _onexit_table_t and related functions declarations.
On 2018/1/12 4:30, Jacek Caban wrote: > Signed-off-by: Jacek Caban -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] [ucrt]missing quick_exit in stdlib.h/cstdlib
在 4/20/21 9:31 PM, yume todo 写道: > > In ucrt64, quick_exit is not found. > > This has been fixed on master now: https://sourceforge.net/p/mingw-w64/mingw-w64/ci/7dda261 -- Best regards, Liu Hao ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] wine FTBFS with mingw64 gcc 11: undefined reference to `sincos'
在 5/15/21 1:27 AM, Jacek Caban 写道: > > I think that the decision was unfortunate on GCC side, but there is little we > can do. We will > probably need to provide it in msvcrt importlibs. Please try the attached > patch, it should help. > > Doesn't GCC transform such pair of calls to `sincos()` again and result in an infinite recursion? -- Best regards, Liu Hao ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] patch to add htonll/ntohll
在 2021-12-17 02:13, Michel Zou 写道: > Hi, > It turns out that these are inline functions, here is a new patch. > xan > Thanks. Pushed to master. Next time, please send a patch created by `git format`, and please do sign off the commit with `git commit -s`. -- Best regards, LIU Hao ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public