ArnoldTheresius <arnold.we...@siemens.com> writes: > trueroad wrote >> Hello Arnold. >> Thank you for your patch. >> >> If I understand correctly, we only need check the definitions of >> `__x86__` and `__i386__` check. >> >> In x86_64 environment, neither `__x86__` nor `__i386__` are defined. >> >> ``` >> $ echo | x86_64-w64-mingw32-gcc -dM -E - | grep "__x86__" >> >> $ echo | x86_64-w64-mingw32-gcc -dM -E - | grep "__i386__" >> >> $ echo | i686-w64-mingw32-gcc -dM -E - | grep "__x86__" >> >> $ echo | i686-w64-mingw32-gcc -dM -E - | grep "__i386__" >> #define __i386__ 1 >> >> $ >> ``` >> >> Therefore, `defined (__code_model_32__)` is not necessary. >> >> If `__SSE2_MATH__` is defined, it only indicates that main.cc is >> compiled with SSE2 math enabled. >> Shared libraries such as libguile may still use x86 FPU. >> If the floating point calculation inside GUILE uses x86 FPU, we need to >> set the precision even if C++ uses SSE2. >> >> Therefore, `defined (__SSE2_MATH__)` is not necessary. >> >> Furthermore, if the floating-point operations of all modules use SSE2, >> setting the x86 FPU precision has no effect because it is not used. >> In other words, there is no problem even if the precision is set on a >> platform that does not need to set. >> >> Therefore, environment definitions such as `defined (__MINGW32__)` is >> not necessary. >> >> >> https://codereview.appspot.com/577450043/ > > Thank you for the clear answer. > > Therfore, do it as suggested [move asm() call closer to _FPU_SETCW() call] > > ArnoldTheresius. > > > > > -- > Sent from: http://lilypond.1069038.n5.nabble.com/Dev-f88644.html
None of the following GCC options would help? The following options control compiler behavior regarding floating-point arithmetic. These options trade off between speed and correctness. All must be specifically enabled. -ffloat-store Do not store floating-point variables in registers, and inhibit other options that might change whether a floating- point value is taken from a register or memory. This option prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a "double" is supposed to have. Similarly for the x86 architecture. For most programs, the excess precision does only good, but a few programs rely on the precise definition of IEEE floating point. Use -ffloat-store for such programs, after modifying them to store all pertinent intermediate computations into variables. -fexcess-precision=style This option allows further control over excess precision on machines where floating-point operations occur in a format with more precision or range than the IEEE standard and interchange floating-point types. By default, -fexcess-precision=fast is in effect; this means that operations may be carried out in a wider precision than the types specified in the source if that would result in faster code, and it is unpredictable when rounding to the types specified in the source code takes place. When compiling C, if -fexcess-precision=standard is specified then excess precision follows the rules specified in ISO C99; in particular, both casts and assignments cause values to be rounded to their semantic types (whereas -ffloat-store only affects assignments). This option is enabled by default for C if a strict conformance option such as -std=c99 is used. -ffast-math enables -fexcess-precision=fast by default regardless of whether a strict conformance option is used. -fexcess-precision=standard is not implemented for languages other than C. On the x86, it has no effect if -mfpmath=sse or -mfpmath=sse+387 is specified; in the former case, IEEE semantics apply without excess precision, and in the latter, rounding is unpredictable. -ffast-math Sets the options -fno-math-errno, -funsafe-math-optimizations, -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans, -fcx-limited-range and -fexcess-precision=fast. This option causes the preprocessor macro "__FAST_MATH__" to be defined. This option is not turned on by any -O option besides -Ofast since it can result in incorrect output for programs that depend on an exact implementation of IEEE or ISO rules/specifications for math functions. It may, however, yield faster code for programs that do not require the guarantees of these specifications. -- David Kastrup