Forgot to mention, strictly speaking only __SSE__ is necessary for _mesa_lroundevenf, so it would work on these shiny Pentium 3 and Athlon XP... The double version (though it's unused) however requires __SSE2__.
Roland Am 09.08.2015 um 19:23 schrieb Roland Scheidegger: > Am 09.08.2015 um 18:47 schrieb Matt Turner: >> On Sun, Aug 9, 2015 at 3:57 AM, Jose Fonseca <jfons...@vmware.com> wrote: >>> As currently only GCC x86_64 builds where using it. >>> --- >>> src/util/rounding.h | 16 +++++++++++++--- >>> 1 file changed, 13 insertions(+), 3 deletions(-) >>> >>> diff --git a/src/util/rounding.h b/src/util/rounding.h >>> index ec31b47..38c1c2f 100644 >>> --- a/src/util/rounding.h >>> +++ b/src/util/rounding.h >>> @@ -27,7 +27,17 @@ >>> #include <math.h> >>> #include <limits.h> >>> >>> -#ifdef __x86_64__ >>> +/* SSE2 is supported on: all x86_64 targets, on x86 targets when -msse2 is >>> + * passed to GCC, and should also be enabled on all Windows builds. */ >>> +#if defined(__x86_64__) /* gcc */ || \ >>> + defined(_M_X64) /* msvc */ || \ >>> + defined(_M_AMD64) /* msvc */ || \ >>> + defined(__SSE2__) /* gcc -msse2 */ || \ >> >> I don't think we should include __SSE2__ in this. On x86-32, >> floating-point operations will be using the x87 FPU, so using SSE >> intrinsics will force some transfers to and from memory. > My understanding is, you will only get __SSE2__ if you actually used > -msse2 to build, otherwise this will be not defined. And in this case > gcc will use sse/sse2 for all float math, thus this is quite appropriate. > > As for the _WIN32 I think we always enable sse math for windows. > scons gallium.py defines /arch:SSE2 for windows msvc builds always (and > mentions it's the default for msvc 2012 anyway), again meaning these > builds will not touch x87 fpu (and not run on your 80486...). > Actually reading some MS docs, the compiler may well chose to run a mix > of x87 and sse/sse2 but in this case there'd be no way to know if the > sse conversion instruction should be used anyway. > Don't ask me about other compilers on windows, though... > I guess though this means that you could not make mesa run on such chips > if you wanted by just changing compile flags, but I don't know if that's > really a problem. (I think the problem here is that if you'd use > something like /arch:AVX then indeed __AVX__ would get defined by msvc, > however for /arch:SSE2 no __SSE2__ macro (or any other) would get > defined (boo...). Though I guess it could be defined manually somewhere > for such builds somewhere depending on the /arch flag used for compiling > in the build infrastructure.) > > Roland > > >> >>> + defined(_WIN32) >>> +#define HAVE_SSE2 1 >> >> Does MSVC define __amd64, __amd64__, or __x86_64? The AMD64 ABI >> document says these, and __x86_64__ should be defined if compiling on >> x86-64. With the removal of __SSE2__ above, I'd make this >> >> #ifndef __x86_64__ >> #if defined(_M_X64) /* msvc */ || \ >> defined(_M_AMD64) /* msvc */ || \ >> defined(_WIN32) >> #define __x86_64__ >> #endif >> #endif >> >>> +#endif >>> + >>> +#ifdef HAVE_SSE2 >>> #include <xmmintrin.h> >>> #include <emmintrin.h> >>> #endif >>> @@ -93,7 +103,7 @@ _mesa_roundeven(double x) >>> static inline long >>> _mesa_lroundevenf(float x) >>> { >>> -#ifdef __x86_64__ >>> +#ifdef HAVE_SSE2 >>> #if LONG_BIT == 64 >>> return _mm_cvtss_si64(_mm_load_ss(&x)); >>> #elif LONG_BIT == 32 || defined(_WIN32) >>> @@ -113,7 +123,7 @@ _mesa_lroundevenf(float x) >>> static inline long >>> _mesa_lroundeven(double x) >>> { >>> -#ifdef __x86_64__ >>> +#ifdef HAVE_SSE2 >>> #if LONG_BIT == 64 >>> return _mm_cvtsd_si64(_mm_load_sd(&x)); >>> #elif LONG_BIT == 32 || defined(_WIN32) >>> -- >>> 2.1.4 >>> > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev