Matt and I talked about whether we needed the compile check, he didn't think we did because we required a GCC that has msse4.1 support, but the HAVE vs USE is a real bug (someone else noticed that too).
CC'ing Matt in case I'm miss-remembering something. Quoting Scott D Phillips (2018-01-24 10:28:53) > Before we were adding -DHAVE_SSE41 which isn't what the code is > looking for, so some uses of the sse4.1 code were always being > skipped. > > Fixes: 84486f6462 ("meson: Enable SSE4.1 optimizations") > --- > meson.build | 20 +++++++++++++++----- > 1 file changed, 15 insertions(+), 5 deletions(-) > > diff --git a/meson.build b/meson.build > index 97619f786b..3bbda53ccf 100644 > --- a/meson.build > +++ b/meson.build > @@ -771,9 +771,9 @@ foreach a : ['-Werror=pointer-arith', '-Werror=vla'] > endif > endforeach > > +with_sse41 = false > +sse41_args = [] > if host_machine.cpu_family().startswith('x86') > - pre_args += '-DHAVE_SSE41' > - with_sse41 = true > sse41_args = ['-msse4.1'] > > # GCC on x86 (not x86_64) with -msse* assumes a 16 byte aligned stack, but > @@ -781,9 +781,19 @@ if host_machine.cpu_family().startswith('x86') > if host_machine.cpu_family() == 'x86' > sse41_args += '-mstackrealign' > endif > -else > - with_sse41 = false > - sse41_args = [] > + > + if cc.compiles('''#include <smmintrin.h> > + int param; > + int main () { > + __m128i a = _mm_set1_epi32 (param), b = > _mm_set1_epi32 (param + 1), c; > + c = _mm_max_epu32(a, b); > + return _mm_cvtsi128_si32(c); > + }''', > + name : 'SSE4.1 intrinsics', > + args : sse41_args) > + with_sse41 = true > + pre_args += '-DUSE_SSE41' > + endif > endif > > # Check for GCC style atomics > -- > 2.14.3 >
signature.asc
Description: signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev