On Thu, Mar 02, 2017 at 10:03:28AM +0100, Thomas Koenig wrote: > Am 02.03.2017 um 09:43 schrieb Jakub Jelinek: > > On Wed, Mar 01, 2017 at 10:00:08PM +0100, Thomas Koenig wrote: > > > @@ -101,7 +93,7 @@ > > > `static void > > > 'matmul_name` ('rtype` * const restrict retarray, > > > 'rtype` * const restrict a, 'rtype` * const restrict b, int try_blas, > > > - int blas_limit, blas_call gemm) __attribute__((__target__("avx2"))); > > > + int blas_limit, blas_call gemm) __attribute__((__target__("avx2,fma"))); > > > static' include(matmul_internal.m4)dnl > > > `#endif /* HAVE_AVX2 */ > > > > > > > I guess the question here is if there are any CPUs that have AVX2 but don't > > have FMA3. If there are none, then this is not controversial, if there are > > some, it depends on how widely they are used compared to ones that have both > > AVX2 and FMA3. Going just from our -march= bitsets, it seems if there is > > PTA_AVX2, then there is also PTA_FMA: haswell, broadwell, skylake, > > skylake-avx512, knl, > > bdver4, znver1, there are CPUs that have just PTA_AVX and not PTA_AVX2 and > > still have PTA_FMA: bdver2, bdver3 (but that is not relevant to this patch). > > In a previous incantation of the patch, I saw that the compiler > generated the same floating point code for AVX and AVX2 (which why > there currently is no AVX2 floating point version). I could also > generate an AVX+FMA version for floating point and an AVX2 version > for integer (if anybody cares about integer matmul).
I think having another avx,fma version is not worth it, avx+fma is far less common than avx without fma. > > > @@ -147,7 +141,8 @@ > > > #endif /* HAVE_AVX512F */ > > > > > > #ifdef HAVE_AVX2 > > > - if (__cpu_model.__cpu_features[0] & (1 << FEATURE_AVX2)) > > > + if ((__cpu_model.__cpu_features[0] & (1 << FEATURE_AVX2)) > > > + && (__cpu_model.__cpu_features[0] & (1 << FEATURE_FMA))) > > > { > > > matmul_p = matmul_'rtype_code`_avx2; > > > goto tailcall; > > > > and this too. > > Will do. Note I meant obviously the FEATURE_AVX512F related hunk, not this one, sorry. Jakub