"Smith, Barry F. via petsc-dev" <petsc-dev@mcs.anl.gov> writes:
> This needs to be fixed properly with a configure test(s) and not with huge > and inconsistent checks like this > > #if defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) && > defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) && > !defined(PETSC_USE_64BIT_INDICES) or this > > #elif defined(PETSC_USE_AVX512_KERNELS) && defined(PETSC_HAVE_IMMINTRIN_H) && > defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) && > !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) && > !defined(PETSC_SKIP_IMMINTRIN_H_CUDAWORKAROUND) > > > > self.useAVX512Kernels = self.framework.argDB['with-avx512-kernels'] > if self.useAVX512Kernels: > self.addDefine('USE_AVX512_KERNELS', 1) > > Here you should check that the needed include files are available, that > it is 32 bit integers, that defined(__AVX512F__) exists and that appropriate > functions exist. What happens when one AVX512 kernel gets a 64-bit integer or single-precision implementation? You'll have new macros for each combination of criteria instead of using && on the normal macros? We should investigate whether #pragma omp simd or ivdep can generate comparable vectorization without directly using the intrinsics -- those are quite a bit more portable when they work, and they often work with some experimentation. > Maybe you need two configure checks if there are two different types of > functionality you are trying to catch? > > Continuing to hack away with gross #if def means this crud is > unmaintainable and will always haunt you. Yes the incremental cost of doing a > proper configure test is there but once that is done the maintenance costs > (which have been haunting use for months) will be gone. > > Barry > > > > > > > > > >> On Oct 25, 2019, at 2:49 AM, Lisandro Dalcin via petsc-dev >> <petsc-dev@mcs.anl.gov> wrote: >> >> >> >> On Fri, 25 Oct 2019 at 01:40, Balay, Satish <ba...@mcs.anl.gov> wrote: >> I'm curious why this issue comes up for you. The code was unrelated to >> --with-avx512-kernels=0 option. >> >> Its relying on __AVX512F__and PETSC_HAVE_IMMINTRIN_H flags. And >> assumes immintrin.h has a definition for _mm512_reduce_add_pd() >> >> Is the flag __AVX512F__ always set on your machine by gcc? >> >> And does this change based on the hardware? I just tried this build >> [same os/compiler] on "Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz" - and >> can't reproduce the issue. >> >> I do see _mm512_reduce_add_pd is missing from immintrin.h - but the >> flag __AVX512F__ is not set for me. >> >> >> Of course it is not set, you are just invoking the preprocessor. Try this >> way: >> >> $ cat xyz.c >> #if defined __AVX512F__ >> #error "avx512f flag set" >> #endif >> >> $ gcc -march=native -c xyz.c >> xyz.c:2:2: error: #error "avx512f flag set" >> #error "avx512f flag set" >> >> I forgot to mention my XXXOPTFLAGS, full reconfigure script below. >> Do we have some Ubuntu 16 builder using system GCC? >> Maybe we should use `-march=native -O3 -g3` in one of these builders? >> >> $ cat arch-gnu-opt/lib/petsc/conf/reconfigure-arch-gnu-opt.py >> #!/usr/bin/python >> if __name__ == '__main__': >> import sys >> import os >> sys.path.insert(0, os.path.abspath('config')) >> import configure >> configure_options = [ >> '--COPTFLAGS=-march=native -mtune=native -O3', >> '--CXXOPTFLAGS=-march=native -mtune=native -O3', >> '--FOPTFLAGS=-march=native -mtune=native -O3', >> '--download-metis=1', >> '--download-p4est=1', >> '--download-parmetis=1', >> '--with-avx512-kernels=0', >> '--with-debugging=0', >> '--with-fortran-bindings=0', >> '--with-zlib=1', >> 'CC=mpicc', >> 'CXX=mpicxx', >> 'FC=mpifort', >> 'PETSC_ARCH=arch-gnu-opt', >> ] >> configure.petsc_configure(configure_options) >> >> >> -- >> Lisandro Dalcin >> ============ >> Research Scientist >> Extreme Computing Research Center (ECRC) >> King Abdullah University of Science and Technology (KAUST) >> http://ecrc.kaust.edu.sa/