On Tue, 10 Apr 2018, Jeff Hammond wrote: > This should generate an SSE2 binary: > > 'COPTFLAGS=-g', > 'FOPTFLAGS=-g', > > This should generate a KNL binary: > > 'COPTFLAGS=-g -xMIC-AVX512 -O3', > 'FOPTFLAGS=-g -xMIC-AVX512 -O3', > > This should generate a SSE2 binary that also supports CORE-AVX2 dispatch. > > '--COPTFLAGS=-g -axcore-avx2', > '--FOPTFLAGS=-g -axcore-avx2', > > I don't see a good reason for the third option to fail. Please report this > bug to Intel. > > You might also verify that this works: > > '--COPTFLAGS=-g -xCORE-AVX2', > '--FOPTFLAGS=-g -xCORE-AVX2',
This fails the same way as -axcore-avx2 > > In general, one should avoid compiling for SSE on KNL, because SSE-AVX > transition penalties need to be avoided (google should find the details). > Are you trying to generate a single binary that is portable to ancient > Core/Xeon and KNL? My usage here is to reproduce this issue reported by Randy - assumed the knl box we have is the easiest way.. Satish > I recommend that you use AVX (Sandy Bridge) - > preferably AVX2 (Haswell) - as your oldest ISA target when generating a > portable binary that includes KNL support. > > Jeff > > On Tue, Apr 10, 2018 at 2:23 PM, Satish Balay <ba...@mcs.anl.gov> wrote: > > > I tried a few builds with: > > > > '--with-64-bit-indices=1', > > '--with-memalign=64', > > '--with-blaslapack-dir=/home/intel/18/compilers_and_ > > libraries_2018.0.128/linux/mkl', > > '--with-cc=icc', > > '--with-fc=ifort', > > '--with-cxx=0', > > '--with-debugging=0', > > '--with-mpi=0', > > > > And then changed the OPTFLAGS: > > > > 1. 'basic -g' - works fine > > > > 'COPTFLAGS=-g', > > 'FOPTFLAGS=-g', > > > > 2. 'avx512' - works fine > > > > 'COPTFLAGS=-g -xMIC-AVX512 -O3', > > 'FOPTFLAGS=-g -xMIC-AVX512 -O3', > > > > 3. 'avx2' - breaks. > > > > '--COPTFLAGS=-g -axcore-avx2', > > '--FOPTFLAGS=-g -axcore-avx2', > > > > with a breakpoint at dmdavecrestorearrayf903_() in gdb - I see - the > > stack is fine during the first call to dmdavecrestorearrayf903_() - > > but is corrupted when it goes to the second call to > > dmdavecrestorearrayf903_() i.e ierr=0x7fffffffb4a0 changes to > > ierr=0x0] > > > > >>>>>>>>>> > > > > Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, > > v=0x6030c0 <test_$VEC2.0.1>, a=0x401abd <test+2301>, > > ierr=0x7fffffffb4a0) at /home/petsc/petsc.barry-test/ > > src/dm/impls/da/f90-custom/zda1f90.c:153 > > 153 { > > (gdb) where > > #0 dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030c0 > > <test_$VEC2.0.1>, a=0x401abd <test+2301>, ierr=0x7fffffffb4a0) > > at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/ > > zda1f90.c:153 > > #1 0x0000000000401abd in test () at ex1f.F90:80 > > #2 0x00000000004011ae in main () > > #3 0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6 > > #4 0x00000000004010b9 in _start () > > (gdb) c > > Continuing. > > > > Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, > > v=0x6030b8 <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0) > > at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/ > > zda1f90.c:153 > > 153 { > > (gdb) where > > #0 dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030b8 > > <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0) > > at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/ > > zda1f90.c:153 > > #1 0x0000000000401ada in test () at ex1f.F90:81 > > #2 0x00000000004011ae in main () > > #3 0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6 > > #4 0x00000000004010b9 in _start () > > (gdb) > > > > >>>>>>>>> > > > > Its not clear to me why this happens. [and why it would work with > > -xMIC-AVX512 but breaks with -axcore-avx2]. > > > > Perhaps Richard, Jeff have better insight on this. > > > > BTW: The above run is with: > > > > bash-4.2$ icc --version > > icc (ICC) 18.0.0 20170811 > > > > Satish > > > > On Mon, 9 Apr 2018, Satish Balay wrote: > > > > > I'm able to reproduce this problem on knl box [with the attached test > > code]. But it goes away if I rebuild without the option > > --with-64-bit-indices. > > > > > > Will have to check further.. > > > > > > Satish > > > > > > > > > On Thu, 5 Apr 2018, Randall Mackie wrote: > > > > > > > Dear PETSc users, > > > > > > > > I’m curious if anyone else experiences problems using > > DMDAVecGetArrayF90 in conjunction with Intel compilers? > > > > We have had many problems (typically 11 SEGV segmentation violations) > > when PETSc is compiled in optimize mode (with various combinations of > > options). > > > > These same codes run valgrind clean with gfortran, so I assume this is > > an Intel bug, but before we submit a bug report I wanted to see if anyone > > else had similar experiences? > > > > We have basically gone back and replaced our calls to > > DMDAVecGetArrayF90 with calls to VecGetArrayF90 and pass those pointers > > into a “local” subroutine that works fine. > > > > > > > > In case anyone is curious, the attached test code shows this behavior > > when PETSc is compiled with the following options: > > > > > > > > ./configure \ > > > > --with-clean=1 \ > > > > --with-debugging=0 \ > > > > --with-fortran=1 \ > > > > --with-64-bit-indices \ > > > > --download-mpich=../mpich-3.3a2.tar.gz \ > > > > --with-blas-lapack-dir=/opt/intel/mkl \ > > > > --with-cc=icc \ > > > > --with-fc=ifort \ > > > > --with-cxx=icc \ > > > > --FOPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \ > > > > --COPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \ > > > > --CXXOPTFLAGS='-O2 -xSSSE3 -axcore-avx2’ \ > > > > > > > > > > > > > > > > Thanks, Randy M. > > > > > > > > > > > > > > > >