On Thu, Mar 18, 2021 at 8:49 AM Steve Kargl via Fortran <fortran@gcc.gnu.org> wrote: > > It seems that gfortran will inline MATMUL with optimization. > This produce very poor performance. In fact, gfortran will > inline MATMUL even if one specifies -fexternal-blas. This is > very bad. > > % cat a.f90 > program main > > implicit none > > integer, parameter :: imax = 20000, jmax = 10000 > real, allocatable :: inVect(:), matrix(:,:), outVect(:) > real :: start, finish > > allocate(invect(imax), matrix(imax,jmax), outvect(jmax)) > > call random_number(inVect) > call random_number(matrix) > > call cpu_time(start) > outVect = matmul(inVect, matrix) > call cpu_time(finish) > > print '("Time = ",f10.7," seconds. – First Value = > ",f10.4)',finish-start,outVect(1) > end program main > > % gfcx -o z -O0 a.f90 && ./z > Time = 0.2234111 seconds. – First Value = 4982.6362 > % nm z | grep matmul > U _gfortran_matmul_r4@@GFORTRAN_8 > % gfcx -o z -O1 a.f90 && ./z > Time = 0.3295890 seconds. – First Value = 4971.0962 > % nm z | grep matmul > % gfcx -o z -O2 a.f90 && ./z > Time = 0.3299561 seconds. – First Value = 5025.4902 > % nm z | grep matmul > % gfcx -o z -O2 -fexternal-blas a.f90 && ./z > Time = 0.3295580 seconds. – First Value = 5022.8291 > > This last one is definitely broken. I did not link with > an external BLAS library. Please fix before 11.1 is > released.
Since the libgfortran MATMUL should be vectorized I think it's not reasonable to inline any but _very_ small MATMUL at optimization levels that do not enable vectorization. Richard. > > -- > Steve