https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106565

kargl at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kargl at gcc dot gnu.org

--- Comment #3 from kargl at gcc dot gnu.org ---

>       INTEGER, PARAMETER :: m = 200, n = 300, nn = 150
>       REAL :: A(m,n), B(nn,n), C(m,nn), BB(n,nn)
>       INTEGER :: i, j, k, L


If you are doing a problem of this size or larger, you want to use the
-fexternal-blas option and link in OpenBLAS.

I added timing code and replicated the loop to both in one go.

% gfcx -o z -O3 -march=native a.f90 && ./z
   1.16500998       1615.08594    
   5.32258606       1615.08020    
% gfcx -o z -O3 -march=native a.f90 -fexternal-blas -lopenblas && ./z
   2.44668889       1615.08301    
   1.99379802       1615.08301

Reply via email to