https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106565

            Bug ID: 106565
           Summary: Using a transposed matrix in matmul (GCC-10.3.0) is
                    very slow
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: quanhua.liu at noaa dot gov
  Target Milestone: ---

gcc version 10.3.0 (GCC)
linux
Using  (2) BB = transpose(B)
           C = matmul(A, BB)
is 5 times faster than
using  (1) C = matmul(A, transpose(B))

ifort 19 doesn't have the problem.


      PROGRAM test_matrixCal
! ------------------------------------------------------
! This code test
!  (1)   C = matmul(A, transpose(B))        
!  against 
!  (2)   BB = transpose(B)
!        C = matmul(A, BB)
!  (2) is 5 times faster than (1)
!   gfortran -O3 test_matrixCal
!   time a.ot
! ------------------------------------------------------
      INTEGER, PARAMETER :: m = 200, n = 300, nn = 150
      REAL :: A(m,n), B(nn,n), C(m,nn), BB(n,nn)
      INTEGER :: i, j, k, L
      A(:,:) = 3.0
      B(:,:) = 1.7

      iterative_loop: DO L = 1, 1000
         A(:,10) = A(:,10) + 0.0001*L
!         C = matmul(A, transpose(B))
         BB = transpose(B)
         C = matmul(A, BB)
      IF(mod(L,50) == 0)   print *,L, C(10,20)
      END DO iterative_loop
      STOP
      END PROGRAM test_matrixCal

Reply via email to