https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106565
Bug ID: 106565 Summary: Using a transposed matrix in matmul (GCC-10.3.0) is very slow Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: quanhua.liu at noaa dot gov Target Milestone: --- gcc version 10.3.0 (GCC) linux Using (2) BB = transpose(B) C = matmul(A, BB) is 5 times faster than using (1) C = matmul(A, transpose(B)) ifort 19 doesn't have the problem. PROGRAM test_matrixCal ! ------------------------------------------------------ ! This code test ! (1) C = matmul(A, transpose(B)) ! against ! (2) BB = transpose(B) ! C = matmul(A, BB) ! (2) is 5 times faster than (1) ! gfortran -O3 test_matrixCal ! time a.ot ! ------------------------------------------------------ INTEGER, PARAMETER :: m = 200, n = 300, nn = 150 REAL :: A(m,n), B(nn,n), C(m,nn), BB(n,nn) INTEGER :: i, j, k, L A(:,:) = 3.0 B(:,:) = 1.7 iterative_loop: DO L = 1, 1000 A(:,10) = A(:,10) + 0.0001*L ! C = matmul(A, transpose(B)) BB = transpose(B) C = matmul(A, BB) IF(mod(L,50) == 0) print *,L, C(10,20) END DO iterative_loop STOP END PROGRAM test_matrixCal