https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66094

Thomas Koenig <tkoenig at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Blocks|                            |37131

--- Comment #1 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
It would be nice to handle

  c = matmul(transpose(a),b)

This can be changed to

  do i=1,m
     c_t = 0
     do k=1, count
        do j=1,n
           c_t(j) = c_t(j) + at(k,i) * b(j,k)
        end do
     end do
     c(i,:) = c_t
  end do

with a vector temporary c_t

and

 c = matmul(a,transpose(b))

changed to

  c = 0
  do k=1, count
     do j=1,n
        do i=1,m
           c4(i,j) = c4(i,j) + a(i,k)*b(j,k)
        end do
     end do
  end do

(without a vector temporary).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37131
[Bug 37131] inline matmul for small matrix sizes

Reply via email to