------- Comment #7 from dominiq at lps dot ens dot fr  2010-08-02 13:39 -------
> I fail to see why a scalar temporary is not enough:
>
> do j = 1, N
>   do i = 1, N
>     tmp = 0.0
>     do k = 1, N
 >      tmp = a(i,k)*b(k,j)
>     end do
>     a(i,j) = tmp
>   end do
> end do
>
> Note: it won't work with a scalar if one reverses the i and the j loop.

Well, try:


real :: a(3,3), b(3,3), c(3,3), tmp
integer :: i, j, k, N=3
a = reshape([(i,i=1,9)],[3,3])
b = reshape([(10-i,i=1,9)],[3,3])
c = matmul(a, b)
do j = 1, N
  do i = 1, N
    tmp = 0.0
    do k = 1, N
      tmp = tmp + a(i,k)*b(k,j)
    end do
    a(i,j) = tmp
  end do
end do
print *, a
print *, c
if(any(a/=c)) call abort()
print *, 'OK'
end

The problem is that in order to compute a(1,2) you need a(1,1)*b(1,2), so you
cannot have already written a(1,1). Note that the effective way to compute the
matrix product on modern CPUs is to exchange the i and k loops and use a rank 1
tmp:

real :: a(3,3), b(3,3), c(3,3), tmp(3)
integer :: i, j, k, N=3
a = reshape([(i,i=1,9)],[3,3])
b = reshape([(10-i,i=1,9)],[3,3])
c = matmul(a, b)
do j = 1, N
  tmp = 0.0
  do k = 1, N
    do i = 1, N
      tmp(i) = tmp(i) + a(i,k)*b(k,j)
    end do
  end do
  b(:,j) = tmp
end do
if(any(b/=c)) call abort()
print *, 'OK'
end

and this work with a rank 1 temporary.

If one want to enter this kind of reduced temporaries, another candidate is
a=transpose(a) which requires a scalar temporary only.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45159

Reply via email to