Look more carefully into the pattern of your codes.

It seems that you may be able to reshape v1 into a matrix of size (m, n) 
and v0 into a matrix of size (n, m), and you may do a matrix-matrix 
multiplication once without looping over multiple strided vectors.

Also, if you are using ArrayViews, you can write view(v0, n*(i-1)+1:n*i) 
instead of v0[(n * (i-1)) + (1:n)].

Dahua


On Tuesday, July 29, 2014 4:22:32 PM UTC-5, Florian Oswald wrote:
>
> Hi all,
>
> I've got an algorithm that hinges critically on fast matrix 
> multiplication. I put up the function on this gist
>
> https://gist.github.com/floswald/6dea493417912536688d#file-tensor-jl-L45
>
> indicating the line (45) that takes most of the time, as you can see in 
> the profile output that is there as well. I am trying to figure out if I'm 
> doing something wrong here or if that line just takes as long as it takes. 
> I have to do this many times, so if this takes too long I have to change my 
> strategy.
>
> The core of the problem looks like that
>
> for imat in 2:nbm
>     v0    = copy(v1)
>     stemp = ibm[ks[imat]]
>     n     = size(stemp,1)
>     m     = nall / n
>     for i in 1:m
>         v1[m*(0:(n-1)) + i] = stemp * v0[(n*(i-1)) + (1:n)]
>    end
> end
>
>
> where v are vectors and stemp is a matrix. I spend a lot of time in the 
> matrix multiplication line on the innermost loop. Any suggestions would be 
> much appreciated. Thanks!
>
>
>

Reply via email to