I'm trying 4 procs and 300x300 dense matrices. I'm no used to git, so I put 
the code here:

function cannon_par(a,b) # for square matrices, nworkers() must be set


        s = size(a,1)

        nblocks = nworkers()          # number of procs

        size_B = int(sqrt(nblocks))   # size of a,b w.r.t. the  blocks


        if ~ isinteger(size_B)

                error("nworkers() must be a perfect square")

        end


        bs = s/size_B # block size =  bs x bs


        if ~ isinteger(bs)

                error("Argument matrices can not be divided equally among 
nworkers()")

        end


        A = cell(s,s)

        B = cell(s,s)

        C = zeros(s,s)


        I(i) = (i-1)*bs+1:i*bs # function for block indexing,  block A_ij = 
A[I(i),A(j)]


        #### initial shifting ####


        for i = 1:size_B


                #shift the ith block of a by i-1 horizontally

                A[I(i),:] = circshift(a[I(i),:],[0 bs*(1-i)])


                #shift the ith block of b by i-1 vertically

                B[:,I(i)] = circshift(b[:,I(i)], bs*(1-i))

        end


        #### A and B are distributed ####


        dA = distribute(A)

        dB = distribute(B)


        #### Cannon iterations ####


        for k = 1:size_B


                #tic()

                C_local = pmap(fetch, {@spawnat p localpart(dA)*localpart(dB) 
for p in procs(dA)})

                #toc()


                #tic()

                for i = 1:size_B

                        for j = 1:size_B

                            C[I(i),I(j)] += C_local[(j-1)*size_B+i]

                        end

                end

                #toc()


            if k < size_B

            #tic()

            A = circshift(A,[0 -bs]);   # shifted

            B = circshift(B,-bs);


            dA = distribute(A);       # and distributed again

            dB = distribute(B);

            #toc()

            end

        end

        C

end



Il giorno domenica 22 giugno 2014 16:54:01 UTC+2, Viral Shah ha scritto:
>
> The communication is probably happening in other parts of the code. How 
> large a problem are you trying? Can you post the full code in a gist or a 
> git repository? I will try it out. This is a good example to have in our 
> manual as well, and I just haven't got around to it.
>
> -viral
>
> On Sunday, June 22, 2014 4:53:02 PM UTC+5:30, Pietro Benedusi wrote:
>>
>> Yes, I'm using the function distribute(). This is the hotspot of my code 
>> (C = A*B)
>>
>>                 C_local = pmap(fetch, {@spawnat p 
>> localpart(dA)*localpart(dB) for p in procs(dA)})
>>
>>
>> Is it the right way to procede? In this way the multiplication is very 
>> slow ( I'm using 4 workers).
>>
>> Many thanks for helping.
>>
>>
>>
>> Il giorno domenica 22 giugno 2014 07:14:52 UTC+2, Viral Shah ha scritto:
>>>
>>> Are you using DArrays? You should be able to move data with indexing. 
>>> For the Cannon algorithm, you should be able to organize your communication 
>>> so that each processor moves the data it needs - IIRC.
>>>
>>> -viral
>>>
>>> On Saturday, June 21, 2014 11:08:06 PM UTC+5:30, Pietro Benedusi wrote:
>>>>
>>>> Hello,
>>>>
>>>> I need to write a distributed Cannon algorithm for matrix 
>>>> multiplication. 
>>>> In every iteration I have to shift all the blocks of the involved 
>>>> matrices or equivalently to move blocks between remote procs. How can I 
>>>> move blocks from a remote proc to an other?
>>>>
>>>> Thnaks
>>>>
>>>

Reply via email to