I'm trying 4 procs and 300x300 dense matrices. I'm no used to git, so I put the code here:
function cannon_par(a,b) # for square matrices, nworkers() must be set s = size(a,1) nblocks = nworkers() # number of procs size_B = int(sqrt(nblocks)) # size of a,b w.r.t. the blocks if ~ isinteger(size_B) error("nworkers() must be a perfect square") end bs = s/size_B # block size = bs x bs if ~ isinteger(bs) error("Argument matrices can not be divided equally among nworkers()") end A = cell(s,s) B = cell(s,s) C = zeros(s,s) I(i) = (i-1)*bs+1:i*bs # function for block indexing, block A_ij = A[I(i),A(j)] #### initial shifting #### for i = 1:size_B #shift the ith block of a by i-1 horizontally A[I(i),:] = circshift(a[I(i),:],[0 bs*(1-i)]) #shift the ith block of b by i-1 vertically B[:,I(i)] = circshift(b[:,I(i)], bs*(1-i)) end #### A and B are distributed #### dA = distribute(A) dB = distribute(B) #### Cannon iterations #### for k = 1:size_B #tic() C_local = pmap(fetch, {@spawnat p localpart(dA)*localpart(dB) for p in procs(dA)}) #toc() #tic() for i = 1:size_B for j = 1:size_B C[I(i),I(j)] += C_local[(j-1)*size_B+i] end end #toc() if k < size_B #tic() A = circshift(A,[0 -bs]); # shifted B = circshift(B,-bs); dA = distribute(A); # and distributed again dB = distribute(B); #toc() end end C end Il giorno domenica 22 giugno 2014 16:54:01 UTC+2, Viral Shah ha scritto: > > The communication is probably happening in other parts of the code. How > large a problem are you trying? Can you post the full code in a gist or a > git repository? I will try it out. This is a good example to have in our > manual as well, and I just haven't got around to it. > > -viral > > On Sunday, June 22, 2014 4:53:02 PM UTC+5:30, Pietro Benedusi wrote: >> >> Yes, I'm using the function distribute(). This is the hotspot of my code >> (C = A*B) >> >> C_local = pmap(fetch, {@spawnat p >> localpart(dA)*localpart(dB) for p in procs(dA)}) >> >> >> Is it the right way to procede? In this way the multiplication is very >> slow ( I'm using 4 workers). >> >> Many thanks for helping. >> >> >> >> Il giorno domenica 22 giugno 2014 07:14:52 UTC+2, Viral Shah ha scritto: >>> >>> Are you using DArrays? You should be able to move data with indexing. >>> For the Cannon algorithm, you should be able to organize your communication >>> so that each processor moves the data it needs - IIRC. >>> >>> -viral >>> >>> On Saturday, June 21, 2014 11:08:06 PM UTC+5:30, Pietro Benedusi wrote: >>>> >>>> Hello, >>>> >>>> I need to write a distributed Cannon algorithm for matrix >>>> multiplication. >>>> In every iteration I have to shift all the blocks of the involved >>>> matrices or equivalently to move blocks between remote procs. How can I >>>> move blocks from a remote proc to an other? >>>> >>>> Thnaks >>>> >>>