(I apologize in advance for the simplistic/newbie question.) I'm performing an ALLREDUCE operation on a multi-dimensional array. This operation is the biggest bottleneck in the code, and I'm wondering if there's a way to do it more efficiently than what I'm doing now. Here's a representative example of what's happening:
ir=1 do ikl=1,km do ij=1,jm do ii=1,im albuf(ir)=array(ii,ij,ikl,nl,0,ng) ir=ir+1 enddo enddo enddo agbuf=0.0 call mpi_allreduce(albuf,agbuf,im*jm*kmloc(coords(2)+1),mpi_real,mpi_sum,ang_com,ierr) ir=1 do ikl=1,km do ij=1,jm do ii=1,im phim(ii,ij,ikl,nl,0,ng)=agbuf(ir) ir=ir+1 enddo enddo enddo Is there any way to just do this in one fell swoop, rather than buffering, transmitting, and unbuffering? This operation is looped over many times. Are there savings to be had here? Thanks, Greg