Have you thought about trying out MPI_Scatter/Gather and at least seeing how 
efficient the internal algorithms are?

If you are always going to be running on the same platform and want to 
tune-n-tweak for that, then have at it.  If you are going to run this code on 
different platforms w/ different network architectures then I would be 
concerned about the performance "portability".  In other words a solution that 
ran well on one cluster may not run well on another, due to a number of factors.

Good luck,

-bill



From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Toon Knapen
Sent: Monday, January 31, 2011 5:05 AM
To: Open MPI Users
Subject: Re: [OMPI users] maximising bandwidth


So when you say you want your master to send "as fast as possible", I suppose 
you meant get back to running your code as soon as possible.  In that case you 
would want nonblocking.  However when you say you want the slaves to receive 
data faster, it seems you're implying the actual data transmission across the 
network.  I believe the data transmission speed is not dependent on whether the 
it is blocking or nonblocking.
Sorry I did not express myself clearly. With 'as fast as possible' I meant that 
I want to have all data ASAP available in my slave nodes. The master has 
nothing to do but sending so I do not care if the sends are blocking or 
non-blocking. Actually the master will use seperate threads for the sending 
anyway so either I launch a thread per blocking-send or just 1 thread to do all 
the sending using nonblocking sends.

I do think there is plenty of reason for a difference (in the timing for 
receiving the data in the slaves). If OpenMPI is not able to offload the 
sending to some dedicated card (which in my case is probably the case since I'm 
on a stock linux with stock ethernet cards) and OpenMPI will try to send the 
data that it was requested to send by multiple nonblocking send's 
simultaneously, OpenMPI itself probably needs to multi-thread the sending of 
each message himself.

Well, I do not know anything about the internals of OpenMPI so I actually have 
no clue how OpenMPI would do this really and how it will try to optimise the 
use of BW on the network.

toon


Reply via email to