Hi,

On 11/8/06, Greg Lindahl <greg.lind...@qlogic.com> wrote:
On Tue, Nov 07, 2006 at 05:02:54PM +0000, Miguel Figueiredo Mascarenhas Sousa 
Filipe wrote:

> if your aplication is on one given node, sharing data is better than
> copying data.

Unless sharing data repeatedly leads you to false sharing and a loss
in performance.


what does that mean.. I did not understand that.


> the MPI model assumes you don't have a "shared memory" system..
> therefore it is "message passing" oriented, and not designed to
> perform optimally on shared memory systems (like SMPs, or numa-CCs).

For many programs with both MPI and shared memory implementations, the
MPI version runs faster on SMPs and numa-CCs. Why? See the previous
paragraph...

Of course it does..its faster to copy data in main memory than it is
to do it thought any kind of network interface. You can optimize you
message passing implementation to a couple of memory to memory copies
when ranks are on the same node. In the worst case, even if using
local IP addresses to communicate between peers/ranks (in the same
node), the operating  system doesn't even touch the interface.. it
will just copy data from a tcp sender buffer to a tcp receiver
buffer.. in the end - that's always faster than going through a
phisical network link.



But you still have a message passing api that is doing memory to
memory copies.. its a worse framework to do memory copies than a api
designed just for that.
One could argue that MPI is more than a message passing api, since it
provides also APIs to apply operators to the data..


But, for instance.. try to benchmark real applications with a MPI and
posix threads implementations in the same numa-cc or big SMP machine..
my bet is that posix threads implementation is going to be faster..
There are always exceptions.. like having a very well designed MPI
application, but a terrible posix threads one.. or design that's just
not that adaptable to a posix threads programming model (or a MPI
model).


--
Miguel Sousa Filipe

Reply via email to