What is really difficult with MPI is data distribution. A lot of
applications are parallelised using replicated data. That's fine if you
are CPU bound, but it you are limited by the amount of memory per
processor, a shared memory approach (which is the default with OpenMP)
is the easiest way of using all the memory.
MPI may also add a lot of overhead if you parallelise inner loops, which
is easy and cheap with OpenMP. OTOH, coarse-grain parallelism with
OpenMP is difficult; MPI is usually more suitable here. It depends on
your application, and you may find candidates for both approaches in the
same application.
Herbert
[EMAIL PROTECTED] wrote:
Send Beowulf mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
http://www.beowulf.org/mailman/listinfo/beowulf
or, via email, send a message with subject or body 'help' to
[EMAIL PROTECTED]
You can reach the person managing the list at
[EMAIL PROTECTED]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beowulf digest..."
Today's Topics:
1. Re: Naive question: mpi-parallel program in multicore CPUs
(Larry Stewart)
----------------------------------------------------------------------
Message: 1
Date: Tue, 02 Oct 2007 14:52:21 -0400
From: Larry Stewart <[EMAIL PROTECTED]>
Subject: Re: [Beowulf] Naive question: mpi-parallel program in
multicore CPUs
To: [EMAIL PROTECTED], Bo <[EMAIL PROTECTED]>
Cc: [email protected], Kwan Wing Keung <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset="gb2312"
The question of OpenMP vs MPI has been around for a long time,
for example:
http://www.beowulf.org/archive/2001-March/002718.html
My general impression is that it is a waste of time to convert from pure
MPI to
a hybrid approach. For example:
www.sc2000.org/techpapr/papers/pap.pap214.pdf
On the other hand, here's a fellow who got a 4X speedup by going to hybrid:
www.nersc.gov/nusers/services/training/classes/NUG/Jun04/NUG2004_yhe_hybrid.ppt
My own view is that with a modern cluster with fast processors and with
inter-node communications not
that much slower than a cache miss to main memory, the unified MPI model
makes more sense, but
there are many many papers arguing about this topic.
-L
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.scyld.com/pipermail/beowulf/attachments/20071002/567c759a/attachment-0001.html
------------------------------
_______________________________________________
Beowulf mailing list
[email protected]
http://www.beowulf.org/mailman/listinfo/beowulf
End of Beowulf Digest, Vol 44, Issue 4
**************************************
--
Herbert Fruchtl
EaStCHEM Fellow
School of Chemistry
University of St Andrews
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf