What is really difficult with MPI is data distribution. A lot of applications are parallelised using replicated data. That's fine if you are CPU bound, but it you are limited by the amount of memory per processor, a shared memory approach (which is the default with OpenMP) is the easiest way of using all the memory.

MPI may also add a lot of overhead if you parallelise inner loops, which is easy and cheap with OpenMP. OTOH, coarse-grain parallelism with OpenMP is difficult; MPI is usually more suitable here. It depends on your application, and you may find candidates for both approaches in the same application.

  Herbert

[EMAIL PROTECTED] wrote:
Send Beowulf mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.beowulf.org/mailman/listinfo/beowulf
or, via email, send a message with subject or body 'help' to
        [EMAIL PROTECTED]

You can reach the person managing the list at
        [EMAIL PROTECTED]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beowulf digest..."


Today's Topics:

   1. Re: Naive question: mpi-parallel program in multicore CPUs
      (Larry Stewart)


----------------------------------------------------------------------

Message: 1
Date: Tue, 02 Oct 2007 14:52:21 -0400
From: Larry Stewart <[EMAIL PROTECTED]>
Subject: Re: [Beowulf] Naive question: mpi-parallel program in
        multicore CPUs
To: [EMAIL PROTECTED], Bo <[EMAIL PROTECTED]>
Cc: [email protected], Kwan Wing Keung <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset="gb2312"

The question of OpenMP vs MPI has been around for a long time,
for example:

http://www.beowulf.org/archive/2001-March/002718.html

My general impression is that it is a waste of time to convert from pure
MPI to
a hybrid approach. For example:

www.sc2000.org/techpapr/papers/pap.pap214.pdf

On the other hand, here's a fellow who got a 4X speedup by going to hybrid:

www.nersc.gov/nusers/services/training/classes/NUG/Jun04/NUG2004_yhe_hybrid.ppt

My own view is that with a modern cluster with fast processors and with
inter-node communications not
that much slower than a cache miss to main memory, the unified MPI model
makes more sense, but
there are many many papers arguing about this topic.

-L

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://www.scyld.com/pipermail/beowulf/attachments/20071002/567c759a/attachment-0001.html

------------------------------

_______________________________________________
Beowulf mailing list
[email protected]
http://www.beowulf.org/mailman/listinfo/beowulf


End of Beowulf Digest, Vol 44, Issue 4
**************************************

--
Herbert Fruchtl
EaStCHEM Fellow
School of Chemistry
University of St Andrews

_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to