There is a project called "MVAPICH2-GPU", which is developed by D. K.
Panda's research group at Ohio State University. You will find lots of
references on Google... and I just briefly gone through the slides of
"MVAPICH2-GPU: Optimized GPU to GPU Communication for InfiniBand
Clusters"":
http://no
On Dec 12, 2011, at 8:42 AM, amjad ali wrote:
> Thanking you all very much for the reply.
>
> I would request to have some reference about what Tim Prince & Andreas has
> said.
>
> Tim said that OpenMPI has had effective shared memory message passing. Is
> that anything to do with --enable
Thanking you all very much for the reply.
I would request to have some reference about what Tim Prince & Andreas has
said.
Tim said that OpenMPI has had effective shared memory message passing. Is
that anything to do with --enable-MPI-threads switch while installing
OpeMPI?
regards,
AA
On 12/11/2011 12:16 PM, Andreas Schäfer wrote:
Hey,
on an SMP box threaded codes CAN always be faster than their MPI
equivalents. One reason why MPI sometimes turns out to be faster is
that with MPI every process actually initializes its own
data. Therefore it'll end up in the NUMA domain to whi
.@open-mpi.org] On
> Behalf Of amjad ali
> Sent: 10 December 2011 20:22
> To: Open MPI Users
> Subject: [OMPI users] How to justify the use MPI codes on multicore
> systems/PCs?
>
>
>
> Hello All,
>
>
>
> I developed my MPI based parallel code for clus
hared across
the threads in the same process.
I'd be curious to see some timing comparisons.
MM
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of amjad ali
Sent: 10 December 2011 20:22
To: Open MPI Users
Subject: [OMPI users] How to justify the use MP
Hello All,
I developed my MPI based parallel code for clusters, but now I use it on
multicore/manycore computers (PCs) as well. How to justify (in some
thesis/publication) the use of a distributed memory code (in MPI) on a
shared memory (multicore) machine. I guess to explain two reasons:
(1) Pl