Pasha,
se attached file.
I have traced how MPI_IPROBE is called and also managed to significantly
reduce the number of calls to MPI_IPROBE. Unfortunately this only
resulted in the program spending time in other routines. Basically the
code runs through a number of timesteps and after each
Ralph,
I can't get "opal_paffinity_alone" to work (see below). However, there
is a "mpi_affinity_alone" that I tried without any improvement.
However, setting:
-mca btl_openib_eager_limit 65536
gave a 15% improvement so OpenMPI is now down to 326 (from previous 376
seconds). Still a lot more
Okay, one problem is fairly clear. As Terry indicated, you have to tell us
to bind or else you lose a lot of performace. Set -mca opal_paffinity_alone
1 on your cmd line and it should make a significant difference.
On Wed, Aug 5, 2009 at 8:10 AM, Torgny Faxen wrote:
> Ralph,
Ralph,
I am running through a locally provided wrapper but it translates to:
/software/mpi/openmpi/1.3b2/i101017/bin/mpirun -np 144 -npernode 8 -mca
mpi_show_mca_params env,file /nobac
kup/rossby11/faxen/RCO_scobi/src_161.openmpi/rco2.24pe
a) Upgrade.. This will take some time, it will have
A comment to the below. I meant the 2x performance was for shared
memory communications.
--td
Message: 3
Date: Wed, 05 Aug 2009 09:55:42 -0400
From: Terry Dontje <terry.don...@sun.com>
Subject: Re: [OMPI users] Performance difference on OpenMPI, IntelMPI
and ScaliMPI
To: us..
If the above doesn't improve anything the next question is do you know
what the sizes of the messages are? For very small messages I believe
Scali shows a 2x better performance than Intel and OMPI (I think this
is due to a fastpath optimization).
I remember that mvapich was faster that
to a fastpath optimization).
--td
Message: 1
Date: Wed, 05 Aug 2009 15:15:52 +0200
From: Torgny Faxen <fa...@nsc.liu.se>
Subject: Re: [OMPI users] Performance difference on OpenMPI, IntelMPI
and ScaliMPI
To: pa...@dev.mellanox.co.il, Open MPI Users <us...@open-mpi.org>
Torgny,
We have one know issue in openib btl that it related to IPROBE -
https://svn.open-mpi.org/trac/ompi/ticket/1362
Theoretical it maybe source cause of the performance degradation, but
for me the performance difference sounds too big.
* Do you know what is typical message size for this
Could you send us the mpirun cmd line? I wonder if you are missing some
options that could help. Also, you might:
(a) upgrade to 1.3.3 - it looks like you are using some kind of pre-release
version
(b) add -mca mpi_show_mca_params env,file - this will cause rank=0 to output
what mca params it
Pasha,
no collectives are being used.
A simple grep in the code reveals the following MPI functions being used:
MPI_Init
MPI_wtime
MPI_COMM_RANK
MPI_COMM_SIZE
MPI_BUFFER_ATTACH
MPI_BSEND
MPI_PACK
MPI_UNPACK
MPI_PROBE
MPI_GET_COUNT
MPI_RECV
MPI_IPROBE
MPI_FINALIZE
where MPI_IPROBE is the clear
Do you know if the application use some collective operations ?
Thanks
Pasha
Torgny Faxen wrote:
Hello,
we are seeing a large difference in performance for some applications
depending on what MPI is being used.
Attached are performance numbers and oprofile output (first 30 lines)
from one
Hello,
we are seeing a large difference in performance for some applications
depending on what MPI is being used.
Attached are performance numbers and oprofile output (first 30 lines)
from one out of 14 nodes from one application run using OpenMPI,
IntelMPI and Scali MPI respectively.
12 matches
Mail list logo