Re: [OMPI users] MPI_Reduce performance

2010-09-22 Thread Jeff Squyres
On Sep 9, 2010, at 4:31 PM, Ashley Pittman wrote: >> What is the exact semantics of an asynchronous barrier, > > I'm not sure of the exact semantics but once you've got your head around the > concept it's fairly simple to understand how to use it, you call > MPI_IBarrier() and it gives you a ha

Re: [OMPI users] MPI_Reduce performance

2010-09-10 Thread Richard Treumann
Tele (845) 433-7846 Fax (845) 433-8363 users-boun...@open-mpi.org wrote on 09/10/2010 10:27:02 AM: > [image removed] > > Re: [OMPI users] MPI_Reduce performance > > Eugene Loh > > to: > > Open MPI Users > > 09/10/2010 10:30 AM > > Sent by: &

Re: [OMPI users] MPI_Reduce performance

2010-09-10 Thread Eugene Loh
Richard Treumann wrote: Hi Ashley I understand the problem with descriptor flooding can be serious in an application with unidirectional data dependancy. Perhaps we have a different perception of how common that is. Ashley speculated it was a "significant minority."  I don't know

Re: [OMPI users] MPI_Reduce performance

2010-09-10 Thread Richard Treumann
logy Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 users-boun...@open-mpi.org wrote on 09/09/2010 05:34:15 PM: > [image removed] > > Re: [OMPI users] MPI_Reduce performance > > Ashley Pittman > >

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Alex A. Granovsky
because of different speed of execution on different nodes, delays, etc... If you account for this, you get the result I mentioned. Alex - Original Message - From: Eugene Loh To: Open MPI Users Sent: Thursday, September 09, 2010 11:32 PM Subject: Re: [OMPI users] MPI_Reduce

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ashley Pittman
On 9 Sep 2010, at 21:40, Richard Treumann wrote: > > Ashley > > Can you provide an example of a situation in which these semantically > redundant barriers help? I'm not making the case for semantically redundant barriers, I'm making a case for implicit synchronisation in every iteration of

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
Ashley Can you provide an example of a situation in which these semantically redundant barriers help? I may be missing something but my statement for the text book would be "If adding a barrier to your MPI program makes it run faster, there is almost certainly a flaw in it that is better solv

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ashley Pittman
On 9 Sep 2010, at 21:10, jody wrote: > Hi > @Ashley: > What is the exact semantics of an asynchronous barrier, I'm not sure of the exact semantics but once you've got your head around the concept it's fairly simple to understand how to use it, you call MPI_IBarrier() and it gives you a handle

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread jody
Hi @Ashley: What is the exact semantics of an asynchronous barrier, and is it part of the MPI specs? Thanks Jody On Thu, Sep 9, 2010 at 9:34 PM, Ashley Pittman wrote: > > On 9 Sep 2010, at 17:00, Gus Correa wrote: > >> Hello All >> >> Gabrielle's question, Ashley's recipe, and Dick Treutmann's

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ashley Pittman
On 9 Sep 2010, at 17:00, Gus Correa wrote: > Hello All > > Gabrielle's question, Ashley's recipe, and Dick Treutmann's cautionary words, > may be part of a larger context of load balance, or not? > > Would Ashley's recipe of sporadic barriers be a silver bullet to > improve load imbalance prob

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Eugene Loh
Alex A. Granovsky wrote: Isn't in evident from the theory of random processes and probability theory that in the limit of infinitely large cluster and parallel process, the probability of deadlocks with current implementation is unfortunately quite a finite quantity and in

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Alex A. Granovsky
n any particular details of the program. Just my two cents. Alex Granovsky - Original Message - From: Richard Treumann To: Open MPI Users Sent: Thursday, September 09, 2010 10:10 PM Subject: Re: [OMPI users] MPI_Reduce performance I was pointing out that most programs have

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
A / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 From: Eugene Loh To: Open MPI Users List-Post: users@lists.open-mpi.org Date: 09/09/2010 12:40 PM Subject: Re: [OMPI users] MPI_Reduce performance Sent by: users-boun...@open-mpi.org Gus Co

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Eugene Loh
Gus Correa wrote: More often than not some components lag behind (regardless of how much you tune the number of processors assigned to each component), slowing down the whole scheme. The coupler must sit and wait for that late component, the other components must sit and wait for the coupler, an

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Gus Correa
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 From: Ashley Pittman To: Open MPI Users Date: 09/09/2010 03:53 AM Subject:Re: [OMPI users] MPI_Re

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
ogy Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 From: Ashley Pittman To: Open MPI Users List-Post: users@lists.open-mpi.org Date: 09/09/2010 03:53 AM Subject: Re: [OMPI users] MPI_Reduce performance Sent by: users-bou

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ralph Castain
On Sep 9, 2010, at 1:46 AM, Ashley Pittman wrote: > > On 9 Sep 2010, at 08:31, Terry Frankcombe wrote: > >> On Thu, 2010-09-09 at 01:24 -0600, Ralph Castain wrote: >>> As people have said, these time values are to be expected. All they >>> reflect is the time difference spent in reduce waiting

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ashley Pittman
On 9 Sep 2010, at 08:31, Terry Frankcombe wrote: > On Thu, 2010-09-09 at 01:24 -0600, Ralph Castain wrote: >> As people have said, these time values are to be expected. All they >> reflect is the time difference spent in reduce waiting for the slowest >> process to catch up to everyone else. The

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Gabriele Fatigati
Yes Terry, thats' right. 2010/9/9 Terry Frankcombe > On Thu, 2010-09-09 at 01:24 -0600, Ralph Castain wrote: > > As people have said, these time values are to be expected. All they > > reflect is the time difference spent in reduce waiting for the slowest > > process to catch up to everyone els

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Gabriele Fatigati
Mm,I don't understand. The experiments on my appliciation shows that an intensive use of Barrier+ Reduce is more faster than a single Reduce. 2010/9/9 Ralph Castain > As people have said, these time values are to be expected. All they reflect > is the time difference spent in reduce waiting for

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Terry Frankcombe
On Thu, 2010-09-09 at 01:24 -0600, Ralph Castain wrote: > As people have said, these time values are to be expected. All they > reflect is the time difference spent in reduce waiting for the slowest > process to catch up to everyone else. The barrier removes that factor > by forcing all processes t

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Ralph Castain
As people have said, these time values are to be expected. All they reflect is the time difference spent in reduce waiting for the slowest process to catch up to everyone else. The barrier removes that factor by forcing all processes to start from the same place. No mystery here - just a reflec

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Gabriele Fatigati
More in depth, total execution time without Barrier is about 1 sec. Total execution time with Barrier+Reduce is 9453, with 128 procs. 2010/9/9 Terry Frankcombe > Gabriele, > > Can you clarify... those timings are what is reported for the reduction > call specifically, not the total executi

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Gabriele Fatigati
Hy Terry, this time is spent in MPI_Reduce, it isn't total execution time. 2010/9/9 Terry Frankcombe > Gabriele, > > Can you clarify... those timings are what is reported for the reduction > call specifically, not the total execution time? > > If so, then the difference is, to a first approxima

Re: [OMPI users] MPI_Reduce performance

2010-09-08 Thread Terry Frankcombe
Gabriele, Can you clarify... those timings are what is reported for the reduction call specifically, not the total execution time? If so, then the difference is, to a first approximation, the time you spend sitting idly by doing absolutely nothing waiting at the barrier. Ciao Terry -- Dr. Ter

Re: [OMPI users] MPI_Reduce performance

2010-09-08 Thread Ashley Pittman
On 8 Sep 2010, at 10:21, Gabriele Fatigati wrote: > So, im my opinion, it is better put MPI_Barrier before any MPI_Reduce to > mitigate "asynchronous" behaviour of MPI_Reduce in OpenMPI. I suspect the > same for others collective communications. Someone can explaine me why > MPI_reduce has this

Re: [OMPI users] MPI_Reduce performance

2010-09-08 Thread David Zhang
Doing Reduce without Barrier first allows one process to call Reduce and exit immediately without waiting for other processes to call Reduce. Therefore, this allows one process to advance faster than other processes. I suspect the 2671 second result is the difference between the fastest and slowest

[OMPI users] MPI_Reduce performance

2010-09-08 Thread Gabriele Fatigati
Dear OpenMPI users, i'm using OpenMPI 1.3.3 on Infiniband 4x interconnnection network. My parallel application use intensive MPI_Reduce communication over communicator created with MPI_Comm_split. I've noted strange behaviour during execution. My code is instrumented with Scalasca 1.3 to report s