Re: [OMPI devel] SM BTL hang issue

2007-08-31 Thread Terry D. Dontje
Scott Atchley wrote: Terry, Are you testing on Linux? If so, which kernel? No, I am running into issues on Solaris but Ollie's run of the test code on Linux seems to work fine. --td See the patch to iperf to handle kernel 2.6.21 and the issue that they had with usleep(0): http://das

Re: [OMPI devel] SM BTL hang issue

2007-08-31 Thread Scott Atchley
Terry, Are you testing on Linux? If so, which kernel? See the patch to iperf to handle kernel 2.6.21 and the issue that they had with usleep(0): http://dast.nlanr.net/Projects/Iperf2.0/patch-iperf-linux-2.6.21.txt Scott On Aug 31, 2007, at 1:36 PM, Terry D. Dontje wrote: Ok, I have an up

Re: [OMPI devel] SM BTL hang issue

2007-08-31 Thread Terry D. Dontje
Ok, I have an update to this issue. I believe there is an implementation difference of sched_yield between Linux and Solaris. If I change the sched_yield in opal_progress to be a usleep(500) then my program completes quite quickly. I have sent a few questions to a Solaris engineer and hopefu

Re: [OMPI devel] SM BTL hang issue

2007-08-30 Thread Li-Ta Lo
On Thu, 2007-08-30 at 12:45 -0400, terry.don...@sun.com wrote: > Li-Ta Lo wrote: > > >On Thu, 2007-08-30 at 12:25 -0400, terry.don...@sun.com wrote: > > > > > >>Li-Ta Lo wrote: > >> > >> > >> > >>>On Wed, 2007-08-29 at 14:06 -0400, Terry D. Dontje wrote: > >>> > >>> > >>> > >>> >

Re: [OMPI devel] SM BTL hang issue

2007-08-30 Thread Terry . Dontje
Li-Ta Lo wrote: On Thu, 2007-08-30 at 12:25 -0400, terry.don...@sun.com wrote: Li-Ta Lo wrote: On Wed, 2007-08-29 at 14:06 -0400, Terry D. Dontje wrote: hmmm, interesting since my version doesn't abort at all. Some problem with fortran compiler/language binding

Re: [OMPI devel] SM BTL hang issue

2007-08-30 Thread Li-Ta Lo
On Thu, 2007-08-30 at 12:25 -0400, terry.don...@sun.com wrote: > Li-Ta Lo wrote: > > >On Wed, 2007-08-29 at 14:06 -0400, Terry D. Dontje wrote: > > > > > >>hmmm, interesting since my version doesn't abort at all. > >> > >> > >> > > > > > >Some problem with fortran compiler/language binding?

Re: [OMPI devel] SM BTL hang issue

2007-08-30 Thread Terry . Dontje
Li-Ta Lo wrote: On Wed, 2007-08-29 at 14:06 -0400, Terry D. Dontje wrote: hmmm, interesting since my version doesn't abort at all. Some problem with fortran compiler/language binding? My C translation doesn't have any problem. [ollie@exponential ~]$ mpirun -np 4 a.out 10 Target d

Re: [OMPI devel] SM BTL hang issue

2007-08-30 Thread Li-Ta Lo
On Wed, 2007-08-29 at 14:06 -0400, Terry D. Dontje wrote: > hmmm, interesting since my version doesn't abort at all. > Some problem with fortran compiler/language binding? My C translation doesn't have any problem. [ollie@exponential ~]$ mpirun -np 4 a.out 10 Target duration (seconds): 10.

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Terry D. Dontje
hmmm, interesting since my version doesn't abort at all. --td Li-Ta Lo wrote: On Wed, 2007-08-29 at 11:36 -0400, Terry D. Dontje wrote: To run the code I usually do "mpirun -np 6 a.out 10" on a 2 core system. It'll print out the following and then hang: Target duration (seconds):

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Li-Ta Lo
On Wed, 2007-08-29 at 11:36 -0400, Terry D. Dontje wrote: > To run the code I usually do "mpirun -np 6 a.out 10" on a 2 core > system. It'll print out the following and then hang: > Target duration (seconds): 10.00 > # of messages sent in that time: 589207 > Microseconds per mess

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Terry D. Dontje
To run the code I usually do "mpirun -np 6 a.out 10" on a 2 core system. It'll print out the following and then hang: Target duration (seconds): 10.00 # of messages sent in that time: 589207 Microseconds per message: 16.972 --td Terry D. Dontje wrote: Heard you th

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Terry D. Dontje
Heard you the first time Gleb, just been backed up with other stuff. Following is the code: include "mpif.h" character(20) cmd_line_arg ! We'll use the first command-line argument ! to set the duration of the test. real(8) :: duration = 10 ! The de

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Gleb Natapov
On Wed, Aug 29, 2007 at 11:01:14AM -0400, Richard Graham wrote: > If you are going to look at it, I will not bother with this. I need the code to reproduce the problem. Otherwise I have nothing to look at. > > Rich > > > On 8/29/07 10:47 AM, "Gleb Natapov" wrote: > > > On Wed, Aug 29, 2007 a

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Richard Graham
If you are going to look at it, I will not bother with this. Rich On 8/29/07 10:47 AM, "Gleb Natapov" wrote: > On Wed, Aug 29, 2007 at 10:46:06AM -0400, Richard Graham wrote: >> Gleb, >> Are you looking at this ? > Not today. And I need the code to reproduce the bug. Is this possible? > >>

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Gleb Natapov
On Wed, Aug 29, 2007 at 10:46:06AM -0400, Richard Graham wrote: > Gleb, > Are you looking at this ? Not today. And I need the code to reproduce the bug. Is this possible? > > Rich > > > On 8/29/07 9:56 AM, "Gleb Natapov" wrote: > > > On Wed, Aug 29, 2007 at 04:48:07PM +0300, Gleb Natapov wr

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Richard Graham
Gleb, Are you looking at this ? Rich On 8/29/07 9:56 AM, "Gleb Natapov" wrote: > On Wed, Aug 29, 2007 at 04:48:07PM +0300, Gleb Natapov wrote: >> Is this trunk or 1.2? > Oops. I should read more carefully :) This is trunk. > >> >> On Wed, Aug 29, 2007 at 09:40:30AM -0400, Terry D. Dontje w

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Gleb Natapov
On Wed, Aug 29, 2007 at 04:48:07PM +0300, Gleb Natapov wrote: > Is this trunk or 1.2? Oops. I should read more carefully :) This is trunk. > > On Wed, Aug 29, 2007 at 09:40:30AM -0400, Terry D. Dontje wrote: > > I have a program that does a simple bucket brigade of sends and receives > > where r

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Terry D. Dontje
Trunk. --td Gleb Natapov wrote: Is this trunk or 1.2? On Wed, Aug 29, 2007 at 09:40:30AM -0400, Terry D. Dontje wrote: I have a program that does a simple bucket brigade of sends and receives where rank 0 is the start and repeatedly sends to rank 1 until a certain amount of time has passe

Re: [OMPI devel] SM BTL hang issue

2007-08-29 Thread Gleb Natapov
Is this trunk or 1.2? On Wed, Aug 29, 2007 at 09:40:30AM -0400, Terry D. Dontje wrote: > I have a program that does a simple bucket brigade of sends and receives > where rank 0 is the start and repeatedly sends to rank 1 until a certain > amount of time has passed and then it sends and all done