Re: [OMPI devel] SCTP noisy failure

2007-12-12 Thread Brad Penoff
On Dec 12, 2007 6:03 PM, Jeff Squyres wrote: > On Dec 12, 2007, at 8:58 PM, Brad Penoff wrote: > > >> That's not really the issue: I don't *want* SCTP support. :) > >> > >> I have a default RHEL4U4 install and now Open MPI is complaining on a > >> default mpirun. Open MPI should work out of the

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Jeff Squyres
Tarballs available at: http://www.open-mpi.org/~jsquyres/unofficial/ On Dec 12, 2007, at 4:08 PM, Jeff Squyres (jsquyres) wrote: Heh, ok. I'll make a tarball against your patch later. Its against the trunk? -jms Sent from my PDA -Original Message- From: Gleb Natapov [mai

Re: [OMPI devel] SCTP noisy failure

2007-12-12 Thread Jeff Squyres
On Dec 12, 2007, at 8:58 PM, Brad Penoff wrote: That's not really the issue: I don't *want* SCTP support. :) I have a default RHEL4U4 install and now Open MPI is complaining on a default mpirun. Open MPI should work out of the box -- warning free -- on all supported operating systems. Haha,

Re: [OMPI devel] SCTP noisy failure

2007-12-12 Thread Brad Penoff
On Dec 12, 2007 5:44 PM, Jeff Squyres wrote: > On Dec 12, 2007, at 7:16 PM, Brad Penoff wrote: > > > Does your system have sctp in the kernel as a module? This is the > > default for most Linux systems so you may have to "modprobe sctp" to > > get rid of the ESOCKTNOSUPPORT... > > That's not real

Re: [OMPI devel] SCTP noisy failure

2007-12-12 Thread Jeff Squyres
On Dec 12, 2007, at 7:16 PM, Brad Penoff wrote: Does your system have sctp in the kernel as a module? This is the default for most Linux systems so you may have to "modprobe sctp" to get rid of the ESOCKTNOSUPPORT... That's not really the issue: I don't *want* SCTP support. :) I have a defa

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r16909 (f77_hello compiler error)

2007-12-12 Thread George Bosilca
The logic was wrong. I only get half of it. Commit 16950 solve the problem. Sorry for this. Thanks, george. On Dec 12, 2007, at 2:44 PM, Jeff Squyres wrote: Yes -- something changed; I tested all 4 languages extensively before I committed (but not on mac). This fails for me on Linux a

Re: [OMPI devel] SCTP noisy failure

2007-12-12 Thread Brad Penoff
hey Jeff, Does your system have sctp in the kernel as a module? This is the default for most Linux systems so you may have to "modprobe sctp" to get rid of the ESOCKTNOSUPPORT... brad On Dec 12, 2007 3:57 PM, Jeff Squyres wrote: > After the exclusivity change today, I notice that I am getting

[OMPI devel] SCTP noisy failure

2007-12-12 Thread Jeff Squyres
After the exclusivity change today, I notice that I am getting warnings for *every* mpirun from the SCTP BTL on RHEL4: [15:52] svbu-mpi:~/mpi % mpirun -np 2 hello [svbu-mpi.cisco.com][1,0][btl_sctp_component.c: 615:mca_btl_sctp_component_create_listen] socket() failed with errno=94 [svbu-mpi.c

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Jeff Squyres
Was Rich referring to ensuring that the test codes checked that their payloads were correct (and not re-assembled in the wrong order)? On Dec 12, 2007, at 4:10 PM, Brian W. Barrett wrote: On Wed, 12 Dec 2007, Gleb Natapov wrote: On Wed, Dec 12, 2007 at 03:46:10PM -0500, Richard Graham wrot

Re: [OMPI devel] New BTL parameter

2007-12-12 Thread Paul H. Hargrove
Gleb Natapov wrote: On Wed, Dec 12, 2007 at 02:03:02PM -0500, Jeff Squyres wrote: On Dec 9, 2007, at 10:34 AM, Gleb Natapov wrote: Currently BTL has parameter btl_min_send_size that is no longer used. I want to change it to be btl_rndv_eager_limit. This new parameter will determine

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Brian W. Barrett
On Wed, 12 Dec 2007, Gleb Natapov wrote: On Wed, Dec 12, 2007 at 03:46:10PM -0500, Richard Graham wrote: This is better than nothing, but really not very helpful for looking at the specific issues that can arise with this, unless these systems have several parallel networks, with tests that wil

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Jeff Squyres (jsquyres)
Heh, ok. I'll make a tarball against your patch later. Its against the trunk? -jms Sent from my PDA -Original Message- From: Gleb Natapov [mailto:gl...@voltaire.com] Sent: Wednesday, December 12, 2007 03:54 PM Eastern Standard Time To: Open MPI Developers Subject:Re: [O

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 03:52:17PM -0500, Jeff Squyres wrote: > On Dec 12, 2007, at 3:20 PM, Gleb Natapov wrote: > > >> How about making a tarball with this patch in it that can be thrown > >> at > >> everyone's MTT? (we can put the tarball on www.open-mpi.org > >> somewhere) > > I don't have

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Jeff Squyres
On Dec 12, 2007, at 3:20 PM, Gleb Natapov wrote: How about making a tarball with this patch in it that can be thrown at everyone's MTT? (we can put the tarball on www.open-mpi.org somewhere) I don't have access to www.open-mpi.org, but I can send you the patch. I can send you a tarball too,

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 03:46:10PM -0500, Richard Graham wrote: > This is better than nothing, but really not very helpful for looking at the > specific issues that can arise with this, unless these systems have several > parallel networks, with tests that will generate a lot of parallel network >

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Richard Graham
This is better than nothing, but really not very helpful for looking at the specific issues that can arise with this, unless these systems have several parallel networks, with tests that will generate a lot of parallel network traffic, and be able to self check for out-of-order received - i.e. this

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Jon Mason
On Wed, Dec 12, 2007 at 01:35:33PM -0500, Jeff Squyres wrote: > I agree with Gleb's idea. More below. > > On Dec 12, 2007, at 12:24 PM, Jon Mason wrote: > > > Ok, glad I got this conversation started :) > > > > So, we need a slight redesign to determine the cm method (unless > > forced > > via

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 11:57:11AM -0500, Jeff Squyres wrote: > Gleb -- > > How about making a tarball with this patch in it that can be thrown at > everyone's MTT? (we can put the tarball on www.open-mpi.org somewhere) I don't have access to www.open-mpi.org, but I can send you the patch. I can

Re: [OMPI devel] New BTL parameter

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 02:03:02PM -0500, Jeff Squyres wrote: > On Dec 9, 2007, at 10:34 AM, Gleb Natapov wrote: > > > Currently BTL has parameter btl_min_send_size that is no longer used. > > I want to change it to be btl_rndv_eager_limit. This new parameter > > will > > determine a size of a

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r16909 (f77_hello compiler error)

2007-12-12 Thread Jeff Squyres
Yes -- something changed; I tested all 4 languages extensively before I committed (but not on mac). This fails for me on Linux as well; I'll check into it... On Dec 12, 2007, at 2:15 PM, Ethan Mallove wrote: Hello, Is this change (or r16908) causing the below error in the MTT trivial test

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r16909 (f77_hello compiler error)

2007-12-12 Thread Ethan Mallove
Hello, Is this change (or r16908) causing the below error in the MTT trivial test (f77_hello)? The error occurs on Solaris and Linux. ... NOTICE: Invoking /ws/ompi-tools/SUNWspro/SOS11/bin/f90 -f77 -ftrap=%none -I/installs/cGmK/install/include/v9 -xarch=amd64 hello.f -o f77_hello -R/install

Re: [OMPI devel] New BTL parameter

2007-12-12 Thread Jeff Squyres
On Dec 9, 2007, at 10:34 AM, Gleb Natapov wrote: Currently BTL has parameter btl_min_send_size that is no longer used. I want to change it to be btl_rndv_eager_limit. This new parameter will determine a size of a first fragment of rendezvous protocol. Now we use btl_eager_limit to set its

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Jeff Squyres
I agree with Gleb's idea. More below. On Dec 12, 2007, at 12:24 PM, Jon Mason wrote: Ok, glad I got this conversation started :) So, we need a slight redesign to determine the cm method (unless forced via commandline arg). This can be determined by calling all the individual open routines

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Jon Mason
Ok, glad I got this conversation started :) So, we need a slight redesign to determine the cm method (unless forced via commandline arg). This can be determined by calling all the individual open routines, and having them return a priority based on their ability to function. For example, the xoo

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Jeff Squyres
Gleb -- How about making a tarball with this patch in it that can be thrown at everyone's MTT? (we can put the tarball on www.open-mpi.org somewhere) On Dec 11, 2007, at 4:14 PM, Richard Graham wrote: I will re-iterate my concern. The code that is there now is mostly nine years old (with

Re: [OMPI devel] SCTP BTL exclusivity value problem

2007-12-12 Thread Karol Mroz
I just read this thread... many thanks for applying the fix. Jeff Squyres wrote: > Done in r16942. > > On Dec 12, 2007, at 10:45 AM, Gleb Natapov wrote: > >> On Wed, Dec 12, 2007 at 10:31:37AM -0500, Jeff Squyres wrote: >>> I'd be in favor of setting the TCP exclusivity to LOW+100 and setting >>

Re: [OMPI devel] initial SCTP BTL commit comments?

2007-12-12 Thread Andrew Friedley
Jeff Squyres wrote: Alternatively, you could do what the ofud BTL does (a currently experimental BTL): look for the string "ofud" in the "btl" MCA parameter -- i.e., see if the user explicitly asked for the ofud BTL. If not found (doing the Right Things with the "^" operator, of course),

Re: [OMPI devel] SCTP BTL exclusivity value problem

2007-12-12 Thread Jeff Squyres
Done in r16942. On Dec 12, 2007, at 10:45 AM, Gleb Natapov wrote: On Wed, Dec 12, 2007 at 10:31:37AM -0500, Jeff Squyres wrote: I'd be in favor of setting the TCP exclusivity to LOW+100 and setting SCTP exclusivity to LOW. Fine with me. On Dec 12, 2007, at 10:07 AM, Gleb Natapov wrote:

Re: [OMPI devel] SCTP BTL exclusivity value problem

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 10:31:37AM -0500, Jeff Squyres wrote: > I'd be in favor of setting the TCP exclusivity to LOW+100 and setting > SCTP exclusivity to LOW. Fine with me. > > > On Dec 12, 2007, at 10:07 AM, Gleb Natapov wrote: > > > On Wed, Dec 12, 2007 at 10:02:07AM -0500, Jeff Squyres w

Re: [OMPI devel] SCTP BTL exclusivity value problem

2007-12-12 Thread Jeff Squyres
I'd be in favor of setting the TCP exclusivity to LOW+100 and setting SCTP exclusivity to LOW. On Dec 12, 2007, at 10:07 AM, Gleb Natapov wrote: On Wed, Dec 12, 2007 at 10:02:07AM -0500, Jeff Squyres wrote: Yes -- this came up in a prior thread. See what I proposed: http://www.open-mp

Re: [OMPI devel] SCTP BTL exclusivity value problem

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 10:02:07AM -0500, Jeff Squyres wrote: > Yes -- this came up in a prior thread. See what I proposed: > > http://www.open-mpi.org/community/lists/devel/2007/12/2698.php > > (no one replied, so no action was taken) > > Are you on a system where the SCTP BTL is being bu

Re: [OMPI devel] SCTP BTL exclusivity value problem

2007-12-12 Thread Jeff Squyres
Yes -- this came up in a prior thread. See what I proposed: http://www.open-mpi.org/community/lists/devel/2007/12/2698.php (no one replied, so no action was taken) Are you on a system where the SCTP BTL is being built? What kind of environment is it? On Dec 12, 2007, at 9:38 AM, Gle

[OMPI devel] SCTP BTL exclusivity value problem

2007-12-12 Thread Gleb Natapov
Hi, SCTP BTL sets its exclusivity value to MCA_BTL_EXCLUSIVITY_LOW - 1 but MCA_BTL_EXCLUSIVITY_LOW is zero so actually it is set to max exclusivity possible. Can somebody fix this please? May be we should not define MCA_BTL_EXCLUSIVITY_LOW to zero? -- Gleb.

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 04:08:31PM +0200, Pavel Shamis (Pasha) wrote: > Gleb Natapov wrote: >> On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote: >> >>> Gleb Natapov wrote: >>> On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: > Isn't th

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Pavel Shamis (Pasha)
Gleb Natapov wrote: On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote: Gleb Natapov wrote: On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: Isn't there a better way somehow? Perhaps we should have "select" call *all* the functions and accept

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote: > Gleb Natapov wrote: > > On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: > > > >> Isn't there a better way somehow? Perhaps we should have "select" > >> call *all* the functions and accept back a priority. T

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Pavel Shamis (Pasha)
Gleb Natapov wrote: On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: Isn't there a better way somehow? Perhaps we should have "select" call *all* the functions and accept back a priority. The one with the highest priority then wins. This is quite similar to much of the ot

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Gleb Natapov
On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: > Isn't there a better way somehow? Perhaps we should have "select" > call *all* the functions and accept back a priority. The one with the > highest priority then wins. This is quite similar to much of the > other selection log

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Jeff Squyres
On Dec 12, 2007, at 5:13 AM, Pavel Shamis (Pasha) wrote: Hmm. I don't think that we want to put knowledge of XRC in the OOB CPC (and vice versa). That seems like an abstraction violation. I didn't like that XRC knowledge was put in the connect base either, but I was too busy to argue with it.

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Pavel Shamis (Pasha)
Jeff Squyres wrote: Hmm. I don't think that we want to put knowledge of XRC in the OOB CPC (and vice versa). That seems like an abstraction violation. I didn't like that XRC knowledge was put in the connect base either, but I was too busy to argue with it. :-) Isn't there a better way s