[O-MPI devel] Fwd: [O-MPI users] HOWTO turn of "multi-rail" support at runtime?

2005-09-21 Thread Brian Barrett

Tim -

Just to make sure I"m not losing it - if any of the "high speed"  
networks is found between peers, tcp shouldn't be used between that  
pair, right?  I was pretty sure that's what the priority code did  
now, but wanted to make sure I wasn't losing it ;).


Brian

Begin forwarded message:


From: "Tim S. Woodall" 
Date: September 20, 2005 7:51:42 PM GMT+02:00
To: Open MPI Users 
Subject: Re: [O-MPI users] HOWTO turn of "multi-rail" support at  
runtime?

Reply-To: Open MPI Users 


Daryl,

Try setting:

-mca btl_base_include self,mvapi

To specify that only lookback (self) and mvapi btls should be used.

Can you forward me the config.log from your build?

Thanks,
Tim

Daryl W. Grunau wrote:

Hi, I've got a dual-homed IB + GigE connected cluster for which  
I've built
a very recent drop of OpenMPI (w/ mvapi support).  I'm having  
difficulty
making OMPI solely use native verbs as it's communication between  
nodes.
I've tried all incantations of the following mca parameters to no  
avail:


   --mca btl_tcp_if_exclude "lo,eth0,eth1,ib0,ib1"
   --mca ptl_tcp_if_include "lo,eth0,eth1,ib0,ib1"

Note I'm putting ib in the list because I really don't wish to use  
IP/IB;
OMPI should be able to communicate at the native verbs level,  
right?  If I
leave ib0/1 unconfigured on my host, OMPI uses eth0 for its  
communication.
If I bring up ib0, OMPI uses both eth0 and ib0!  Is there any way  
I can

specify for it to use none of these TCP interfaces?  TIA!

Daryl

P.s.  I can send output of ompi_info if that is helpful.
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [O-MPI devel] Registration Cache changes

2005-09-21 Thread Gleb Natapov
Hello Galen,

Finally I've got some time to look through the new code.
I have couple of notes.  In pml_ob1_rdma.c you try to merge 
registrations in the number of places. The code looks like this:
  btl_mpool->mpool_deregister(btl_mpool, reg);
  btl_mpool->mpool_register(btl_mpool, 
new_base,
new_len,
MCA_MPOOL_FLAGS_CACHE,
®);
How do you know reg is not in use? You can't deregister it if somebody
is using the registration!
Also I thought about merging registration and I am not sure this is such 
a good idea. The registration may grow to large and you will not be able
to shrink it if only small part of it is in use. This may cause the waste 
of memory. 

In mca_mpool_base_registration_t structure you save base/bound in byte
granularity, but we know that kernel works in much coarse resolution.
Why not to exploit this fact. We can round base/bound to page boundaries.
We are going to pin this memory anyway. In my patch I introduced
mpool_pageshift for this.

--
Gleb.


Re: [O-MPI devel] Fwd: [O-MPI users] HOWTO turn of "multi-rail" support at runtime?

2005-09-21 Thread Tim S. Woodall

Thats correct. Not sure why TCP would have been used - unless IB
interfaces weren't up..


Brian Barrett wrote:

Tim -

Just to make sure I"m not losing it - if any of the "high speed"  
networks is found between peers, tcp shouldn't be used between that  
pair, right?  I was pretty sure that's what the priority code did  
now, but wanted to make sure I wasn't losing it ;).


Brian

Begin forwarded message:



From: "Tim S. Woodall" 
Date: September 20, 2005 7:51:42 PM GMT+02:00
To: Open MPI Users 
Subject: Re: [O-MPI users] HOWTO turn of "multi-rail" support at  
runtime?

Reply-To: Open MPI Users 


Daryl,

Try setting:

-mca btl_base_include self,mvapi

To specify that only lookback (self) and mvapi btls should be used.

Can you forward me the config.log from your build?

Thanks,
Tim

Daryl W. Grunau wrote:


Hi, I've got a dual-homed IB + GigE connected cluster for which  
I've built
a very recent drop of OpenMPI (w/ mvapi support).  I'm having  
difficulty
making OMPI solely use native verbs as it's communication between  
nodes.
I've tried all incantations of the following mca parameters to no  
avail:


  --mca btl_tcp_if_exclude "lo,eth0,eth1,ib0,ib1"
  --mca ptl_tcp_if_include "lo,eth0,eth1,ib0,ib1"

Note I'm putting ib in the list because I really don't wish to use  
IP/IB;
OMPI should be able to communicate at the native verbs level,  
right?  If I
leave ib0/1 unconfigured on my host, OMPI uses eth0 for its  
communication.
If I bring up ib0, OMPI uses both eth0 and ib0!  Is there any way  
I can

specify for it to use none of these TCP interfaces?  TIA!

Daryl

P.s.  I can send output of ompi_info if that is helpful.
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [O-MPI devel] Registration Cache changes

2005-09-21 Thread Tim S. Woodall



Gleb Natapov wrote:

Hello Galen,

Finally I've got some time to look through the new code.
I have couple of notes.  In pml_ob1_rdma.c you try to merge 
registrations in the number of places. The code looks like this:

  btl_mpool->mpool_deregister(btl_mpool, reg);
  btl_mpool->mpool_register(btl_mpool, 
new_base,

new_len,
MCA_MPOOL_FLAGS_CACHE,
®);
How do you know reg is not in use? You can't deregister it if somebody
is using the registration!


Good catch... this should check the reference count and
only deregister when the reference count actually goes to zero...


Re: [O-MPI devel] Registration Cache changes

2005-09-21 Thread Galen M. Shipman

Gleb,



Gleb Natapov wrote:


Hello Galen,

Finally I've got some time to look through the new code.
I have couple of notes.  In pml_ob1_rdma.c you try to merge
registrations in the number of places. The code looks like this:
  btl_mpool->mpool_deregister(btl_mpool, reg);
  btl_mpool->mpool_register(btl_mpool,
new_base,
new_len,
MCA_MPOOL_FLAGS_CACHE,
®);
How do you know reg is not in use? You can't deregister it if  
somebody

is using the registration!



Good catch... this should check the reference count and
only deregister when the reference count actually goes to zero...
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Yes, this was a good catch.. This was causing all sorts of fun for us!
Thanks,

Galen



[O-MPI devel] [Fwd: OMPI mpif.h problems]

2005-09-21 Thread Tim S. Woodall

Can anyone comment on this?


 Original Message 
Subject: OMPI mpif.h problems
List-Post: devel@lists.open-mpi.org
Date: Wed, 21 Sep 2005 12:27:13 -0600
From: David R. (Chip) Kent IV 
To: Tim S. Woodall 
References: <20050914164817.gj2...@duckhorn.lanl.gov> <432857a8.3060...@lanl.gov> <20050914202150.go2...@duckhorn.lanl.gov> 
<43288959.7070...@lanl.gov> <20050915142252.gg5...@duckhorn.lanl.gov> <43298466.4050...@lanl.gov>


Tim,

I managed to find a number of problems with the mpif.h when I tried it on
a big application.  It looks like a lot of key constants are not defined
in this file.  So far, MPI_SEEK_SET, MPI_MODE_CREATE, MPI_MODE_WRONLY
have broken the build.  I add them into the mpif.h file as I find them,
but it takes ~10 minutes to redo the build.  Let me know if you make a fix
for this, and I'll test it out.

Chip

-
David R. "Chip" Kent IV

Parallel Tools Team
High Performance Computing Environments Group (CCN-8)
Los Alamos National Laboratory

(505)665-5021
drk...@lanl.gov
-

This message is "Technical data or Software  Publicly
Available" or "Correspondence".



[O-MPI devel] mpif.h problems

2005-09-21 Thread David R. (Chip) Kent IV
I managed to find a number of problems with the mpif.h when I tried it on 
a big application.  It looks like a lot of key constants are not defined 
in this file.  So far, MPI_SEEK_SET, MPI_MODE_CREATE, MPI_MODE_WRONLY 
have broken the build.  I've added them to mpif.h as I find them so that 
I can get the build to go, but I assume there are many more values still 
missing.

Chip

-
David R. "Chip" Kent IV

Parallel Tools Team
High Performance Computing Environments Group (CCN-8)
Los Alamos National Laboratory

(505)665-5021
drk...@lanl.gov
-

This message is "Technical data or Software  Publicly 
Available" or "Correspondence".


[O-MPI devel] --with-mvapi/--with-btl-mvapi???

2005-09-21 Thread Tim S. Woodall


Note that the recent change to the configure script(s) to use --with-mvapi
instead of --with-btl-mvapi are not complete. I've recently had to use both
to compile mvapi. This is causing a great deal of pain for external users.

Can someone please look at this?