[O-MPI devel] Fwd: [O-MPI users] HOWTO turn of "multi-rail" support at runtime?
Tim - Just to make sure I"m not losing it - if any of the "high speed" networks is found between peers, tcp shouldn't be used between that pair, right? I was pretty sure that's what the priority code did now, but wanted to make sure I wasn't losing it ;). Brian Begin forwarded message: From: "Tim S. Woodall" Date: September 20, 2005 7:51:42 PM GMT+02:00 To: Open MPI Users Subject: Re: [O-MPI users] HOWTO turn of "multi-rail" support at runtime? Reply-To: Open MPI Users Daryl, Try setting: -mca btl_base_include self,mvapi To specify that only lookback (self) and mvapi btls should be used. Can you forward me the config.log from your build? Thanks, Tim Daryl W. Grunau wrote: Hi, I've got a dual-homed IB + GigE connected cluster for which I've built a very recent drop of OpenMPI (w/ mvapi support). I'm having difficulty making OMPI solely use native verbs as it's communication between nodes. I've tried all incantations of the following mca parameters to no avail: --mca btl_tcp_if_exclude "lo,eth0,eth1,ib0,ib1" --mca ptl_tcp_if_include "lo,eth0,eth1,ib0,ib1" Note I'm putting ib in the list because I really don't wish to use IP/IB; OMPI should be able to communicate at the native verbs level, right? If I leave ib0/1 unconfigured on my host, OMPI uses eth0 for its communication. If I bring up ib0, OMPI uses both eth0 and ib0! Is there any way I can specify for it to use none of these TCP interfaces? TIA! Daryl P.s. I can send output of ompi_info if that is helpful. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [O-MPI devel] Registration Cache changes
Hello Galen, Finally I've got some time to look through the new code. I have couple of notes. In pml_ob1_rdma.c you try to merge registrations in the number of places. The code looks like this: btl_mpool->mpool_deregister(btl_mpool, reg); btl_mpool->mpool_register(btl_mpool, new_base, new_len, MCA_MPOOL_FLAGS_CACHE, ®); How do you know reg is not in use? You can't deregister it if somebody is using the registration! Also I thought about merging registration and I am not sure this is such a good idea. The registration may grow to large and you will not be able to shrink it if only small part of it is in use. This may cause the waste of memory. In mca_mpool_base_registration_t structure you save base/bound in byte granularity, but we know that kernel works in much coarse resolution. Why not to exploit this fact. We can round base/bound to page boundaries. We are going to pin this memory anyway. In my patch I introduced mpool_pageshift for this. -- Gleb.
Re: [O-MPI devel] Fwd: [O-MPI users] HOWTO turn of "multi-rail" support at runtime?
Thats correct. Not sure why TCP would have been used - unless IB interfaces weren't up.. Brian Barrett wrote: Tim - Just to make sure I"m not losing it - if any of the "high speed" networks is found between peers, tcp shouldn't be used between that pair, right? I was pretty sure that's what the priority code did now, but wanted to make sure I wasn't losing it ;). Brian Begin forwarded message: From: "Tim S. Woodall" Date: September 20, 2005 7:51:42 PM GMT+02:00 To: Open MPI Users Subject: Re: [O-MPI users] HOWTO turn of "multi-rail" support at runtime? Reply-To: Open MPI Users Daryl, Try setting: -mca btl_base_include self,mvapi To specify that only lookback (self) and mvapi btls should be used. Can you forward me the config.log from your build? Thanks, Tim Daryl W. Grunau wrote: Hi, I've got a dual-homed IB + GigE connected cluster for which I've built a very recent drop of OpenMPI (w/ mvapi support). I'm having difficulty making OMPI solely use native verbs as it's communication between nodes. I've tried all incantations of the following mca parameters to no avail: --mca btl_tcp_if_exclude "lo,eth0,eth1,ib0,ib1" --mca ptl_tcp_if_include "lo,eth0,eth1,ib0,ib1" Note I'm putting ib in the list because I really don't wish to use IP/IB; OMPI should be able to communicate at the native verbs level, right? If I leave ib0/1 unconfigured on my host, OMPI uses eth0 for its communication. If I bring up ib0, OMPI uses both eth0 and ib0! Is there any way I can specify for it to use none of these TCP interfaces? TIA! Daryl P.s. I can send output of ompi_info if that is helpful. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [O-MPI devel] Registration Cache changes
Gleb Natapov wrote: Hello Galen, Finally I've got some time to look through the new code. I have couple of notes. In pml_ob1_rdma.c you try to merge registrations in the number of places. The code looks like this: btl_mpool->mpool_deregister(btl_mpool, reg); btl_mpool->mpool_register(btl_mpool, new_base, new_len, MCA_MPOOL_FLAGS_CACHE, ®); How do you know reg is not in use? You can't deregister it if somebody is using the registration! Good catch... this should check the reference count and only deregister when the reference count actually goes to zero...
Re: [O-MPI devel] Registration Cache changes
Gleb, Gleb Natapov wrote: Hello Galen, Finally I've got some time to look through the new code. I have couple of notes. In pml_ob1_rdma.c you try to merge registrations in the number of places. The code looks like this: btl_mpool->mpool_deregister(btl_mpool, reg); btl_mpool->mpool_register(btl_mpool, new_base, new_len, MCA_MPOOL_FLAGS_CACHE, ®); How do you know reg is not in use? You can't deregister it if somebody is using the registration! Good catch... this should check the reference count and only deregister when the reference count actually goes to zero... ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel Yes, this was a good catch.. This was causing all sorts of fun for us! Thanks, Galen
[O-MPI devel] [Fwd: OMPI mpif.h problems]
Can anyone comment on this? Original Message Subject: OMPI mpif.h problems List-Post: devel@lists.open-mpi.org Date: Wed, 21 Sep 2005 12:27:13 -0600 From: David R. (Chip) Kent IV To: Tim S. Woodall References: <20050914164817.gj2...@duckhorn.lanl.gov> <432857a8.3060...@lanl.gov> <20050914202150.go2...@duckhorn.lanl.gov> <43288959.7070...@lanl.gov> <20050915142252.gg5...@duckhorn.lanl.gov> <43298466.4050...@lanl.gov> Tim, I managed to find a number of problems with the mpif.h when I tried it on a big application. It looks like a lot of key constants are not defined in this file. So far, MPI_SEEK_SET, MPI_MODE_CREATE, MPI_MODE_WRONLY have broken the build. I add them into the mpif.h file as I find them, but it takes ~10 minutes to redo the build. Let me know if you make a fix for this, and I'll test it out. Chip - David R. "Chip" Kent IV Parallel Tools Team High Performance Computing Environments Group (CCN-8) Los Alamos National Laboratory (505)665-5021 drk...@lanl.gov - This message is "Technical data or Software Publicly Available" or "Correspondence".
[O-MPI devel] mpif.h problems
I managed to find a number of problems with the mpif.h when I tried it on a big application. It looks like a lot of key constants are not defined in this file. So far, MPI_SEEK_SET, MPI_MODE_CREATE, MPI_MODE_WRONLY have broken the build. I've added them to mpif.h as I find them so that I can get the build to go, but I assume there are many more values still missing. Chip - David R. "Chip" Kent IV Parallel Tools Team High Performance Computing Environments Group (CCN-8) Los Alamos National Laboratory (505)665-5021 drk...@lanl.gov - This message is "Technical data or Software Publicly Available" or "Correspondence".
[O-MPI devel] --with-mvapi/--with-btl-mvapi???
Note that the recent change to the configure script(s) to use --with-mvapi instead of --with-btl-mvapi are not complete. I've recently had to use both to compile mvapi. This is causing a great deal of pain for external users. Can someone please look at this?