Re: [OMPI devel] Using MTT to test the newly added SCTP BTL
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, Jeff... thanks for getting back to me. Jeff Squyres wrote: > On Nov 29, 2007, at 12:13 PM, Karol Mroz wrote: > >>> One solution might be to remove the .ompi_ignore but to only enable >>> the SCTP BTL when an explicit --with-sctp flag is given to configure >>> (or something similar). You might want to run this by the [OMPI] >>> group first, but there's precedent for it, so I doubt anyone would >>> object. >> The situation at present is that the SCTP BTL only builds on FreeBSD, >> OSX and Linux and only if the SCTP is found to be in a standard place. >> On Linux, for instance, you need to have installed the lksctp >> package in >> order for the SCTP BTL to build. We also have a --with-sctp configure >> option where you can specify the SCTP path should it not be in a >> standard location. If SCTP does not exist on the system, then the BTL >> will not build and more importantly, will not break the build of the >> overall system. > > Is this SCTL lksctp package installed by default on any Linux? OS X? > Solaris? The lksctp package is not installed by default on any Linux distribution that I'm aware of. For OSX, SCTP support is provided via the SCTP Network Kernel Extension (http://sctp.fh-muenster.de/sctp-nke.html) and this too is not installed by default. Solaris does have SCTP support by default, but we currently do not build on Solaris systems regardless. >> My question now, is it necessary for us to alter the above >> behavior (as initially mentioned by Jeff), or is having the SCTP BTL >> build iff SCTP is found sufficient? > > > I think the only thing that matters is what the current default > behavior is -- if the .ompi_ignore is removed, will it hose anyone > unexpectedly? I.e., if they build and run today and it works, then > the .ompi_ignore is removed and you build and run... and it doesn't > work. That my only real concern. Removal of .ompi_ignore should not create build problems for anyone who is running without some form of SCTP support. To test this claim, we built Open MPI with .ompi_ignore removed and no SCTP support on both an ubuntu linux and an OSX machine. Both builds succeeded without any problem. A couple other questions we had, and this references an email from a while back, deals with SCTP BTL exclusivity. I will link the relevant message below and any advice would be appreciated: http://www.open-mpi.org/community/lists/devel/2007/11/2609.php Thanks. - -- Karol Mroz km...@cs.ubc.ca -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHUKH7uoug78g/Mz8RAmuuAKCF2FHDkfwsv4G6Pc1f05Ya9CFHLwCfQJT1 UJb17w+fhxL6abtOwLKX4nE= =QSsm -END PGP SIGNATURE-
[OMPI devel] Another patch for v1.2.5
Inspired by this thread: http://www.open-mpi.org/community/lists/users/2007/11/4547.php Brian kindly donated a patch to make Linux ECONNREFUSED behavior better in the oob TCP. I filed CMR 1192 to get this into 1.2.5. It's not critical for 1.2.5, but it would be nice to have. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Using ompi_proc_t's proc_name.vpid as Universal rank
Hi, Thanks for the clarification. So, now I am wondering how rank information regarding processes in MPI_COMM_WORLD are assigned. Is there a table that stores unique integer values for processess or is rank assignment done in some other manner? Thanks, Sajjad Tabib Tim PrinsSent by: devel-boun...@open-mpi.org 11/30/07 07:22 AM Please respond to Open MPI Developers To Open MPI Developers cc Subject Re: [OMPI devel] Using ompi_proc_t's proc_name.vpid as Universal rank Hi Sajjad, The vpid is not unique. If you do a comm_spawn then the newly launched processes will have a new jobid, and their vpids will start at 0. So the whole process name is unique. However, there is talk now of being able to join 2 jobs that were started completely independently. This may lead to the point where a process name is no longer unique, however this work appears to be a ways out and as far as I know no decisions have been made on it yet. Hope this helps, Tim Sajjad Tabib wrote: > > Hello, > > I have a proprietary transport/messaging layer that sits below an MTL > component. This transport layer requires OpenMPI to assign it a rank > that is unique and specific to that process and will not change from > execution to termination. In a way, I am trying to find a one-one > correspondence between a process's universal rank in OpenMPI and this > transport layer. I began looking at ompi_proc_t from different processes > and seemingly found a unique identifier, proc_name.vpid. Consequently, I > assigned the ranks to each process in my transport layer based on the > value of the local vpid of each process. > I have not tested this thoroughly, but it has been working so far. > Although, I would like to make sure that this is a good approach, or > know, at least, whether if there are other ways to do this. I would > appreciate it if you could leave me feedback or give suggestions on how > to assign universal ranks to a proprietary transport software. > > Thanks for your help, > > Sajjad Tabib > > > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] tmp XRC branches
On Fri, Nov 30, 2007 at 02:06:02PM -0500, Jeff Squyres wrote: > Are any of the XRC tmp SVN branches still relevant? Or have they now > been integrated into the trunk? > > I ask because I see 4 XRC-related branches out there under /tmp and / > tmp-public. They are not relevant any more. I'll remove the one I created. -- Gleb.
Re: [OMPI devel] Using MTT to test the newly added SCTP BTL
On Nov 29, 2007, at 12:13 PM, Karol Mroz wrote: One solution might be to remove the .ompi_ignore but to only enable the SCTP BTL when an explicit --with-sctp flag is given to configure (or something similar). You might want to run this by the [OMPI] group first, but there's precedent for it, so I doubt anyone would object. The situation at present is that the SCTP BTL only builds on FreeBSD, OSX and Linux and only if the SCTP is found to be in a standard place. On Linux, for instance, you need to have installed the lksctp package in order for the SCTP BTL to build. We also have a --with-sctp configure option where you can specify the SCTP path should it not be in a standard location. If SCTP does not exist on the system, then the BTL will not build and more importantly, will not break the build of the overall system. Is this SCTL lksctp package installed by default on any Linux? OS X? Solaris? My question now, is it necessary for us to alter the above behavior (as initially mentioned by Jeff), or is having the SCTP BTL build iff SCTP is found sufficient? I think the only thing that matters is what the current default behavior is -- if the .ompi_ignore is removed, will it hose anyone unexpectedly? I.e., if they build and run today and it works, then the .ompi_ignore is removed and you build and run... and it doesn't work. That my only real concern. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Indirect calls to wait* and test*
I would find this a useful feature. I haven't played with the diff so I can't comment on it, but the idea of it sounds good to me. Cheers, Josh On Nov 29, 2007, at 6:37 PM, Aurelien Bouteiller wrote: This patch introduces customisable wait/test for requests as discussed at the face-to-face ompi meeting in Paris. A new global structure (ompi_request_functions) holding all the pointers to the wait/test functions have been added. ompi_request_wait* and ompi_request_test* have been #defined to be replaced by ompi_request_functions.req_wait. The default implementations of the wait/test functions names have been changed from ompi_request_% to ompi_request_default_%. Those functions are static initializer of the ompi_request_functions structure. To modify the defaults, a components 1) copy the ompi_request_functions structure (the type ompi_request_fns_t can be used to declare a suitable variable), 2) change some of the functions according to its needs. This is best done at MPI_init time when there is no threads. Should this component be unloaded it have to restore the defaults. The ompi_request_default_* functions should never be called directly anywhere in the code. If a component needs to access the previously defined implementation of wait, it should call its local copy of the function. Component implementors should keep in mind that another component might have already changed the defaults and needs to be called. Performance impact on NetPipe -a (async recv mode) does not show measurable overhead. Here follows the "diff -y" between original and modified ompi assembly code from ompi/mpi/c/wait.c. The only significant difference is an extra movl to load the address of the ompi_request_functions structure in eax. This obviously explains why there is no measurable cost on latency. ORIGINAL MODIFIED L2: L2: movl L_ompi_request_null$non_lazy_ptr-"L001$pb"(%ebx), % eax movl L_ompi_request_null$non_lazy_ptr-"L001$pb"(% ebx), %eax cmpl%eax, (%edi) cmpl%eax, (%edi) je L18 je L18 > movl L_ompi_request_functions $non_lazy_ptr-"L001$pb"(%ebx), %eax movl%esi, 4(%esp) movl%esi, 4(%esp) movl%edi, (%esp) movl%edi, (%esp) callL_ompi_request_wait$stub |call*16(%eax) Here is the patch for those who want to try themselves. If I receive comments outlining the need, thread safe accessors could be added to allow components to change the functions at anytime during execution and not only during MPI_Init/Finalize. Please make noise if you find this useful. If comments does not suggest extra work, I expect this code to be committed in trunk next week. Aurelien Le 8 oct. 07 à 06:01, Aurelien Bouteiller a écrit : For message logging purpose, we need to interface with wait_any, wait_some, test, test_any, test_some, test_all. It is not possible to use PMPI for this purpose. During the face-to-face meeting in Paris (5-12 october 2007) we discussed this issue and came to the conclusion that the best way to achieve this is to replace direct calls to ompi_request_wait* and test* by indirect calls (same way as PML send, recv, etc). Basic idea is to declare a static structure containing the 8 pointers to all the functions. This structure is initialized at compilation time with the current basic wait/test functions. Before end of MPI_init, any component might replace the basics with specialized functions. Expected cost is less than .01us latency according to preliminary test. The method is consistent with the way we call pml send/recv. Mechanism could be used later for stripping out grequest from critical path when they are not used. -- Aurelien Bouteiller, PhD Innovative Computing Laboratory - MPI group +1 865 974 6321 1122 Volunteer Boulevard Claxton Education Building Suite 350 Knoxville, TN 37996 ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Dr. Aurelien Bouteiller, Sr. Research Associate Innovative Computing Laboratory - MPI group +1 865 974 6321 1122 Volunteer Boulevard Claxton Education Building Suite 350 Knoxville, TN 37996 ___ devel mailing list de...@open-mpi.org