Re: [OMPI devel] SDP support for OPEN-MPI
On Jan 1, 2008, at 1:11 PM, Andrew Friedley wrote: We would like to add SDP support for OPENMPI. I have a few points -- this is the first: I would do this patch slightly differently. I prefer to have as few #if's as possible, so I'd do it to always have the struct members and logic for the MCA-enable/disable of SDP support, but only actually enable it if HAVE_DECL_AF_INET_SDP. Hence, the number of #if's is dramatically reduced -- you only need to #if the parts of the code that actually try to use AF_INET_SDP (etc.). I'd also ditch the --enable-sdp; I think configure can figure that stuff out by itself without an --enable switch. Perhaps if people really want the ability to turn SDP off at configure time, --disable- sdp could be useful. But that might not be too useful. Don't forget that we always have the "bool" type available; you can use that for logicals (instead of int). I'd also add another MCA param that is read-only that indicates whether SDP is support was compiled in or not (i.e., HAVE_DECL_AF_INET_SDP is 1, and therefore there was a value for AF_INET_SDP). This will allow you to query ompi_info and see if your OMPI was configured for SDP support. That way, you can have a consistent set of MCA params for the TCP components regardless of platform. I think that's somewhat important. To be user-friendly, I'd also emit a warning if someone tries to enable SDP support and it's not available. Note that SDP could be unavailable for multiple reasons: - wasn't available at compile time - isn't available for the peer IP address that was used Hence, if HAVE_DECL_AF_INET_SDP==1 and using AF_INET_SDP fails to that peer, it might be desirable to try to fail over to using AF_INET_something_else. I'm still technically on vacation :-), so I didn't look *too* closely at your patch, but I think you're doing that (failing over if AF_INET_SDP doesn't work because of EAFNOSUPPORT), which is good. I would think the following would apply: - Error (or warning?): user requests SDP and HAVE_DECL_AF_INET_SDP is 0 - Error (or warning?): user requests SDP and HAVE_DECL_AF_INET_SDP is 1, but using AF_INET_SDP failed - Not an error: user does not request SDP, but HAVE_DECL_AF_INET_SDP is 1 and AF_INET_SDP works - Not an error: user does not request SDP, but HAVE_DECL_AF_INET_SDP is 1 and AF_INET_SDP does not work, but is able to fail over to AF_INET_something_else With all this, the support is still somewhat inconsistent -- you could be using an OMPI that has HAVE_DECL_AF_INET_SDP==0, but you're running on a system that has SDP available. Perhaps a more general approach would be to [perhaps additionally] provide an MCA param to allow the user to specify the AF_* value? (AF_INET_SDP is a standardized value, right? I.e., will it be the same on all Linux variants [and someday Solaris]?) SDP can be used to accelerate job start ( oob over sdp ) and IPoIB performance. I fail to see the reason to pollute the TCP btl with IB-specific SDP stuff. For the oob, this is arguable, but doesn't SDP allow for *transparent* socket replacement at runtime ? In this case, why not use this mechanism and keep the code clean ? Patrick's got a good point: is there a reason not to do this? (LD_PRELOAD and the like) Is it problematic with the remote orted's? Furthermore, why would a user choose to use SDP and TCP/IPoIB when the OpenIB BTL is available using the native verbs interface? FWIW, this same sort of question gets asked of the uDAPL BTL -- the answer there being that the uDAPL BTL runs in places the OpenIB BTL does not. Is this true here as well? Andrew's got a point point here, too -- accelerating the TCP BTL with SDP seems kinda pointless. I'm guessing that you did it because it was just about the same work as was done in the TCP OOB (for which we have no corresponding verbs interface). Is that right? -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Cisco MTT runs
On Jan 1, 2008, at 10:08 AM, Josh Hursey wrote: You can see the difference in the one weekly contrib mark on the MTT contribution graph: http://osl.iu.edu/~jjhursey/research/mtt-contrib.pdf I set up a cron job to update this graph every Mon, Wed, Fri at 2 am, so you can check that link whenever you feel bored. Should we link to this graph somewhere off the OMPI MTT results web site? It should be easy enough to add some prefix/suffix/hook-like HTML somewhere in the standard MTT config that can customize the output HTML (i.e., put some additional information on the page, such as a link to this graph)... -- Jeff Squyres Cisco Systems
Re: [OMPI devel] SDP support for OPEN-MPI
Patrick Geoffray wrote: Lenny Verkhovsky wrote: We would like to add SDP support for OPENMPI. SDP can be used to accelerate job start ( oob over sdp ) and IPoIB performance. I fail to see the reason to pollute the TCP btl with IB-specific SDP stuff. For the oob, this is arguable, but doesn't SDP allow for *transparent* socket replacement at runtime ? In this case, why not use this mechanism and keep the code clean ? Furthermore, why would a user choose to use SDP and TCP/IPoIB when the OpenIB BTL is available using the native verbs interface? FWIW, this same sort of question gets asked of the uDAPL BTL -- the answer there being that the uDAPL BTL runs in places the OpenIB BTL does not. Is this true here as well? Andrew
Re: [OMPI devel] Cisco MTT runs
Awesome! You can see the difference in the one weekly contrib mark on the MTT contribution graph: http://osl.iu.edu/~jjhursey/research/mtt-contrib.pdf I set up a cron job to update this graph every Mon, Wed, Fri at 2 am, so you can check that link whenever you feel bored. As a side note I'm supposed to meet with Joseph about MTT visualization stuff when we both get back to IU next week. I'll keep you all posted on that progress. Happy holidays! Josh On Dec 31, 2007, at 5:01 PM, Jeff Squyres wrote: In case you hadn't noticed, Cisco resumed running MTT literally right before the holiday weekend -- I got about 9 days of runs: http://www.open-mpi.org/mtt/stats/index.php?dates=2007-12-01+- +2007-12-31&org_name=all&platform_name=all&os_name=all&mpi_install_com piler_name=all&mpi_get_name=all&test_suite=all I also just bumped up the number of variants we're running per some recent openib btl activity on the trunk, so our nightly contribution should be going up. We used to run 9 variants on both v1.2 and the trunk; we're now running 8 variants on v1.2 and 12 variants on the trunk. I will likely add more as some other internal test clusters [finally] come on-line in the new year... :-) Happy holidays! -- Jeff Squyres Cisco Systems ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Minor patch for !IPV6_V6ONLY
On Mon, Dec 31, 2007 at 08:05:38PM -0800, Paul H. Hargrove wrote: > I just tried today to build the OMPI trunk on an old RH8 box and found > that for > OPAL_WANT_IPV6 && !defined(IPV6_V6ONLY) > the file oob_tcp.c fails to compile due to unbalanced braces. > > Swapping an #endif with a closing branc (patch below) fixed the problem > for me. Thanks for the patch, you were absolutely right. Fixed in r17028. -- Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany private: http://adi.thur.de
Re: [OMPI devel] SDP support for OPEN-MPI
Lenny Verkhovsky wrote: We would like to add SDP support for OPENMPI. SDP can be used to accelerate job start ( oob over sdp ) and IPoIB performance. I fail to see the reason to pollute the TCP btl with IB-specific SDP stuff. For the oob, this is arguable, but doesn't SDP allow for *transparent* socket replacement at runtime ? In this case, why not use this mechanism and keep the code clean ? Patrick
[OMPI devel] Trac nit-pick
I've noticed that the trac search capability (https://svn.open-mpi.org/trac/ompi/search) requires a minimum query length of three characters. That makes is impractical to search for "gm" or "mx", which have obvious relevance to OMPI. -Paul -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900