[ewg] OFED 4.17
Can someone please pick up libfabric 1.6.1 for 4.17 if it already isn't there? https://github.com/ofiwg/libfabric/releases/tag/v1.6.1 https://github.com/ofiwg/libfabric/releases/download/v1.6.1/libfabric-1.6.1.tar.bz2 https://github.com/ofiwg/fabtests/releases/tag/v1.6.1 https://github.com/ofiwg/fabtests/releases/download/v1.6.1/fabtests-1.6.1.tar.bz2 Thanks, - Sean ___ ewg mailing list ewg@lists.openfabrics.org https://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] [ofiwg] Query regarding OFED
> I am working on Linux Kernel 4.13.10. I want to install OFA OFED for > RDMA. > > I wanted to know if Wireshark support is available with OFED. Also I > would like to know the setup steps for OFA OFED. > > Kindly guide me through the correct path. OFED questions are better directed to ewg@lists.openfabrics.org. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] OFA EWG Meeting: Monday, Nov 20th, 2017, 09:00 AM US Pacific Time (12pm EST) - Minutes
> I assume RC3 testing is going well; no new issues reported on EWG or > Bugzilla so I would like to package up GA tomorrow if possible. Once > complete, I would like to immediately move onto OFED 4.8-2 and add > EL7.4 and SLES12.3 backports. Maybe I am being overly optimistic but I > would like to finalize OFED 4.8-2 GA by end of year if possible. You're talking about a 4 week gap between releases. Why even release 4.8-1 with that short of a release window? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] OFED 4.8-1 RC3 prep work
> Note: there may also be a libfabric v1.5.2 library update for RC3. Vlad, please pull in v1.5.2 libfabric and fabtest for rc3. These are bug fix only releases relative to v1.5.1 (and v1.5.0). https://github.com/ofiwg/libfabric/releases/tag/v1.5.2 https://github.com/ofiwg/fabtests/releases/tag/v1.5.2 Thanks, Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] [ANNOUNCE] OFED 4.8-rc2 release is available
> > OFED is a software product of OFA, not Linux. OFA can put anything > > that they want in it. Why do you even care? It's no different than > > Intel or Mellanox or any other company shipping out of tree > > software. > > The primary answer to your question depends on whether or not the > software will ever be upstreamed. If it will, then it really should > go > there first and not later, and the reason is well exemplified by what > happened with XRC where the version that landed in OFED and the > version > that landed in upstream were two totally different things, and users > had > to go back and fix up all their code because of the difference once it > finally did land upstream. It's not nice to put users in that > position > again, and this does sound like it might end up going down that exact > road since upstream is pursuing ways of doing peer to peer PCI > operations and such without any input from the Xeon Phi folks. I'm not defending whatever business decisions any organization (including a multi-company non-profit like OFA) wants to make wrt their software distributions. I'm claiming that that's their decision. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] [ANNOUNCE] OFED 4.8-rc2 release is available
> This is exactly why we so strongly discourage this out of tree stuff - > getting something unmergable in OFED is *NOT* Job Done, Time to Go > Home. Down this path just creates another Lustre mess. Actually, this may be 'job done'. No individual or company is obligated to provide upstream software for any of their hardware. OFA decides what to ship in their software products, not the greater linux kernel community. Individual companies can decide if out of tree maintenance is more cost effective than trying to merge code upstream. Because that's what this ultimately comes down to. IMO, the only people who have legitimate complaints here are those people running Xeon Phi with Mellanox HCAs who are being forced to use OFED, rather than upstream code. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] [ANNOUNCE] OFED 4.8-rc2 release is available
> So in plain words: Intel abuses their influence in OFED to ship crap > that has absolutely no chance to get upstream in the current form > instead of working with the community to improve infrastructure. > > That's exactly what I guessed, thanks for confirming. OFED is a software product of OFA, not Linux. OFA can put anything that they want in it. Why do you even care? It's no different than Intel or Mellanox or any other company shipping out of tree software. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] OFA EWG Meeting: Monday, Feb 27, 2017, 09:00 AM US Pacific Time (12pm EST) - Minutes
> I have a question regarding rdma-core inclusion into OFED 4.8. > Currently the tagged release is rdma-core-12 but we seem to be pulling > in the head of the master github tree as it moves. I recommend pulling > release tags rather than the head of a master branch. Can we roll back > to rdma-core-12? What do others think? Pulling from released tags is the only thing that makes sense. If a new release becomes available during the 4.8 development cycle, you can then decide whether to pull in that release or not. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] OFA EWG Meeting: Monday, Oct 24, 2016, 09:00 AM US Pacific Time (12pm EST) - Agenda
> Not sure if there are any agenda items we need to cover on Monday since > Vlad is still on vacation and the formal ask for GPL only components is > still waiting for official response. Doesn't OFED already ship GPL only code? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] libiwpm change breaks OFED build
Thanks - I didn't realize that this was the package for the daemon. Yes, a name change does actually make sense. :) > The name change was requested by a major distro, because the package > was named libiwpm although it isn't a library. > > We decided to change the name of the package to iwpmd since the port > mapper is a daemon process. > > Also it is worth noting that it is only a package name change. The name > of the executable stays the same - it is iwpmd. > > Thank you, > Tatyana > > -----Original Message- > From: Hefty, Sean > Sent: Thursday, June 30, 2016 12:01 PM > To: Nikolova, Tatyana E <tatyana.e.nikol...@intel.com>; Vladimir > Sokolovsky <v...@dev.mellanox.co.il>; Woodruff, Robert J > <robert.j.woodr...@intel.com>; Rupert Dance (rsda...@soft-forge.com) > <rsda...@soft-forge.com>; Davis, Arlin R <arlin.r.da...@intel.com> > Cc: ewg@lists.openfabrics.org > Subject: RE: libiwpm change breaks OFED build > > > I am sorry for the trouble. I am going to change latest.txt to > > libiwpm- 1.0.5.tar.gz. The new iwpmd-1.0.6.tar.gz package is intended > > for the next OFED-4-x release. > > Independent from OFED, changing the package name will be confusing for > users. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] libiwpm change breaks OFED build
> I am sorry for the trouble. I am going to change latest.txt to libiwpm- > 1.0.5.tar.gz. The new iwpmd-1.0.6.tar.gz package is intended for the > next OFED-4-x release. Independent from OFED, changing the package name will be confusing for users. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] failure building fabtests with OFED-3.18-2
> Anybody see this error? I hit it on one RHEL7 machine. But another > machine > built ok. Perhaps some missing prerequisite RPM? > > unit/eq_test.c: In function 'eq_wait_fd_poll': > unit/eq_test.c:260:2: warning: implicit declaration of function > 'fi_trywait' You probably need to update your version of libfabric. The ofiwg decided that fabtests would only target the latest version of libfabric and not try to provide backwards compatibility with older versions. This is purely a maintenance decision. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
[ewg] please pull libfabric-1.1.1 and fabtests-1.1.1 into OFED 3.18-1
New packages are available at: http://downloads.openfabrics.org/downloads/ofi/ libfabric-1.1.1.tar.gz libfabric-1.1.1.tar.bz2 fabtests-1.1.1.tar.gz fabtests-1.1.1.tar.bz2 Thanks, Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] OFA EWG Meeting Minutes
Rupert, Attached are release notes for libfabric and librdmacm for OFED 3.18. - Sean Open Fabrics Enterprise Distribution (OFED) RDMA CM in OFED 3.18 Release Notes June 2015 === Table of Contents === 1. Overview 2. New Features 3. Known Issues === 1. Overview === The RDMA CM is a communication manager used to setup reliable, connected and unreliable datagram data transfers. It provides an RDMA transport neutral interface for establishing connections. The API concepts are based on sockets, but adapted for queue pair (QP) based semantics: communication must be over a specific RDMA device, and data transfers are message based. The RDMA CM can control both the QP and communication management (connection setup / teardown) portions of an RDMA API, or only the communication management piece. It works in conjunction with the verbs API defined by the libibverbs library. The libibverbs library provides the underlying interfaces needed to send and receive data. The RDMA CM can operate asynchronously or synchronously. The mode of operation is controlled by the user. The RDMA CM also provides the rsocket implementation. Rsockets is a protocol over RDMA that supports a socket-level API for applications. Rsocket APIs are intended to match the behavior of corresponding socket calls. A preload library is included as part of the RDMA CM package, which allows many socket based applications to run unmodified over rsockets. === 2. New Features === for OFED 3.18 Enhancements to the librdmacm release 1.0.21, versus the 1.0.18 release, mostly centered around bug fixes. Notable feature enhancements include the following. * There were several updates and bug fixes to the rsockets code based on more extensive testing and use cases. Rsockets now supports the use of native InfiniBand addressing. * The RDMA CM was updated to support XRC QPs. * The rsocket preload library allows for fine grained interception of socket calls. === 3. Known Issues === The RDMA CM relies on the operating system's network configuration tables to map IP addresses to RDMA devices. Incorrectly configured network configurations can result in the RDMA CM being unable to locate the correct RDMA device. If you experience problems using the RDMA CM, you may want to check the following: * Verify that you have IP connectivity over the RDMA devices. For example, ping between iWarp or IPoIB devices. * Ensure that IP network addresses assigned to RDMA devices do not overlap with IP network addresses assigned to standard Ethernet devices. * For multicast issues, either bind directly to a specific RDMA device, or configure the IP routing tables to route multicast traffic over an RDMA device's IP address. Version 1.0 Released on May 03, 2015 Introduction Libfabric is a communication library that exports interfaces for fabric services to applications. Libfabric is the core component of the Open Fabrics Interfaces (OFI) framework. Libfabric has the following objectives: * High-performance: provide optimized software paths to hardware - Independent of hardware implementations * Scalable: targets support for millions of processes - Designed to reduce cache and memory footprint - Scalable address resolution and storage - Tight data structures * Application-centric - Interfaces co-designed with application developers and hardware vendors * Extensible - Easily adaptable to support future application needs OFI is being developed by the OFI Working Group (OFIWG) a subgroup of the OpenFabrics Alliance (OFA). Participation in OFIWG (pronounced o-fee-wig) is open to anyone, regardless of their membership in OFA. The goal of OFI and libfabric is to define interfaces that enable a tight semantic map between applications and underlying fabric services. Specifically, libfabric software interfaces have been co-designed with fabric hardware providers and application developers, with an initial focus on the needs of HPC users. OFI supports multiple interface semantics, is fabric and hardware implementation agnostic, and leverages and expands the existing RDMA open source community. For more information regarding the OFI project, please visit the OFIWG GitHub site: http://ofiwg.github.io/libfabric/ Support
Re: [ewg] OFA EWG Meeting Minutes
FYI - There are 2 open, but somewhat related issues, for libfabric. We are trying to find a solution for those ASAP. Once we have a proposed fix, I'll create a 1.0.1rc1 package that can be pulled into the next OFED rc. I should have release notes ready this week. -Original Message- From: ewg-boun...@lists.openfabrics.org [mailto:ewg- boun...@lists.openfabrics.org] On Behalf Of Davis, Arlin R Sent: Monday, June 15, 2015 8:50 AM To: Rupert Dance; 'OpenFabrics EWG' Subject: Re: [ewg] OFA EWG Meeting Minutes Where are we on RC3? From: ewg-boun...@lists.openfabrics.org [mailto:ewg- boun...@lists.openfabrics.org] On Behalf Of Rupert Dance Sent: Sunday, June 14, 2015 3:14 PM To: 'OpenFabrics EWG' Subject: [ewg] OFA EWG Meeting Minutes Hello, The OFA EWG Meeting Minutes from the 6/8/2015 meeting are available here: https://www.openfabrics.org/downloads/WorkGroups/ewg/EWG_Minutes/2015-06- 08-EWG-Meeting-Minutes.pdf Please remember that we want to release OFED 3.18 GA but we are waiting for everyone to submit updates to their package release notes. Please let me know when you have completed this. Here are the commitments made in the last meeting. a) Arlin has updated the release notes and will submit b) Steve will double check and create an update if needed. c) Pradeep Avago: will double check and submit some updates d) Tatyana – will create release notes for libiwpm e) Arlin will ask Sean for an update to librdmacm and release notes for libfabric f) Jeff Becker – will provide updates for NFSRDMA. Thanks Rupert ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] OFA EWG Meeting Minutes
Even if we have a fix within a day, the fix would just now go into 1.0.1rc1. The official 1.0.1 release, which is what you really want in OFED 3.18, wouldn't be until the end of this month. We would need time to test the other bug fixes that went in between 1.0 and 1.0.1. - Sean -Original Message- From: Woodruff, Robert J Sent: Monday, June 15, 2015 1:44 PM To: Hefty, Sean; Davis, Arlin R; Rupert Dance; 'OpenFabrics EWG' Subject: RE: [ewg] OFA EWG Meeting Minutes If these can be fixed quickly, then we can probably pull them in, but I think these are the last items holding up RC3 and I am not sure that they are really show stoppers, and since we plan to do a OFED-3.18-1 soon after OFED-3.18 to pick up support for SLES 11 SP4 and RHEL EL 6.7, perhaps these libfabric fixes can wait till then, unless the fixes are easy and can be fixed in a day or two. My 2 cents. Woody -Original Message- From: ewg-boun...@lists.openfabrics.org [mailto:ewg- boun...@lists.openfabrics.org] On Behalf Of Hefty, Sean Sent: Monday, June 15, 2015 9:48 AM To: Davis, Arlin R; Rupert Dance; 'OpenFabrics EWG' Subject: Re: [ewg] OFA EWG Meeting Minutes FYI - There are 2 open, but somewhat related issues, for libfabric. We are trying to find a solution for those ASAP. Once we have a proposed fix, I'll create a 1.0.1rc1 package that can be pulled into the next OFED rc. I should have release notes ready this week. -Original Message- From: ewg-boun...@lists.openfabrics.org [mailto:ewg- boun...@lists.openfabrics.org] On Behalf Of Davis, Arlin R Sent: Monday, June 15, 2015 8:50 AM To: Rupert Dance; 'OpenFabrics EWG' Subject: Re: [ewg] OFA EWG Meeting Minutes Where are we on RC3? From: ewg-boun...@lists.openfabrics.org [mailto:ewg- boun...@lists.openfabrics.org] On Behalf Of Rupert Dance Sent: Sunday, June 14, 2015 3:14 PM To: 'OpenFabrics EWG' Subject: [ewg] OFA EWG Meeting Minutes Hello, The OFA EWG Meeting Minutes from the 6/8/2015 meeting are available here: https://www.openfabrics.org/downloads/WorkGroups/ewg/EWG_Minutes/2015- 06- 08-EWG-Meeting-Minutes.pdf Please remember that we want to release OFED 3.18 GA but we are waiting for everyone to submit updates to their package release notes. Please let me know when you have completed this. Here are the commitments made in the last meeting. a) Arlin has updated the release notes and will submit b) Steve will double check and create an update if needed. c) Pradeep Avago: will double check and submit some updates d) Tatyana – will create release notes for libiwpm e) Arlin will ask Sean for an update to librdmacm and release notes for libfabric f) Jeff Becker – will provide updates for NFSRDMA. Thanks Rupert ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
[ewg] librdmacm 1.0.20
Vlad, Please pull in librdmacm 1.0.20 into OFED 3.18. www.openfabrics.org/downloads/rdmacm/librdmacm-1.0.20.tar.gz - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
[ewg] please pull libfabric-1.0.0rc4
Vlad, Please update the libfabric package with rc4 from ofa download site: http://downloads.openfabrics.org/downloads/ofi/ Thanks, Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] rdma_accept and non-rdma QPs
Since (as explained in my other email), rdma_create_qp() has not been called, id-qp is NULL, causing ucma_modify_qp_rtr() and subsequently rdma_accept() failure. Other than directly setting id-qp, I don't see how rdma_accept() can be used without calling rdma_create_qp() first. Is the conclusion that rdmacm doesn't support outside QP creation/destruction (at least not for the passive side) and that the documentation (rdma_accept manpage) is incorrect? Tracing through the code, I don't think that the rdma cm will support outside QP creation on either the active or passive side. Rdma_connect() should handle outside QP creation, but the corresponding event processing code will not, for the same reason why rdma_accept() does not. Either the man page or code needs to be updated. Is there a reason why you are using external QP creation? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/ewg
Re: [ewg] where can I find librsocket?
rsockets is part of librdmacm. It first appeared in v1.0.16 (7/12/12) but there are bug fixes in v1.0.17 (3/6/13) and some beyond that but not yet released. As Hal mentioned, the rsocket API is part of librdmacm. The rsocket preload library is librspreload. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] IPv6 over Ethernet (RoCE) is broken
I have installed the latest OFED-3.5-2 daily software on SLES 11.2 x86-64 with Mellanox Technologies MT26428 [ConnectX VPI. PCIe 2.0 5GT/s - IB QDR / 10GigE]. When MT26428 is in IB mode, all works fine. But when it switches into the ethernet mode becoming a RoCE card, some error comes out. If the server application program uses rdma cm APIs (rdma_bind and rdma_listen) listening on IPv6 address like ::, it will refuse the connection launched by rdma_connect. The errno is set to be ECONNREFUSED. However, if server listens on IPv4 address like 0.0.0.0 or 0, the connection is successful. I've found some patch: http://comments.gmane.org/gmane.linux.drivers.rdma/16448 But I'm not sure whether it's relevant to the issue I experienced above. Has any one seen this bug ? I don't recall that patch series touching anything outside of iwarp, so it shouldn't be relevant. Do you know if this worked in the previous version of OFED? I recall some patches on the mail list regarding how RoCE formed GIDs. I don't know if those were pulled into 3.5-2 or not. - Seans ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Automated Port Migration on RC connected QPs
Does anyone have example code on how to setup automatic port migration correctly for RC connected QPs that use RDMA CM? The RDMA CM does not support APM. You could use the RDMA CM to connect, but would need to exchange alternate path information in band and modify the QP directly to set the alternate path information and APM state. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] ib_acme fails for requests with IPv4 addresses (ofed 3.5)
Now I have another problem with 3 out of 18 nodes. All 3 get the correct information for the other 15 nodes if I run ib_acme, and also the other 15 can obtain the right information for the 3, but if I run ib_acme among those 3 nodes then I get a Connection timed out. On all three nodes the command for 'localhost' does work, too. Here the ouput: === = rc001 ~ $ pdsh -w rc0[00-17] 'for x in `seq 100 117`; do ib_acme -f i -d 10.1.4.${x} -v; done' | grep failed -B 1 rc002: Destination: 10.1.4.106 rc002: ib_acm_resolve_ip failed: Connection timed out rc002: SA verification: failed Cannot assign requested address -- rc011: Destination: 10.1.4.102 rc011: ib_acm_resolve_ip failed: Connection timed out rc011: SA verification: failed Cannot assign requested address -- rc006: Destination: 10.1.4.102 rc006: ib_acm_resolve_ip failed: Connection timed out rc006: SA verification: failed Cannot assign requested address -- rc002: Destination: 10.1.4.111 rc002: ib_acm_resolve_ip failed: Connection timed out rc002: SA verification: failed Cannot assign requested address -- rc011: Destination: 10.1.4.106 rc011: ib_acm_resolve_ip failed: Connection timed out rc011: SA verification: failed Cannot assign requested address -- rc006: Destination: 10.1.4.111 rc006: ib_acm_resolve_ip failed: Connection timed out rc006: SA verification: failed Cannot assign requested address === = Do you have seen this type of problem before? In this case it should not be related to the ibacm_addr.cfg, right? Maybe its a problem with the switch or links, I will try some other ports of the switch tomorrow. I have not seen this problem before. The log file that you provided looks okay to me. The following snippet from the rc011 log file indicates that the address resolution message sent from rc011 is correctly being routed back to rc011. (rc011 simply discards the message.) 1363971114.607: acm_process_recv: base endpoint name rc011 1363971114.607: acm_process_acm_recv: 1363971114.607: acm_process_acm_recv: src 10.1.4.111 1363971114.607: acm_process_acm_recv: dest 10.1.4.106 1363971114.607: acm_process_acm_recv: unsolicited request 1363971114.607: acm_process_addr_req: 1363971114.607: acm_acquire_dest: 10.1.4.111 1363971114.607: acm_get_dest: 10.1.4.111 1363971114.607: acm_process_addr_req: dest state 4 1363971114.607: acm_complete_queued_req: status 0 1363971114.607: acm_put_dest: 10.1.4.111 What would be interesting to know is if the log file on rc006 shows that it received the message from rc011. That is, do we see something like this: : acm_process_recv: base endpoint name rc006 : acm_process_acm_recv: : acm_process_acm_recv: src 10.1.4.111 : acm_process_acm_recv: dest 10.1.4.106 : acm_process_acm_recv: unsolicited request It's curious that only a select group of nodes can't communicate with each other. I'm inclined to agree with your assessment that it may be an issue with the switch, or possibly how the multicast group was configured. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] ib_acme fails for requests with IPv4 addresses (ofed 3.5)
Note that you can test each node separately by making the source/destination addresses the same. This may show that your first system, rc002, is working, but rc003 is not. On the second node, the ib_acme command fails only for IPs, too. But it returns with a different message ('Cannot assign requested address'): === == rc003 ~/tmp/ibacm-1.0.7 $ ib_acme -f i -s 10.0.0.52 -d 10.0.0.51 -v -P -V Service: localhost Destination: 10.0.0.51 Source: 10.0.0.52 ib_acm_resolve_ip failed: Cannot assign requested address SA verification: failed Cannot assign requested address Error Count,Resolve Count,No Data,Addr Query Count,Addr Cache Count,Route Query Count,Route Cache Count localhost,1,2,0,0,0,0,0 return status 0x0 rc003 ~/ $ cat /var/log/ibacm.log ... 1363872021.460: acm_svr_accept: 1363872021.460: acm_svr_accept: assigned client 0 1363872021.460: acm_server: receiving from client 0 1363872021.460: acm_svr_receive: client 0 1363872021.460: acm_svr_resolve_dest: client 0 1363872021.460: acm_svr_resolve_dest: src 10.0.0.52 1363872021.460: acm_get_ep: 10.0.0.52 1363872021.460: acm_get_ep: notice - could not find 10.0.0.52 It doesn't appear that the ibacm address information is correct. Having the complete log file may help. The assigned address configuration would end up being near the top of the log file. ibacm uses an address file, ibacm_addr.cfg, to assign address information to ports. If this file is not present, it will be created. It's a text file, and the format is hopefully straightforward to follow. As a couple of places to look , the file may be in: /etc/rdma/ibacm_addr.cfg /usr/local/etc/rdma/ibacm_addr.cfg If you find the file, the simplest thing to do may be to just remove it. You can look at the existing file to see that the correct IP address has been assigned to the right port. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Changing min_rnr_timer and timeout attributes of QP
I am trying to change min_rnr_timer and timeout attribute of QP from the Linux kernel (RHEL6.3). The QP is created using rdma_create_qp() of rdma_cm moduleand not ib_create_qp() as I want a generic code to work over IB as well as ROCEE. These cannot be changed directly. The min_rnr_timeout is set to 0, which is the maximum. The timeout value is determined based on the packet lifetime reported by the SA. You may be able to configure the SM to set a higher/lower packet lifetime. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Link available/unavailable notification
I initially thought that with the use of ib_register_event_handler() I can achieve this. But after reading a little bit more (IB spec chapter 11 and chapter 7) it looks like only one such handler can be registered with the device. So if my module has to co-exist with any other kernel module using the IB device for such events, this approach may not work. Any comments on this? I am simply interested in getting to know the change in port status as soon as it is done (event based approach preferred than polling for it). ib_register_event_handler supports multiple users. The device 'invokes' a handler, which is responsible for reporting the event to all registered clients. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Changing min_rnr_timer and timeout attributes of QP
I am trying to change min_rnr_timer and timeout attribute of QP from the Linux kernel (RHEL6.3). The QP is created using rdma_create_qp() of rdma_cm moduleand not ib_create_qp() as I want a generic code to work over IB as well as ROCEE. These cannot be changed directly. The min_rnr_timeout is set to 0, which is the maximum. The timeout value is determined based on the packet lifetime reported by the SA. You may be able to configure the SM to set a higher/lower packet lifetime. Actually... from the kernel, I think calling ib_create_qp() works with IB and RoCE, but not iWarp. You can still use the rdma_cm to connect, but transition the QPs yourself using rdma_init_qp_attr() and ib_modify_qp(). - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Interop test failure using OFED-3.5 RC4
We have investigated and found that perftest was upgraded from v1.8 to v2.0 on 11/19/12, between RC3 and RC4. Er, I meant between RC2 and RC3. Why would there be a _major_ version change in any component done in the middle of a release cycle?! ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] EWG/OFED meeting minutes for May 29, 2012
This issue is caused by the following lines in the ibacm daemon (/etc/init.d/ibacm): # Required-Start: rdma $network # Required-Stop: rdma $network - These lines fit RHEL6.x in-box driver that includes /etc/init.d/rdma. Note: SLES, RHEL5.x Distros and OFED use /etc/init.d/openibd to load the driver. So, I used openibd instead of rdma. Thanks! I'll see if I can figure out how to fix the ibacm.init.in file, which is used to generate the ibacm script, to account for the differences. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] server migration
There is a new direct subdomain to the downloads: downloads.openfabrics.org What is the path to the download directory? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Status of SDP? (and ib_sdp patch for Linux-3.4.x)
What's makes SDP so useful and why is will be sorely missed is that it does not require users to refactor existing TCP socket code, or even be in possession of the source code for that matter. Simply preloading libsdp is all need to do to convert that old, cranky tcp program to run (almost) native over InfiniBand. rsockets also comes with a preload library that allows TCP apps to run over RDMA devices. It has been tested with OSU and Intel MPI binaries. (I, personally, do not have access to Intel MPI source code.) The major difficulty is supporting applications which call fork. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [RFC] – Proposal for new process for OFED releases
We propose a new process for the OFED releases starting from next OFED release: - OFED content will be the relevant kernel.org modules and user space released packages - OFED will offer only backports to the distros (no fixes) I think this point needs to be clarified - at least to me anyway. :) - OFED package will be used for easy installation of all packages in a friendly manner The main goals of this change: 1. Ensure OFED and the upstream kernel are the same 2. Provide customers a way to use the new features in latest kernels on existing distros 3. OFED qualification will contribute to the stability of the upstream code I like this approach. We think that at this point of the RDMA technology maturity this is the right way to go. In this way OFED is not conflicting with the kernel or the distros, and still provide a valuable value for early adopters of new features. Versions: We suggest that the OFED version will be the same as kernel.org For example, for kernel 3.2 the OFED release would be OFED-3.2. This would make it easy for people to associate the OFED code with the corresponding kernel.org code. Some open questions that we should consider: - How to handle experimental features? - Need to follow up kernel stable releases if bug fixes are relevant to OFA modules - Should we have a release for every kernel release (I think yes) This would help test upstream submissions, which would be good. It may be desirable to have stable versions of previous releases, so that customers can get bug fixes without pulling in a bunch of new features. E.g. OFED-3.2.1, OFED-3.2.2, etc. If maintaining stable releases of every version is too expensive, maybe mark specific versions (i.e. 3.2) as stable, with intermediate releases (i.e. 3.3, 3.4) as experimental. Just some ideas. - What should we do with modules like SDP that are not in kernel? Either remove them or carry them forward as experimental features. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] T3 change needed for OFED-1.5.4
I got lost here. The librdmacm has release 1.0.14. This included previous fixes. Release 1.0.14.1 reverted commit 93635fa33b41d356fa096242fec4ce788194b42f from the 1.0.14 release. So... if you want commit 93635fa33b41d356fa096242fec4ce788194b42f, then release 1.0.14 should work. If you don't want it, release 1.0.14.1 is needed. If you want the commit, but also want a version number after 1.0.14.1, I need to push out a new release. :) I may be able to create a new release depending on how entwined xrc support has become in the librdmacm code base. :P Can you let me know which option you're looking for? - Sean -Original Message- From: Steve Wise [mailto:sw...@opengridcomputing.com] Sent: Wednesday, September 14, 2011 7:55 AM To: Hefty, Sean Cc: Vladimir Sokolovsky; Tziporet Koren; OpenFabrics EWG; Divy Le Ray; Kumar A S; Vipul Pandya Subject: T3 change needed for OFED-1.5.4 Hey Sean, Do you remember we pulled out 2 upstream rping fixes in ofed-1.5.2 (or maybe 1.5.3) because of a regression that was due to a change in iw_cxgb3/libcxgb3 that was not included in 1.5.2. The commits are below. Anyway, I want to pull this iw_cxgb3/libcxgb3 change into OFED-1.5.4 and I'll need the 2 rping fixes pulled in also. We need this because the T4 iwarp driver/lib has this same functionality that isn't in the OFED T3 code. With this functionality, rping is broken and needs the 2 librdmacm fixes. However, if you pull in the 2 librdmacm fixes, then the existing T3 code will cause rping to hang, I think, which is why we pulled the rping changes from OFED-1.5.2. So we need to bring T3 up to date, and get these librdmacm fixes pulled in. Can you please submit the librdma release that includes these rping changes once we get the T3 changes included? Here are the details: We need this iw_cxg3 upstream fix. Chelsio will do this asap. commit b955150ea784af4c193b708a2e8091673bf23004 Author: Steve Wise sw...@opengridcomputing.com Date: Thu Oct 21 12:37:06 2010 + RDMA/cxgb3: When a user QP is marked in error, also mark the CQs in error Also we need these librdmacm changes: commit 93635fa33b41d356fa096242fec4ce788194b42f Author: Sean Hefty sean.he...@intel.com Date: Mon Nov 1 11:12:13 2010 -0700 librdmacm/rping: Make sure CQ event thread exits before destroying the CQ commit 8c6aeb3e70bbf275f9b618775133b9ae6f07ad15 Author: Steve Wise sw...@opengridcomputing.com Date: Wed Oct 20 12:34:55 2010 -0700 RPING: Remove printf for FLUSH completion. And finally, we need to pull in libcxgb3-1.3.0. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] T3 change needed for OFED-1.5.4
So I think we want the bits of 1.0.14. I'm ok with just changing OFED-1.5.4 to use that vs making a new release if that's ok with everyone (sort of backing up release number-wise). I went ahead and pushed out release 1.0.15, so that OFED doesn't have to go backwards with their version number. The XRC code to the librdmacm is there, but that shouldn't matter, since it's not supported in the kernel or through the libibverbs in a way that the librdmacm will make use of. (I had fixes that came after the xrc patches, and I don't want to redo them.) If 1.5.4 testing reveals any issues, let me know, and I'll try to get them corrected. But we should be okay. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] new releases for ofed 1.5.4
Please pull in librdmacm 1.0.15 and ibacm 1.0.5 into ofed 1.5.4. Both distribution files are available in the ofa server download section. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH 2/10] DAPL v2.0: common: new IB collective provider for Mellanox Fabric Collective Agent
+static int create_member(struct dapl_hca *hca) +{ + ib_hca_transport_t *tp = hca-ib_trans; + int size, ret = EFAULT; + + dapl_log(DAPL_DBG_TYPE_EXTENSION, + create_member: tp=%p, ctx=%p\n, tp, tp-m_ctx); + + if (!tp-m_ctx) + goto bail; + + /* FCA address information */ + tp-f_info = fca_get_rank_info(tp-m_ctx, size); + if (!tp-f_info) { + dapl_log(DAPL_DBG_TYPE_ERR, + create_member: fca_get_rank_info() ERR ret=%s ctx=%p\n, + strerror(errno), tp-m_ctx); + ret = errno; + goto err; + } + + tp-m_info = malloc(sizeof(DAT_SOCK_ADDR) + size); + if (!tp-m_info) { + dapl_log(DAPL_DBG_TYPE_ERR, + create_member: malloc() ERR ret=%s ctx=%p\n, + strerror(errno), tp-m_ctx); + fca_free_rank_info(tp-f_info); + goto err; + } + dapl_os_memzero(tp-m_info, sizeof(DAT_SOCK_ADDR) + size); + + if ((tp-l_sock = socket(AF_INET, SOCK_STREAM, 0)) 0) { + dapl_log(DAPL_DBG_TYPE_ERR, + create_member: socket() ERR ret=%s \n, + strerror(errno)); + ret = errno; + goto err; + } + + dapl_log(DAPL_DBG_TYPE_EXTENSION, create_member listen socket\n); + + /* + * only rank0 needs listen, but we don't know who is rank0 yet. + * Everyone listen, start on seed port until find one unused + */ + memcpy((void*)tp-m_addr, (void*)hca-hca_address, sizeof(DAT_SOCK_ADDR)); + tp-m_addr.sin_port = htons(DAT_COLL_SID-1); + + do { + tp-m_addr.sin_port++; You're in network-byte order here. ++ probably isn't what you want here. + ret = bind(tp-l_sock, +(struct sockaddr *)tp-m_addr, +sizeof(DAT_SOCK_ADDR)); + + } while (ret == -1 errno == EADDRINUSE); + + if (ret == -1) + goto err; + + if ((ret = listen(tp-l_sock, 1024)) 0) + goto err; + + dapl_log(DAPL_DBG_TYPE_EXTENSION, + create_member: listen port 0x%x,%d \n, + ntohs(tp-m_addr.sin_port), + ntohs(tp-m_addr.sin_port)); + + /* local fca_info and sock_addr to member buffer for MPI exchange */ + tp-f_size = size; + tp-m_size = size + sizeof(DAT_SOCK_ADDR); + memcpy(tp-m_info, tp-f_info, size); + memcpy( ((char*)tp-m_info + size), tp-m_addr, sizeof(DAT_SOCK_ADDR)); + + /* free rank info after getting */ + fca_free_rank_info(tp-f_info); + tp-f_info = NULL; + + dapl_log(DAPL_DBG_TYPE_EXTENSION, + create_member: m_ptr=%p, sz=%d exit SUCCESS\n, + tp-m_info, tp-m_size); + + return 0; +err: + /* cleanup */ + if (tp-f_info) { + fca_free_rank_info(tp-f_info); + tp-f_info = NULL; + } + + if (tp-m_info) { + free(tp-m_info); + tp-m_info = NULL; + } + if (tp-l_sock 0) + close(tp-l_sock); +bail: + return 1; +} + ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] rping/cxgb3 regression
I'm wondering if pulling the rping changes for ofed-1.5.3 would be ok? I guess to do this you would have to push a 1-off librdmacm without those changes? Or maybe back up what is in OFED- 1.5.3 to the previous release without this rping change? Thoughts? Is the commit (93635fa33b41d356fa096242fec4ce788194b42f) below the issue? (Btw, the author listed in my git tree is wrong.) I don't think I want to drop back to 1.0.13 for 1.5.3, so maybe reverting this change and pushing out 1.0.14.1 would work. There's just one other change after 1.0.14 at the moment, and it's to the build, so I'd skip a full release for now. Let me know if you think this would work. - Sean --- librdmacm/rping: Make sure CQ event thread exits before destroying the CQ It is possible for the CQ event thread to poll the CQ after it has been destroyed which can result in a seg fault on T3 interfaces. This patch waits for the thread to exit before destroying the CQ. Signed-off-by: Steve Wise sw...@opengridcomputing.com Signed-off-by: Sean Hefty sean.he...@intel.com diff --git a/examples/rping.c b/examples/rping.c index 2d4c2de..ee292ec 100644 --- a/examples/rping.c +++ b/examples/rping.c @@ -280,12 +280,11 @@ static int rping_cq_event_handler(struct rping_cb *cb) ret = 0; if (wc.status) { - if (wc.status != IBV_WC_WR_FLUSH_ERR) { + if (wc.status != IBV_WC_WR_FLUSH_ERR) fprintf(stderr, cq completion failed status %d\n, wc.status); - ret = -1; - } + ret = -1; goto error; } @@ -802,10 +801,9 @@ static void *rping_persistent_server_thread(void *arg) rping_test_server(cb); rdma_disconnect(cb-child_cm_id); + pthread_join(cb-cqthread, NULL); rping_free_buffers(cb); rping_free_qp(cb); - pthread_cancel(cb-cqthread); - pthread_join(cb-cqthread, NULL); rdma_destroy_id(cb-child_cm_id); free_cb(cb); return NULL; @@ -890,6 +888,7 @@ static int rping_run_server(struct rping_cb *cb) rping_test_server(cb); rdma_disconnect(cb-child_cm_id); + pthread_join(cb-cqthread, NULL); rdma_destroy_id(cb-child_cm_id); err2: rping_free_buffers(cb); @@ -1057,6 +1056,7 @@ static int rping_run_client(struct rping_cb *cb) rping_test_client(cb); rdma_disconnect(cb-cm_id); + pthread_join(cb-cqthread, NULL); err2: rping_free_buffers(cb); err1: ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] rping/cxgb3 regression
I placed a 1.0.14.1 package on the ofa server in the downloads/rdmacm section. Can you verify that it works? If so, I'll ask to pull it into 1.5.3 -Original Message- From: Steve Wise [mailto:sw...@opengridcomputing.com] Sent: Tuesday, February 15, 2011 10:37 AM To: Hefty, Sean Cc: OpenFabrics EWG; Tziporet Koren Subject: Re: rping/cxgb3 regression On 02/15/2011 12:18 PM, Hefty, Sean wrote: I'm wondering if pulling the rping changes for ofed-1.5.3 would be ok? I guess to do this you would have to push a 1-off librdmacm without those changes? Or maybe back up what is in OFED- 1.5.3 to the previous release without this rping change? Thoughts? Is the commit (93635fa33b41d356fa096242fec4ce788194b42f) below the issue? (Btw, the author listed in my git tree is wrong.) Yes. I don't think I want to drop back to 1.0.13 for 1.5.3, so maybe reverting this change and pushing out 1.0.14.1 would work. There's just one other change after 1.0.14 at the moment, and it's to the build, so I'd skip a full release for now. Let me know if you think this would work. I just tested that removing this from 1.0.14 will resolve the issue for 1.5.3. - Sean --- librdmacm/rping: Make sure CQ event thread exits before destroying the CQ It is possible for the CQ event thread to poll the CQ after it has been destroyed which can result in a seg fault on T3 interfaces. This patch waits for the thread to exit before destroying the CQ. Signed-off-by: Steve Wisesw...@opengridcomputing.com Signed-off-by: Sean Heftysean.he...@intel.com diff --git a/examples/rping.c b/examples/rping.c index 2d4c2de..ee292ec 100644 --- a/examples/rping.c +++ b/examples/rping.c @@ -280,12 +280,11 @@ static int rping_cq_event_handler(struct rping_cb *cb) ret = 0; if (wc.status) { - if (wc.status != IBV_WC_WR_FLUSH_ERR) { + if (wc.status != IBV_WC_WR_FLUSH_ERR) fprintf(stderr, cq completion failed status %d\n, wc.status); - ret = -1; - } + ret = -1; goto error; } @@ -802,10 +801,9 @@ static void *rping_persistent_server_thread(void *arg) rping_test_server(cb); rdma_disconnect(cb-child_cm_id); + pthread_join(cb-cqthread, NULL); rping_free_buffers(cb); rping_free_qp(cb); - pthread_cancel(cb-cqthread); - pthread_join(cb-cqthread, NULL); rdma_destroy_id(cb-child_cm_id); free_cb(cb); return NULL; @@ -890,6 +888,7 @@ static int rping_run_server(struct rping_cb *cb) rping_test_server(cb); rdma_disconnect(cb-child_cm_id); + pthread_join(cb-cqthread, NULL); rdma_destroy_id(cb-child_cm_id); err2: rping_free_buffers(cb); @@ -1057,6 +1056,7 @@ static int rping_run_client(struct rping_cb *cb) rping_test_client(cb); rdma_disconnect(cb-cm_id); + pthread_join(cb-cqthread, NULL); err2: rping_free_buffers(cb); err1: ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] rping/cxgb3 regression
Not a big deal. Vlad, can you pull librdmacm 1.0.14.1 into the next OFED 1.5.3 RC? The only change versus 1.0.14 is reverting a patch to the rping sample. Thanks, Sean -Original Message- From: Steve Wise [mailto:sw...@opengridcomputing.com] Sent: Tuesday, February 15, 2011 5:57 PM To: Hefty, Sean Cc: OpenFabrics EWG; Tziporet Koren Subject: Re: rping/cxgb3 regression I pulled it down, built/installed it on 2 nodes, then ran a bunch of rpings. No hangs. Looks good! Thanks Sean. Sorry about this. Steve. On 2/15/2011 7:46 PM, Hefty, Sean wrote: I placed a 1.0.14.1 package on the ofa server in the downloads/rdmacm section. Can you verify that it works? If so, I'll ask to pull it into 1.5.3 -Original Message- From: Steve Wise [mailto:sw...@opengridcomputing.com] Sent: Tuesday, February 15, 2011 10:37 AM To: Hefty, Sean Cc: OpenFabrics EWG; Tziporet Koren Subject: Re: rping/cxgb3 regression On 02/15/2011 12:18 PM, Hefty, Sean wrote: I'm wondering if pulling the rping changes for ofed-1.5.3 would be ok? I guess to do this you would have to push a 1-off librdmacm without those changes? Or maybe back up what is in OFED- 1.5.3 to the previous release without this rping change? Thoughts? Is the commit (93635fa33b41d356fa096242fec4ce788194b42f) below the issue? (Btw, the author listed in my git tree is wrong.) Yes. I don't think I want to drop back to 1.0.13 for 1.5.3, so maybe reverting this change and pushing out 1.0.14.1 would work. There's just one other change after 1.0.14 at the moment, and it's to the build, so I'd skip a full release for now. Let me know if you think this would work. I just tested that removing this from 1.0.14 will resolve the issue for 1.5.3. - Sean --- librdmacm/rping: Make sure CQ event thread exits before destroying the CQ It is possible for the CQ event thread to poll the CQ after it has been destroyed which can result in a seg fault on T3 interfaces. This patch waits for the thread to exit before destroying the CQ. Signed-off-by: Steve Wisesw...@opengridcomputing.com Signed-off-by: Sean Heftysean.he...@intel.com diff --git a/examples/rping.c b/examples/rping.c index 2d4c2de..ee292ec 100644 --- a/examples/rping.c +++ b/examples/rping.c @@ -280,12 +280,11 @@ static int rping_cq_event_handler(struct rping_cb *cb) ret = 0; if (wc.status) { - if (wc.status != IBV_WC_WR_FLUSH_ERR) { + if (wc.status != IBV_WC_WR_FLUSH_ERR) fprintf(stderr, cq completion failed status %d\n, wc.status); - ret = -1; - } + ret = -1; goto error; } @@ -802,10 +801,9 @@ static void *rping_persistent_server_thread(void *arg) rping_test_server(cb); rdma_disconnect(cb-child_cm_id); + pthread_join(cb-cqthread, NULL); rping_free_buffers(cb); rping_free_qp(cb); - pthread_cancel(cb-cqthread); - pthread_join(cb-cqthread, NULL); rdma_destroy_id(cb-child_cm_id); free_cb(cb); return NULL; @@ -890,6 +888,7 @@ static int rping_run_server(struct rping_cb *cb) rping_test_server(cb); rdma_disconnect(cb-child_cm_id); + pthread_join(cb-cqthread, NULL); rdma_destroy_id(cb-child_cm_id); err2: rping_free_buffers(cb); @@ -1057,6 +1056,7 @@ static int rping_run_client(struct rping_cb *cb) rping_test_client(cb); rdma_disconnect(cb-cm_id); + pthread_join(cb-cqthread, NULL); err2: rping_free_buffers(cb); err1: ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] IB/core: Control number of retries for SA to leave an MCG
The worst thing is that there is no indication to the user about this state There can be multiple users of the same group. Only the last one leaving causes the leave request to be sent. Notifying that one user of the group doesn't help. All they can do is ask for the leave request to be retried anyway, which has to be coordinated with potential new users. (a host is joined without no one to ever try and make it leave) The patch I sent also puts a message in the kernel log so users can read and react. IMO, this is always a possibility and something that the SA must be able to handle. If a node, switch, link, etc. go down, there's no guarantee that any leave request will be generated, let alone make it to the SA. Architecturally, I think the only currently available option is for the SA to ask a client to reregister. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] user SA notifications, redux
As I mentioned earlier, the reason ib_sa acts as a single access point for SA/SM traps and notices is because traps and notices are sent to ports, not to queue pairs and not to processes. That means only one entity can be subscribed for notices and traps at any particular time, and must manage them, sharing them out among all processes that are interested in them. Can you provide a brief description of the intended usage model? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Splitting of the management git tree in Open Fabrics
With GA of OFED 1.5.2 scheduled for Sept 13, I would like to request comments from the community about the following split after that GA. On openfabrics.org/git split management.git into the following trees. openfabrics.org/git/infiniband-diags.git openfabrics.org/git/libibumad.git openfabrics.org/git/libibmad.git openfabrics.org/git/opensm.git I think this makes sense. Not sure that it would have any effect on the windows port, but nothing that I don't think couldn't be dealt with. Are there any source files shared between the different components? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED 1.5.2rc4 fails to build on RHEL4 Update 8 x86_64
Can you prepare new ibacm tarball with this patch? There's a ibacm-1.0.3.tar.gz file on the ofa server under downloads/rdmacm. I won't mark this as the official 1.0.3 release until we know that it works, but it seems to so far. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED 1.5.2rc4 fails to build on RHEL4 Update 8 x86_64
So how do you suggest to solve this? I think we cannot go to GA in this way I'll look at this and see what possibilities there are. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED 1.5.2rc4 fails to build on RHEL4 Update 8 x86_64
ibacm: support distros with older versions of gcc From: Sean Hefty sean.he...@intel.com ibacm implements atomics using gcc intrinsics that were introduced in gcc 4.1.2. If an older version of gcc is used to compile the code, an error results. Check that the required atomic calls are supported, and if not, provide our own implementation. This fixes a build issue on RH 5.x. Signed-off-by: Sean Hefty sean.he...@intel.com --- This is very lightly tested, but may fix the issue. configure.in | 10 ++ linux/osd.h | 32 ++-- src/acm.c |8 src/acme.c|4 src/libacm.c |4 windows/osd.h |1 + 6 files changed, 57 insertions(+), 2 deletions(-) mode change 100644 = 100755 configure.in diff --git a/configure.in b/configure.in old mode 100644 new mode 100755 index 997c775..dfddeac --- a/configure.in +++ b/configure.in @@ -39,6 +39,16 @@ AC_CHECK_HEADER(infiniband/umad.h, [], AC_MSG_ERROR([infiniband/umad.h not found. Is libibumad installed?])) fi +dnl Check for gcc atomic intrinsics +AC_MSG_CHECKING(compiler support for atomics) +AC_TRY_LINK([int i = 0;], +[ return __sync_add_and_fetch(i, 1) != __sync_sub_and_fetch(i, 1); ], +[ AC_MSG_RESULT(yes) ], +[ +AC_MSG_RESULT(no) +AC_DEFINE(DEFINE_ATOMICS, 1, [Set to 1 to implement atomics]) +]) + AC_CACHE_CHECK(whether ld accepts --version-script, ac_cv_version_script, if test -n `$LD --help /dev/null 2/dev/null | grep version-script`; then ac_cv_version_script=yes diff --git a/linux/osd.h b/linux/osd.h index 722e1b1..28c3647 100644 --- a/linux/osd.h +++ b/linux/osd.h @@ -65,9 +65,37 @@ #endif #define ntohll(x) htonll(x) +#if DEFINE_ATOMICS +typedef struct { pthread_mutex_t mut; int val; } atomic_t; +static inline int atomic_inc(atomic_t *atomic) +{ + int v; + + pthread_mutex_lock(atomic-mut); + v = ++(atomic-val); + pthread_mutex_unlock(atomic-mut); + return v; +} +static inline int atomic_dec(atomic_t *atomic) +{ + int v; + + pthread_mutex_lock(atomic-mut); + v = --(atomic-val); + pthread_mutex_unlock(atomic-mut); + return v; +} +static inline void atomic_init(atomic_t *atomic) +{ + pthread_mutex_init(atomic-mut, NULL); + atomic-val = 0; +} +#else typedef struct { volatile int val; } atomic_t; -#define atomic_inc(v) (__sync_fetch_and_add((v)-val, 1) + 1) -#define atomic_dec(v) (__sync_fetch_and_sub((v)-val, 1) - 1) +#define atomic_inc(v) (__sync_add_and_fetch((v)-val, 1)) +#define atomic_dec(v) (__sync_sub_and_fetch((v)-val, 1)) +#define atomic_init(v) ((v)-val = 0) +#endif #define atomic_get(v) ((v)-val) #define atomic_set(v, s) ((v)-val = s) diff --git a/src/acm.c b/src/acm.c index 7c8b84b..820365c 100644 --- a/src/acm.c +++ b/src/acm.c @@ -27,6 +27,10 @@ * SOFTWARE. */ +#if HAVE_CONFIG_H +# include config.h +#endif /* HAVE_CONFIG_H */ + #include stdio.h #include stdarg.h #include string.h @@ -268,6 +272,7 @@ acm_init_dest(struct acm_dest *dest, uint8_t addr_type, uint8_t *addr, size_t si memcpy(dest-address, addr, size); dest-addr_type = addr_type; DListInit(dest-req_queue); + atomic_init(dest-refcnt); atomic_set(dest-refcnt, 1); lock_init(dest-lock); } @@ -1560,6 +1565,7 @@ static void acm_init_server(void) lock_init(client[i].lock); client[i].index = i; client[i].sock = INVALID_SOCKET; + atomic_init(client[i].refcnt); } } @@ -2680,6 +2686,8 @@ int CDECL_FUNC main(int argc, char **argv) acm_log(0, Assistant to the InfiniBand Communication Manager\n); acm_log_options(); + atomic_init(tid); + atomic_init(wait_cnt); DListInit(dev_list); DListInit(timeout_list); event_init(timeout_event); diff --git a/src/acme.c b/src/acme.c index 7428a57..e03679f 100644 --- a/src/acme.c +++ b/src/acme.c @@ -27,6 +27,10 @@ * SOFTWARE. */ +#if HAVE_CONFIG_H +# include config.h +#endif /* HAVE_CONFIG_H */ + #include stdio.h #include stdlib.h #include string.h diff --git a/src/libacm.c b/src/libacm.c index 32fd7e2..9d56cd2 100644 --- a/src/libacm.c +++ b/src/libacm.c @@ -27,6 +27,10 @@ * SOFTWARE. */ +#if HAVE_CONFIG_H +# include config.h +#endif /* HAVE_CONFIG_H */ + #include osd.h #include libacm.h #include infiniband/acm.h diff --git a/windows/osd.h b/windows/osd.h index 10e5e18..9587c51 100644 --- a/windows/osd.h +++ b/windows/osd.h @@ -44,6 +44,7 @@ typedef struct { volatile LONG val; } atomic_t; #define atomic_dec(v) InterlockedDecrement((v)-val) #define atomic_get(v) ((v)-val) #define atomic_set(v, s) ((v)-val = s) +#define atomic_init(v) ((v)-val = 0) #define event_t HANDLE #define event_init(e)*(e) = CreateEvent(NULL, FALSE, FALSE, NULL) ___ ewg mailing list ewg@lists.openfabrics.org
Re: [ewg] OFED 1.5.2rc4 fails to build on RHEL4 Update 8 x86_64
An alternative might be to use /usr/include/asm/atomic.h as this has atomic_inc and atomic_dec macros already defined. I would also be more portable. This file doesn't exist on my systems. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] library updates for OFED 1.5.2
Vlad, Please pull in librdmacm-1.0.13 and ibacm-1.0.2 into OFED 1.5.2. Both are available from the OFA server: http://www.openfabrics.org/downloads/rdmacm/ librdmacm updates: mostly documentation updates with several new man pages fixes to set errno correctly ibacm updates: allows running as a daemon changes default location of input/log files Thanks, - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] git tree for ofed docs
Can someone point me to the git tree that contains the release notes/docs that get pulled into the OFED releases? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH] librdmacm: release notes
Signed-off-by: Sean Hefty sean.he...@intel.com --- rdma_cm_release_notes.txt | 117 + 1 files changed, 75 insertions(+), 42 deletions(-) diff --git a/rdma_cm_release_notes.txt b/rdma_cm_release_notes.txt index 977ebfa..07e4cab 100644 --- a/rdma_cm_release_notes.txt +++ b/rdma_cm_release_notes.txt @@ -1,7 +1,7 @@ Open Fabrics Enterprise Distribution (OFED) RDMA CM in OFED 1.5 Release Notes - December 2009 + July 2010 === @@ -10,8 +10,6 @@ Table of Contents 1. Overview 2. New Features 3. Known Issues -4. Fixed bugs since OFED 1.3 -5. Fixed bugs since OFED 1.4.2 === 1. Overview @@ -30,32 +28,80 @@ API for data transfers. === 2. New Features === -for OFED 1.3: -Added support for valgrind checks. - -Added quality of service support. Quality of service (QoS) is automatically -enabled through the use of the kernel rdma_cm, if the local subnet is -configured for QoS. Additionally, the librdmacm allows users to request -a specific type of service to use when connecting through a new API. -Support for QoS is fabric dependent, and usually configured by an -administrator. Details of QoS are outside the scope of this document; -additional information may be found in subnet management (SM) documentation. - -Added sanity checks and fixes for maximum outstanding RDMA operations in an -effort to detect application errors earlier in the connection process. - -Various documentation updates. - -for OFED 1.2: -The RDMA CM now supports connected, datagram, and multicast data transfers. - -When used over Infiniband, the RDMA CM will make use of a local path record -cache, if it is enabled. On large fabrics, use of the local cache can greatly -reduce connection time. Use of a cache is not necessary for iWarp. - -Man pages have been created to describe the various interfaces and test -programs available. For a full list, users should refer to the rdma_cm.7 man -page. +for OFED 1.5.2: + +Several enhancements were added to librdmacm release 1.0.12 that +are intended to simplify using RDMA devices and address scalability issues. +These changes were in response to long standing requests to make +connection establishment 'more like sockets'. For full details, +users should refer to the appropriate man pages. Major changes include: + +* Support synchronous operation for library calls. Users can control + whether an rdma_cm_id operates asynchronously or synchronously based on + the rdma_event_channel parameter. Use of synchronous operations + reduces the amount of application code required to use the librdmacm + by eliminating the need for event processing code. + + An rdma_cm_id will be marked for synchronous operation if the + rdma_event_channel parameter is NULL for rdma_create_id or + rdma_migrate_id. Users can toggle between synchronous and + asynchronous operation through the rdma_migrate_id call. + + Calls that operate synchronously include rdma_resolve_addr, + rdma_resolve_route, rdma_connect, rdma_accept, and rdma_get_request. + Synchronous event data is returned to the user through the + rdma_cm_id. + +* The addition of a new API: rdma_getaddrinfo. This call is modeled + after getaddrinfo, but for RDMA devices and connections. It has the + following notable deviations from getaddrinfo: + + A source address is returned as part of the call to allow the + user to allocate necessary local HW resources for connections. + + Optional routing information may be returned to support + Infiniband fabrics. IB routing information includes necessary + path record data. rdma_getaddrinfo will obtain this information + if IB ACM support (see below) is enabled. The use of IB ACM + is not required for rdma_getaddrinfo. + + rdma_getaddrinfo provides future extensions to support + more complex address and route resolution mechanisms, such as + multiple path support and failover. + +* Support for a new APIs: rdma_get_request, rdma_create_ep, and + rdma_destroy_ep. rdma_get_request simplifies the passive side + implementation by adding synchronous support for accepting new + connections. rdma_create_ep combines the functionality of + rdma_create_id, rdma_create_qp, rdma_resolve_addr, and rdma_resolve_route + in a single API that uses the output of rdma_getaddrinfo as its input. + +* Support for optional parameters. To simplify support for casual RDMA + developers and researchers, the librdmacm can allocate protection + domains, completion queues, and queue pairs on a user's behalf. + This simplifies the amount of information that
[ewg] [PATCH] ibacm: release notes
Signed-off-by: Sean Hefty sean.he...@intel.com --- ibacm_release_notes.txt | 144 +++ 1 files changed, 144 insertions(+), 0 deletions(-) create mode 100644 ibacm_release_notes.txt diff --git a/ibacm_release_notes.txt b/ibacm_release_notes.txt new file mode 100644 index 000..4048b39 --- /dev/null +++ b/ibacm_release_notes.txt @@ -0,0 +1,144 @@ +Open Fabrics Enterprise Distribution (OFED) + IB ACM in OFED 1.5 Release Notes + + July 2010 + + +=== +Table of Contents +=== +1. Overview +2. Quick Start Guide +3. Operation Details +4. Known Issues + +=== +1. Overview +=== +The IB ACM package implements and provides a framework for experimental name, +address, and route resolution services over InfiniBand. It is intended to +address connection setup scalability issues running MPI applications on +large clusters. The IB ACM provides information needed to establish a +connection, but does not implement the CM protocol. + +The librdmacm can invoke IB ACM services when built using the --with-ib_acm +option. The IB ACM services tie in under the rdma_resolve_addr, +rdma_resolve_route, and rdma_getaddrinfo routines. For maximum benefit, +the rdma_getaddrinfo routine should be used, however existing applications +should still see significant connection scaling benefits using the calls +available in librdmacm 1.0.11 and previous releases. + +The IB ACM is focused on being scalable and efficient. The current +implementation limits network traffic, SA interactions, and centralized +services. ACM supports multiple resolution protocols in order to handle +different fabric topologies. + +The IB ACM package is comprised of two components: the ib_acm service +and a test/configuration utility - ib_acme. Both are userspace components +and are available for Linux and Windows. Additional details are given below. + +=== +2. Quick Start Guide +=== + +1. Prerequisites: libibverbs and libibumad must be installed. + The IB stack should be running with IPoIB configured. + These steps assume that the user has administrative privileges. +2. Install the IB ACM package + This installs ib_acm, and ib_acme. +3. Run ib_acme -A -O + This will generate IB ACM address and options configuration files. + (acm_addr.cfg and acm_opts.cfg) +4. Run ib_acm and leave running. + ib_acm will eventually be converted to a service/daemon, but for now + is a userspace application. Because ib_acm uses the libibumad + interfaces, it should be run with administrative privileges. +5. Optionally, run ib_acme -s source_ip -d dest_ip -v + This will verify that the ib_acm service is running. +5. Install librdmacm using the build option --with-ib_acm. + The librdmacm will automatically use the ib_acm service. + On failures, the librdmacm will fall back to normal resolution. + +=== +3. Operation Details +=== + +ib_acme: +The ib_acme program serves a dual role. It acts as a utility to test +ib_acm operation and help verify if the ib_acm service and selected +protocol is usable for a given cluster configuration. Additionally, +it automatically generates ib_acm configuration files to assist with +or eliminate manual setup. + + +acm configuration files: +The ib_acm service relies on two configuration files. + +The acm_addr.cfg file contains name and address mappings for each IB +device, port, pkey endpoint. Although the names in the acm_addr.cfg +file can be anything, ib_acme maps the host name and IP addresses to +the IB endpoints. + +The acm_opts.cfg file provides a set of configurable options for the +ib_acm service, such as timeout, number of retries, logging level, etc. +ib_acme generates the acm_opts.cfg file using static information. A +future enhancement would adjust options based on the current system +and cluster size. + + +ib_acm: +The ib_acm service is responsible for resolving names and addresses to +InfiniBand path information and caching such data. It is currently +implemented as an executable application, but is a conceptual service +or daemon that should execute with administrative privileges. + +The ib_acm implements a client interface over TCP sockets, which is +abstracted by the librdmacm library. One or more back-end protocols are +used by the ib_acm service to satisfy user requests. Although
Re: [ewg] librdmacm 1.0.12: message size constrain?
I write a test code from your example. It works fine if message buffer size (BUF_SIZE) is smaller than 64 bytes. Is there some constrain? There should not be any constraint beyond whatever may be imposed by the device. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] Handling busy responses from the SA
Also, I guess, it would be a good API choice if the caller could say 'get me a reply for this mad or error within 60s' rather than specify details like retry counts, etc. The timeout values should be globally set and derived from the usual SA provided data for network transits... I agree with this. Within the framework of the existing umad ABI, this could be specified by setting the high bit in the ib_user_mad_hdr:timeout_ms field, assuming that no one is using that bit in practice. The kernel could then freely select the retry/timeout policy for these clients, which for starters could include dropping BUSY responses and adjusting the timeout using an approach similar to what Mike mentioned in a separate email. Kernel clients could be updated to use this new mode. Any disagreements to this approach? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] librdmacm 1.0.12 release notes for OFED 1.5.2
Here is the first cut at release notes -- attached and inline -- for the OFED 1.5.2 release of the librdmacm. - Sean --- librdmacm release notes --- Several enhancements were added to librdmacm release 1.0.12 that are intended to simplify using RDMA devices and address scalability issues. These changes were in response to long standing requests to make connection establishment 'more like sockets'. For full details, users should refer to the appropriate man pages. Major changes include: * Support synchronous operation for library calls. Users can control whether an rdma_cm_id operates asynchronously or synchronously based on the rdma_event_channel parameter. Use of synchronous operations reduces the amount of application code required to use the librdmacm by eliminating the need for event processing code. An rdma_cm_id will be marked for synchronous operation if the rdma_event_channel parameter is NULL for rdma_create_id or rdma_migrate_id. Users can toggle between synchronous and asynchronous operation through the rdma_migrate_id call. Calls that operate synchronously include rdma_resolve_addr, rdma_resolve_route, rdma_connect, rdma_accept, and rdma_get_request. Synchronous event data is returned to the user through the rdma_cm_id. * The addition of a new API: rdma_getaddrinfo. This call is modeled after getaddrinfo, but for RDMA devices and connections. It has the following notable deviations from getaddrinfo: A source address is returned as part of the call to allow the user to allocate necessary local HW resources for connections. Optional routing information may be returned to support Infiniband fabrics. IB routing information includes necessary path record data. rdma_getaddrinfo will obtain this information if IB ACM support (see below) is enabled. The use of IB ACM is not required for rdma_getaddrinfo. rdma_getaddrinfo provides future extensions to support more complex address and route resolution mechanisms, such as multiple path support and failover. * Support for a new APIs: rdma_get_request, rdma_create_ep, and rdma_destroy_ep. rdma_get_request simplifies the passive side implementation by adding synchronous support for accepting new connections. rdma_create_ep combines the functionality of rdma_create_id, rdma_create_qp, rdma_resolve_addr, and rdma_resolve_route in a single API that uses the output of rdma_getaddrinfo as its input. * Support for optional parameters. To simplify support for casual RDMA developers and researchers, the librdmacm can allocate protection domains, completion queues, and queue pairs on a user's behalf. This simplifies the amount of information that a developer must learn in order to use RDMA, plus allows the user to take advantage of higher-level completion processing abstractions. In addition to optional parameters, a user can also specify that the librdmacm should automatically select usable values for RDMA read operations. * Add support for IB ACM. IB ACM (InfiniBand Assistant for Communication Management) defines a socket based protocol to an IB address and route resolution service. One implementation of that service is provided separately by the ibacm package, but anyone can implement the service provided that they adhere to the IB ACM socket protocol. IB ACM is an experimental service targeted at increasing the scalability of applications running on a large cluster. Use of IB ACM is not required and is controlled through the build option '--with-ib_acm'. If the librdmacm fails to contact the IB ACM service, it reverts to using kernel services to resolve address and routing data. * Add RDMA helper routines. The librdmacm provide a set of simpler verbs calls for posting work requests, registering memory, and checking for completions. These calls are wrappers around libibverbs routines. rel-notes Description: rel-notes ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] ibacm 1.0.0 release notes for OFED 1.5.2
Here are release notes -- attached and inline -- for IB ACM 1.0.0 for OFED 1.5.2. - Sean --- Assistant for InfiniBand Communication Management (IB ACM) Note: The IB ACM should be considered experimental. Overview The IB ACM package implements and provides a framework for experimental name, address, and route resolution services over InfiniBand. It is intended to address connection setup scalability issues running MPI applications on large clusters. The IB ACM provides information needed to establish a connection, but does not implement the CM protocol. The librdmacm can invoke IB ACM services when built using the --with-ib_acm option. The IB ACM services tie in under the rdma_resolve_addr, rdma_resolve_route, and rdma_getaddrinfo routines. For maximum benefit, the rdma_getaddrinfo routine should be used, however existing applications should still see significant connection scaling benefits using the calls available in librdmacm 1.0.11 and previous releases. The IB ACM is focused on being scalable and efficient. The current implementation limits network traffic, SA interactions, and centralized services. ACM supports multiple resolution protocols in order to handle different fabric topologies. This release 1.0.0 is limited in its handling of dynamic changes. The IB ACM package is comprised of two components: the ib_acm service and a test/configuration utility - ib_acme. Both are userspace components and are available for Linux and Windows. Additional details are given below. Quick Start Guide - 1. Prerequisites: libibverbs and libibumad must be installed. The IB stack should be running with IPoIB configured. These steps assume that the user has administrative privileges. 2. Install the IB ACM package This installs ib_acm, and ib_acme. 3. Run ib_acme -A -O This will generate IB ACM address and options configuration files. (acm_addr.cfg and acm_opts.cfg) 4. Run ib_acm and leave running. ib_acm will eventually be converted to a service/daemon, but for now is a userspace application. Because ib_acm uses the libibumad interfaces, it should be run with administrative privileges. 5. Optionally, run ib_acme -s source_ip -d dest_ip -v This will verify that the ib_acm service is running. 5. Install librdmacm using the build option --with-ib_acm. The librdmacm will automatically use the ib_acm service. On failures, the librdmacm will fall back to normal resolution. Details --- ib_acme: The ib_acme program serves a dual role. It acts as a utility to test ib_acm operation and help verify if the ib_acm service and selected protocol is usable for a given cluster configuration. Additionally, it automatically generates ib_acm configuration files to assist with or eliminate manual setup. acm configuration files: The ib_acm service relies on two configuration files. The acm_addr.cfg file contains name and address mappings for each IB device, port, pkey endpoint. Although the names in the acm_addr.cfg file can be anything, ib_acme maps the host name and IP addresses to the IB endpoints. The acm_opts.cfg file provides a set of configurable options for the ib_acm service, such as timeout, number of retries, logging level, etc. ib_acme generates the acm_opts.cfg file using static information. A future enhancement would adjust options based on the current system and cluster size. ib_acm: The ib_acm service is responsible for resolving names and addresses to InfiniBand path information and caching such data. It is currently implemented as an executable application, but is a conceptual service or daemon that should execute with administrative privileges. The ib_acm implements a client interface over TCP sockets, which is abstracted by the librdmacm library. One or more back-end protocols are used by the ib_acm service to satisfy user requests. Although the ib_acm supports standard SA path record queries on the back-end, it provides an experimental multicast resolution protocol in hope of achieving greater scalability. The latter is not usable on all fabric topologies, specifically ones that may not have reversible paths. Users should use the ib_acme utility to verify that multicast protocol is usable before running other applications. Conceptually, the ib_acm service implements an ARP like protocol and either uses IB multicast records to construct path record data or queries the SA directly, depending on the selected route protocol. By default, the ib_acm services uses and caches SA path record queries. Specifically, all IB endpoints join a number of multicast groups. Multicast groups differ based on rates, mtu, sl, etc., and are prioritized. All participating endpoints must be able to communicate on the lowest priority multicast group. The ib_acm assigns one or more names/addresses to each IB endpoint using the acm_addr.cfg file. Clients provide source and destination names or addresses as input to the service, and receive as
Re: [ewg] [PATCH] Handling busy responses from the SA
This ensures that naïve IB applications cannot overwhelm the SA with queries, which could happen when a cluster is being rebooted, or when a large HPC application is started. I don't object to the concept of treating a busy response as a timeout, but how does this help prevent overwhelming the SA? It continues to retry the queries, even if the SA says that it's too busy to respond without adjusting the timeout specified by the user. I would think that you'd at least want to adjust the timeout (double it or use some random backoff). The general guideline that we've been using for adjusting timeouts has been to report the failures and let the caller make the a necessary adjustments. As far as I know, the only way for user space applications to query the SA are through the librdmacm, which sets retries to 0, or through the libibumad interface directly. I would expect any application using the latter to be intelligent enough to handle a busy response. Maybe we should re-think that guideline and allow users to simply indicate that the MAD layer should use reasonable defaults. This would enable the ib_mad module to adjust the timeout values for all consumers based on actual destination response times. It could also back off retrying multiple requests that were initiated around the same time, instead only retrying the first request, while simply increasing the timeout values for the others. This is more complex, but we should be able to start with something fairly simple. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] Handling busy responses from the SA
A common method for handling this sort of thing is to randomize the retry timeout. It would be a good idea to randomize all timeouts, but the BUSY replies should probably randomize over a longer time period. Randomization prevents nodes in the cluster from self-synchronizing and making the load on the SA worse. I agree that randomization would be nice, but I think we want even more than that. Part of the issues that we've seen with the current implementation is that when a large HPC job starts, everyone and their dog sends the SA a query. These time out around the same time and get resent, and the SA ends up processing a huge number of duplicates. The mad layer could be a lot more intelligent and avoid sending more than a handful (1?) of retries (or even initial requests) at a time until some complete. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [ANNOUNCE] OFED 1.5.2 rc1 is available
Known issues: = librdmacm-1.0.12 compilation fails on RHEL4.x Sean, When will you fix the RHEL 4.x compilation issue? I just saw the mail about it this morning. I'll look at it today. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH] librdmacm: support 2.6.9
Redhat 4.x is based on 2.6.9. Add support for older kernels. Signed-off-by: Sean Hefty sean.he...@intel.com --- This should fix the OFED build errors on RH 4.x. When testing this on a RH 4.x system, I noticed additional build warnings on 32-bit systems. I'll add a fix for these warnings separately. include/infiniband/ib.h | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/include/infiniband/ib.h b/include/infiniband/ib.h index 3a97322..2e5029a 100644 --- a/include/infiniband/ib.h +++ b/include/infiniband/ib.h @@ -43,6 +43,16 @@ #define PF_IB AF_IB #endif +#ifndef __be16 +#define __be16 __u16 +#endif +#ifndef __be32 +#define __be32 __u32 +#endif +#ifndef __be64 +#define __be64 __u64 +#endif + struct ib_addr { union { __u8uib_addr8[16]; ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] librdmacm: support 2.6.9
This should fix the OFED build errors on RH 4.x. When testing this on a RH 4.x system, I noticed additional build warnings on 32-bit systems. I'll add a fix for these warnings separately. There's a daily build of the librdmacm which should fix both of these issues. The build is at: openfabrics.org/home/shefty/src/topdir/SOURCES/ Let me know if you need anything else. I'll release 1.0.13 closer to the actual OFED 1.5.2 release, once any other bug fixes have been incorporated. - Sean ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.5.1 failure over iWarp
I'm just now getting around to this, and iwarp seems to be 100% dead on ofed-1.5.1. I see the same error running mvapich2 and rping. Has anybody looked into why? I will be looking at this today. Has anyone tested iwarp against the upstream kernel lately? 2.6.32 or 2.6.33? Woody is suspecting that the IPv6 patches may be a contributing factor, which were pulled into 1.5.1. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg