Re: [ewg] EWG/OFED meeting June 7, 2010 meeting minutes
Tziporet - The fix so PSM doesn't try and build on unsupported systems was submitted last week, so that should be covered. I was a bit surprised to see it raised as an issue here, but you might not have had a chance to catch up. Also, we have done some testing on OFED 1.5.2, including with PSM, and that is going well. - Betsy On Tue, 2010-06-08 at 01:44 -0700, Tziporet Koren wrote: Meeting summary: 1. OFED 1.5.2 progress as planned 2. We plan to have RC2 on Thursday this week (Jun 10, 2010) Meeting details: 1. OFED 1.5.2 - features status - Add new OSes: - RHEL 5.5 - done - SLES11 SP1 - Jeff Backer volunteered to do it - Add RHEL6 beta - done - Update the management package - new package was provided (not final) - Update with new libibverbs 1.1.4 from Roland - on work - Add-on packages that does not touch the core: - Qlogic wish to add PSM library - Need to fix PSN library not to build on systems that are not supported: ia64 and PPC - Betsy should be responsible for this - New libehca tarball - done - iWarp Multicast Acceleration (IBV_QPT_RAW_ETH) - done - Add IBV_QPT_RAW_ETH for mlx4 - Voltaire - with in discussion between V Mellanox. Moni to coordinate change for the nes driver on RAW Eth QP - ACM - Sean - done - uDAPL package with bug fixes - better support for RoCE - done - SDP Zcopy in GA - on work - toward completion - Critical bug fixes - ongoing 2. OFED 1.5.2 testing status - all Voltaire will start more testing after RC2 Intel - Woody - continue testing - all good so far Nes - have one issue that they should fix IBM - not much testing. Will start on RC2 Qlogic - no info HP - not testing Mellanox - regression is running, focused on SDP Open: Is anyone interested to add more kernel.org support beyond 2.6.32 we already support 3. Schedule: Beta - May 3 - done - used in the interop. Report on OFED issues will be provided. RC1 - May 31 - done RC2 - Jun 10 RC3 - Jun 22 GA - Jun 29 Tziporet Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] EWG/OFED meeting today - I am not feeling well
What are the critical topics for the meeting today? Do we need to have one, or can we wait till the next one? If there's a list of topics, I can chair the meeting. Otherwise, I suggest we cancel. - Betsy -Original Message- From: ewg-boun...@openfabrics.org [mailto:ewg-boun...@openfabrics.org] On Behalf Of Tziporet Koren Sent: Monday, May 10, 2010 8:18 AM To: OpenFabrics EWG Subject: [ewg] EWG/OFED meeting today - I am not feeling well Hi all, I am not feeling well so I will not be able to participate and manage the EWG meeting today If someone else is whiling to take the lead today please handle the meeting without me. Sorry for the late notice Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] EWG/OFED meeting today [CANCELLED] - I am not feeling well
Given there was no response to agenda item request, and no one has dialed in, the EWG meeting is cancelled today. - Betsy -Original Message- From: ewg-boun...@openfabrics.org [mailto:ewg-boun...@openfabrics.org] On Behalf Of Tziporet Koren Sent: Monday, May 10, 2010 8:18 AM To: OpenFabrics EWG Subject: [ewg] EWG/OFED meeting today - I am not feeling well Hi all, I am not feeling well so I will not be able to participate and manage the EWG meeting today If someone else is whiling to take the lead today please handle the meeting without me. Sorry for the late notice Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] qperf maintainer
I spoke with Johann George, and he confirmed that he still owns qperf. Much of the work can be done on the TCP stack. The one place where he will occasionally need help is for RDMA specific issues, as he doesn't have that equipment readily available to him. We've worked with him to resolve a couple of RDMA related issues, and I know he welcomes working with others. - Betsy From: ewg-boun...@openfabrics.org [ewg-boun...@openfabrics.org] On Behalf Of Steve Wise [sw...@opengridcomputing.com] Sent: Monday, April 12, 2010 8:41 AM To: OpenFabrics EWG Subject: [ewg] qperf maintainer Does Johann George still own qperf? If not, who does? Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] What's remaiing to GA OFED 1.4.2?
Vladimir - thanks, that sounds good. Will this final build also remove the user visible references to RC2 (they seemed still to be there in yesterday's build) and refer to itself as OFED 1.4.2 GA? Thanks, Betsy On Tue, 2009-08-04 at 01:19 -0700, Vladimir Sokolovsky wrote: Betsy Zeller wrote: It looks as though Jon submitted his changes last week, and has hopefully had time to test them. What still remains to be done so we can release OFED 1.4.2 as GA? Thanks, Betsy Hi Betsy. I built OFED-1.4.2-20090802-0245.tgz with latest patches and we are running regression tests on it. We should finish till Thursday and if no new issues comparing to OFED-1.4.1 will be found, I will release OFED-1.4.2. Hope this schedule is OK with you. Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] What's remaiing to GA OFED 1.4.2?
It looks as though Jon submitted his changes last week, and has hopefully had time to test them. What still remains to be done so we can release OFED 1.4.2 as GA? Thanks, Betsy ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ewg] Meeting today? (EOM)
Trying to get on, but I don't think any of us has the administrators meeting number. Do you have it? -Betsy From: ewg-boun...@lists.openfabrics.org [ewg-boun...@lists.openfabrics.org] On Behalf Of Jeff Squyres (jsquyres) [jsquy...@cisco.com] Sent: Monday, July 27, 2009 8:42 AM To: ewg@lists.openfabrics.org Subject: Re: [ewg] Meeting today? (EOM) Yes. I have not been doing my mon morn announcement anymore because with the details in the outlook invite, my reminders serverd little purpose. Also, I can't be there today. :) -jms Sent from my PDA. No type good. From: ewg-boun...@lists.openfabrics.org ewg-boun...@lists.openfabrics.org To: ewg@lists.openfabrics.org ewg@lists.openfabrics.org Sent: Mon Jul 27 11:34:31 2009 Subject: [ewg] Meeting today? (EOM) ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Notes from EWG meeting 07/27/09
Here's the notes from today's EWG meeting. (7/27/09) 1) There are (at least) two backport related build breakages. QLogic will fix the VNIC build breakage this week. Oracle has not responded on a plan to fix the RDS build breakages. 2) Bill Boas will follow up with Andy Grover at Oracle to find out about future support/development plans for RDS, and will report back to the group. 3) Jack from Mellanox will follow up with Vladimir to remove unsupported distros from OFED 1.5. 4) Jon Mason has asked Vladimir to pull the fix for bug #1678. He'll then test with the nightly build, and verify the fix. Once he's gotten back to us on his results, and others have had a chance to test, we can declare GA for OFED 1.4.2. 5) In her message last week, Tziporet reminded us all that the folks from the distros would really like to see all user level libraries and services do their own releases, rather than just making pull requests, and using git tags to mark them. We'll follow up in the next meeting to get the list of which user level software may need to add this style of packaging. 6) We had a short discussion as to how we might improve the testing for new, non-driver features, but no conclusion was reached. This will be on the agenda for a future meeting. We'll also discuss in the next meeting what's remaining before we can declare feature freeze. Anyone on the call this morning - please add or update this list if you think I missed anything. - Betsy -- Betsy Zeller Director of Software Engineering NSG InfiniBand Engineering QLogic Corporation 205 Ravendale Mountain View, CA, 94043 1-650-934-8088 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] infinipath rpms???
Rafaella - you can download rpms from the QLogic website to install on top of OFED 1.4 which provide support for PSM, and all the PSM-enabled MPIs. I'll give you the details off-line. - Betsy On Thu, 2009-06-25 at 19:03 -0700, rdau...@ucla.edu wrote: Hello All, I need to build, in a shared cluster location, openMPI 1.3.2 with PSM support (on top of OFED 1.4.1) since some of the nodes on our mixed fabric IB cluster have Qlogic InfiniPath_QLE7240 HCAs. I have noticed that the OFED 1.4.1 stack does not contain the infinipath rpms such as: infinipath-2.3-5347.923_rhel5_qlc.x86_64.rpm, infinipath-libs-2.3-5347.923_rhel5_qlc.x86_64.rpm, and mpi-devel-2.3-5347.923_rhel5_qlc.noarch.rpm and I think that without these rpms you can't build openmpi with PSM support. Are you not supporting the Qlogic InfiniPath_QLE7240 HCAs? Can I use the RPMs generated with the qlogic provided QLogicIB-Basic.4.4.1.0.37 package INSTALL script (even though they refer to the OFED 1.4 stack and the openmpi 1.2.8)? Please notice that without the PSM support openMPI on qlogic InfiniPath_QLE7240 cards fails on AlltoAll MPI calls (see my previous message need HELP with: error polling LP CQ with status RETRY EXCEEDED ERROR status number 12). While the PSM support takes care of the problem. Thanks, Raffaella. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [Fwd: [Bug 1596] openibd stop failed when nfs is loaded]
Jeff - it would be great if you could work on this. I don't think we can ship with this bug open. Thanks, Betsy On Thu, 2009-05-07 at 10:01 -0700, Jeff Becker wrote: Hi all. I haven't heard any further discussion on this. Should we fix this for 1.4.1? I'm happy to work on this. Thanks. -jeff Jeff Becker wrote: Jon Mason wrote: On Wed, May 06, 2009 at 09:28:49AM -0700, Jeff Becker wrote: Hi Jon. I can take a crack at this, or do you have a solution already in mind? Thanks. The issue is whether or not the user will unmount the NFS dirs before stopping the openibd. If it is an NFSRDMA mount, they'll have to anyway. So I do not see the problem. I think the problem is that the dependence is there even if they are running plain NFS (not RDMA) and IB, and (for whatever reason) the OFED NFS modules got built (replacing the standard ones). That's what needs to be fixed. -jeff -jeff Original Message Subject: [Bug 1596] openibd stop failed when nfs is loaded Date: Tue, 5 May 2009 20:47:09 -0500 From: bugzilla-dae...@lists.openfabrics.org bugzilla-dae...@lists.openfabrics.org To: Becker, Jeffrey C. (ARC-TN)[COMPUTER SCIENCES CORPORATION] jeffrey.c.bec...@nasa.gov https://bugs.openfabrics.org/show_bug.cgi?id=1596 betsy.zel...@qlogic.com changed: What|Removed |Added Severity|major |critical Priority|P3 |P2 --- Comment #10 from betsy.zel...@qlogic.com 2009-05-05 18:47 --- I know we discussed this in Monday's meeting, but it is a really bad issue. Although we don't expect customers to often reload their driver, it does happen sometimes, and forcing them to reboot isn't really a reasonable workaround. It also makes testing almost unmanageable. I believe we have to fix this for 1.4.1, so I'm raising the priority and severity on this. -Betsy -- Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Notes from Sonoma OFED/Distros BOF
There were several discussions at Sonoma covering OFED, and how we can improve the integration of OFED into the distros. Here are the notes from the BOF we ran. Other BOF attendees, please add anything I missed. -- Proposals people offered included: 1) Don’t ship MPIs with OFED (need to get buyin from MVAPICH team) 2) Ensure that user level libraries and ship in OFED reflect what is in the maintainers version of the code 3) If a HW vendor ships a slightly different version of SW from what is in OFED release, give it a different release identifier, preferably one which indicates the company making the change, so there are no conflicts. 4) If the contents of any tarball changes, the version number of that tarball needs to change as well. The fact that this is not universally true makes it much more difficult for the distro vendors to make sure they have the right software in their release. 5) One person suggested that we package the userland and kernel components of OFED, *minus* the hardware drivers. That way, any vendor could take a tested version of OFED, and ship a new release of their driver to go with it. 6) There are 231 bugs open against pre OFED 1.4.1 releases. If you opened a bug on a previous release, please check to see if it is there with OFED 1.4.1. If it isn't, please close the bug. If it is, please update the bug to indicate that it is still there. Actions we agreed on included: 1) Identify all user components which are not packaged as tarballs and notify them that it will be a condition of inclusion in OFED 1.5 that they are packaged as tarballs. 2) Ensure that each of the tarballs for OFED 1.5 has a unique version number, and that version numbers are updated appropriately as the tarball contents change. 3) Ensure that each use level component that depends on specific kernel feature actually checks for the existence of that kernel feature, and politely declines to compile/install if that component is not available. 4) Investigate making the OFED install work as non-root – Doug has offered tutorials using BUILD_ROOT and chroot mechanisms to do non-root builds. Betsy ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED Jan 5, 2009 meeting minutes on OFED plans
Given that we GA'd OFED 1.3 on Dec 10/08, I'm pretty uncomfortable with scheduling our OFED 1.4 release almost a full 8 months later. If I remember right, our goal is to have two OFED releases a year. How about a schedule as follows: Feature Freeze: 3/25/09 Alpha Release: 3/25/09 Beta Release: 4/20/09 RC1: 4/06/09 RC2: 4/20/09 RC3: 5/04/09 RC4: 5/18/09 RC5: 6/01/09 RC6: 6/15/09 Release: 6/28/09 Or, we could be somewhat more firm about what constitutes a feature freeze, make that later, and go with less RCs. But, one way or another, I think we really need to tighten up the schedule. Betsy On Tue, 2009-01-06 at 23:41 +0200, Tziporet Koren wrote: OFED Jan 5, 2009 meeting minutes on future plans = Meeting minutes on the web: http://www.openfabrics.org/txt/documentation/linux/EWG_meeting_minutes Meeting Summary: == 1. OFED 1.4.1 release: We will look into it on middle of February 2. OFED 1.5: - Release target date is July 09 - Kernel base will be 2.6.29 - Features list - not closed yet Details: == 1. Conclusions from OFED 1.4 release: We got several comments from Doug (Redhat): - ammaso driver is not compiling - we should remove it from the kernel sources since its not compile - IBM ehca driver - does not compile on RHEL4 - Improvements for the build system - grep for undefined functions and report this since they will fail when trying to load the module Other conclusions: - Real feature freeze is only after we are on the right kernel base - We usually have at least 6 RCs - need to plan for this on the schedule 2. Do we wish to have OFED 1.4. 1: Pros: We can add the following features: - RDS iWARP - Steve Wise - NFS/RDMA backports - Steve Wise - Open MPI 1.3 - Jeff S. - Supporting new OSes: RHEL 5.3, SLES 11 - Fixes of fatal bugs - if found Cons: - Having 1.4.1 release will delay 1.5 schedule for one month or more. - A large QA effort. Decision: Since no fatal bugs were reported so far, we decide to revisit this in middle of Feb. Meanwhile Steve can work on RDS iWARP support and NFS/RDMA backports on the 1.4.1 branch. People that wish to work with Open MPI 1.3 can download it from its site. 3. OFED 1.5: Schedule and features. Schedule: Release is planned for July. (assuming no 1.4.1 release) Feature Freeze: 4/20/09 Alpha Release:4/20/09 Beta Release: 5/20/09 RC1: 5/05/09 RC2: 5/19/09 RC3: 6/02/09 RC4: 6/16/09 RC5: 6/30/09 RC6: 7/14/09 Release: 7/28/09 Features: * Kernel.org: 2.6.28 and 2.6.29 * Multiple Event Queues to support Multi-core CPUs * NFS/RDMA - GA * RDS support for iWARP * OpenMPI 1.3 * Add support/backports for RedHat EL 5.3 and EL 4.8, SLES 11 * Support for Mellanox vNIC (EoIB) and FCoIB with BridgeX device * SDP - performance improvements * Mellanox suggested to add IB over Eth - this is similar to iWARP but more like IB (e.g. including UD), and can work over ConnectX. A concern was raised by Intel (Dave Sommers) since it is not a standard transport. Decision: This request will be raised in the MWG, and they should decide if OFA can support it. More discussions on 1.5 features will be done in next meeting. Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED Nov 3 2008 meeting summary on OFED 1.4 status
Vlad - Yannick Cote will check in a fix tonight for 1326, plus some other driver fixes. Our internal testing indicates that this will also fix 1283. We do not currently have a fix for 1242, but are continuing to work on it. Given that IBMs fix isn't scheduled to come in till tomorrow, and that I haven't yet seen a response from the folks at Voltaire, I'd recommend holding off RC4 till Monday, Nov 10. Thanks, Betsy On Wed, 2008-11-05 at 21:29 +0200, Vladimir Sokolovsky wrote: Meeting Summary: == RC4 is delayed - will be released on Thursday Nov 6. Details: === Bugs to be fixed in RC4: 1283blocker P1RHEL 5 [EMAIL PROTECTED] NEW Intel MPI fails on Qlogc HCA 1326blocker P1RHEL 4 [EMAIL PROTECTED] NEW ipath driver fails to build on IA64 in the 10/28/08 daily build 1335major P3Other [EMAIL PROTECTED]NEW Bonding: packet lost during failover 1301major P3RHEL 4 [EMAIL PROTECTED]NEW Can not load rds module on RH4 up7 1323blocker P1All[EMAIL PROTECTED] REOPENED IB/ehca: possibillity of kernel panic under certain circumstances 1242critical P2RHEL 4 [EMAIL PROTECTED] NEW kernel panic while running mpi2007 against ofed1.4 -- ib_ipath: ipath_sdma_verbs_send 1336critical P1RHEL 5 [EMAIL PROTECTED] NEW Can't to unloading the mlx4_ib module on ppc64 Hi all, I see that the number of critical issues did not decreased. Do you think we should delay RC4 till Monday Nov 10 or you expect these issues will be fixed by tomorrow? Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED-1.3.2-20080728-0355.tgz issues (now talking about warnings)
In last Monday's OFED meeting, we talked about the importance of resolving the 1800+ warnings currently generated when the nightly build is installed/built on an RHEL system. SLES seems just as bad. This many warnings makes it pretty much impossible to note when a critical one appears - in fact, it's pretty likely that there are important ones we are already missing. Also, we talked in Sonoma about all the things we needed to do to truly make OFED an Enterprise release. This is one of them. Can the owners of the various components take a stab at getting rid of their warnings, both for SLES and RHEL? It would help us all to be able to look clearly at what's left when the ones that are just noisy have been cleaned up. The target we discussed in Monday's meeting was to get these cleaned up by the OFED 1.4 RC3 release. Let's give it a try. - Betsy On Thu, 2008-09-11 at 23:42 +0300, Eli Cohen wrote: Whether it showed up as a problem or not, the compiler warning was an array out of bounds warning. That's not the type of compiler warning that should be ignored as it almost always points to a bug. I guess I was a little surprised that even though the compile tests on the kernel pass, that you guys allow that type of warning to go unchecked. I make a habit out of reviewing kernel compile warnings on the code I maintain and, when possible, I fix all the warnings just so things like this get caught. I totally agree with you; this warning must have lost in the multitude of compiler messages. Anyway, I think I prefer this fix to the same problem: Index: ofed_kernel-fixes/drivers/infiniband/ulp/ipoib/ipoib_verbs.c === --- ofed_kernel-fixes.orig/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 2008-09-08 13:07:02.0 +0300 +++ ofed_kernel-fixes/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 2008-09-08 13:08:41.0 +0300 @@ -234,7 +234,7 @@ int ipoib_transport_dev_init(struct net_ if (i UD_POST_RCV_COUNT - 1) priv-rx_wr_draft[i].next = priv-rx_wr_draft[i + 1]; } - priv-rx_wr_draft[i].next = NULL; + priv-rx_wr_draft[UD_POST_RCV_COUNT - 1].next = NULL; if (ipoib_ud_need_sg(priv-max_ib_mtu)) { for (i = 0; i UD_POST_RCV_COUNT; ++i) { What do you think? If you're going to keep the setting of the last item to NULL outside the loop, then you can also remove the if inside the loop as you'll just overwrite the last entry when you exit the loop. Well, yes, but then I am going to reference an entry outside the bounds of the array, which we want to prevent in the first place. priv-rx_wr_draft[i].next = priv-rx_wr_draft[i + 1]; without the if, priv-rx_wr_draft[i + 1] refrence priv-rx_wr_draft[UD_POST_RCV_COUNT]. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg -- Betsy Zeller Director of Software Engineering HSG InfiniBand Engineering QLogic Corporation 2071 Stierlin Court, Suite 200 Mountain View, CA, 94043 1-650-934-8088 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] RE: List of libraries in OFED
I've gotten a lot of feedback that the division of libraries into private and public is quite helpful. From what I've heard, there are currently applications using: - libopensm - libosmcomp - libosmvendor - libibcommon Now that it is well understood that these libraries are intended to be private, developers can move away from using them. But, in the meantime it would be helpful if any major planned changes in these could be posted to the list. I've also heard it suggested that it would be easier to avoid some issues with private libraries if they were not in the standard compiler search path. There are pros and cons to deciding to move them, but I thought I would mention the suggestion. Thanks for putting together and publishing the list. - Betsy On Thu, 2008-07-17 at 14:07 +0300, Tziporet Koren wrote: This is a reminder to review the OFED public libraries. Specific Qlogic requested to review it before I publish it on the web Thanks Tziporet Public: === * libdat2 * libdat * libibcm * libibverbs * libibmad * libibumad * libsdp * librdmacm Private: * libcxgb3 * libehca * libipathverbs * libmlx4 * libmthca * libnes * libibdmcom * libdaplcma * libdaplofa * libibdm * libibis * libibmscli * libopensm * libosmcomp * libosmvendor * libosmvendor_openib * libumad2sim * libibcommon ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Compatibility in OFED
Sasha - one of the things we discussed at the Sonoma conference was evolving our development process to emphasize the Enterprise in Open Fabrics Enterprise Distribution. Part of that is minimizing disruption when new versions of OFED are installed. We want customers to feel comfortable upgrading to new versions of OFED, and that's not going to be true if they get surprised by other (previously working) software breaking after installation. What John is asking for here (at least deliver both versions of the library for some period of time, and preferably don't break backward compatibility at all) seems pretty reasonable from the user's perspective, though I acknowledge that actually doing this requires extra effort. Separately, we should discuss how me manage version changes - introducing a version change in the middle of the RCs seems a bit late in the process. Regards, Betsy On Mon, 2008-06-02 at 23:17 +0300, Sasha Khapyorsky wrote: On 09:55 Mon 02 Jun , John Russo wrote: In OFED 1.3rc2 we noticed that libosmcomp's version had changed from 1 to 2. Unfortunately this has caused forward compatibility problems for existing applications which were compiled against OFED 1.2.5.1. It would be preferred if when such upgrades occur, both the .1 and .2 version of the library are provided such that existing applications do not need to be recompiled nor reinstalled. What is the problem to recompile? Anyway you can keep old version if you like. Alternatively limiting the necessity for such library version changes (never break old interfaces) would be preferred. It is hard to promise for 100%. libosmcomp is not a great candidate for stable API (it is even not documented). I'm sure that changes were done for a reason and obviously it was published/discussed on the list. Sasha ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Notes from Improve the OFED Development Process at Sonoma
Here are the notes from the Improve the OFED Development Process session at Sonoma, including issues, proposals, questions and comments, collected from the session. I wanted to get the notes out in front of everyone before we forget what we actually talked about. Once people have had a chance to review them, we can follow up and determine how to execute. (Note, I'm out of town and away from email till next Monday, so hopefully other panelists will respond to questions and comments - no, I didn't plan it that way!) - Betsy Issues 1) How do we ensure we achieve enterprise quality? PROPOSALS: a) Evaluate features early in the process, so everyone is aware of potential impact - eg, API changes, performance impact on other components, (NOTE: we need a checklist) b) Make it clear to all participants that OFED is not intended to be an experimental release. Components can be marked as tech preview, but it is not OK to put experimental changes in already released protocols. c) Authors of changes to ULPs, or SW that may affect many HW vendors should submit a list of potential affected HW, along with a list of which HW they have tested on. 2) What do we do to minimize slippage of release date? PROPOSALS: a) Clear definition of feature vs bugfix b) Clear statement and communication of feature freeze date c) Review proposed features, including what impact they might have on other components, such as drivers or ULPs d) Especially for late bug fixes, the implementor has to be responsible for making sure the change doesn't impact other SW components, or other vendors hw. PROPOSAL: Make OFED a date driven release. Implies that for last period (4 weeks?) of release, only specific bug fixes are allowed. 3) Process for fast turnaround, single vendor bugfix. DOCUMENT on Open Fabrics webpage: - submit patch - get nightly build - vendor tests - all recognize that sub-minor (4 release numbers) means that this is for one vendor - or, can we actually use something like: 1.3.1.1.q, to indicate a bugfix release from QLogic (NOTE: need decision) - fix needs to be rolled into next point release - Download page from openfabrics.org has only fully qualified, tested releases 4) QUESTION: Does kernel code always need to come through kernel.org? PROPOSALS: a) SDP is an exception, as existing non-conforming - it will not be submitted to kernel.org b) IB kernel developers will try to get changes into kernel first, to be pulled down into OFED. If they miss the kernel train, changes should be submitted to next available kernel. Developers should try to make sure their changes are in Roland's queue, even if they are not actually in the kernel. c) Components (other than SDP) which are not currently in the kernel should be submitted there. (eg, RDS - what else?) QUESTION: What happens if a component is not accepted to kernel? - if it is because of code style/quality - fix it - if not considered appropriate feature for kernel - discuss it, but how do we resolve it? Note that we really don't want a component which is not in the kernel, and which has no long term owner - other possible reason for rejection is that it is legally inappropriate, that is, either wrong license/copyright, or disputed patent, or something similar. This should not be taken into OFED. QUESTION: How do we make sure fix-up patches get submitted to the upstream kernel? COMMENT: All can benefit from the kernel review process. COMMENT: We don't want to end up in a situation where the same functionality is handled one way in OFED, and another way in the kernel 5) Improve Housekeeping PROPOSAL: Review licenses/copyrights for correctness 6) Enabling Distros QUESTION: Is it true that RHEL doesn't take all of OFED 1.3, but instead pulls directly from the maintainers? (NOTE: Need followup) QUESTION: Is it true that RHEL and SUSE write their own backports, rather than using the OFED ones? (concern for us because of testing) QUESTION: How, exactly, are RHEL and SUSE distros put together? eg, does the distro just pull IB kernel support from the relevant kernel, rather than taking OFED kernel components? QUESTION: How do we coordinate OFED and distro releases, to maximize opportunityfor most recent OFED release to go in newest distro release? PROPOSAL for packaging change: a) Aggregate kernel patches and modules into one package b) User code - use tar-balls and sample RPM spec files for releases and RCs - use git (for daily builds) and pull script - distributors roll their own 7) Planning for interoperability events PROPOSAL: Do interoperability initial testing on final RC (only expected changes are interop changes), and do formal testing on GA. COMMENTS: - the best way to reach enterprise quality is to submit everything through the kernel, because of their rigorous review process - note that submitting to Roland's queue does not automatically mean the change goes into OFED - you have to specifically request that it go in.
Re: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness
One of the things driving the OFED 1.3 date is that OFED 1.3 has to be released before the Plugfest, which starts on March 10. I can deal with slipping OFED GA date to Feb 25, but I really don't think we should let it slip into March. How confident are the developers that, if they get the extra week, there won't be further slippage? Doug - thanks very much for letting us know the plans for RHEL5 U2 - it's great news that OFED 1.3 (final release) will be included. - Betsy On Wed, 2008-01-30 at 18:40 +0200, Tziporet Koren wrote: Doug Ledford wrote: Hmmm...I'd like to put my $.02 in here. I don't have any visibility into what drives the OFED schedule, so I have no clue as to why people don't want to slip the schedule for this change. I'm sure you guys have your reasons. However, I also happen to be a consumer of this code, and I know for a fact that no one has gotten my input on this issue. So, the deal is that I'm currently integrating OFED 1.3 into what will be RHEL5.2. The RHEL5.2 freeze date has already passed, but in order to keep what finally goes out from being too stale, I'm being allowed to submit the OFED-1.3-rc1 code prior to freeze, and then update to OFED-1.3 final during our beta test process. What this means, is that anything you punt from 1.3 to 1.3.1, you are also punting out of RHEL5.2 and RHEL4.7. So, that being said, there's a whole trickle down effect with various groups that would really like to be able to use 5.2 out of the box that may prefer a slip in 1.3 so that this can be part of it instead of punting to 1.3.1. I'm not saying this will change your mind, but I'm sure it wasn't part of the decision process before, so I'm bringing it up. Thanks for the input (BTW you are welcome to join our weekly meetings and give us feedback online) I think it is important to make sure RH new versions will include best OFED release This my suggestion is: * Delay 1.3 release in a week * Do RC4 next week - Feb 6 * Add RC5 on Feb 18 - this will be the GOLD version * GA release on Feb 25 All - please reply if this is acceptable 760 major [EMAIL PROTECTED] UDP performance on Rx is lower than Tx - for 1.3.1 761 major [EMAIL PROTECTED] Poor and jittery UDP performance at small messages - for 1.3.1 Ditto for requesting these two be in 1.3. We've already had customers bring up the UDP performance issue in our previous releases. We will push some fixes of these to RC4 if the above plan is accepted Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[Fwd: Re: [ewg] OFED October 29 meeting summary on OFED 1.3 beta readiness]
I sent this out last night, but noticed it didn't appear in my inbox, so trying different address. Tziporet, can you let me know you got this? Thanks, Betsy Forwarded Message From: Betsy Zeller [EMAIL PROTECTED] To: [EMAIL PROTECTED], Tziporet Koren [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [ewg] OFED October 29 meeting summary on OFED 1.3 beta readiness Date: Tue, 30 Oct 2007 15:59:35 -0700 Tziporet, Vlad - I know the request went out on Monday to have all kernel backport patches etc updated to work with 2.6.24 by Wednesday. Unfortunately, due to other schedule commitments, we're not going to be able to turn those around in two days. My best estimate at the moment is that we will be able to submit the updated InfiniPath related backport patches by Friday. Regards, Betsy -- Betsy Zeller Director of Software Engineering HSG InfiniBand Engineering QLogic Corporation 2071 Stierlin Court, Suite 200 Mountain View, CA, 94043 1-650-934-8088 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED October 29 meeting summary on OFED 1.3 beta readiness
Tziporet, Vlad - I know the request went out on Monday to have all kernel backport patches etc updated to work with 2.6.24 by Wednesday. Unfortunately, due to other schedule commitments, we're not going to be able to turn those around in two days. My best estimate at the moment is that we will be able to submit the updated InfiniPath related backport patches by Friday. Regards, Betsy -- Betsy Zeller Director of Software Engineering HSG InfiniBand Engineering QLogic Corporation 2071 Stierlin Court, Suite 200 Mountain View, CA, 94043 1-650-934-8088 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg