Re: [ewg] EWG/OFED meeting minutes for today (Dec 13, 2011)
On 12/29/2011 07:21 AM, Steve Wise wrote: On 12/22/2011 02:26 PM, Edward Mascarenhas wrote: There is a suggestion from Eduard from Qlogic to have the backports ad different branches and not as today. He will start a mail thread on his suggestion. The attached file in pdf format has the proposal. It is also inlined in text below so its easier to comment on. The pdf has 2 diagrams which should make it easier to understand the changes that are proposed. Comments are welcome. - Edward I like this approach. I would also do the same thing with kernel_addons so that each branch is just a specific backport. What are there any disadvantages to your proposal? Did you get a chance to investigate the other backport projects from this thread? https://lkml.org/lkml/2011/9/9/327 Some of us have been reviewing this work. We don't see how this could be used with distro kernels (which we would need for OFED). Edward Steve. This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] EWG/OFED meeting minutes for today (Dec 13, 2011)
Hi Vladimir and Steve, Thanks for your comments. See below. - Edward -- Hi Edward, The main disadvantage of this proposal is that it will require much more time for the maintenance, which is a show stopper, IMHO: Any update in the main tree will require to checkout every branch with backports and make specific changes. The proposed development method does have some additional steps which will need to be performed by the individual developers prior to sending in a pull request: * if the change is applicable to all backport branches, the developer will create a commit in the master (or fixes) branch. This commit will contain actual code changes rather then a patch file. Once the commit has been made in the master (or fixes) branch, the developer will have to merge the new commit to applicable backports. Effectively, this is done with two commands (executed per backport branch) - git checkout branch and git merge master. This is easily scriptable. The result of the above process can be one of two possibilities: 1. The commit merges cleanly to all backport branches - in this case, the developer is ready to send the pull request. 2. The commit creates a merge conflict in one or more of the backport branches - in this case, the developer fixes the merge conflicts in all applicable backports and then submits the pull request. This scenario does involve a little bit more of manual maintenance work, however, I don't consider this a show-stopper for the following reasons: * even with the current structure of the repo, developers have to make sure that the patches for the individual backports apply cleanly and are correct (effectively, the same work as fixing the merge conflict) * the work is performed by the individual developers and, thus distributed among all members. It does not have to be performed by a single individual/maintainer. * in our experience, merge conflicts happen very seldom. Almost all commits will merge cleanly and will not require additional effort. * if the change is only applicable to a few backport branches, the developer will have to make the change into each of the backport branches. This is effectively the same work that the developer will have to do in the existing structure with some simplification. The developer will not have to prepare the tree for the particular backport (just checking out the particular branch will do that). In addition, the developer will not have to create a patch file to be checked in as changes are in the form of actual code. The goal of the newly proposed structure and development method is that pull requests are only submitted on clean repositories. By the time the pull request is submitted, all commits should have been applied to the appropriate backport branches and all merge conflicts resolved in the source repository (source as in the repository to pull from). There should be minimal to no maintenance work required by the person pulling into the master OFED repository. In addition, a lot of backports can be shared by the different kernels/Distros and it is easily maintainable today. The suggested approach will lead to spending more work on backports as they will be developed separately. In general, we consider sharing backport directories/branches for multiple kernels a bad idea. It creates confusion and as soon as there is a difference, there will be the need to separate the two in some way. The newly proposed structure and development method will have just as many backport branches as there are backport directories in the current OFED repository. The checking out of the correct backport branch could be done easily by using the 'ofed_scripts/get_backport_dir.sh' script in conjunction with 'git checkout'. Regarding the issues you bring in your proposal, there is a mechanism that enables to check that all patches can be applied clearly (ofed_scripts/ofed_makedist.sh) and a build script that checks the compilation on the OFA server. Also, ofed_makedist.sh creates a tarball per backport directory which make it easier to test/develop for this specific kernel. Each developer should check his code using these tools before sending a pull request. There is a way for developers to check their work but the fact that broken patches have been checked into the repo on multiple occasions indicates that the method isn't used for every patch
[ewg] ibdiagpath return code incorrect
It appears that even on error ibdiagpath is returning zero but it should return non-zero for error cases. Scripts that call ibdiagpath can use the return code to take appropriate action. Here is an example. == service0: # ibdiagpath -i 1 -d 1,12 Loading IBDIAGPATH from: /usr/lib64/ibdiagpath1.2 -W- Topology file is not specified. Reports regarding cluster links will use direct routes. Loading IBDM from: /usr/lib64/ibdm1.2 -I- Using port 1 as the local port (since that is the output port based on the provided direct route). -I--- -I- Traversing the path from local to destination -I--- -I- From: lid=0x000c guid=0x0002c90200291949 dev=25218 service0/P1 -I- To: lid=0x001d guid=0x080069005264 dev=48438 Port=28 can't read PATH(1): no such element in array service0:/tmp/src # echo $? 0 service0:/tmp/src # Assuming this is a bug, I don't have a fix, but I'm hoping someone who is more familiar with the ibdiagpath code will provide a fix. Thanks, Edward ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ofa-general] Re: [ewg] OFED March 24 meeting summary on OFED 1.4 plans
The SGI Altix ICE cluster system supports 2 InfiniBand fabrics. http://www.sgi.com/products/servers/altix/ice/ Each compute node has 2 HCAs and each is connected to a separate fabric. We recommend that users use one fabric for storage traffic and the other for MPI, but there is no reason why both fabrics could not be used for MPI. OpenMPI requires setting a separate subnet prefix for each fabric to use both fabrics for MPI and OpenSM supports this setting of subnet prefix. Other MPIs do not require this. Edward on 04/04/2008 08:08 AM Tang, Changqing said the following: What I mean claim to support is to have more people to test with this config. --CQ -Original Message- From: Or Gerlitz [mailto:[EMAIL PROTECTED] Sent: Thursday, April 03, 2008 11:18 PM To: Tang, Changqing Cc: [EMAIL PROTECTED]; ewg@lists.openfabrics.org Subject: Re: [ofa-general] Re: [ewg] OFED March 24 meeting summary on OFED 1.4 plans On Thu, Apr 3, 2008 at 5:40 PM, Tang, Changqing [EMAIL PROTECTED] wrote: The problem is, from MPI side, (and by default), we don't know which port is on which fabric, since the subnet prefix is the same. We rely on system admin to config two different subnet prefixes for HP-MPI to work. No vendor has claimed to support this. CQ, not supporting a different subnet prefix per IB subnet is against IB nature, I don't think there should be any problem to configure a different prefix at each open SM instance and the Linux host stack would work perfectly under this config. If you are a ware to any problem in the opensm and/or the host stack please let the community know and the maintainers will fix it. Or. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED 1.3 timeline
Hi, What is the expected timeline for the remaining RCs and GA of OFED 1.3? Thanks, Edward ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg