Has there been any new interoperability testing between the iWARP vendors since Oct 08?
Ranjit On Tue, Oct 21, 2008 at 9:40 AM, Bob Noseworthy<[email protected]> wrote: > Greetings EWG members, > A bug for the observed IPoIB issue was logged last Friday, and updated > yesterday confirming that RC3 still demonstrates the issue. This is logged > as #1287 -- https://bugs.openfabrics.org/show_bug.cgi?id=1287 > > Further issues/observations from the recent OFA Interoperability Logo > Group's September Interoperability Event are at the end of this email. > Summary of reported IPoIB issue: > If IPoIB datagram mode is enabled, and IP frames of 8K or larger are sent, > and no ARP entry exists for the destination, then the first IP frame is > always lost (ping used), no matter what the timeout is set to (as high as > 15s) > > > The following is a short summary of various updates from the September > OpenFabrics Interoperability Event. Due to confidentiality reasons, many > details are occluded. Per the request of the IWG on Oct 14, this > information is being shared with the EWG. > > ================== > > > Below are rough notes from our testers, principally Nick Wood and Mike > Hagen. > IB update; > > 1. An SDP issue was observed once and not reproduced - suspected to be an > issue with starting testing too soon after netserver was started while all > three SDP tests were running simultaneously. When retesting was performed > tests were not run simultaneously and no issues were seen. > > 2. An SRP issues was observed once and not reproduced - A vendors SRP target > was seen to become unresponsive when srp_sg_tablesize was increased to 255. > Subsequent testing did not reproduce this behavior but is still being > pursued. > > 2a. A vendors HCA was seen to perform slowly on SRP transfers, this was > traced to an issue with the default srp_sg_tablesize of 16 had to be > increased to 131 for reasonable performance. Reminder - performance is > outside the scope of the Logo program. Tziporet - this default value perhaps > could be increased as recommended unless there is a reason 16 is preferred. > > > > 3. There is a link issue between two vendor's HCA cards. The fix that > was introduced allowed the link indication light to come up however > ibdiagnet never completes (hangs at IPoIB subnets check) and had to be > killed. Ibdiagnet also reports the following error: > > > -I--------------------------------------------------- > -I- PM Counters Info > -I--------------------------------------------------- > -E- Could not get PM info: > "pmGetPortCounters 0xffff 1" failed 4 consecutive times. > -E- Could not get PM info: > "pmGetPortCounters 0xffff 1" failed 4 consecutive times. > -I- No illegal PM counters values were found > > This happens with both VendorA cards when linked to any speed card from > VendorB *without* an sm running. If there is an sm running and the fix is in > place on the machines housing the VendorA cards then everything works > flawlessly when linked with any speed VendorB card. > > Upon removal of the cable from the VendorA card, that card gets put into a > bad state; with the fix in place and an sm running. The sm does not activate > the newly established link. This happened with VendorA cards to any VendorB > card. OpenSM also reports an error on screen; OpenSM: SM port is down. > Reestablishing the connection that was in place when the opensm instance was > started restores the active state. > > One final bit of information that I have been able to glean. It does not > appear to matter if you restore the original connection that the opensm was > started on. The only connection that brings the card back to an active state > is if you link it with a qdr hca even if that connection was not the > original. If you then attempt to restore the original the active state will > not be restored. > Currently this issue is presumed to be principally a vendor matter, but if > evidence points to additional issues with ibdiagnet, or other OFED matters, > then bugs will be filed. > > > 4. Similar to the above issue, it was observed that two vendor's HCAs that > should link at DDR when directly connected were actually linking at SDR > speeds, regardless of the cable used. This is a known issue however seems > to be a failure of the Link Init test procedure as the highest denominator > speed is not achieved. > > 5. An issue with ibdiagnet was discovered by a vendor and bugs submitted > (unrelated to issue 3 above) > > ================== > > iWARP update; > > 1. "dapltest -T P" will not work between two cards. They both have > implemented a different peer2peer protocol that ensures that a client does a > transfer before the server, to overcome the limitation in the iWARP standard > that says a client must send first data or the connection must be teared > down. > > 2. The section in the IWG test suite covering dapl must be updated to > include at least some reference to /etc/dat.conf which must be configured in > order to use any dapl based application including many MPIs and dapltest. > (This was being addressed by Arlin Davis) > > 3. dapl2.0 and dapltest2.0 do not work with iWARP devices. From the base > OFED1.4 install dapl2.0-utils must be uninstalled and compat-dapl must be > installed from the OFED website. > > 4. Due to the dapl problems, Intel MPI works in single vendor environments > but will not work in multi-vendor environments. > > 5. The default OpenMPI installed with OFED 1.4 is version 1.2.7. iWARP > support is officially not added until OpenMPI 1.3. > > 6. Loopback functionality is still not seen by all vendors. (this has > relevance to OFED feature enhancement #1275 > <https://bugs.openfabrics.org/show_bug.cgi?id=1275> > > 7. Dynamic links support was not seen by all vendors when using Intel MPI. > > > ================== > ================== > > > Testing is ongoing with RC3 and future 1.4RCs on a best effort basis until > the GA, at which time the Logo Event will be held for those participating. > If you have additional questions about these comments, the > Interoperability Events, Logo Events, or the OFA Interoperability Test > Plan, please feel free to contact us here at UNH-IOL, our OFA > Interoperability Logo Group team can be reached at [email protected]. > <mailto:[email protected]> > The testplan, logo list and past logo reports can be reviewed at > http://www.iol.unh.edu/services/testing/ofa/ > > > Best Regards, > - Bob Noseworthy > Chief Engineer / Technical Sherpa > +1-909-891-0090 {unified phone number for office, cell, etc} > +1-603-862-0090 {IOL Main number-associate this with any shipments} > UNH-IOL > > > > > > > > > > Rupert Dance wrote: >> >> I have sent another reminder to UNH IOL to get this logged. I will >> continue >> to follow up on this. >> >> Thanks >> >> Rupert >> -----Original Message----- >> From: Tziporet Koren [mailto:[email protected]] Sent: Sunday, >> October 19, 2008 8:48 AM >> To: Rupert Dance >> Cc: EWG >> Subject: Have you opened bugs to OFED 1.4 >> I mean the bugs you explained in the last OFED meeting. >> >> Thanks >> Tziporet >> >> > > _______________________________________________ > ewg mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
