Bob Noseworthy wrote:
A bug for the observed IPoIB issue was logged last Friday, and updated yesterday confirming that RC3 still demonstrates the issue. This is logged as https://bugs.openfabrics.org/show_bug.cgi?id=1287

Further issues/observations from the recent OFA Interoperability Logo Group's September Interoperability Event are at the end of this email. Summary of reported IPoIB issue: If IPoIB datagram mode is enabled, and IP frames of 8K or larger are sent, and no ARP entry exists for the destination, then the first IP frame is always lost (ping used), no matter what the timeout is set to (as high as 15s)
Looking in the code, the issue you report seems to be related to the length of internal queue used by ipoib to keep skbs whose neighbour doesn't have yet an IB Address-Handle (L2 info needed for xmit) associated with
drivers/infiniband/ulp/ipoib/ipoib.h:   IPOIB_MAX_PATH_REC_QUEUE  = 3,
drivers/infiniband/ulp/ipoib/ipoib_main.c if (skb_queue_len(&neigh->queue) < 
IPOIB_MAX_PATH_REC_QUEUE)
drivers/infiniband/ulp/ipoib/ipoib_main.c: skb_queue_len(&path->queue) < 
IPOIB_MAX_PATH_REC_QUEUE) {
drivers/infiniband/ulp/ipoib/ipoib_main.c: if (skb_queue_len(&neigh->queue) < 
IPOIB_MAX_PATH_REC_QUEUE)
the current code will keep up to three skbs and then drop all the ones that follows till the point in time a reply for the driver path query is received from the SA. Unless I miss something, this code is there from day one (Q4/2005), do you claim that with older code drops this issue has not been observed? I am cc-ing here Roland, the maintainer of the driver, so you can check things with him.

The following is a short summary of various updates from the September OpenFabrics Interoperability Event. Due to confidentiality reasons, many details are occluded. Per the request of the IWG on Oct 14, this information is being shared with the EWG.

Testing is ongoing with RC3 and future 1.4RCs on a best effort basis until the GA, at which time the Logo Event will be held for those participating. If you have additional questions about these comments, the Interoperability Events, Logo Events, or the OFA Interoperability Test Plan, please feel free to contact us here at UNH-IOL
May I ask whose decision was it to test the Linux kernel RDMA stack in its "ofed" flavor and what was the reasonings behind it? the main-line kernel IB/iWARP code is well maintained and has an associated small supporting developer community. The ofed kernel bits contain code which was not accepted yet to the upstream kernel so you are actually testing not the product delivered by the ofa maintainers but rather a different creature, are you aware to that?


Or.

_______________________________________________
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to