[email protected] wrote @ http://lists.openfabrics.org/pipermail/general/2009-May/059730.html > What would prevent a race between a tx completion (with an > error) and the cleanup of a neighbour?
Okay, so maybe this code/design of using the stashed ipoib_neighbour at the tx completion code is the root cause of all these troubles?! >From a quick look on the code and two patches that touched this area >(f56bcd801... "Use separate CQ for UD send completions" and 57ce41d1... "Fix >transmit queue stalling forever") - I see that the original tx cq handler - >ipoib_ib_handle_tx_wc() doesn't touch the neigbour but today is called only >from the drain timer & dev-stop flows. Now, ipoib_cm_handle_tx_wc() is called >for "normal" flow both for datagram and connected modes, and this function >touches he neighbour. I am not sure why commit f56bcd801... made UD completions to go through ipoib_cm_handle_tx_wc() nor why this function must use the neighbor to access the data-structure it needs to, maybe Eli can comment on that? Or. _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
