On Wed, 2010-07-28 at 11:16 -0700, Roland Dreier wrote: > > > Actually, I tried to implement the completion callback > > > in a workqueue thread but ipoib_cm_handle_tx_wc() calls > > > netif_tx_lock() which isn't safe unless it is called > > > from an IRQ handler or netif_tx_lock_bh() is called first. > > > Oh, sounds like a bug in IPoIB. I guess we could fix it by just > > changing it to netif_tx_lock_bh()? (Or is that not safe from an IRQ > handler?) > > Wait, is this still a problem with IPoIB? As far as I can tell, the > IPoIB completion handlers don't do anything except enable the NAPI poll > routine or the transmit ring timer (ie they just do napi_schedule() or > mod_timer()), so the context that the CQ callback is called in doesn't > matter. In particular I don't see any way ipoib_cm_handle_tx_wc() could > be reached except from the NAPI polling loop. > > - R.
I don't remember now whether I hit the problem in a backported IPoIB or in a recent kernel but I did need to single thread and call local_bh_disable() for completion callbacks or I would get deadlocks. I just assumed that ULPs were being written with that as a requirement. This is what makes understanding the "locking conventions" for IPoIB really complex. Sometimes you need a lock and sometimes you don't depending on the state of the network stack. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html