On Wed, 2010-07-28 at 11:16 -0700, Roland Dreier wrote:
> >  > Actually, I tried to implement the completion callback
>  >  > in a workqueue thread but ipoib_cm_handle_tx_wc() calls
>  >  > netif_tx_lock() which isn't safe unless it is called
>  >  > from an IRQ handler or netif_tx_lock_bh() is called first.
> 
>  > Oh, sounds like a bug in IPoIB.  I guess we could fix it by just
>  > changing it to netif_tx_lock_bh()?  (Or is that not safe from an IRQ 
> handler?)
> 
> Wait, is this still a problem with IPoIB?  As far as I can tell, the
> IPoIB completion handlers don't do anything except enable the NAPI poll
> routine or the transmit ring timer (ie they just do napi_schedule() or
> mod_timer()), so the context that the CQ callback is called in doesn't
> matter.  In particular I don't see any way ipoib_cm_handle_tx_wc() could
> be reached except from the NAPI polling loop.
> 
>  - R.

I don't remember now whether I hit the problem in a backported IPoIB
or in a recent kernel but I did need to single thread and call
local_bh_disable() for completion callbacks or I would get deadlocks.
I just assumed that ULPs were being written with that as a requirement.

This is what makes understanding the "locking conventions" for
IPoIB really complex. Sometimes you need a lock and sometimes
you don't depending on the state of the network stack.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to