On 25/05/20(Mon) 12:56, Gerhard Roth wrote: > On 5/22/20 9:05 PM, Mark Kettenis wrote: > > > From: Łukasz Lejtkowski <emig...@gmail.com> > > > Date: Fri, 22 May 2020 20:51:57 +0200 > > > > > > Probably power supply 12 V is broken. Showing 16,87 V(Fluke 179) - > > > too high. Should be 12,25-12,50 V. I replaced to the new one. > > > > That might be why the device stops responding. The fact that cleaning > > up from a failed USB transaction leads to this panic is a bug though. > > > > And somebody just posted a very similar panic with ure(4). Something > > in the network stack is holding a mutex when it shouldn't. > > I think that holding the mutex is ok. The bug is calling the stop > routine in case of errors. > > This is what common foo_start() does: > > m_head = ifq_deq_begin(&ifp->if_snd); > if (foo_encap(sc, m_head, 0)) { > ifq_deq_rollback(&ifp->if_snd, m_head); > ... > return; > } > ifq_deq_commit(&ifp->if_snd, m_head); > > Here, ifq_deq_begin() grabs a mutex and it is held while > calling foo_encap(). > > For USB network interfaces foo_encap() mostly does this: > > err = usbd_transfer(sc->sc_xfer); > if (err != USBD_IN_PROGRESS) { > foo_stop(sc); > return EIO; > } > > And foo_stop() calls usbd_abort_pipe() -> xhci_command_submit(), > which might sleep. > > How to fix? We could do the foo_encap() after the ifq_deq_commit(), > possibly dropping the current mbuf if encap fails (who cares > for the packets after foo_stop() anyway).
That's the approach taken by drivers using ifq_dequeue(9) instead of ifq_deq_begin/commit(). > Or change all the drivers to follow the path that if_aue.c takes: > > err = usbd_transfer(c->aue_xfer); > if (err != USBD_IN_PROGRESS) { > ... > /* Stop the interface from process context. */ > usb_add_task(sc->aue_udev, &sc->aue_stop_task); > return (EIO); > } That's just trading the current problem for another one with higher complexity. > Any ideas, what's better? Or alternative proposals? Using ifq_dequeue(9) would have the advantage of unifying the code base. It introduces a behavior change. A simpler fix would be to call foo_stop() in the error path after ifq_deq_rollback().