Re: [PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)
On Sat, 2007-09-22 at 13:53 -0700, David Brownell wrote: > > Sigh. I need a real deep look inside that code to understand, why > > tx_reqs is not a requestlist but a freelist. Very intuitive naming :) > > It *is* a list of requests: free ones -- the only kind this level of > driver is allowed to remember! ;) > > Yeah, I had to go back and read the driver again before I understood > just what problem this patch was trying to fix. Which is why I wanted > to make sure the mismatch between comments and contents was resolved. Fair enough. Thanks for sanitizing the comments. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)
> Sigh. I need a real deep look inside that code to understand, why > tx_reqs is not a requestlist but a freelist. Very intuitive naming :) It *is* a list of requests: free ones -- the only kind this level of driver is allowed to remember! ;) Yeah, I had to go back and read the driver again before I understood just what problem this patch was trying to fix. Which is why I wanted to make sure the mismatch between comments and contents was resolved. - Dave == CUT HERE From: Benedikt Spranger <[EMAIL PROTECTED]> This patch fixes a longstanding race in the Ethernet gadget driver, which can cause an oops on device disconnect. The fix is just to make the TX path check whether its freelist is empty. That check is otherwise not necessary, since the queue is always stopped when that list empties (and restarted when request completion puts an entry back on that freelist). The race window starts when the network code decides to transmit a packet, and ends when hard_start_xmit() grabs the freelist lock. When disconnect() is called inside that window, it shuts down the TX queue and breaks the otherwise-solid assumption that packets are never sent through a TX queue that's stopped. Signed-off-by: Benedikt Spranger <[EMAIL PROTECTED]> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> Signed-off-by: David Brownell <[EMAIL PROTECTED]> --- a/drivers/usb/gadget/ether.c +++ b/drivers/usb/gadget/ether.c @@ -1989,8 +1989,20 @@ static int eth_start_xmit (struct sk_buff *skb, struct net_device *net) } spin_lock_irqsave(&dev->req_lock, flags); + /* +* this freelist can be empty if an interrupt triggered disconnect() +* and reconfigured the gadget (shutting down this queue) after the +* network stack decided to xmit but before we got the spinlock. +*/ + if (list_empty(&dev->tx_reqs)) { + spin_unlock_irqrestore(&dev->req_lock, flags); + return 1; + } + req = container_of (dev->tx_reqs.next, struct usb_request, list); list_del (&req->list); + + /* temporarily stop TX queue when the freelist empties */ if (list_empty (&dev->tx_reqs)) netif_stop_queue (net); spin_unlock_irqrestore(&dev->req_lock, flags); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)
On Sat, 2007-09-22 at 13:14 -0700, David Brownell wrote: > How's this? Note that the queue should already have been stopped, > so I removed what should be an extra call (as well as fixing the > comments). Yeah, stop queue should be not necessary. > - Dave > > > From: Thomas Gleixner <[EMAIL PROTECTED]> Please change to: From: Benedikt Spranger <[EMAIL PROTECTED]> He did all the grump work of figuring out what's going wrong. I was just the messenger. > This patch fixes a longstanding race in the Ethernet gadget driver, > which can cause an oops on device disconnect. The fix is just to > make the TX path check whether its freelist is empty. That check > is otherwise not necessary, since the queue is always stopped when > that list empties (and restarted when request completion puts an > entry back on that freelist). Sigh. I need a real deep look inside that code to understand, why tx_reqs is not a requestlist but a freelist. Very intuitive naming :) > The race window starts when the network code decides to transmit a > packet, and ends when hard_start_xmit() grabs the freelist lock. > If disconnect() is called inside that window, it shuts down the > TX queue and breaks the otherwise-solid assumption that packets are > never sent when the TX queue is stopped. Please add our signed offs as well Signed-off-by: Benedikt Spranger <[EMAIL PROTECTED]> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> > Signed-off-by: David Brownell <[EMAIL PROTECTED]> Thanks, tglx > --- a/drivers/usb/gadget/ether.c > +++ b/drivers/usb/gadget/ether.c > @@ -1989,8 +1989,20 @@ static int eth_start_xmit (struct sk_buff *skb, struct > net_device *net) > } > > spin_lock_irqsave(&dev->req_lock, flags); > + /* > + * the freelist can be empty if an interrupt triggered disconnect() > + * and reconfigured the gadget (shutting down this queue) after the > + * network stack decided to xmit but before we got the spinlock. > + */ > + if (list_empty(&dev->tx_reqs)) { > + spin_unlock_irqrestore(&dev->req_lock, flags); > + return 1; > + } > + > req = container_of (dev->tx_reqs.next, struct usb_request, list); > list_del (&req->list); > + > + /* temporarily stop TX queue when the freelist empties */ > if (list_empty (&dev->tx_reqs)) > netif_stop_queue (net); > spin_unlock_irqrestore(&dev->req_lock, flags); > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)
How's this? Note that the queue should already have been stopped, so I removed what should be an extra call (as well as fixing the comments). - Dave From: Thomas Gleixner <[EMAIL PROTECTED]> This patch fixes a longstanding race in the Ethernet gadget driver, which can cause an oops on device disconnect. The fix is just to make the TX path check whether its freelist is empty. That check is otherwise not necessary, since the queue is always stopped when that list empties (and restarted when request completion puts an entry back on that freelist). The race window starts when the network code decides to transmit a packet, and ends when hard_start_xmit() grabs the freelist lock. If disconnect() is called inside that window, it shuts down the TX queue and breaks the otherwise-solid assumption that packets are never sent when the TX queue is stopped. Signed-off-by: David Brownell <[EMAIL PROTECTED]> --- a/drivers/usb/gadget/ether.c +++ b/drivers/usb/gadget/ether.c @@ -1989,8 +1989,20 @@ static int eth_start_xmit (struct sk_buff *skb, struct net_device *net) } spin_lock_irqsave(&dev->req_lock, flags); + /* +* the freelist can be empty if an interrupt triggered disconnect() +* and reconfigured the gadget (shutting down this queue) after the +* network stack decided to xmit but before we got the spinlock. +*/ + if (list_empty(&dev->tx_reqs)) { + spin_unlock_irqrestore(&dev->req_lock, flags); + return 1; + } + req = container_of (dev->tx_reqs.next, struct usb_request, list); list_del (&req->list); + + /* temporarily stop TX queue when the freelist empties */ if (list_empty (&dev->tx_reqs)) netif_stop_queue (net); spin_unlock_irqrestore(&dev->req_lock, flags); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)
On Sat, 2007-09-22 at 12:18 -0700, David Brownell wrote: > I think you misread my comment. Those requests are **NOT** pending!! > So this update has a *MORE* incorrect description of the issue. > > That's just the freelist ... it's a fairly conventional model whereby > there's a pool of "free" request slots which can be issued. When the > pool empties, the TX queue shuts down until one of the requests which > is pending in the hardware completes, and makes a slot free. > > The problem you're addressing is that there's a small window where a > disconnect IRQ can shut down the TX queue (and empty that freelist) > after upper layers in the network stack started a transmission on > an active (pre-disconnect) TX queue. > > That problem is *NOT* related to any pending requests at all!! Sorry, I misunderstood your comment. Can you please add the correct comment yourself before we play some more rounds of ping pong ? tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)
I think you misread my comment. Those requests are **NOT** pending!! So this update has a *MORE* incorrect description of the issue. That's just the freelist ... it's a fairly conventional model whereby there's a pool of "free" request slots which can be issued. When the pool empties, the TX queue shuts down until one of the requests which is pending in the hardware completes, and makes a slot free. The problem you're addressing is that there's a small window where a disconnect IRQ can shut down the TX queue (and empty that freelist) after upper layers in the network stack started a transmission on an active (pre-disconnect) TX queue. That problem is *NOT* related to any pending requests at all!! NAK... > From [EMAIL PROTECTED] Sat Sep 22 10:51:00 2007 > Subject: [PATCH] usb-gadget-ether: Prevent oops caused by error interrupt > race -V2 (comments update) > From: Thomas Gleixner <[EMAIL PROTECTED]> > Date: Sat, 22 Sep 2007 19:41:01 +0200 > > From: Benedikt Spranger <[EMAIL PROTECTED]> > > eth_start_xmit() can race against a disconnect interrupt in the gadget > device driver, which nukes all pending request. Right now we access the > pending request list unconditionally and dereference the request list > head itself in such a case, which results in an Oops. > > Check whether the list is empty before actually dereferencing > dev->tx_reqs.next. Also add a comment for the second list_empty check > further down to avoid confusion. > > Long standing bug. Patch should be applied to stable as well. > > Signed-off-by: Benedikt Spranger <[EMAIL PROTECTED]> > Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> > > diff --git a/drivers/usb/gadget/ether.c b/drivers/usb/gadget/ether.c > index 593e235..f2a7bd5 100644 > ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] usb-gadget-ether: Prevent oops caused by error interrupt race -V2 (comments update)
From: Benedikt Spranger <[EMAIL PROTECTED]> eth_start_xmit() can race against a disconnect interrupt in the gadget device driver, which nukes all pending request. Right now we access the pending request list unconditionally and dereference the request list head itself in such a case, which results in an Oops. Check whether the list is empty before actually dereferencing dev->tx_reqs.next. Also add a comment for the second list_empty check further down to avoid confusion. Long standing bug. Patch should be applied to stable as well. Signed-off-by: Benedikt Spranger <[EMAIL PROTECTED]> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> diff --git a/drivers/usb/gadget/ether.c b/drivers/usb/gadget/ether.c index 593e235..f2a7bd5 100644 --- a/drivers/usb/gadget/ether.c +++ b/drivers/usb/gadget/ether.c @@ -1989,8 +1989,21 @@ static int eth_start_xmit (struct sk_buff *skb, struct net_device *net) } spin_lock_irqsave(&dev->req_lock, flags); + /* +* dev->tx_reqs may be empty. We raced against a disconnect +* interrupt in the gadget device driver, which nuked all +* pending requests. +*/ + if (list_empty(&dev->tx_reqs)) { + netif_stop_queue(net); + spin_unlock_irqrestore(&dev->req_lock, flags); + return 1; + } + req = container_of (dev->tx_reqs.next, struct usb_request, list); list_del (&req->list); + + /* last request in list: stop queue */ if (list_empty (&dev->tx_reqs)) netif_stop_queue (net); spin_unlock_irqrestore(&dev->req_lock, flags); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/