Re: [PATCH] usb: gadget: u_ether: remove interrupt throttling
On Tue, Nov 01, 2016 at 01:29:59PM +0200, Felipe Balbi wrote: > According to Dave Miller "the networking stack has a > hard requirement that all SKBs which are transmitted > must have their completion signalled in a fininte > amount of time. This is because, until the SKB is > freed by the driver, it holds onto socket, > netfilter, and other subsystem resources." > > In summary, this means that using TX IRQ throttling > for the networking gadgets is, at least, complex and > we should avoid it for the time being. > > Cc: <sta...@vger.kernel.org> > Reported-by: Ville Syrjälä <ville.syrj...@linux.intel.com> > Suggested-by: David Miller <da...@davemloft.net> > Signed-off-by: Felipe Balbi <felipe.ba...@linux.intel.com> Fixes laggy/hangy g_ether on my BYT FFRD8 tablet, caused by commit 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for LST bit") Tested-by: Ville Syrjälä <ville.syrj...@linux.intel.com> > --- > drivers/usb/gadget/function/u_ether.c | 8 > 1 file changed, 8 deletions(-) > > diff --git a/drivers/usb/gadget/function/u_ether.c > b/drivers/usb/gadget/function/u_ether.c > index f4a640216913..119a2e5848e8 100644 > --- a/drivers/usb/gadget/function/u_ether.c > +++ b/drivers/usb/gadget/function/u_ether.c > @@ -589,14 +589,6 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb, > > req->length = length; > > - /* throttle high/super speed IRQ rate back slightly */ > - if (gadget_is_dualspeed(dev->gadget)) > - req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH || > -dev->gadget->speed == USB_SPEED_SUPER)) > && > - !list_empty(>tx_reqs)) > - ? ((atomic_read(>tx_qlen) % dev->qmult) != 0) > - : 0; > - > retval = usb_ep_queue(in, req, GFP_ATOMIC); > switch (retval) { > default: > -- > 2.10.1 -- Ville Syrjälä Intel OTC -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
On Fri, Oct 28, 2016 at 01:16:13PM +0300, Felipe Balbi wrote: > > Hi, > > Ville Syrjälä <ville.syrj...@linux.intel.com> writes: > > On Thu, Oct 06, 2016 at 12:08:26PM +0300, Ville Syrjälä wrote: > >> On Thu, Oct 06, 2016 at 10:36:09AM +0300, Felipe Balbi wrote: > >> > > >> > Hi, > >> > > >> > Felipe Balbi <felipe.ba...@linux.intel.com> writes: > >> > >> > Okay, I have found a regression on dwc3 and another patch follows: > >> > > >> > commit 5e1a2af3e46248c55098cdae643c4141851b703e > >> > Author: Felipe Balbi <felipe.ba...@linux.intel.com> > >> > Date: Wed Oct 5 14:24:37 2016 +0300 > >> > > >> > usb: dwc3: gadget: properly account queued requests > >> > > >> > Some requests could be accounted for multiple > >> > times. Let's fix that so each and every requests is > >> > accounted for only once. > >> > > >> > Cc: <sta...@vger.kernel.org> # v4.8 > >> > Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs > >> > for LST bit") > >> > Signed-off-by: Felipe Balbi <felipe.ba...@linux.intel.com> > >> > > >> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > >> > index 07cc8929f271..3c3ced128c77 100644 > >> > --- a/drivers/usb/dwc3/gadget.c > >> > +++ b/drivers/usb/dwc3/gadget.c > >> > @@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep, > >> > req->trb = trb; > >> > req->trb_dma = dwc3_trb_dma_offset(dep, trb); > >> > req->first_trb_index = dep->trb_enqueue; > >> > +dep->queued_requests++; > >> > } > >> > > >> > dwc3_ep_inc_enq(dep); > >> > @@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep, > >> > > >> > trb->ctrl |= DWC3_TRB_CTRL_HWO; > >> > > >> > -dep->queued_requests++; > >> > - > >> > trace_dwc3_prepare_trb(dep, trb); > >> > } > >> > > >> > @@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 > >> > *dwc, struct dwc3_ep *dep, > >> > unsigned ints_pkt = 0; > >> > unsigned inttrb_status; > >> > > >> > -dep->queued_requests--; > >> > dwc3_ep_inc_deq(dep); > >> > + > >> > +if (req->trb == trb) > >> > +dep->queued_requests--; > >> > + > >> > trace_dwc3_complete_trb(dep, trb); > >> > > >> > /* > >> > > >> > I have also built a branch which you can use for testing. Here's a pull > >> > request, once you tell me it works for you, then I can send proper > >> > patches out: > >> > > >> > The following changes since commit > >> > c8d2bc9bc39ebea8437fd974fdbc21847bb897a3: > >> > > >> > Linux 4.8 (2016-10-02 16:24:33 -0700) > >> > > >> > are available in the git repository at: > >> > > >> > git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git > >> > tmp-test-v4.8 > >> > > >> > for you to fetch changes up to c968b8d1effe64a7980802d1eef29f4e1922faca: > >> > > >> > usb: dwc3: gadget: properly account queued requests (2016-10-06 > >> > 10:16:37 +0300) > >> > > >> > > >> > Felipe Balbi (2): > >> > usb: gadget: function: u_ether: don't starve tx request queue > >> > usb: dwc3: gadget: properly account queued requests > >> > > >> > drivers/usb/dwc3/gadget.c | 7 --- > >> > drivers/usb/gadget/function/u_ether.c | 5 +++-- > >> > 2 files changed, 7 insertions(+), 5 deletions(-) > >> > >> Tried your branch, but unfortunately I'm still seeing the lags. New trace > >> attached. > > > > Just a reminder that the regressions is still there in 4.9-rc2. > > Unfortunateyly with all the stuff already piled on top, getting USB > > working on my device is no longer a simple matter of reverting the > > one co
Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
On Thu, Oct 06, 2016 at 12:08:26PM +0300, Ville Syrjälä wrote: > On Thu, Oct 06, 2016 at 10:36:09AM +0300, Felipe Balbi wrote: > > > > Hi, > > > > Felipe Balbi <felipe.ba...@linux.intel.com> writes: > > > Okay, I have found a regression on dwc3 and another patch follows: > > > > commit 5e1a2af3e46248c55098cdae643c4141851b703e > > Author: Felipe Balbi <felipe.ba...@linux.intel.com> > > Date: Wed Oct 5 14:24:37 2016 +0300 > > > > usb: dwc3: gadget: properly account queued requests > > > > Some requests could be accounted for multiple > > times. Let's fix that so each and every requests is > > accounted for only once. > > > > Cc: <sta...@vger.kernel.org> # v4.8 > > Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for > > LST bit") > > Signed-off-by: Felipe Balbi <felipe.ba...@linux.intel.com> > > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > > index 07cc8929f271..3c3ced128c77 100644 > > --- a/drivers/usb/dwc3/gadget.c > > +++ b/drivers/usb/dwc3/gadget.c > > @@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep, > > req->trb = trb; > > req->trb_dma = dwc3_trb_dma_offset(dep, trb); > > req->first_trb_index = dep->trb_enqueue; > > + dep->queued_requests++; > > } > > > > dwc3_ep_inc_enq(dep); > > @@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep, > > > > trb->ctrl |= DWC3_TRB_CTRL_HWO; > > > > - dep->queued_requests++; > > - > > trace_dwc3_prepare_trb(dep, trb); > > } > > > > @@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 > > *dwc, struct dwc3_ep *dep, > > unsigned ints_pkt = 0; > > unsigned inttrb_status; > > > > - dep->queued_requests--; > > dwc3_ep_inc_deq(dep); > > + > > + if (req->trb == trb) > > + dep->queued_requests--; > > + > > trace_dwc3_complete_trb(dep, trb); > > > > /* > > > > I have also built a branch which you can use for testing. Here's a pull > > request, once you tell me it works for you, then I can send proper > > patches out: > > > > The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3: > > > > Linux 4.8 (2016-10-02 16:24:33 -0700) > > > > are available in the git repository at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git tmp-test-v4.8 > > > > for you to fetch changes up to c968b8d1effe64a7980802d1eef29f4e1922faca: > > > > usb: dwc3: gadget: properly account queued requests (2016-10-06 10:16:37 > > +0300) > > > > > > Felipe Balbi (2): > > usb: gadget: function: u_ether: don't starve tx request queue > > usb: dwc3: gadget: properly account queued requests > > > > drivers/usb/dwc3/gadget.c | 7 --- > > drivers/usb/gadget/function/u_ether.c | 5 +++-- > > 2 files changed, 7 insertions(+), 5 deletions(-) > > Tried your branch, but unfortunately I'm still seeing the lags. New trace > attached. Just a reminder that the regressions is still there in 4.9-rc2. Unfortunateyly with all the stuff already piled on top, getting USB working on my device is no longer a simple matter of reverting the one commit. I had to revert 10 patches to get even a clean revert, but unfortunately that approach just lead to the transfer hanging immediately. So what I ended up doing is reverting all of this: git log --no-merges --format=oneline 55a0237f8f47957163125e20ee9260538c5c341c^..HEAD -- drivers/usb/dwc3/ include/linux/ulpi/interface.h drivers/usb/Kconfig drivers/usb/core/Kconfig which comes out at whopping 70 commits. Having to carry that around is going to be quite a pain especially as more stuff might be piled on top. /me thinks a stable backport of any fix (assuming one is found eventually) is going to be quite the challenge... -- Ville Syrjälä Intel OTC -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
On Tue, Oct 04, 2016 at 03:23:12PM +0300, Felipe Balbi wrote: > > Hi, > > Ville Syrjälä <ville.syrj...@linux.intel.com> writes: > > On Mon, Oct 03, 2016 at 07:08:43PM +0300, Felipe Balbi wrote: > >> > >> Hi, > >> > >> ville.syrj...@linux.intel.com writes: > >> > From: Ville Syrjälä <ville.syrj...@linux.intel.com> > >> > > >> > This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c. > >> > > >> > commit 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for > >> > LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp > >> > is very sluggish and can even stall entirely. Revert cures it. > >> > > >> > Cc: Felipe Balbi <felipe.ba...@linux.intel.com> > >> > Cc: sta...@vger.kernel.org > >> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com> > >> > >> Reverting that commit is not the best idea. Let's start with the usual: > >> > >> - Kernel version > >> - dmesg on both sides (host and device) > >> - dwc3 tracepoints: > >> > >> # mkdir -p /t > >> # mount -t tracefs none /t > >> # echo 8192 > /t/buffer_size_kb > >> # echo 1 > /t/events/dwc3/enable > >> # echo 0 > /t/events/dwc3/dwc3_readl/enable > >> # echo 0 > /t/events/dwc3/dwc3_writel/enable > >> > >> This should be enough to tell me what's really going on. > > > > I've attached the (compressed due to size) dmesgs/traces from the device. > > For my test I just did 'for i in `seq 1 5` ; dmesg ; done' in an ssh > > session, and I did get the lag in the bad run, so presumably the trace > > should have caught it. Both were on 4.8 (+ the revert for the good run). > > > > Host side dmesg just showes me: > > usb 3-3.4: new high-speed USB device number 25 using xhci_hcd > > cdc_ether 3-3.4:1.0 usb0: register 'cdc_ether' at usb-:00:14.0-3.4, > > CDC Ethernet Device, ... > > when I boot the device. Nothing extra appears when things lag around. > > I'm running 4.5.something on the host. > > Looking at your logs what I can see is that "revert.trace" doesn't > respect gadget driver's IRQ throttling, which means you get an interrupt > for each and every request, while "bad.trace" shows dwc3 respecting > "no_interrupt" flag passed in by g_ether. > > The only explanation I have for this behavior when we start respecting > gadget driver's flags, is that there's a possible race in u_ether.c > where it can go for a long time without an interrupt due to that > throttling. > > This only happens with IN endpoints (the ones which are throttled) and > the behavior can be seen in log snippet below: > > 27782: irq/17-dwc3-2524 [000] d... 666.891969: tx_complete: req > 88016da1a6c0 length 1514/1514 flags ZsI tx_qlen 4 qmult 5 > 27802: ksoftirqd/0-3 [000] .Ns1 666.892047: eth_start_xmit: req > 88016da1a6c0 length 0/1514 flags Zsi tx_qlen 5 qmult 5 > 27808: ksoftirqd/0-3 [000] .Ns2 666.892062: eth_start_xmit: req > 88016da1a300 length 0/1514 flags ZsI tx_qlen 6 qmult 5 > 27814: ksoftirqd/0-3 [000] .Ns2 666.892069: eth_start_xmit: req > 88016da1ab40 length 0/1514 flags Zsi tx_qlen 7 qmult 5 > 27820: ksoftirqd/0-3 [000] .Ns2 666.892075: eth_start_xmit: req > 88016da1acc0 length 0/1514 flags Zsi tx_qlen 8 qmult 5 > 27826: ksoftirqd/0-3 [000] .Ns2 666.892082: eth_start_xmit: req > 88016da1ae40 length 0/1514 flags Zsi tx_qlen 9 qmult 5 > 27832: ksoftirqd/0-3 [000] .Ns2 666.892088: eth_start_xmit: req > 88016da1a780 length 0/1514 flags Zsi tx_qlen 10 qmult 5 > 27884: irq/17-dwc3-2524 [000] d... 666.892213: tx_complete: req > 88016da1a840 length 1514/1514 flags Zsi tx_qlen 9 qmult 5 > 27889: irq/17-dwc3-2524 [000] d... 666.892215: tx_complete: req > 88016da1a600 length 1514/1514 flags Zsi tx_qlen 8 qmult 5 > 27897:sshd-2684 [002] ..s1 666.892223: eth_start_xmit: req > 88016da1a840 length 0/1514 flags Zsi tx_qlen 9 qmult 5 > 27898: irq/17-dwc3-2524 [000] d... 666.892223: tx_complete: req > 88016da1a180 length 1514/1514 flags Zsi tx_qlen 8 qmult 5 > 27903: irq/17-dwc3-2524 [000] d... 666.892225: tx_complete: req > 88016da1a240 length 1514/1514 flags Zsi tx_qlen 7 qmult 5 > 27910:sshd-2684 [002] ..s1 666.892230: eth_start_xmit: req > 88016da1a180 length 0/1514 flags Zsi tx_qlen 8 qmult 5 > 27913: irq/17-dwc3-2524 [000] d...
Re: [PATCH] Revert "usb: dwc3: gadget: drop unnecessary loop when cleaning up TRBs"
On Mon, Sep 07, 2015 at 09:56:06AM +0300, Heikki Krogerus wrote: > Hi, > > On Tue, Sep 01, 2015 at 06:37:54PM +0300, Ville Syrjälä wrote: > > On Tue, Sep 01, 2015 at 10:17:59AM -0500, Felipe Balbi wrote: > > > Hi, > > > > > > On Tue, Sep 01, 2015 at 05:39:28PM +0300, Ville Syrjälä wrote: > > > > On Tue, Sep 01, 2015 at 08:59:02AM -0500, Felipe Balbi wrote: > > > > > Hi, > > > > > > > > > > On Tue, Sep 01, 2015 at 04:17:00PM +0300, Ville Syrjälä wrote: > > > > > > On Mon, Aug 31, 2015 at 01:50:10PM -0500, Felipe Balbi wrote: > > > > > > > Hi, > > > > > > > > > > > > > > On Mon, Aug 31, 2015 at 08:25:10PM +0300, Ville Syrjälä wrote: > > > > > > > > On Mon, Aug 31, 2015 at 11:54:13AM -0500, Felipe Balbi wrote: > > > > > > > > > On Mon, Aug 31, 2015 at 07:48:28PM +0300, > > > > > > > > > ville.syrj...@linux.intel.com wrote: > > > > > > > > > > From: Ville Syrjälä <ville.syrj...@linux.intel.com> > > > > > > > > > > > > > > > > > > > > This reverts commit > > > > > > > > > > 8f2c9544aba636134303105ecb164190a39dece4. > > > > > > > > > > > > > > > > > > > > As it breaks g_ether on my Baytrail FFRD8 device. > > > > > > > > > > Everything starts out > > > > > > > > > > fine, but after a bit of data has been transferred it just > > > > > > > > > > stops > > > > > > > > > > flowing. > > > > > > > > > > > > > > > > > > > > Note that I do get a bunch of these "NOHZ: > > > > > > > > > > local_softirq_pending 08" > > > > > > > > > > when booting the machine, but I'm not really sure if > > > > > > > > > > they're related > > > > > > > > > > to this problem. > > > > > > > > > > > > > > > > > > I have a feeling your problem is elsewhere. We *are* > > > > > > > > > completing one TRB > > > > > > > > > at a time. By reverting that commit you're just masking the > > > > > > > > > real problem > > > > > > > > > and I'd rather get that one fixed. > > > > > > > > > > > > > > > > > > How do you reproduce your issue ? > > > > > > > > > > > > > > > > Just boot the system, it gets an IP from dnsmasq on my host, > > > > > > > > then I ssh > > > > > > > > into it and do something to produce a bit of console output, > > > > > > > > after which > > > > > > > > g_ether is dead. Eg. 'dmesg' a few times is enough to kill it. > > > > > > > > > > > > > > which kernel version ? > > > > > > > > > > > > Anything since the patch went in, so 4.1-rc > > > > > > > > > > > > > Running as USB2 or USB3 ? > > > > > > > > > > > > speed:480, so USB2 I presume? > > > > > > > > > > > > > Have you tried > > > > > > > linux-next ? > > > > > > > > > > > > Tried it now (next-20150901). Equally bad as the rest. > > > > > > > > > > > > > I just did 1000 dmesg iterations over ssh with g_ether and > > > > > > > saw no issues. > > > > > > > > > > > > > > Can you enable dwc3 tracepoints and try again ? (use some very > > > > > > > large > > > > > > > trace buffer, something around 2 or 4 MiB should be enough). > > > > > > > > > > > > Attached one trace from linux-next, and another one with the revert > > > > > > on > > > > > > top. > > > > > > > > > > are you sure these come from next ? > > > > > > > > Yep. > > > > > > > > > It makes zero sense :-) Here's an > > > > > odd snippet: > > > > > > > > > > | sshd-1719 [000] d.s342.579785: dwc3_e
Re: [PATCH] Revert "usb: dwc3: gadget: drop unnecessary loop when cleaning up TRBs"
On Tue, Sep 01, 2015 at 08:59:02AM -0500, Felipe Balbi wrote: > Hi, > > On Tue, Sep 01, 2015 at 04:17:00PM +0300, Ville Syrjälä wrote: > > On Mon, Aug 31, 2015 at 01:50:10PM -0500, Felipe Balbi wrote: > > > Hi, > > > > > > On Mon, Aug 31, 2015 at 08:25:10PM +0300, Ville Syrjälä wrote: > > > > On Mon, Aug 31, 2015 at 11:54:13AM -0500, Felipe Balbi wrote: > > > > > On Mon, Aug 31, 2015 at 07:48:28PM +0300, > > > > > ville.syrj...@linux.intel.com wrote: > > > > > > From: Ville Syrjälä <ville.syrj...@linux.intel.com> > > > > > > > > > > > > This reverts commit 8f2c9544aba636134303105ecb164190a39dece4. > > > > > > > > > > > > As it breaks g_ether on my Baytrail FFRD8 device. Everything starts > > > > > > out > > > > > > fine, but after a bit of data has been transferred it just stops > > > > > > flowing. > > > > > > > > > > > > Note that I do get a bunch of these "NOHZ: local_softirq_pending 08" > > > > > > when booting the machine, but I'm not really sure if they're related > > > > > > to this problem. > > > > > > > > > > I have a feeling your problem is elsewhere. We *are* completing one > > > > > TRB > > > > > at a time. By reverting that commit you're just masking the real > > > > > problem > > > > > and I'd rather get that one fixed. > > > > > > > > > > How do you reproduce your issue ? > > > > > > > > Just boot the system, it gets an IP from dnsmasq on my host, then I ssh > > > > into it and do something to produce a bit of console output, after which > > > > g_ether is dead. Eg. 'dmesg' a few times is enough to kill it. > > > > > > which kernel version ? > > > > Anything since the patch went in, so 4.1-rc > > > > > Running as USB2 or USB3 ? > > > > speed:480, so USB2 I presume? > > > > > Have you tried > > > linux-next ? > > > > Tried it now (next-20150901). Equally bad as the rest. > > > > > I just did 1000 dmesg iterations over ssh with g_ether and > > > saw no issues. > > > > > > Can you enable dwc3 tracepoints and try again ? (use some very large > > > trace buffer, something around 2 or 4 MiB should be enough). > > > > Attached one trace from linux-next, and another one with the revert on > > top. > > are you sure these come from next ? Yep. > It makes zero sense :-) Here's an > odd snippet: > > | sshd-1719 [000] d.s342.579785: dwc3_ep_queue: ep1in: req > 880077afa540 length 822/1514 ==> 0 > | sshd-1719 [000] d.s342.580075: dwc3_ep_queue: ep1in: req > 880077afa6c0 length 0/334 ==> -108 > | systemd-network-1618 [003] d.s342.754796: dwc3_ep_queue: ep1in: req > 880077afa780 length 0/120 ==> -108 > > your requests are queued with -ESHUTDOWN!! Looking at the code the tracepoint is before the request is queued, so maybe there's just stale junk in req->status before it gets overwritten by __dwc3_gadget_ep_queue()? > > | -0 [000] d.h342.877628: dwc3_readl: addr > c940040c value 0004 > | -0 [000] d.h342.877635: dwc3_readl: addr > c9400408 value 0100 > | -0 [000] d.h342.877638: dwc3_writel: addr > c9400408 value 8100 > | irq/22-dwc3-753 [002] d..242.878300: dwc3_event: event 00c4 > > so you had an IRQ, fine. > > | irq/22-dwc3-753 [002] d..242.878312: dwc3_gadget: ep1out: reason > Transfer Not Active > | irq/22-dwc3-753 [002] d..242.878328: dwc3_gadget_ep_cmd: ep1out: > cmd 'Start Transfer' [6] params 77ad9030 > > a transfer is started. > > | irq/22-dwc3-753 [002] d..242.878332: dwc3_writel: addr > c9400828 value > | irq/22-dwc3-753 [002] d..242.878336: dwc3_writel: addr > c9400824 value 77ad9030 > | irq/22-dwc3-753 [002] d..242.878339: dwc3_writel: addr > c9400820 value > | irq/22-dwc3-753 [002] d..242.878342: dwc3_writel: addr > c940082c value 0406 > | irq/22-dwc3-753 [002] d..242.878345: dwc3_readl: addr > c940082c value 00050006 > | irq/22-dwc3-753 [002] d..242.878348: dwc3_gadget: Command > Complete --> 0 > | irq/22-dwc3-753 [002] d..242.878350: dwc3
Re: [PATCH] Revert "usb: dwc3: gadget: drop unnecessary loop when cleaning up TRBs"
On Tue, Sep 01, 2015 at 10:17:59AM -0500, Felipe Balbi wrote: > Hi, > > On Tue, Sep 01, 2015 at 05:39:28PM +0300, Ville Syrjälä wrote: > > On Tue, Sep 01, 2015 at 08:59:02AM -0500, Felipe Balbi wrote: > > > Hi, > > > > > > On Tue, Sep 01, 2015 at 04:17:00PM +0300, Ville Syrjälä wrote: > > > > On Mon, Aug 31, 2015 at 01:50:10PM -0500, Felipe Balbi wrote: > > > > > Hi, > > > > > > > > > > On Mon, Aug 31, 2015 at 08:25:10PM +0300, Ville Syrjälä wrote: > > > > > > On Mon, Aug 31, 2015 at 11:54:13AM -0500, Felipe Balbi wrote: > > > > > > > On Mon, Aug 31, 2015 at 07:48:28PM +0300, > > > > > > > ville.syrj...@linux.intel.com wrote: > > > > > > > > From: Ville Syrjälä <ville.syrj...@linux.intel.com> > > > > > > > > > > > > > > > > This reverts commit 8f2c9544aba636134303105ecb164190a39dece4. > > > > > > > > > > > > > > > > As it breaks g_ether on my Baytrail FFRD8 device. Everything > > > > > > > > starts out > > > > > > > > fine, but after a bit of data has been transferred it just stops > > > > > > > > flowing. > > > > > > > > > > > > > > > > Note that I do get a bunch of these "NOHZ: > > > > > > > > local_softirq_pending 08" > > > > > > > > when booting the machine, but I'm not really sure if they're > > > > > > > > related > > > > > > > > to this problem. > > > > > > > > > > > > > > I have a feeling your problem is elsewhere. We *are* completing > > > > > > > one TRB > > > > > > > at a time. By reverting that commit you're just masking the real > > > > > > > problem > > > > > > > and I'd rather get that one fixed. > > > > > > > > > > > > > > How do you reproduce your issue ? > > > > > > > > > > > > Just boot the system, it gets an IP from dnsmasq on my host, then I > > > > > > ssh > > > > > > into it and do something to produce a bit of console output, after > > > > > > which > > > > > > g_ether is dead. Eg. 'dmesg' a few times is enough to kill it. > > > > > > > > > > which kernel version ? > > > > > > > > Anything since the patch went in, so 4.1-rc > > > > > > > > > Running as USB2 or USB3 ? > > > > > > > > speed:480, so USB2 I presume? > > > > > > > > > Have you tried > > > > > linux-next ? > > > > > > > > Tried it now (next-20150901). Equally bad as the rest. > > > > > > > > > I just did 1000 dmesg iterations over ssh with g_ether and > > > > > saw no issues. > > > > > > > > > > Can you enable dwc3 tracepoints and try again ? (use some very large > > > > > trace buffer, something around 2 or 4 MiB should be enough). > > > > > > > > Attached one trace from linux-next, and another one with the revert on > > > > top. > > > > > > are you sure these come from next ? > > > > Yep. > > > > > It makes zero sense :-) Here's an > > > odd snippet: > > > > > > | sshd-1719 [000] d.s342.579785: dwc3_ep_queue: ep1in: > > > req 880077afa540 length 822/1514 ==> 0 > > > | sshd-1719 [000] d.s342.580075: dwc3_ep_queue: ep1in: > > > req 880077afa6c0 length 0/334 ==> -108 > > > | systemd-network-1618 [003] d.s342.754796: dwc3_ep_queue: ep1in: > > > req 880077afa780 length 0/120 ==> -108 > > > > > > your requests are queued with -ESHUTDOWN!! > > > > Looking at the code the tracepoint is before the request is queued, so > > maybe there's just stale junk in req->status before it gets overwritten > > by __dwc3_gadget_ep_queue()? > > right, something touched usb_request.status before and the request has > been recycled. > > > > more requests are queued and that's it. No further traffic. It just > > > stopped working. No further IRQs, nothing. > > > > > > mine looks very much different (see attached). I don't have any > > > -ESHUTDOWNs. How did you load g_ether ? Did you pass any extra options ? > > > > g_ether is builtin, and I just pass g_ether.dev_addr= via kernel > > cmdline. > > > > > Which IP version are you running ? > > > > ipv4 > > I mean the SNPS IP :-) (it's 2.10a, see below) It's all Greek to me :) > > > GSBUSCFG0 = 0x0006 > > GSBUSCFG1 = 0x0f00 > > GTXTHRCFG = 0x230a > > GRXTHRCFG = 0x2280 > > GCTL = 0x45802002 > > GEVTEN = 0x > > GSTS = 0x3e82 > > GSNPSID = 0x5533210a > > this could be a bug with 2.10a where completion IRQs are missed. Any > chance you can look for you Errata document and see if any exist ? I'm > using 2.40a. Ugh. USB isn't my thing, so I'm definitely not going to start hunting down any obscure docs. Cc:ing Mathias and Heikki since it looks like they've touched this beast before. You guys have any docs and/or clue as to what's happening here? -- Ville Syrjälä Intel OTC -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Revert "usb: dwc3: gadget: drop unnecessary loop when cleaning up TRBs"
On Mon, Aug 31, 2015 at 11:54:13AM -0500, Felipe Balbi wrote: > On Mon, Aug 31, 2015 at 07:48:28PM +0300, ville.syrj...@linux.intel.com wrote: > > From: Ville Syrjälä <ville.syrj...@linux.intel.com> > > > > This reverts commit 8f2c9544aba636134303105ecb164190a39dece4. > > > > As it breaks g_ether on my Baytrail FFRD8 device. Everything starts out > > fine, but after a bit of data has been transferred it just stops > > flowing. > > > > Note that I do get a bunch of these "NOHZ: local_softirq_pending 08" > > when booting the machine, but I'm not really sure if they're related > > to this problem. > > I have a feeling your problem is elsewhere. We *are* completing one TRB > at a time. By reverting that commit you're just masking the real problem > and I'd rather get that one fixed. > > How do you reproduce your issue ? Just boot the system, it gets an IP from dnsmasq on my host, then I ssh into it and do something to produce a bit of console output, after which g_ether is dead. Eg. 'dmesg' a few times is enough to kill it. Here's what I have in my .config: CONFIG_USB_OTG=y CONFIG_USB_OTG_FSM=y CONFIG_USB_DWC3=y CONFIG_USB_DWC3_GADGET=y CONFIG_USB_DWC3_PCI=y CONFIG_USB_PHY=y CONFIG_NOP_USB_XCEIV=y CONFIG_USB_GPIO_VBUS=y CONFIG_USB_GADGET=y CONFIG_USB_GADGET_VBUS_DRAW=2 CONFIG_USB_GADGET_STORAGE_NUM_BUFFERS=2 CONFIG_USB_LIBCOMPOSITE=y CONFIG_USB_U_ETHER=y CONFIG_USB_F_ECM=y CONFIG_USB_F_SUBSET=y CONFIG_USB_ETH=y -- Ville Syrjälä Intel OTC -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [regression 3.15-rc3] Resume from s4 broken by 1f81b6d22a5980955b01e08cf27fb745dc9b686f
On Wed, May 07, 2014 at 04:48:48PM +0300, Mathias Nyman wrote: On 05/06/2014 02:41 PM, Ville Syrjälä wrote: On Mon, May 05, 2014 at 12:32:22PM -0700, Julius Werner wrote: Hmmm... very odd. I unfortunately don't have a machine that can easily do S4 at hand, but I did test this on an IVB with XHCI_RESET_ON_RESUME in S3 (essentially the same code path), and I didn't run into any problems. How exactly does your machine fail on resume? Is it a kernel crash or just a hang? Can you try getting some debug output (by setting 'echo N /sys/module/printk/parameters/console_suspend' and trying to catch the crash on the screen or a serial line, or maybe through pstore)? I really don't see much that could go wrong with this patch, so without more info it will be hard to understand your problem. Also, I noticed that you have two HID devices plugged in during suspend. Does it make a difference if you have different devices (e.g. a mass storage stick) or none at all? Looks like it doesn't like it when there's anything plugged into the SS ports. I tried with just a HID keyboard or with just a hub. In both cases it fails to resume. If I have nothing connected to the SS ports then it resumes just fine. I managed to catch something with ramoops. Looks like it's hitting POISON_FREE when trying to delete some list entry. 4[ 107.047230] xhci_hcd :00:14.0: Slot 1 endpoint 2 not removed from BW list! 4[ 107.047574] general protection fault: [#1] PREEMPT SMP I took a look at the xhci_mem_cleanup() function and to me it looks like it tries to access a list_head that is already freed. The struct list_head xhci-devs[].eps[].bw_endpoint_list is added to an endpoint list in xhci-rh_bw[].bw_table.interval_bw[].endpoints xhci_mem_cleanup() frees all devices (the allocated xhci-devs[], containing the bw_endpoint_list) before it starts to loop through, and delete entries from the xhci-rh_bw[].bw_table.interval_bw[].endpoints list. I can't see how this relates to Julius patch though, and I'm not sure yet why it only triggers when devices are connected to SS ports. Maybe just unlucky timing? I think the non-SS ports are connected to the EHCI controllers rather than the XHCI controllers. So that explains at least one detail. And I guess timing is as good an excuse as any why this gets exposed by the patch in question. Does this help?: Indeed it does. The machine just survived a dozen or so suspend+resume cycles without a hitch. The bug was 100% reproducible on this machine, so the fix seems solid. Tested-by: Ville Syrjälä ville.syrj...@linux.intel.com diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index c089668..b1a8a5f 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -1822,6 +1822,16 @@ void xhci_mem_cleanup(struct xhci_hcd *xhci) kfree(cur_cd); } + num_ports = HCS_MAX_PORTS(xhci-hcs_params1); + for (i = 0; i num_ports; i++) { + struct xhci_interval_bw_table *bwt = xhci-rh_bw[i].bw_table; + for (j = 0; j XHCI_MAX_INTERVAL; j++) { + struct list_head *ep = bwt-interval_bw[j].endpoints; + while (!list_empty(ep)) + list_del_init(ep-next); + } + } + for (i = 1; i MAX_HC_SLOTS; ++i) xhci_free_virt_device(xhci, i); @@ -1857,16 +1867,6 @@ void xhci_mem_cleanup(struct xhci_hcd *xhci) if (!xhci-rh_bw) goto no_bw; - num_ports = HCS_MAX_PORTS(xhci-hcs_params1); - for (i = 0; i num_ports; i++) { - struct xhci_interval_bw_table *bwt = xhci-rh_bw[i].bw_table; - for (j = 0; j XHCI_MAX_INTERVAL; j++) { - struct list_head *ep = bwt-interval_bw[j].endpoints; - while (!list_empty(ep)) - list_del_init(ep-next); - } - } - for (i = 0; i num_ports; i++) { struct xhci_tt_bw_info *tt, *n; list_for_each_entry_safe(tt, n, xhci-rh_bw[i].tts, tt_list) { -- Ville Syrjälä Intel OTC -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [regression 3.15-rc3] Resume from s4 broken by 1f81b6d22a5980955b01e08cf27fb745dc9b686f
]--- -- Ville Syrjälä Intel OTC -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html