Re: [PATCH] usb: gadget: u_ether: remove interrupt throttling

2016-11-01 Thread Ville Syrjälä
On Tue, Nov 01, 2016 at 01:29:59PM +0200, Felipe Balbi wrote:
> According to Dave Miller "the networking stack has a
> hard requirement that all SKBs which are transmitted
> must have their completion signalled in a fininte
> amount of time. This is because, until the SKB is
> freed by the driver, it holds onto socket,
> netfilter, and other subsystem resources."
> 
> In summary, this means that using TX IRQ throttling
> for the networking gadgets is, at least, complex and
> we should avoid it for the time being.
> 
> Cc: <sta...@vger.kernel.org>
> Reported-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> Suggested-by: David Miller <da...@davemloft.net>
> Signed-off-by: Felipe Balbi <felipe.ba...@linux.intel.com>

Fixes laggy/hangy g_ether on my BYT FFRD8 tablet, caused by
commit 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for LST bit")

Tested-by: Ville Syrjälä <ville.syrj...@linux.intel.com>

> ---
>  drivers/usb/gadget/function/u_ether.c | 8 
>  1 file changed, 8 deletions(-)
> 
> diff --git a/drivers/usb/gadget/function/u_ether.c 
> b/drivers/usb/gadget/function/u_ether.c
> index f4a640216913..119a2e5848e8 100644
> --- a/drivers/usb/gadget/function/u_ether.c
> +++ b/drivers/usb/gadget/function/u_ether.c
> @@ -589,14 +589,6 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>  
>   req->length = length;
>  
> - /* throttle high/super speed IRQ rate back slightly */
> - if (gadget_is_dualspeed(dev->gadget))
> - req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
> -dev->gadget->speed == USB_SPEED_SUPER)) 
> &&
> - !list_empty(>tx_reqs))
> - ? ((atomic_read(>tx_qlen) % dev->qmult) != 0)
> - : 0;
> -
>   retval = usb_ep_queue(in, req, GFP_ATOMIC);
>   switch (retval) {
>   default:
> -- 
> 2.10.1

-- 
Ville Syrjälä
Intel OTC
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"

2016-10-28 Thread Ville Syrjälä
On Fri, Oct 28, 2016 at 01:16:13PM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> Ville Syrjälä <ville.syrj...@linux.intel.com> writes:
> > On Thu, Oct 06, 2016 at 12:08:26PM +0300, Ville Syrjälä wrote:
> >> On Thu, Oct 06, 2016 at 10:36:09AM +0300, Felipe Balbi wrote:
> >> > 
> >> > Hi,
> >> > 
> >> > Felipe Balbi <felipe.ba...@linux.intel.com> writes:
> >> 
> >> > Okay, I have found a regression on dwc3 and another patch follows:
> >> > 
> >> > commit 5e1a2af3e46248c55098cdae643c4141851b703e
> >> > Author: Felipe Balbi <felipe.ba...@linux.intel.com>
> >> > Date:   Wed Oct 5 14:24:37 2016 +0300
> >> > 
> >> > usb: dwc3: gadget: properly account queued requests
> >> > 
> >> > Some requests could be accounted for multiple
> >> > times. Let's fix that so each and every requests is
> >> > accounted for only once.
> >> > 
> >> > Cc: <sta...@vger.kernel.org> # v4.8
> >> > Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs 
> >> > for LST bit")
> >> > Signed-off-by: Felipe Balbi <felipe.ba...@linux.intel.com>
> >> > 
> >> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> >> > index 07cc8929f271..3c3ced128c77 100644
> >> > --- a/drivers/usb/dwc3/gadget.c
> >> > +++ b/drivers/usb/dwc3/gadget.c
> >> > @@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
> >> >  req->trb = trb;
> >> >  req->trb_dma = dwc3_trb_dma_offset(dep, trb);
> >> >  req->first_trb_index = dep->trb_enqueue;
> >> > +dep->queued_requests++;
> >> >  }
> >> >  
> >> >  dwc3_ep_inc_enq(dep);
> >> > @@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
> >> >  
> >> >  trb->ctrl |= DWC3_TRB_CTRL_HWO;
> >> >  
> >> > -dep->queued_requests++;
> >> > -
> >> >  trace_dwc3_prepare_trb(dep, trb);
> >> >  }
> >> >  
> >> > @@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 
> >> > *dwc, struct dwc3_ep *dep,
> >> >  unsigned ints_pkt = 0;
> >> >  unsigned inttrb_status;
> >> >  
> >> > -dep->queued_requests--;
> >> >  dwc3_ep_inc_deq(dep);
> >> > +
> >> > +if (req->trb == trb)
> >> > +dep->queued_requests--;
> >> > +
> >> >  trace_dwc3_complete_trb(dep, trb);
> >> >  
> >> >  /*
> >> > 
> >> > I have also built a branch which you can use for testing. Here's a pull
> >> > request, once you tell me it works for you, then I can send proper
> >> > patches out:
> >> > 
> >> > The following changes since commit 
> >> > c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:
> >> > 
> >> >   Linux 4.8 (2016-10-02 16:24:33 -0700)
> >> > 
> >> > are available in the git repository at:
> >> > 
> >> >   git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git 
> >> > tmp-test-v4.8
> >> > 
> >> > for you to fetch changes up to c968b8d1effe64a7980802d1eef29f4e1922faca:
> >> > 
> >> >   usb: dwc3: gadget: properly account queued requests (2016-10-06 
> >> > 10:16:37 +0300)
> >> > 
> >> > 
> >> > Felipe Balbi (2):
> >> >   usb: gadget: function: u_ether: don't starve tx request queue
> >> >   usb: dwc3: gadget: properly account queued requests
> >> > 
> >> >  drivers/usb/dwc3/gadget.c | 7 ---
> >> >  drivers/usb/gadget/function/u_ether.c | 5 +++--
> >> >  2 files changed, 7 insertions(+), 5 deletions(-)
> >> 
> >> Tried your branch, but unfortunately I'm still seeing the lags. New trace
> >> attached.
> >
> > Just a reminder that the regressions is still there in 4.9-rc2.
> > Unfortunateyly with all the stuff already piled on top, getting USB
> > working on my device is no longer a simple matter of reverting the
> > one co

Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"

2016-10-27 Thread Ville Syrjälä
On Thu, Oct 06, 2016 at 12:08:26PM +0300, Ville Syrjälä wrote:
> On Thu, Oct 06, 2016 at 10:36:09AM +0300, Felipe Balbi wrote:
> > 
> > Hi,
> > 
> > Felipe Balbi <felipe.ba...@linux.intel.com> writes:
> 
> > Okay, I have found a regression on dwc3 and another patch follows:
> > 
> > commit 5e1a2af3e46248c55098cdae643c4141851b703e
> > Author: Felipe Balbi <felipe.ba...@linux.intel.com>
> > Date:   Wed Oct 5 14:24:37 2016 +0300
> > 
> > usb: dwc3: gadget: properly account queued requests
> > 
> > Some requests could be accounted for multiple
> > times. Let's fix that so each and every requests is
> > accounted for only once.
> > 
> > Cc: <sta...@vger.kernel.org> # v4.8
> > Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for 
> > LST bit")
> > Signed-off-by: Felipe Balbi <felipe.ba...@linux.intel.com>
> > 
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 07cc8929f271..3c3ced128c77 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
> > req->trb = trb;
> > req->trb_dma = dwc3_trb_dma_offset(dep, trb);
> > req->first_trb_index = dep->trb_enqueue;
> > +   dep->queued_requests++;
> > }
> >  
> > dwc3_ep_inc_enq(dep);
> > @@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
> >  
> > trb->ctrl |= DWC3_TRB_CTRL_HWO;
> >  
> > -   dep->queued_requests++;
> > -
> > trace_dwc3_prepare_trb(dep, trb);
> >  }
> >  
> > @@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 
> > *dwc, struct dwc3_ep *dep,
> > unsigned ints_pkt = 0;
> > unsigned inttrb_status;
> >  
> > -   dep->queued_requests--;
> > dwc3_ep_inc_deq(dep);
> > +
> > +   if (req->trb == trb)
> > +   dep->queued_requests--;
> > +
> > trace_dwc3_complete_trb(dep, trb);
> >  
> > /*
> > 
> > I have also built a branch which you can use for testing. Here's a pull
> > request, once you tell me it works for you, then I can send proper
> > patches out:
> > 
> > The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:
> > 
> >   Linux 4.8 (2016-10-02 16:24:33 -0700)
> > 
> > are available in the git repository at:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git tmp-test-v4.8
> > 
> > for you to fetch changes up to c968b8d1effe64a7980802d1eef29f4e1922faca:
> > 
> >   usb: dwc3: gadget: properly account queued requests (2016-10-06 10:16:37 
> > +0300)
> > 
> > 
> > Felipe Balbi (2):
> >   usb: gadget: function: u_ether: don't starve tx request queue
> >   usb: dwc3: gadget: properly account queued requests
> > 
> >  drivers/usb/dwc3/gadget.c | 7 ---
> >  drivers/usb/gadget/function/u_ether.c | 5 +++--
> >  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> Tried your branch, but unfortunately I'm still seeing the lags. New trace
> attached.

Just a reminder that the regressions is still there in 4.9-rc2.
Unfortunateyly with all the stuff already piled on top, getting USB
working on my device is no longer a simple matter of reverting the
one commit. I had to revert 10 patches to get even a clean revert, but
unfortunately that approach just lead to the transfer hanging immediately.

So what I ended up doing is reverting all of this:
git log --no-merges --format=oneline 
55a0237f8f47957163125e20ee9260538c5c341c^..HEAD -- drivers/usb/dwc3/ 
include/linux/ulpi/interface.h drivers/usb/Kconfig drivers/usb/core/Kconfig

which comes out at whopping 70 commits. Having to carry that around
is going to be quite a pain especially as more stuff might be piled on
top.

/me thinks a stable backport of any fix (assuming one is found
eventually) is going to be quite the challenge...

-- 
Ville Syrjälä
Intel OTC
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"

2016-10-04 Thread Ville Syrjälä
On Tue, Oct 04, 2016 at 03:23:12PM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> Ville Syrjälä <ville.syrj...@linux.intel.com> writes:
> > On Mon, Oct 03, 2016 at 07:08:43PM +0300, Felipe Balbi wrote:
> >> 
> >> Hi,
> >> 
> >> ville.syrj...@linux.intel.com writes:
> >> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> >> >
> >> > This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c.
> >> >
> >> > commit  55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for
> >> > LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp
> >> > is very sluggish and can even stall entirely. Revert cures it.
> >> >
> >> > Cc: Felipe Balbi <felipe.ba...@linux.intel.com>
> >> > Cc: sta...@vger.kernel.org
> >> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> >> 
> >> Reverting that commit is not the best idea. Let's start with the usual:
> >> 
> >> - Kernel version
> >> - dmesg on both sides (host and device)
> >> - dwc3 tracepoints:
> >> 
> >> # mkdir -p /t
> >> # mount -t tracefs none /t
> >> # echo 8192 > /t/buffer_size_kb
> >> # echo 1 > /t/events/dwc3/enable
> >> # echo 0 > /t/events/dwc3/dwc3_readl/enable
> >> # echo 0 > /t/events/dwc3/dwc3_writel/enable
> >> 
> >> This should be enough to tell me what's really going on.
> >
> > I've attached the (compressed due to size) dmesgs/traces from the device.
> > For my test I just did 'for i in `seq 1 5` ; dmesg ; done' in an ssh
> > session, and I did get the lag in the bad run, so presumably the trace
> > should have caught it. Both were on 4.8 (+ the revert for the good run).
> >
> > Host side dmesg just showes me:
> >  usb 3-3.4: new high-speed USB device number 25 using xhci_hcd
> >  cdc_ether 3-3.4:1.0 usb0: register 'cdc_ether' at usb-:00:14.0-3.4, 
> > CDC Ethernet Device, ...
> > when I boot the device. Nothing extra appears when things lag around.
> > I'm running 4.5.something on the host.
> 
> Looking at your logs what I can see is that "revert.trace" doesn't
> respect gadget driver's IRQ throttling, which means you get an interrupt
> for each and every request, while "bad.trace" shows dwc3 respecting
> "no_interrupt" flag passed in by g_ether.
> 
> The only explanation I have for this behavior when we start respecting
> gadget driver's flags, is that there's a possible race in u_ether.c
> where it can go for a long time without an interrupt due to that
> throttling.
> 
> This only happens with IN endpoints (the ones which are throttled) and
> the behavior can be seen in log snippet below:
> 
>   27782: irq/17-dwc3-2524  [000] d...   666.891969: tx_complete: req 
> 88016da1a6c0 length 1514/1514 flags ZsI tx_qlen 4 qmult 5
>   27802: ksoftirqd/0-3 [000] .Ns1   666.892047: eth_start_xmit: req 
> 88016da1a6c0 length 0/1514 flags Zsi tx_qlen 5 qmult 5
>   27808: ksoftirqd/0-3 [000] .Ns2   666.892062: eth_start_xmit: req 
> 88016da1a300 length 0/1514 flags ZsI tx_qlen 6 qmult 5
>   27814: ksoftirqd/0-3 [000] .Ns2   666.892069: eth_start_xmit: req 
> 88016da1ab40 length 0/1514 flags Zsi tx_qlen 7 qmult 5
>   27820: ksoftirqd/0-3 [000] .Ns2   666.892075: eth_start_xmit: req 
> 88016da1acc0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>   27826: ksoftirqd/0-3 [000] .Ns2   666.892082: eth_start_xmit: req 
> 88016da1ae40 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>   27832: ksoftirqd/0-3 [000] .Ns2   666.892088: eth_start_xmit: req 
> 88016da1a780 length 0/1514 flags Zsi tx_qlen 10 qmult 5
>   27884: irq/17-dwc3-2524  [000] d...   666.892213: tx_complete: req 
> 88016da1a840 length 1514/1514 flags Zsi tx_qlen 9 qmult 5
>   27889: irq/17-dwc3-2524  [000] d...   666.892215: tx_complete: req 
> 88016da1a600 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>   27897:sshd-2684  [002] ..s1   666.892223: eth_start_xmit: req 
> 88016da1a840 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>   27898: irq/17-dwc3-2524  [000] d...   666.892223: tx_complete: req 
> 88016da1a180 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>   27903: irq/17-dwc3-2524  [000] d...   666.892225: tx_complete: req 
> 88016da1a240 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>   27910:sshd-2684  [002] ..s1   666.892230: eth_start_xmit: req 
> 88016da1a180 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>   27913: irq/17-dwc3-2524  [000] d...  

Re: [PATCH] Revert "usb: dwc3: gadget: drop unnecessary loop when cleaning up TRBs"

2015-09-28 Thread Ville Syrjälä
On Mon, Sep 07, 2015 at 09:56:06AM +0300, Heikki Krogerus wrote:
> Hi,
> 
> On Tue, Sep 01, 2015 at 06:37:54PM +0300, Ville Syrjälä wrote:
> > On Tue, Sep 01, 2015 at 10:17:59AM -0500, Felipe Balbi wrote:
> > > Hi,
> > > 
> > > On Tue, Sep 01, 2015 at 05:39:28PM +0300, Ville Syrjälä wrote:
> > > > On Tue, Sep 01, 2015 at 08:59:02AM -0500, Felipe Balbi wrote:
> > > > > Hi,
> > > > > 
> > > > > On Tue, Sep 01, 2015 at 04:17:00PM +0300, Ville Syrjälä wrote:
> > > > > > On Mon, Aug 31, 2015 at 01:50:10PM -0500, Felipe Balbi wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On Mon, Aug 31, 2015 at 08:25:10PM +0300, Ville Syrjälä wrote:
> > > > > > > > On Mon, Aug 31, 2015 at 11:54:13AM -0500, Felipe Balbi wrote:
> > > > > > > > > On Mon, Aug 31, 2015 at 07:48:28PM +0300, 
> > > > > > > > > ville.syrj...@linux.intel.com wrote:
> > > > > > > > > > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > > > > > > > > > 
> > > > > > > > > > This reverts commit 
> > > > > > > > > > 8f2c9544aba636134303105ecb164190a39dece4.
> > > > > > > > > > 
> > > > > > > > > > As it breaks g_ether on my Baytrail FFRD8 device. 
> > > > > > > > > > Everything starts out
> > > > > > > > > > fine, but after a bit of data has been transferred it just 
> > > > > > > > > > stops
> > > > > > > > > > flowing.
> > > > > > > > > > 
> > > > > > > > > > Note that I do get a bunch of these "NOHZ: 
> > > > > > > > > > local_softirq_pending 08"
> > > > > > > > > > when booting the machine, but I'm not really sure if 
> > > > > > > > > > they're related
> > > > > > > > > > to this problem.
> > > > > > > > > 
> > > > > > > > > I have a feeling your problem is elsewhere. We *are* 
> > > > > > > > > completing one TRB
> > > > > > > > > at a time. By reverting that commit you're just masking the 
> > > > > > > > > real problem
> > > > > > > > > and I'd rather get that one fixed.
> > > > > > > > > 
> > > > > > > > > How do you reproduce your issue ?
> > > > > > > > 
> > > > > > > > Just boot the system, it gets an IP from dnsmasq on my host, 
> > > > > > > > then I ssh
> > > > > > > > into it and do something to produce a bit of console output, 
> > > > > > > > after which
> > > > > > > > g_ether is dead. Eg. 'dmesg' a few times is enough to kill it.
> > > > > > > 
> > > > > > > which kernel version ?
> > > > > > 
> > > > > > Anything since the patch went in, so 4.1-rc
> > > > > > 
> > > > > > > Running as USB2 or USB3 ?
> > > > > > 
> > > > > > speed:480, so USB2 I presume?
> > > > > > 
> > > > > > > Have you tried
> > > > > > > linux-next ?
> > > > > > 
> > > > > > Tried it now (next-20150901). Equally bad as the rest.
> > > > > > 
> > > > > > > I just did 1000 dmesg iterations over ssh with g_ether and
> > > > > > > saw no issues.
> > > > > > > 
> > > > > > > Can you enable dwc3 tracepoints and try again ? (use some very 
> > > > > > > large
> > > > > > > trace buffer, something around 2 or 4 MiB should be enough).
> > > > > > 
> > > > > > Attached one trace from linux-next, and another one with the revert 
> > > > > > on
> > > > > > top.
> > > > > 
> > > > > are you sure these come from next ?
> > > > 
> > > > Yep.
> > > > 
> > > > > It makes zero sense :-) Here's an
> > > > > odd snippet:
> > > > > 
> > > > > | sshd-1719  [000] d.s342.579785: dwc3_e

Re: [PATCH] Revert "usb: dwc3: gadget: drop unnecessary loop when cleaning up TRBs"

2015-09-01 Thread Ville Syrjälä
On Tue, Sep 01, 2015 at 08:59:02AM -0500, Felipe Balbi wrote:
> Hi,
> 
> On Tue, Sep 01, 2015 at 04:17:00PM +0300, Ville Syrjälä wrote:
> > On Mon, Aug 31, 2015 at 01:50:10PM -0500, Felipe Balbi wrote:
> > > Hi,
> > > 
> > > On Mon, Aug 31, 2015 at 08:25:10PM +0300, Ville Syrjälä wrote:
> > > > On Mon, Aug 31, 2015 at 11:54:13AM -0500, Felipe Balbi wrote:
> > > > > On Mon, Aug 31, 2015 at 07:48:28PM +0300, 
> > > > > ville.syrj...@linux.intel.com wrote:
> > > > > > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > > > > > 
> > > > > > This reverts commit 8f2c9544aba636134303105ecb164190a39dece4.
> > > > > > 
> > > > > > As it breaks g_ether on my Baytrail FFRD8 device. Everything starts 
> > > > > > out
> > > > > > fine, but after a bit of data has been transferred it just stops
> > > > > > flowing.
> > > > > > 
> > > > > > Note that I do get a bunch of these "NOHZ: local_softirq_pending 08"
> > > > > > when booting the machine, but I'm not really sure if they're related
> > > > > > to this problem.
> > > > > 
> > > > > I have a feeling your problem is elsewhere. We *are* completing one 
> > > > > TRB
> > > > > at a time. By reverting that commit you're just masking the real 
> > > > > problem
> > > > > and I'd rather get that one fixed.
> > > > > 
> > > > > How do you reproduce your issue ?
> > > > 
> > > > Just boot the system, it gets an IP from dnsmasq on my host, then I ssh
> > > > into it and do something to produce a bit of console output, after which
> > > > g_ether is dead. Eg. 'dmesg' a few times is enough to kill it.
> > > 
> > > which kernel version ?
> > 
> > Anything since the patch went in, so 4.1-rc
> > 
> > > Running as USB2 or USB3 ?
> > 
> > speed:480, so USB2 I presume?
> > 
> > > Have you tried
> > > linux-next ?
> > 
> > Tried it now (next-20150901). Equally bad as the rest.
> > 
> > > I just did 1000 dmesg iterations over ssh with g_ether and
> > > saw no issues.
> > > 
> > > Can you enable dwc3 tracepoints and try again ? (use some very large
> > > trace buffer, something around 2 or 4 MiB should be enough).
> > 
> > Attached one trace from linux-next, and another one with the revert on
> > top.
> 
> are you sure these come from next ?

Yep.

> It makes zero sense :-) Here's an
> odd snippet:
> 
> | sshd-1719  [000] d.s342.579785: dwc3_ep_queue: ep1in: req 
> 880077afa540 length 822/1514 ==> 0
> | sshd-1719  [000] d.s342.580075: dwc3_ep_queue: ep1in: req 
> 880077afa6c0 length 0/334 ==> -108
> |  systemd-network-1618  [003] d.s342.754796: dwc3_ep_queue: ep1in: req 
> 880077afa780 length 0/120 ==> -108
> 
> your requests are queued with -ESHUTDOWN!!

Looking at the code the tracepoint is before the request is queued, so
maybe there's just stale junk in req->status before it gets overwritten
by __dwc3_gadget_ep_queue()?

> 
> |   -0 [000] d.h342.877628: dwc3_readl: addr 
> c940040c value 0004
> |   -0 [000] d.h342.877635: dwc3_readl: addr 
> c9400408 value 0100
> |   -0 [000] d.h342.877638: dwc3_writel: addr 
> c9400408 value 8100
> |  irq/22-dwc3-753   [002] d..242.878300: dwc3_event: event 00c4
> 
> so you had an IRQ, fine.
> 
> |  irq/22-dwc3-753   [002] d..242.878312: dwc3_gadget: ep1out: reason 
> Transfer Not Active
> |  irq/22-dwc3-753   [002] d..242.878328: dwc3_gadget_ep_cmd: ep1out: 
> cmd 'Start Transfer' [6] params  77ad9030 
> 
> a transfer is started.
> 
> |  irq/22-dwc3-753   [002] d..242.878332: dwc3_writel: addr 
> c9400828 value 
> |  irq/22-dwc3-753   [002] d..242.878336: dwc3_writel: addr 
> c9400824 value 77ad9030
> |  irq/22-dwc3-753   [002] d..242.878339: dwc3_writel: addr 
> c9400820 value 
> |  irq/22-dwc3-753   [002] d..242.878342: dwc3_writel: addr 
> c940082c value 0406
> |  irq/22-dwc3-753   [002] d..242.878345: dwc3_readl: addr 
> c940082c value 00050006
> |  irq/22-dwc3-753   [002] d..242.878348: dwc3_gadget: Command 
> Complete --> 0
> |  irq/22-dwc3-753   [002] d..242.878350: dwc3

Re: [PATCH] Revert "usb: dwc3: gadget: drop unnecessary loop when cleaning up TRBs"

2015-09-01 Thread Ville Syrjälä
On Tue, Sep 01, 2015 at 10:17:59AM -0500, Felipe Balbi wrote:
> Hi,
> 
> On Tue, Sep 01, 2015 at 05:39:28PM +0300, Ville Syrjälä wrote:
> > On Tue, Sep 01, 2015 at 08:59:02AM -0500, Felipe Balbi wrote:
> > > Hi,
> > > 
> > > On Tue, Sep 01, 2015 at 04:17:00PM +0300, Ville Syrjälä wrote:
> > > > On Mon, Aug 31, 2015 at 01:50:10PM -0500, Felipe Balbi wrote:
> > > > > Hi,
> > > > > 
> > > > > On Mon, Aug 31, 2015 at 08:25:10PM +0300, Ville Syrjälä wrote:
> > > > > > On Mon, Aug 31, 2015 at 11:54:13AM -0500, Felipe Balbi wrote:
> > > > > > > On Mon, Aug 31, 2015 at 07:48:28PM +0300, 
> > > > > > > ville.syrj...@linux.intel.com wrote:
> > > > > > > > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > > > > > > > 
> > > > > > > > This reverts commit 8f2c9544aba636134303105ecb164190a39dece4.
> > > > > > > > 
> > > > > > > > As it breaks g_ether on my Baytrail FFRD8 device. Everything 
> > > > > > > > starts out
> > > > > > > > fine, but after a bit of data has been transferred it just stops
> > > > > > > > flowing.
> > > > > > > > 
> > > > > > > > Note that I do get a bunch of these "NOHZ: 
> > > > > > > > local_softirq_pending 08"
> > > > > > > > when booting the machine, but I'm not really sure if they're 
> > > > > > > > related
> > > > > > > > to this problem.
> > > > > > > 
> > > > > > > I have a feeling your problem is elsewhere. We *are* completing 
> > > > > > > one TRB
> > > > > > > at a time. By reverting that commit you're just masking the real 
> > > > > > > problem
> > > > > > > and I'd rather get that one fixed.
> > > > > > > 
> > > > > > > How do you reproduce your issue ?
> > > > > > 
> > > > > > Just boot the system, it gets an IP from dnsmasq on my host, then I 
> > > > > > ssh
> > > > > > into it and do something to produce a bit of console output, after 
> > > > > > which
> > > > > > g_ether is dead. Eg. 'dmesg' a few times is enough to kill it.
> > > > > 
> > > > > which kernel version ?
> > > > 
> > > > Anything since the patch went in, so 4.1-rc
> > > > 
> > > > > Running as USB2 or USB3 ?
> > > > 
> > > > speed:480, so USB2 I presume?
> > > > 
> > > > > Have you tried
> > > > > linux-next ?
> > > > 
> > > > Tried it now (next-20150901). Equally bad as the rest.
> > > > 
> > > > > I just did 1000 dmesg iterations over ssh with g_ether and
> > > > > saw no issues.
> > > > > 
> > > > > Can you enable dwc3 tracepoints and try again ? (use some very large
> > > > > trace buffer, something around 2 or 4 MiB should be enough).
> > > > 
> > > > Attached one trace from linux-next, and another one with the revert on
> > > > top.
> > > 
> > > are you sure these come from next ?
> > 
> > Yep.
> > 
> > > It makes zero sense :-) Here's an
> > > odd snippet:
> > > 
> > > | sshd-1719  [000] d.s342.579785: dwc3_ep_queue: ep1in: 
> > > req 880077afa540 length 822/1514 ==> 0
> > > | sshd-1719  [000] d.s342.580075: dwc3_ep_queue: ep1in: 
> > > req 880077afa6c0 length 0/334 ==> -108
> > > |  systemd-network-1618  [003] d.s342.754796: dwc3_ep_queue: ep1in: 
> > > req 880077afa780 length 0/120 ==> -108
> > > 
> > > your requests are queued with -ESHUTDOWN!!
> > 
> > Looking at the code the tracepoint is before the request is queued, so
> > maybe there's just stale junk in req->status before it gets overwritten
> > by __dwc3_gadget_ep_queue()?
> 
> right, something touched usb_request.status before and the request has
> been recycled.
> 
> > > more requests are queued and that's it. No further traffic. It just
> > > stopped working. No further IRQs, nothing.
> > > 
> > > mine looks very much different (see attached). I don't have any
> > > -ESHUTDOWNs. How did you load g_ether ? Did you pass any extra options ?
> > 
> > g_ether is builtin, and I just pass g_ether.dev_addr= via kernel 
> > cmdline.
> > 
> > > Which IP version are you running ?
> > 
> > ipv4
> 
> I mean the SNPS IP :-) (it's 2.10a, see below)

It's all Greek to me :)

> 
> > GSBUSCFG0 = 0x0006
> > GSBUSCFG1 = 0x0f00
> > GTXTHRCFG = 0x230a
> > GRXTHRCFG = 0x2280
> > GCTL = 0x45802002
> > GEVTEN = 0x
> > GSTS = 0x3e82
> > GSNPSID = 0x5533210a
> 
> this could be a bug with 2.10a where completion IRQs are missed. Any
> chance you can look for you Errata document and see if any exist ? I'm
> using 2.40a.

Ugh. USB isn't my thing, so I'm definitely not going to start hunting down
any obscure docs.

Cc:ing Mathias and Heikki since it looks like they've touched this beast
before. You guys have any docs and/or clue as to what's happening here?

-- 
Ville Syrjälä
Intel OTC
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Revert "usb: dwc3: gadget: drop unnecessary loop when cleaning up TRBs"

2015-08-31 Thread Ville Syrjälä
On Mon, Aug 31, 2015 at 11:54:13AM -0500, Felipe Balbi wrote:
> On Mon, Aug 31, 2015 at 07:48:28PM +0300, ville.syrj...@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > 
> > This reverts commit 8f2c9544aba636134303105ecb164190a39dece4.
> > 
> > As it breaks g_ether on my Baytrail FFRD8 device. Everything starts out
> > fine, but after a bit of data has been transferred it just stops
> > flowing.
> > 
> > Note that I do get a bunch of these "NOHZ: local_softirq_pending 08"
> > when booting the machine, but I'm not really sure if they're related
> > to this problem.
> 
> I have a feeling your problem is elsewhere. We *are* completing one TRB
> at a time. By reverting that commit you're just masking the real problem
> and I'd rather get that one fixed.
> 
> How do you reproduce your issue ?

Just boot the system, it gets an IP from dnsmasq on my host, then I ssh
into it and do something to produce a bit of console output, after which
g_ether is dead. Eg. 'dmesg' a few times is enough to kill it.

Here's what I have in my .config:
CONFIG_USB_OTG=y
CONFIG_USB_OTG_FSM=y
CONFIG_USB_DWC3=y
CONFIG_USB_DWC3_GADGET=y
CONFIG_USB_DWC3_PCI=y
CONFIG_USB_PHY=y
CONFIG_NOP_USB_XCEIV=y
CONFIG_USB_GPIO_VBUS=y
CONFIG_USB_GADGET=y
CONFIG_USB_GADGET_VBUS_DRAW=2
CONFIG_USB_GADGET_STORAGE_NUM_BUFFERS=2
CONFIG_USB_LIBCOMPOSITE=y
CONFIG_USB_U_ETHER=y
CONFIG_USB_F_ECM=y
CONFIG_USB_F_SUBSET=y
CONFIG_USB_ETH=y

-- 
Ville Syrjälä
Intel OTC
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [regression 3.15-rc3] Resume from s4 broken by 1f81b6d22a5980955b01e08cf27fb745dc9b686f

2014-05-07 Thread Ville Syrjälä
On Wed, May 07, 2014 at 04:48:48PM +0300, Mathias Nyman wrote:
 On 05/06/2014 02:41 PM, Ville Syrjälä wrote:
  On Mon, May 05, 2014 at 12:32:22PM -0700, Julius Werner wrote:
  Hmmm... very odd. I unfortunately don't have a machine that can easily
  do S4 at hand, but I did test this on an IVB with XHCI_RESET_ON_RESUME
  in S3 (essentially the same code path), and I didn't run into any
  problems.
 
  How exactly does your machine fail on resume? Is it a kernel crash or
  just a hang? Can you try getting some debug output (by setting 'echo N
  /sys/module/printk/parameters/console_suspend' and trying to catch
  the crash on the screen or a serial line, or maybe through pstore)? I
  really don't see much that could go wrong with this patch, so without
  more info it will be hard to understand your problem.
 
  Also, I noticed that you have two HID devices plugged in during
  suspend. Does it make a difference if you have different devices (e.g.
  a mass storage stick) or none at all?
 
  Looks like it doesn't like it when there's anything plugged into the
  SS ports. I tried with just a HID keyboard or with just a hub. In
  both cases it fails to resume. If I have nothing connected to the SS
  ports then it resumes just fine.
 
  I managed to catch something with ramoops. Looks like it's hitting
  POISON_FREE when trying to delete some list entry.
 
 
  4[  107.047230] xhci_hcd :00:14.0: Slot 1 endpoint 2 not removed from 
  BW list!
  4[  107.047574] general protection fault:  [#1] PREEMPT SMP
 
 I took a look at the xhci_mem_cleanup() function and to me it looks
 like it tries to access a list_head that is already freed.
 
 The struct list_head xhci-devs[].eps[].bw_endpoint_list is added to an 
 endpoint 
 list in xhci-rh_bw[].bw_table.interval_bw[].endpoints
 
 xhci_mem_cleanup() frees all devices (the allocated xhci-devs[], containing 
 the 
 bw_endpoint_list) before it starts to loop through, and delete entries from 
 the 
 xhci-rh_bw[].bw_table.interval_bw[].endpoints list.
 
 I can't see how this relates to Julius patch though, and I'm not sure yet why 
 it 
 only triggers when devices are connected to SS ports. Maybe just unlucky 
 timing?

I think the non-SS ports are connected to the EHCI controllers rather
than the XHCI controllers. So that explains at least one detail. And I
guess timing is as good an excuse as any why this gets exposed by the
patch in question.

 
 Does this help?:

Indeed it does. The machine just survived a dozen or so suspend+resume
cycles without a hitch. The bug was 100% reproducible on this machine,
so the fix seems solid.

Tested-by: Ville Syrjälä ville.syrj...@linux.intel.com

 
 diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
 index c089668..b1a8a5f 100644
 --- a/drivers/usb/host/xhci-mem.c
 +++ b/drivers/usb/host/xhci-mem.c
 @@ -1822,6 +1822,16 @@ void xhci_mem_cleanup(struct xhci_hcd *xhci)
  kfree(cur_cd);
  }
 
 +   num_ports = HCS_MAX_PORTS(xhci-hcs_params1);
 +   for (i = 0; i  num_ports; i++) {
 +   struct xhci_interval_bw_table *bwt = xhci-rh_bw[i].bw_table;
 +   for (j = 0; j  XHCI_MAX_INTERVAL; j++) {
 +   struct list_head *ep = bwt-interval_bw[j].endpoints;
 +   while (!list_empty(ep))
 +   list_del_init(ep-next);
 +   }
 +   }
 +
  for (i = 1; i  MAX_HC_SLOTS; ++i)
  xhci_free_virt_device(xhci, i);
 
 @@ -1857,16 +1867,6 @@ void xhci_mem_cleanup(struct xhci_hcd *xhci)
  if (!xhci-rh_bw)
  goto no_bw;
 
 -   num_ports = HCS_MAX_PORTS(xhci-hcs_params1);
 -   for (i = 0; i  num_ports; i++) {
 -   struct xhci_interval_bw_table *bwt = xhci-rh_bw[i].bw_table;
 -   for (j = 0; j  XHCI_MAX_INTERVAL; j++) {
 -   struct list_head *ep = bwt-interval_bw[j].endpoints;
 -   while (!list_empty(ep))
 -   list_del_init(ep-next);
 -   }
 -   }
 -
  for (i = 0; i  num_ports; i++) {
  struct xhci_tt_bw_info *tt, *n;
  list_for_each_entry_safe(tt, n, xhci-rh_bw[i].tts, 
 tt_list) {

-- 
Ville Syrjälä
Intel OTC
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [regression 3.15-rc3] Resume from s4 broken by 1f81b6d22a5980955b01e08cf27fb745dc9b686f

2014-05-06 Thread Ville Syrjälä
 ]---

-- 
Ville Syrjälä
Intel OTC
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html