Re: ERROR transfer event TRB DMA ptr not part of current TD

2013-04-30 Thread Sarah Sharp
On Mon, Apr 29, 2013 at 01:02:19PM +0200, Oliver Neukum wrote:
> On Thursday 10 January 2013 10:42:07 Sarah Sharp wrote:
> 
> > The new debugging shows that the host is giving *two* short status
> > completions for a TD.  This only happens when the isoc TD is split into
> > two TRBs because the buffer crosses a 64KB boundary.  The completion
> > event shows that none of the buffer in either TRB was sent.  So this
> > suggests a deeper hardware issue, and we may need to use bounce buffers
> > to work around it.
> > 
> > I'll need to file a bug with our hardware team and cook up a
> > work-around.  So don't send any patches just yet.
> 
> Hi Sarah,
> 
> has this issue been addressed?

I looked into it, and it's not a hardware bug.  It's caused by a change
in how xHCI 1.0 hosts handle short packets, that I didn't catch before
now.  It's relatively harmless, but does need to get fixed.  I've added
it to my JIRA bug tracker, but it's a low priority bug at this point.

Perhaps we just need to add the spurious success quirk for all xHCI 1.0
hosts?

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


xhci error: Transfer event TRB DMA ptr not part of current TD

2014-07-10 Thread Hans de Goede
Hi Sarah, Matthias, et al,

I've been running a full Linux distro from an uas enclosure with a ssd
for testing purposes (mostly for testing the distro on different
hardware but also for uas testing).

While testing this on a Thinkpad T440s I noticed the error
from $subject happening exactly once in the log. This always happens
when initializing the uas disk enclosure with the ssd. This happens
with both 3.15 and 3.16-rc4.

I've run a battery of tests to try and pin this down, here is the
test matrix:

 T440s (Ivy Bridge)   E6430 (Sandy Bridge)  Desktop 
(NEC)
Renesas uPD720231 + ssd:  FAIL (*) OKOK (**)
ASM1053E + hdd :  OK   OKOK (**)
ASM1053E + ssd :  OK   OKOK (**)

*) Putting an USB-3 hub in between makes no difference
**) Tested with an USB-3 hub in between

Where FAIL means that the error shows up.

The 2 enclosures tested with are:
Renesas uPD720231: 
http://www.amazon.com/SEDNA-SE-EH-322-U-External-Enclosure-Support/dp/B00E0MLIVE
ASM1053E: http://plugable.com/products/usb3-sata-uasp1

The 3 xhci controllers tested with are:

T440s (Ivy Bridge):
00:14.0 USB controller [0c03]: Intel Corporation 8 Series USB xHCI HC 
[8086:9c31] (rev 04)

E6430 (Sandy Bridge):
00:14.0 USB controller [0c03]: Intel Corporation 7 Series/C210 Series Chipset 
Family USB xHCI Host Controller [8086:1e31] (rev 04)

Desktop (NEC):
01:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host 
Controller [1033:0194] (rev 04)

So this seems to only happen (and even then only once on init) when
pairing a Renesas uPD720231 with an Ivy Bridge chipset xHCI controller.

For completeness sake, the ssd used in both cases was: a 120G Crucial M500,
model string: Crucial_ CT120M500SSD1 .

If you want me to build a kernel with a patch added to add some extra
debugging around the problem area to pin this down, send me such a
patch and I'll happily run some tests with it.

Regards,

Hans


p.s.

While doing all this testing I've also found a regression with 3.16 and the
Renesas uPD720231, which I'm bisecting now, so more on that later.

I can work around the regression by limiting the max amount of streams
(and thus outstanding requests) to 16. Which means the regression might not
be specific to the uPD720231, as the ASM1053E only supports 16 streams to
begin with.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ASM3142 controller ERROR Transfer event TRB DMA ptr not part of current TD

2018-08-20 Thread Igor Kuzmin
Hi!

I'm experiencing an issue with combination of ASMedia ASM3142
controller
(http://www.asmedia.com.tw/eng/e_show_products.php?cate_index=175&item=179)
and specific USB3 device type - industrial camera. After few seconds
from the start of image acquisition USB transfer hangs (device reports
that streaming endpoint has stalled) with the following messages
repeated several times in kernel log:
[  137.337248] xhci_hcd :02:00.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 1
[  137.337259] xhci_hcd :02:00.0: Looking for event-dma 0004841d9000 
trb-start 0004841dabf0 trb-end 0004841dafe0 seg-start 0004841da000 
seg-end 0004841daff0
.
[  137.346576] xhci_hcd :02:00.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 13
[  137.346585] xhci_hcd :02:00.0: Looking for event-dma 0004841d9b60 
trb-start 0004841dabf0 trb-end 0004841dafe0 seg-start 0004841da000 
seg-end 0004841daff0
.

Endpoint 2 (OUT) is used for control commands (with 0x82 IN endpoint for
ACKs) and image data is streamed through IN endpoint 0x81. The fact
that 2 bulk endpoints are used in parallel is most likely what triggers
this bug. Other devices work fine with this controller, e.g. USB3
storage, but they don't have such USB endpoints configuration. Also the
camera works fine with this controller on Windows, but only with
drivers from ASMedia, not with Microsoft's.

I've tried several Linux kernel versions including latest kernel.org
release 4.18.3.

Here is lspci -vvv -s :02:00.0 output:
02:00.0 USB controller: ASMedia Technology Inc. Device 2142 (prog-if 30 [XHCI])
Subsystem: ASMedia Technology Inc. Device 2142
Physical Slot: 6
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 

Re: xhci error: Transfer event TRB DMA ptr not part of current TD

2014-07-18 Thread Hans de Goede
Hi,

On 07/10/2014 01:17 PM, Hans de Goede wrote:
> Hi Sarah, Matthias, et al,
> 
> I've been running a full Linux distro from an uas enclosure with a ssd
> for testing purposes (mostly for testing the distro on different
> hardware but also for uas testing).
> 
> While testing this on a Thinkpad T440s I noticed the error
> from $subject happening exactly once in the log. This always happens
> when initializing the uas disk enclosure with the ssd. This happens
> with both 3.15 and 3.16-rc4.
> 
> I've run a battery of tests to try and pin this down, here is the
> test matrix:
> 
>  T440s (Ivy Bridge)   E6430 (Sandy Bridge)  Desktop 
> (NEC)
> Renesas uPD720231 + ssd:  FAIL (*) OKOK (**)
> ASM1053E + hdd :  OK   OKOK (**)
> ASM1053E + ssd :  OK   OKOK (**)
> 
> *) Putting an USB-3 hub in between makes no difference
> **) Tested with an USB-3 hub in between
> 
> Where FAIL means that the error shows up.
> 
> The 2 enclosures tested with are:
> Renesas uPD720231: 
> http://www.amazon.com/SEDNA-SE-EH-322-U-External-Enclosure-Support/dp/B00E0MLIVE
> ASM1053E: http://plugable.com/products/usb3-sata-uasp1
> 
> The 3 xhci controllers tested with are:
> 
> T440s (Ivy Bridge):
> 00:14.0 USB controller [0c03]: Intel Corporation 8 Series USB xHCI HC 
> [8086:9c31] (rev 04)
> 
> E6430 (Sandy Bridge):
> 00:14.0 USB controller [0c03]: Intel Corporation 7 Series/C210 Series Chipset 
> Family USB xHCI Host Controller [8086:1e31] (rev 04)

Correction the T440s is Haswell, the E6430 Ivy Bridge, Sandy Bridge never had 
USB 3
(in its companion chipset). Sorry, note the lspci is correct.

> Desktop (NEC):
> 01:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host 
> Controller [1033:0194] (rev 04)
> 
> So this seems to only happen (and even then only once on init) when
> pairing a Renesas uPD720231 with an Ivy Bridge chipset xHCI controller.
> 
> For completeness sake, the ssd used in both cases was: a 120G Crucial M500,
> model string: Crucial_ CT120M500SSD1 .

In the mean time I've found what I believe is a proper fix for this, we've
special handling for COMP_STOP_INVAL, but not for COMP_STOP, judging from
the comment above the special handling this handling should be equally applied
to COMP_STOP and that fixes this error being logged. I'll send a patch to fix
this separately.

Regards,

Hans


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 3

2017-07-25 Thread Julian Lyubenov
Hello,


I am trying to help somebody fix this longtime bug in xhci_hcd driver. (or more 
likely bug in ASMedia controller that we co
I will try provide as much information as I can.


My setup: I am using USB TV tuner card connected to ASMedia USB 3.1 port on 
mainboard.


00:00.0 USB controller: ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller 
(prog-if 30 [XHCI])
        Subsystem: ASRock Incorporation ASM1142 USB 3.1 Host Controller
        Flags: bus master, fast devsel, latency 0, IRQ 34, NUMA node 0
        Memory at df20 (64-bit, non-prefetchable) [size=32K]
        Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+
        Capabilities: [68] MSI-X: Enable+ Count=8 Masked-
        Capabilities: [78] Power Management version 3
        Capabilities: [80] Express Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [200] Advanced Error Reporting
        Capabilities: [280] #19
        Capabilities: [300] Latency Tolerance Reporting
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci


Device is 1b21:1242


Sometimes after few hours, sometimes after 1-2 days - the xhci_hcd driver crash 
with similar error in dmesg:


[125235.391809] xhci_hcd :00:00.0: ERROR Transfer event TRB DMA ptr not 
part of current TD ep_index 0 comp_code 3
[125235.391843] xhci_hcd :00:00.0: Looking for event-dma 00021231a690 
trb-start 00021231a890 trb-end 00021231a8b0 seg-start 00021231a000 
seg-end 00021231aff0


It never happens if I connect my USB TV tuner card to the onboard Intel USB 3.0 
ports. Only with ASMedia USB 3.1 port.


There same issue is present at 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1667750 but for different 
ASMedia controller.



I have logs: 
mount -t debugfs none /sys/kernel/debug
echo xhci-hcd >> /sys/kernel/debug/tracing/set_event
cat /sys/kernel/debug/tracing/trace


But people from launchpad told me they are not applicable.


Please help me to provide the necessary information to fix this issue.

Thanks





--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-10 Thread Jörg Otte
If I plug in my USB DVB-T stick I get the following in dmesg:

dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm state.
dvb-usb: will pass the complete MPEG2 transport stream to the software demuxer.
DVB: registering new adapter (TerraTec/qanu USB2.0 Highspeed DVB-T Receiver)
usb 1-1: DVB: registering adapter 0 frontend 0 (TerraTec/qanu USB2.0
Highspeed DVB-T Receiver)...
input: IR-receiver inside an USB DVB receiver as
/devices/pci:00/:00:14.0/usb1/1-1/input/input17
dvb-usb: schedule remote query interval to 50 msecs.
xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
current TD ep_index 1 comp_code 1
xhci_hcd :00:14.0: Looking for event-dma 000207540400
trb-start 000207540420 trb-end 000207540420 seg-start
0002075404
xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
current TD ep_index 1 comp_code 1
xhci_hcd :00:14.0: Looking for event-dma 000207540410
trb-start 000207540420 trb-end 000207540420 seg-start
0002075404
dvb-usb: bulk message failed: -110 (2/0)

and DVB-T is not functional. The problem came in with:

1163d50 Merge tag 'usb-4.0-rc3' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

I never had this xhci_hcd error before so this is a regression.


Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13

2017-03-24 Thread Grey Christoforo
Dear kernel USB people,

Could you please have a look at the bug report here?
bugs.launchpad.net/ubuntu/+source/linux/+bug/1667750

Regards,
grey
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-10 Thread Mathias Nyman
On 10.03.2015 11:40, Jörg Otte wrote:
> If I plug in my USB DVB-T stick I get the following in dmesg:
> 
> dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm 
> state.
> dvb-usb: will pass the complete MPEG2 transport stream to the software 
> demuxer.
> DVB: registering new adapter (TerraTec/qanu USB2.0 Highspeed DVB-T Receiver)
> usb 1-1: DVB: registering adapter 0 frontend 0 (TerraTec/qanu USB2.0
> Highspeed DVB-T Receiver)...
> input: IR-receiver inside an USB DVB receiver as
> /devices/pci:00/:00:14.0/usb1/1-1/input/input17
> dvb-usb: schedule remote query interval to 50 msecs.
> xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of
> current TD ep_index 1 comp_code 1
> xhci_hcd :00:14.0: Looking for event-dma 000207540400
> trb-start 000207540420 trb-end 000207540420 seg-start
> 00000002075404
> xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
> current TD ep_index 1 comp_code 1
> xhci_hcd :00:14.0: Looking for event-dma 000207540410
> trb-start 000207540420 trb-end 000207540420 seg-start
> 0002075404
> dvb-usb: bulk message failed: -110 (2/0)
> 
> and DVB-T is not functional. The problem came in with:
> 
> 1163d50 Merge tag 'usb-4.0-rc3' of
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
> 
> I never had this xhci_hcd error before so this is a regression.
> 
> 
> Thanks, Jörg

Oh, thanks.

Looks like we get an event for a TRB we just moved past. 

Any chance you could take a log with xhci debugging enabled before attaching 
the DVB-T
stick?

echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control


I'd suspect one of these two patches:

commit 45ba2154d12fc43b70312198ec47085f10be801a
xhci: fix reporting of 0-sized URBs in control endpoint

commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
xhci: Clear the host side toggle manually when endpoint is 'soft reset'

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-10 Thread Jörg Otte
2015-03-10 14:06 GMT+01:00 Mathias Nyman :
> On 10.03.2015 11:40, Jörg Otte wrote:
>> If I plug in my USB DVB-T stick I get the following in dmesg:
>>
>> dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm 
>> state.
>> dvb-usb: will pass the complete MPEG2 transport stream to the software 
>> demuxer.
>> DVB: registering new adapter (TerraTec/qanu USB2.0 Highspeed DVB-T Receiver)
>> usb 1-1: DVB: registering adapter 0 frontend 0 (TerraTec/qanu USB2.0
>> Highspeed DVB-T Receiver)...
>> input: IR-receiver inside an USB DVB receiver as
>> /devices/pci:00/:00:14.0/usb1/1-1/input/input17
>> dvb-usb: schedule remote query interval to 50 msecs.
>> xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
>> current TD ep_index 1 comp_code 1
>> xhci_hcd :00:14.0: Looking for event-dma 000207540400
>> trb-start 000207540420 trb-end 0000000207540420 seg-start
>> 0002075404
>> xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
>> current TD ep_index 1 comp_code 1
>> xhci_hcd :00:14.0: Looking for event-dma 000207540410
>> trb-start 000207540420 trb-end 000207540420 seg-start
>> 0002075404
>> dvb-usb: bulk message failed: -110 (2/0)
>>
>> and DVB-T is not functional. The problem came in with:
>>
>> 1163d50 Merge tag 'usb-4.0-rc3' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
>>
>> I never had this xhci_hcd error before so this is a regression.
>>
>>
>> Thanks, Jörg
>
> Oh, thanks.
>
> Looks like we get an event for a TRB we just moved past.
>
> Any chance you could take a log with xhci debugging enabled before attaching 
> the DVB-T
> stick?
>
> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>
>

here it comes attached.


> I'd suspect one of these two patches:
>
> commit 45ba2154d12fc43b70312198ec47085f10be801a
> xhci: fix reporting of 0-sized URBs in control endpoint
>
> commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
> xhci: Clear the host side toggle manually when endpoint is 'soft reset'
>
> -Mathias
>

Thanks, Jörg


xhci-debug.gz
Description: GNU Zip compressed data


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-10 Thread Mathias Nyman
On 10.03.2015 17:36, Jörg Otte wrote:

>>> Any chance you could take a log with xhci debugging enabled before 
>>> attaching the DVB-T
>>> stick?
>>>
>>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>>>
>>>
>>
>> here it comes attached.
>>
>>
>>> I'd suspect one of these two patches:
>>>
>>> commit 45ba2154d12fc43b70312198ec47085f10be801a
>>> xhci: fix reporting of 0-sized URBs in control endpoint
>>>
>>> commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
>>> xhci: Clear the host side toggle manually when endpoint is 'soft reset'
>>>
> 
> Revert the commits.
> The second one  "xhci: Clear the host side..."  is it !
> 

Yes, thank you

Seems that It wasn't mature enough, I'll revert it.

>From your logs I can see what went wrong, 

If you still have some time, could you try out a patch (attached) and see if it 
solves the
issue for you. (on top of clean 4.0-rc3). I can't reproduce it with my own USB 
DVB-T device

-Mathias 

>From a895eb69a63dfef1943f0593da29167bea12100c Mon Sep 17 00:00:00 2001
From: Mathias Nyman 
Date: Tue, 10 Mar 2015 18:50:45 +0200
Subject: [PATCH] xhci: set correct dequeue status in endpoint soft reset

The endpoint might already processesed some TRBs on the endpiont ring
before we soft reset the endpoint.
Make sure we set the dequeue pointer to where we were befere soft reset

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index b06d1a5..64527a4 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -2972,6 +2972,8 @@ void xhci_endpoint_reset(struct usb_hcd *hcd,
 	unsigned int ep_index, ep_state;
 	unsigned long flags;
 	u32 ep_flag;
+	struct xhci_ep_ctx *ep_ctx;
+	dma_addr_t addr;
 
 	xhci = hcd_to_xhci(hcd);
 	udev = (struct usb_device *) ep->hcpriv;
@@ -3046,6 +3048,9 @@ void xhci_endpoint_reset(struct usb_hcd *hcd,
 	   virt_dev->out_ctx, ctrl_ctx,
 	   ep_flag, ep_flag);
 	xhci_endpoint_copy(xhci, command->in_ctx, virt_dev->out_ctx, ep_index);
+	ep_ctx = xhci_get_ep_ctx(xhci, command->in_ctx, ep_index);
+	addr = xhci_trb_virt_to_dma(virt_ep->ring->deq_seg, virt_ep->ring->dequeue);
+	ep_ctx->deq  = cpu_to_le64(addr | virt_ep->ring->cycle_state);
 
 	xhci_queue_configure_endpoint(xhci, command, command->in_ctx->dma,
  udev->slot_id, false);
-- 
1.8.3.2



Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-10 Thread Alan Stern
On Tue, 10 Mar 2015, Mathias Nyman wrote:

> Yes, thank you
> 
> Seems that It wasn't mature enough, I'll revert it.
> 
> From your logs I can see what went wrong, 
> 
> If you still have some time, could you try out a patch (attached) and see if 
> it solves the
> issue for you. (on top of clean 4.0-rc3). I can't reproduce it with my own 
> USB DVB-T device

Mathias:

Your patch description says this:

> The endpoint might already processesed some TRBs on the endpiont ring
> before we soft reset the endpoint.
> Make sure we set the dequeue pointer to where we were befere soft reset

However, if a driver tries to issue an endpoint reset while there are
still some URBs queued, it is a bug.  Host controller drivers shouldn't
have to worry about this -- xhci_endpoint_reset() should simply return 
an error if the endpoint ring isn't empty.

I suppose we should check for this in the USB core.  I'll write a patch
and CC: you.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-10 Thread Alan Stern
On Tue, 10 Mar 2015, Mathias Nyman wrote:

> > Mathias:
> > 
> > Your patch description says this:
> > 
> >> The endpoint might already processesed some TRBs on the endpiont ring
> >> before we soft reset the endpoint.
> >> Make sure we set the dequeue pointer to where we were befere soft reset
> > 
> > However, if a driver tries to issue an endpoint reset while there are
> > still some URBs queued, it is a bug.  Host controller drivers shouldn't
> > have to worry about this -- xhci_endpoint_reset() should simply return 
> > an error if the endpoint ring isn't empty.
> > 
> > I suppose we should check for this in the USB core.  I'll write a patch
> > and CC: you.
> > 
> > Alan Stern
> > 
> 
> It's possible that there's something in usb core as well, 
> but I think the following was what happened:
> 
> 1. First a normal configure endpoint command is issued, it sets endpoint 
> dequeue pointer
>to xxx400 = start of ring segment
> 2. two urbs get queued -> two TDs put on endpoint ring.
> 3. xhci executes those, ring is in running (idle) state. sw dequeue at 
> xxx430, No TDs queued.
>Endpoint dequeue pointer is not written to the endpoint output context as 
> the ring is still 
>in running state (even if idle, not advancing with no TDs queued) it still 
> shows xxx400
> 4. -> something happends, xhci_endpoint_reset() is called, we do a new 
> configure endpoint
>to 'soft reset' the endpiont, but we copy the dequeue pointer from the old 
> endpoint
>output context to the configure endpoint input context, which 
> re-initializes the old
>dequeue xxx400 pointer to xhci hardware, and it starts executing the old 
> TDs from the ring.

Obviously that's bad.

But don't you have to stop the endpoint ring in order to configure it?  
When you stop the ring, doesn't the controller store the correct
current value of the dequeue pointer somewhere?

> 5. xhci driver notices that we get events for old TRBs that do not belong to 
> the TD the driver
>thinks we should be handling

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-11 Thread Lu, Baolu



On 03/11/2015 02:49 AM, Alan Stern wrote:

On Tue, 10 Mar 2015, Mathias Nyman wrote:


Mathias:

Your patch description says this:


The endpoint might already processesed some TRBs on the endpiont ring
before we soft reset the endpoint.
Make sure we set the dequeue pointer to where we were befere soft reset

However, if a driver tries to issue an endpoint reset while there are
still some URBs queued, it is a bug.  Host controller drivers shouldn't
have to worry about this -- xhci_endpoint_reset() should simply return
an error if the endpoint ring isn't empty.

I suppose we should check for this in the USB core.  I'll write a patch
and CC: you.

Alan Stern


It's possible that there's something in usb core as well,
but I think the following was what happened:

1. First a normal configure endpoint command is issued, it sets endpoint 
dequeue pointer
to xxx400 = start of ring segment
2. two urbs get queued -> two TDs put on endpoint ring.
3. xhci executes those, ring is in running (idle) state. sw dequeue at xxx430, 
No TDs queued.
Endpoint dequeue pointer is not written to the endpoint output context as 
the ring is still
in running state (even if idle, not advancing with no TDs queued) it still 
shows xxx400
4. -> something happends, xhci_endpoint_reset() is called, we do a new 
configure endpoint
to 'soft reset' the endpiont, but we copy the dequeue pointer from the old 
endpoint
output context to the configure endpoint input context, which 
re-initializes the old
dequeue xxx400 pointer to xhci hardware, and it starts executing the old 
TDs from the ring.


Is it possible to return an error message up to client driver? The 
client driver then decides
how to handle this kind of error. It, possibly, unlink all ongoing 
transfers and ask host driver
to soft reset this endpoint. When xhci_endpoint_reset is called, there 
should be no ongoing

transfers.

Thanks,
Baolu


Obviously that's bad.

But don't you have to stop the endpoint ring in order to configure it?
When you stop the ring, doesn't the controller store the correct
current value of the dequeue pointer somewhere?


5. xhci driver notices that we get events for old TRBs that do not belong to 
the TD the driver
thinks we should be handling

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/




--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-11 Thread Mathias Nyman
On 10.03.2015 20:49, Alan Stern wrote:
> On Tue, 10 Mar 2015, Mathias Nyman wrote:
> 
>>> Mathias:
>>>
>>> Your patch description says this:
>>>
 The endpoint might already processesed some TRBs on the endpiont ring
 before we soft reset the endpoint.
 Make sure we set the dequeue pointer to where we were befere soft reset
>>>
>>> However, if a driver tries to issue an endpoint reset while there are
>>> still some URBs queued, it is a bug.  Host controller drivers shouldn't
>>> have to worry about this -- xhci_endpoint_reset() should simply return 
>>> an error if the endpoint ring isn't empty.
>>>
>>> I suppose we should check for this in the USB core.  I'll write a patch
>>> and CC: you.
>>>
>>> Alan Stern
>>>
>>
>> It's possible that there's something in usb core as well, 
>> but I think the following was what happened:
>>
>> 1. First a normal configure endpoint command is issued, it sets endpoint 
>> dequeue pointer
>>to xxx400 = start of ring segment
>> 2. two urbs get queued -> two TDs put on endpoint ring.
>> 3. xhci executes those, ring is in running (idle) state. sw dequeue at 
>> xxx430, No TDs queued.
>>Endpoint dequeue pointer is not written to the endpoint output context as 
>> the ring is still 
>>in running state (even if idle, not advancing with no TDs queued) it 
>> still shows xxx400
>> 4. -> something happends, xhci_endpoint_reset() is called, we do a new 
>> configure endpoint
>>to 'soft reset' the endpiont, but we copy the dequeue pointer from the 
>> old endpoint
>>output context to the configure endpoint input context, which 
>> re-initializes the old
>>dequeue xxx400 pointer to xhci hardware, and it starts executing the old 
>> TDs from the ring.
> 
> Obviously that's bad.
> 
> But don't you have to stop the endpoint ring in order to configure it?  
> When you stop the ring, doesn't the controller store the correct
> current value of the dequeue pointer somewhere?
> 

Normally we stop the endpoint before configuring it, but in this case
the endpoint is already configured, and we don't really want to change the 
configuration,
we just want to reset the toggle so that it's in sync with the device.

As I understand the xhci specs allows us to issue a configure endpoint command 
for a running endpoint as long as it's empty. 

xhci 1.0  4.6.6 Configure Endpoint:  
"An endpoint shall be in the Stopped state or if in the Running state shall be 
“idle” (e.g. no USB
Transactions are in progress, the Transfer Ring is empty, and software has 
processed all
outstanding events for the Transfer Ring) if its Drop Context flag is set. If 
this condition is not met
undefined behavior may occur.

But the output context we copy is from last time the endpoint was stopped or 
configured.
So we need to update the dequeue pointer to the one we have in the driver, I 
need to check
if the other old fields in the output context can cause any issues as well.

-Mathias 
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-11 Thread Jörg Otte
2015-03-10 18:04 GMT+01:00 Mathias Nyman :
> On 10.03.2015 17:36, Jörg Otte wrote:
>
 Any chance you could take a log with xhci debugging enabled before 
 attaching the DVB-T
 stick?

 echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control


>>>
>>> here it comes attached.
>>>
>>>
 I'd suspect one of these two patches:

 commit 45ba2154d12fc43b70312198ec47085f10be801a
 xhci: fix reporting of 0-sized URBs in control endpoint

 commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
 xhci: Clear the host side toggle manually when endpoint is 'soft reset'

>>
>> Revert the commits.
>> The second one  "xhci: Clear the host side..."  is it !
>>
>
> Yes, thank you
>
> Seems that It wasn't mature enough, I'll revert it.
>
> From your logs I can see what went wrong,
>
> If you still have some time, could you try out a patch (attached) and see if 
> it solves the
> issue for you. (on top of clean 4.0-rc3). I can't reproduce it with my own 
> USB DVB-T device

Problems:
error: patch failed: drivers/usb/host/xhci.c:2972
error: drivers/usb/host/xhci.c: patch does not apply

For me the patch looks formally good.
No idea why.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-11 Thread Alan Stern
On Wed, 11 Mar 2015, Lu, Baolu wrote:

> >> It's possible that there's something in usb core as well,
> >> but I think the following was what happened:
> >>
> >> 1. First a normal configure endpoint command is issued, it sets endpoint 
> >> dequeue pointer
> >> to xxx400 = start of ring segment
> >> 2. two urbs get queued -> two TDs put on endpoint ring.
> >> 3. xhci executes those, ring is in running (idle) state. sw dequeue at 
> >> xxx430, No TDs queued.
> >> Endpoint dequeue pointer is not written to the endpoint output context 
> >> as the ring is still
> >> in running state (even if idle, not advancing with no TDs queued) it 
> >> still shows xxx400
> >> 4. -> something happends, xhci_endpoint_reset() is called, we do a new 
> >> configure endpoint
> >> to 'soft reset' the endpiont, but we copy the dequeue pointer from the 
> >> old endpoint
> >> output context to the configure endpoint input context, which 
> >> re-initializes the old
> >> dequeue xxx400 pointer to xhci hardware, and it starts executing the 
> >> old TDs from the ring.
> 
> Is it possible to return an error message up to client driver? The 
> client driver then decides
> how to handle this kind of error. It, possibly, unlink all ongoing 
> transfers and ask host driver
> to soft reset this endpoint. When xhci_endpoint_reset is called, there 
> should be no ongoing
> transfers.

That doesn't seem to be the problem here.  Mathias is saying that all
the transfers have indeed completed, but when reconfiguring the
endpoint, the driver tells the controller that some transfers are still
active (because it stores a stale copy of the dequeue pointer).

But Mathias, what about the cycle bits in the TRBs?  Wouldn't they be
set to indicate that the OS now owns the TRBs?  This would cause the
endpoint to stop working, not cause the sort of error that Jörg saw.  
Or does the reconfigure command also store a stale copy of the Dequeue
Cycle State setting?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-11 Thread Mathias Nyman
On 11.03.2015 16:03, Alan Stern wrote:
> On Wed, 11 Mar 2015, Lu, Baolu wrote:
> 
 It's possible that there's something in usb core as well,
 but I think the following was what happened:

 1. First a normal configure endpoint command is issued, it sets endpoint 
 dequeue pointer
 to xxx400 = start of ring segment
 2. two urbs get queued -> two TDs put on endpoint ring.
 3. xhci executes those, ring is in running (idle) state. sw dequeue at 
 xxx430, No TDs queued.
 Endpoint dequeue pointer is not written to the endpoint output context 
 as the ring is still
 in running state (even if idle, not advancing with no TDs queued) it 
 still shows xxx400
 4. -> something happends, xhci_endpoint_reset() is called, we do a new 
 configure endpoint
 to 'soft reset' the endpiont, but we copy the dequeue pointer from the 
 old endpoint
 output context to the configure endpoint input context, which 
 re-initializes the old
 dequeue xxx400 pointer to xhci hardware, and it starts executing the 
 old TDs from the ring.
>>
>> Is it possible to return an error message up to client driver? The 
>> client driver then decides
>> how to handle this kind of error. It, possibly, unlink all ongoing 
>> transfers and ask host driver
>> to soft reset this endpoint. When xhci_endpoint_reset is called, there 
>> should be no ongoing
>> transfers.
> 
> That doesn't seem to be the problem here.  Mathias is saying that all
> the transfers have indeed completed, but when reconfiguring the
> endpoint, the driver tells the controller that some transfers are still
> active (because it stores a stale copy of the dequeue pointer).
> 
> But Mathias, what about the cycle bits in the TRBs?  Wouldn't they be
> set to indicate that the OS now owns the TRBs?  This would cause the
> endpoint to stop working, not cause the sort of error that Jörg saw.  
> Or does the reconfigure command also store a stale copy of the Dequeue
> Cycle State setting?

xhci keeps track of a producer cycle state and consumer cycle state.

These are only updated when the producer or consumer  (enqueue ptr=producer, 
dequeue ptr=consumer in this case)
pass the last link TRB of the last segment. The cycle bit in a TRB is only 
written once,
together when the producer writes the trb to the ring.

The TRB cycle bit at the dequeue pointer is compared to the consumer cycle 
state.  

So the cycle bit check would only mismatch if the actual sw dequeue pointer 
just passed the last TRB
of the last segment, and the stale dequeue pointer in the output context would 
roll it back past that
TRB again.

-Mathias
 
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-11 Thread Jörg Otte
2015-03-11 12:01 GMT+01:00 Jörg Otte :
> 2015-03-10 18:04 GMT+01:00 Mathias Nyman :
>> On 10.03.2015 17:36, Jörg Otte wrote:
>>
> Any chance you could take a log with xhci debugging enabled before 
> attaching the DVB-T
> stick?
>
> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>
>

 here it comes attached.


> I'd suspect one of these two patches:
>
> commit 45ba2154d12fc43b70312198ec47085f10be801a
> xhci: fix reporting of 0-sized URBs in control endpoint
>
> commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
> xhci: Clear the host side toggle manually when endpoint is 'soft 
> reset'
>
>>>
>>> Revert the commits.
>>> The second one  "xhci: Clear the host side..."  is it !
>>>
>>
>> Yes, thank you
>>
>> Seems that It wasn't mature enough, I'll revert it.
>>
>> From your logs I can see what went wrong,
>>
>> If you still have some time, could you try out a patch (attached) and see if 
>> it solves the
>> issue for you. (on top of clean 4.0-rc3). I can't reproduce it with my own 
>> USB DVB-T device
>
> Problems:
> error: patch failed: drivers/usb/host/xhci.c:2972
> error: drivers/usb/host/xhci.c: patch does not apply
>
> For me the patch looks formally good.
> No idea why.

OK, finally I got it applied successfully.
I can confirm now it works for me.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-12 Thread Jörg Otte
2015-03-10 15:03 GMT+01:00 Jörg Otte :
> 2015-03-10 14:06 GMT+01:00 Mathias Nyman :
>> On 10.03.2015 11:40, Jörg Otte wrote:
>>> If I plug in my USB DVB-T stick I get the following in dmesg:
>>>
>>> dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm 
>>> state.
>>> dvb-usb: will pass the complete MPEG2 transport stream to the software 
>>> demuxer.
>>> DVB: registering new adapter (TerraTec/qanu USB2.0 Highspeed DVB-T Receiver)
>>> usb 1-1: DVB: registering adapter 0 frontend 0 (TerraTec/qanu USB2.0
>>> Highspeed DVB-T Receiver)...
>>> input: IR-receiver inside an USB DVB receiver as
>>> /devices/pci:00/0000:00:14.0/usb1/1-1/input/input17
>>> dvb-usb: schedule remote query interval to 50 msecs.
>>> xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
>>> current TD ep_index 1 comp_code 1
>>> xhci_hcd :00:14.0: Looking for event-dma 0000000207540400
>>> trb-start 000207540420 trb-end 000207540420 seg-start
>>> 0002075404
>>> xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
>>> current TD ep_index 1 comp_code 1
>>> xhci_hcd :00:14.0: Looking for event-dma 000207540410
>>> trb-start 000207540420 trb-end 000207540420 seg-start
>>> 0002075404
>>> dvb-usb: bulk message failed: -110 (2/0)
>>>
>>> and DVB-T is not functional. The problem came in with:
>>>
>>> 1163d50 Merge tag 'usb-4.0-rc3' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
>>>
>>> I never had this xhci_hcd error before so this is a regression.
>>>
>>>
>>> Thanks, Jörg
>>
>> Oh, thanks.
>>
>> Looks like we get an event for a TRB we just moved past.
>>
>> Any chance you could take a log with xhci debugging enabled before attaching 
>> the DVB-T
>> stick?
>>
>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>>
>>
>
> here it comes attached.
>
>
>> I'd suspect one of these two patches:
>>
>> commit 45ba2154d12fc43b70312198ec47085f10be801a
>> xhci: fix reporting of 0-sized URBs in control endpoint
>>
>> commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
>> xhci: Clear the host side toggle manually when endpoint is 'soft reset'
>>

Revert the commits.
The second one  "xhci: Clear the host side..."  is it !

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-12 Thread Mathias Nyman
On 10.03.2015 19:29, Alan Stern wrote:
> On Tue, 10 Mar 2015, Mathias Nyman wrote:
> 
>> Yes, thank you
>>
>> Seems that It wasn't mature enough, I'll revert it.
>>
>> From your logs I can see what went wrong, 
>>
>> If you still have some time, could you try out a patch (attached) and see if 
>> it solves the
>> issue for you. (on top of clean 4.0-rc3). I can't reproduce it with my own 
>> USB DVB-T device
> 
> Mathias:
> 
> Your patch description says this:
> 
>> The endpoint might already processesed some TRBs on the endpiont ring
>> before we soft reset the endpoint.
>> Make sure we set the dequeue pointer to where we were befere soft reset
> 
> However, if a driver tries to issue an endpoint reset while there are
> still some URBs queued, it is a bug.  Host controller drivers shouldn't
> have to worry about this -- xhci_endpoint_reset() should simply return 
> an error if the endpoint ring isn't empty.
> 
> I suppose we should check for this in the USB core.  I'll write a patch
> and CC: you.
> 
> Alan Stern
> 

It's possible that there's something in usb core as well, 
but I think the following was what happened:

1. First a normal configure endpoint command is issued, it sets endpoint 
dequeue pointer
   to xxx400 = start of ring segment
2. two urbs get queued -> two TDs put on endpoint ring.
3. xhci executes those, ring is in running (idle) state. sw dequeue at xxx430, 
No TDs queued.
   Endpoint dequeue pointer is not written to the endpoint output context as 
the ring is still 
   in running state (even if idle, not advancing with no TDs queued) it still 
shows xxx400
4. -> something happends, xhci_endpoint_reset() is called, we do a new 
configure endpoint
   to 'soft reset' the endpiont, but we copy the dequeue pointer from the old 
endpoint
   output context to the configure endpoint input context, which re-initializes 
the old
   dequeue xxx400 pointer to xhci hardware, and it starts executing the old TDs 
from the ring.
5. xhci driver notices that we get events for old TRBs that do not belong to 
the TD the driver
   thinks we should be handling

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-12 Thread Mathias Nyman
On 11.03.2015 18:16, Jörg Otte wrote:
> 2015-03-11 12:01 GMT+01:00 Jörg Otte :
>> 2015-03-10 18:04 GMT+01:00 Mathias Nyman :
>>> On 10.03.2015 17:36, Jörg Otte wrote:
>>>
>> I'd suspect one of these two patches:
>>
>> commit 45ba2154d12fc43b70312198ec47085f10be801a
>> xhci: fix reporting of 0-sized URBs in control endpoint
>>
>> commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
>> xhci: Clear the host side toggle manually when endpoint is 'soft 
>> reset'
>>

 Revert the commits.
 The second one  "xhci: Clear the host side..."  is it !

>>>
>>> Yes, thank you
>>>
>>> Seems that It wasn't mature enough, I'll revert it.
>>>
>>> From your logs I can see what went wrong,
>>>
>>> If you still have some time, could you try out a patch (attached) and see 
>>> if it solves the
>>> issue for you. (on top of clean 4.0-rc3). I can't reproduce it with my own 
>>> USB DVB-T device
>>
>> Problems:
>> error: patch failed: drivers/usb/host/xhci.c:2972
>> error: drivers/usb/host/xhci.c: patch does not apply
>>
>> For me the patch looks formally good.
>> No idea why.
> 
> OK, finally I got it applied successfully.
> I can confirm now it works for me.
> 

Great, Thanks

-Mathias 

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


xhci_hcd error Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 1

2015-03-23 Thread Alistair Grant
Hi Mathias & Devin,

I've changed the subject to avoid any confusion with the patch series
that Mathias just posted for inclusion in 4.0-rc.

On Tue, Mar 17, 2015 at 4:21 PM, Alistair Grant  wrote:
>>>>>>> It looks like I may have signed-off a little too soon.  While the patch 
>>>>>>> is
>>>>>>> working correctly if the Hauppauge Live2 is plugged in after the system 
>>>>>>> has
>>>>>>> booted and settled down (my normal use case), it fails if the Live2 is
>>>>>>> plugged in while the system is booted up.
>>>>>>>
>>>>>>> Unplugging the Live2 after recording (which appears to succeed from the
>>>>>>> command line, but had no audio), executing lsusb just hangs.
>>>>>>>
>>>>>>> I've included what I think is the relevant portions of /var/log/syslog
>>>>>>> below.  If you'd like the entire log file posted somewhere please let me
>>>>>>> know.

I've been trying to narrow down the issues with the Hauppauge Live2 and
xhci.  I upgraded to 4.0rc5 + XHCI_AVOID_BEI and found a couple of things:

1) Plugging the Live2 in after booting produced the same error message as
earlier:

ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 1I

I had only previously seen this if the Live2 was plugged in during boot.
Maybe this is a timing / race condition that has become a bit worse in
4.0rc5?

This only happened once, and I haven't been able to reproduce it, so I'll
leave it to you how useful this is.

I've included the usb related messages from syslog at the end of the message.



2) I captured a couple of traces using usbmon, tshark and wireshark
comparing video recording with:

a) The Live2 plugged in after boot (which succeeds), and
b) The Live2 plugging in during boot (which usually fails).

There are some small differences in packet ordering, however the first
major difference is an isochronous in packet where the Live2 returns
"URB status: No such file or directory (-ENOENT) (-2)".

Devin, I'm try to learn a bit about USB and the Live2 protocol, however I'm
not sure how far I'll get.  Are you able to offer any hints on what to do
next?

If you'd like me to post the complete traces or capture something else,
please let me know.

P.S.: I confirmed that the problem is still present if the
"usb: xhci: handle Config Error Change (CEC) in xhci driver"
patch is added to the kernel I used above (since Mathias requested
that it be added to 4.0).

I was also able to successfully record once with the Live2 plugged in during
boot, supporting the timing hypothesis mentioned above.


Thanks,
Alistair
--

The relevant packet is:

No. Time   SourceDestination
Protocol Length Info
290 16.005881000   2.3   host  USB
 1088   URB_ISOCHRONOUS in

Frame 290: 1088 bytes on wire (8704 bits), 1088 bytes captured (8704
bits) on interface 0
Interface id: 0 (usbmon1)
Encapsulation type: USB packets with Linux header and padding (115)
Arrival Time: Mar 23, 2015 10:03:26.691823000 CET
[Time shift for this packet: 0.0 seconds]
Epoch Time: 1427101406.691823000 seconds
[Time delta from previous captured frame: 15.868813000 seconds]
[Time delta from previous displayed frame: 15.868813000 seconds]
[Time since reference or first frame: 16.005881000 seconds]
Frame Number: 290
Frame Length: 1088 bytes (8704 bits)
Capture Length: 1088 bytes (8704 bits)
[Frame is marked: False]
[Frame is ignored: False]
[Protocols in frame: usb]
USB URB
URB id: 0x8801ee7a2000
URB type: URB_COMPLETE ('C')
URB transfer type: URB_ISOCHRONOUS (0x00)
Endpoint: 0x83, Direction: IN
1...  = Direction: IN (1)
.000 0011 = Endpoint value: 3
Device: 2
URB bus id: 1
Device setup request: not relevant ('-')
Data: present (0)
URB sec: 1427101406
URB usec: 691823
URB status: No such file or directory (-ENOENT) (-2)
URB length [bytes]: 0
Data length [bytes]: 1024
[Request in: 70]
[Time from request: 15.97074 seconds]
[bInterfaceClass: Unknown (0x)]
ISO error count: 0
Number of ISO descriptors: 64
Interval: 1
Start frame: 4468
Copy of Transfer Flags: 0x0202
Number of ISO descriptors: 64
Leftover Capture Data: eeffeeff1c00...

  00 20 7a ee 01 88 ff ff 43 00 83 02 01 00 2d 00   . z.C.-.
0010  de d6 0f 55 00 00 00 00 6f 8e 0a 00 fe ff ff ff   ...Uo...
0020  00 00 00 00 00 04 00 00 00 00 00 00 40 00 00 00   @...
0030  01 00 00 00 74 11 00 00 02 02 00 00 40 00 00 00   t...@...
0040

Re: xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13

2017-03-24 Thread Felipe Balbi

Hi,

Grey Christoforo  writes:
> Dear kernel USB people,
>
> Could you please have a look at the bug report here?
> bugs.launchpad.net/ubuntu/+source/linux/+bug/1667750

Send us a proper bug report to mailing list, not a link. Also, make sure
you're testing against v4.10 or v4.11-rc3, otherwise we can't help.

Good luck

-- 
balbi
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13

2017-03-24 Thread Mathias Nyman

On 24.03.2017 12:07, Grey Christoforo wrote:

Dear kernel USB people,

Could you please have a look at the bug report here?
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1667750




ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
Looking for event-dma 0004777d9010 trb-start 000475a14fe0 trb-end 
000475a14fe0
seg-start 000475a14000 seg-end 000475a14ff0

So we get an interrupt saying TRB at 0x4777d9010 is a short packet (comp code 
13),
but xhci driver thinks we are currently working on TRB at 475a14fe0.
475a14fe0 is the last tranfer TRB on that segment. It ends at the Link trb at 
475a14ff0.

I bet 0x4777d9010 is the next segment, and for some reason driver isn't there 
yet.
(maybe TRB was split because data > 64k?, or we're  missing event for previous 
TD)

Can you try with a 4.11 kernel, and take traces with it?

just before the sending data over the network do:

mount -t debugfs none /sys/kernel/debug
echo xhci-hcd >> /sys/kernel/debug/tracing/set_event

Then add content of
/sys/kernel/debug/tracing/trace
and
dmesg
to the bug

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd error Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 1

2015-03-23 Thread Devin Heitmueller
Hi Alistair,

> There are some small differences in packet ordering, however the first
> major difference is an isochronous in packet where the Live2 returns
> "URB status: No such file or directory (-ENOENT) (-2)".
>
> Devin, I'm try to learn a bit about USB and the Live2 protocol, however I'm
> not sure how far I'll get.  Are you able to offer any hints on what to do
> next?

I'm sorry if I'm asking something already answered in your original
description of the problems - but have you confirmed the device works
properly on a system with a standard EHCI controller (as opposed to
XHCI)?

The reason I ask is because that the cx231xx driver is pretty unstable
in general, and I'm wondering if perhaps you're just hitting some
issue completely unrelated to the recent XHCI problems (which
obviously needed to be addressed in order for you to get this far into
testing).

If you haven't tried it yet on a standard ECHI controller because you
don't have one in your PC, it might be worth spending the $20 to buy a
PCI card with an USB 2.0 host controller on it for testing.

It's also worth mentioning that the process you're exercising isn't
just the code path for device insertion.  The udev process and
PulseAudio will both try to connect to the device as soon as the
underlying device nodes appear, so it's possible there is some race
condition where the device is being accessed and registers are being
poked before initialization is complete.  I cannot say for certain
this would be an issue with cx231xx, but I've definitely seen it with
other V4L2 drivers.  That might also explain why you see different
behavior at boot - the driver loads early enough in the boot process
such that initialization completes before processes like udev and
pulseaudio get a chance to interact with it.

Devin

-- 
Devin J. Heitmueller - Kernel Labs
http://www.kernellabs.com
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd error Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 1

2015-03-23 Thread Alistair Grant
Hi Devin,

On Mon, Mar 23, 2015 at 8:16 PM, Devin Heitmueller
 wrote:
> Hi Alistair,
>
>> There are some small differences in packet ordering, however the first
>> major difference is an isochronous in packet where the Live2 returns
>> "URB status: No such file or directory (-ENOENT) (-2)".
>>
>> Devin, I'm try to learn a bit about USB and the Live2 protocol, however I'm
>> not sure how far I'll get.  Are you able to offer any hints on what to do
>> next?
>
> I'm sorry if I'm asking something already answered in your original
> description of the problems - but have you confirmed the device works
> properly on a system with a standard EHCI controller (as opposed to
> XHCI)?
>
> The reason I ask is because that the cx231xx driver is pretty unstable
> in general, and I'm wondering if perhaps you're just hitting some
> issue completely unrelated to the recent XHCI problems (which
> obviously needed to be addressed in order for you to get this far into
> testing).
>
> If you haven't tried it yet on a standard ECHI controller because you
> don't have one in your PC, it might be worth spending the $20 to buy a
> PCI card with an USB 2.0 host controller on it for testing.

Thanks for your follow-up.  I've run the same series of tests over
an EHCI controller without any problems, i.e. I can record
successfully regardless of whether the Live2 is plugged in
during boot or after the system has settled down.

At the moment I only have access to laptops, not a desktop,
so the EHCI testing was on a different machine, with the Ubuntu
3.13 kernel.  If you'd like me to test on a more recent kernel,
please let me know.


> It's also worth mentioning that the process you're exercising isn't
> just the code path for device insertion.  The udev process and
> PulseAudio will both try to connect to the device as soon as the
> underlying device nodes appear, so it's possible there is some race
> condition where the device is being accessed and registers are being
> poked before initialization is complete.  I cannot say for certain
> this would be an issue with cx231xx, but I've definitely seen it with
> other V4L2 drivers.  That might also explain why you see different
> behavior at boot - the driver loads early enough in the boot process
> such that initialization completes before processes like udev and
> pulseaudio get a chance to interact with it.

Good point, although the problems occur when the device is present
during boot.  It works fine if it is plugged in after boot.  However it
does appear to be some sort of timing / race condition.

Any other suggestions?

Do you know if the cx23102 datasheet is available?  All I've been able
to find is a two page sheet with a block diagram.

Thanks again,
Alistair
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd error Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 1

2015-03-23 Thread Devin Heitmueller
> At the moment I only have access to laptops, not a desktop,
> so the EHCI testing was on a different machine, with the Ubuntu
> 3.13 kernel.  If you'd like me to test on a more recent kernel,
> please let me know.

Ok.  That's a useful data point.

>> It's also worth mentioning that the process you're exercising isn't
>> just the code path for device insertion.  The udev process and
>> PulseAudio will both try to connect to the device as soon as the
>> underlying device nodes appear, so it's possible there is some race
>> condition where the device is being accessed and registers are being
>> poked before initialization is complete.  I cannot say for certain
>> this would be an issue with cx231xx, but I've definitely seen it with
>> other V4L2 drivers.  That might also explain why you see different
>> behavior at boot - the driver loads early enough in the boot process
>> such that initialization completes before processes like udev and
>> pulseaudio get a chance to interact with it.
>
> Good point, although the problems occur when the device is present
> during boot.  It works fine if it is plugged in after boot.  However it
> does appear to be some sort of timing / race condition.
>
> Any other suggestions?

My guess would be that there's some bug in the cx231xx that
exacerbates an edge case in the XHCI core - like prematurely setting
the USB alternate back to zero when stopping streaming and not
canceling all the URBs first.

> Do you know if the cx23102 datasheet is available?  All I've been able
> to find is a two page sheet with a block diagram.

It is not publicly available.  I have it under NDA.  That said, I
suspect we're not dealing with anything that is really specific to the
cx231xx chip, but rather just some general bug in the URB handling in
the cx231xx driver.

As a test, you may wish to try disabling the cx231xx-audio module.
That will help narrow down an interaction with ALSA, pulseaudio, and
one of the two sources of URB queueing.

Devin

-- 
Devin J. Heitmueller - Kernel Labs
http://www.kernellabs.com
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd error Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 1

2015-03-26 Thread Alistair Grant
On Mon, Mar 23, 2015 at 10:14 PM, Devin Heitmueller
 wrote:
>
> My guess would be that there's some bug in the cx231xx that
> exacerbates an edge case in the XHCI core - like prematurely setting
> the USB alternate back to zero when stopping streaming and not
> canceling all the URBs first.
>
> ...
>
> As a test, you may wish to try disabling the cx231xx-audio module.
> That will help narrow down an interaction with ALSA, pulseaudio, and
> one of the two sources of URB queueing.


I think you're on the right track here - blacklisting the audio module
(cx231xxx_alsa) allowed successful video recording with the device inserted
during boot.

I added some logging to drivers/usb/core/message.c to check that the URB
list is empty before changing the alternate interface.  While it wasn't
ever called with a non-empty list, it did perhaps help narrow down the area
of code causing trouble.

The normal sequence of log messages when stopping recoring is:

Mar 26 08:50:41 alistair-XPS13 kernel: [  771.866435] cx231xx 1-2:1.1:
urb_init=0 dev->video_mode.max_pkt_size=2892
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.900280] cx231xx 1-2:1.1:
urb_init=0 dev->video_mode.max_pkt_size=2892
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.943487] cx231xx 1-2:1.1:
urb_init=0 dev->video_mode.max_pkt_size=2892
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.988004] cx231xx 1-2:1.1:
stopping capture
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.988009] cx231xx 1-2:1.1:
Stopping isoc
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.988022] cx231xx 1-2:1.1:
Stop capture, if needed
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.988056] cx231xx 1-2:1.1:
Stop capture, if needed
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.988058] cx231xx 1-2:1.1:
closing device
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.988062] cx231xx 1-2:1.1:
cx231xx_stop_stream: ep_mask = 4
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.988167] usb 1-2: akg URB
list: next=0x880073299d90, prev=0x880073299d90, eq=1, empty=1
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.993115] cx231xx 1-2:1.1:
cx231xx_stop_stream: ep_mask = 8
Mar 26 08:50:41 alistair-XPS13 kernel: [  771.993192] usb 1-2: akg URB
list: next=0x8800732992b0, prev=0x8800732992b0, eq=1, empty=1

The "akg URB" record is the log I added to usb_set_interface().

The failures appear to have several different paths, however one of them is:

Mar 26 08:57:17 alistair-XPS13 kernel: [   87.915849] cx231xx 1-2:1.1:
urb_init=0 dev->video_mode.max_pkt_size=2892
Mar 26 08:57:17 alistair-XPS13 kernel: [   87.960923] cx231xx 1-2:1.1:
urb_init=0 dev->video_mode.max_pkt_size=2892
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.004771] cx231xx 1-2:1.1:
urb_init=0 dev->video_mode.max_pkt_size=2892
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032245] cx231xx 1-2:1.1:
stopping capture
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032253] cx231xx 1-2:1.1:
Stopping isoc
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032273] cx231xx 1-2:1.1:
Stop capture, if needed
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032326] cx231xx 1-2:1.1:
Stop capture, if needed
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032331] cx231xx 1-2:1.1:
closing device
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032337] cx231xx 1-2:1.1:
cx231xx_stop_stream: ep_mask = 4
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032526] xhci_hcd
:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD
ep_index 8 comp_code 1
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032542] xhci_hcd
:00:14.0: Looking for event-dma 0001f8671400 trb-start
0001facb2090 trb-end 0001facb2090 seg-start 0001facb2000
seg-end 0001facb23f0
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032565] usb 1-2: akg URB
list: next=0x880215504eb0, prev=0x880215504eb0, eq=1, empty=1
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032579] xhci_hcd
:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD
ep_index 8 comp_code 1
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032596] xhci_hcd
:00:14.0: Looking for event-dma 0001f8671410 trb-start
0001facb2090 trb-end 0001facb2090 seg-start 0001facb2000
seg-end 0001facb23f0
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032703] xhci_hcd
:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD
ep_index 8 comp_code 1
Mar 26 08:57:17 alistair-XPS13 kernel: [   88.032712] xhci_hcd
:00:14.0: Looking for event-dma 0001f8671420 trb-start
0001facb2090 trb-end 0001facb2090 seg-start 0001facb2000
seg-end 0001facb23f0

Hopefully this will give some clues about where to look next.

Thanks again for all your assistance.

Cheers,
Alistair
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1

2015-07-18 Thread Arkadiusz Miskiewicz

Hi.

I'm on 4.2.0-rc2-00077-gf760b87 kernel and while trying to copy some file from
usb storage (sata disk behind sata-usb bridge or pendrive; hapens in both cases)
copying process hangs just early after start with:

[   77.372137] usb 2-1: new SuperSpeed USB device number 2 using xhci_hcd
[   77.388945] usb 2-1: New USB device found, idVendor=1f75, idProduct=0611
[   77.388952] usb 2-1: New USB device strings: Mfr=4, Product=5, SerialNumber=6
[   77.388956] usb 2-1: SerialNumber: 20130514
[   77.402599] usb-storage 2-1:1.0: USB Mass Storage device detected
[   77.403177] scsi host6: usb-storage 2-1:1.0
[   77.403318] usbcore: registered new interface driver usb-storage
[   77.407529] usbcore: registered new interface driver uas
[   78.400954] scsi 6:0:0:0: scsi scan: INQUIRY result too short (5), using 36
[   78.400961] scsi 6:0:0:0: Direct-Access Hitachi  HDS723020BLA642   
PQ: 0 ANSI: 0
[   78.401401] sd 6:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 
TB/1.81 TiB)
[   78.402126] sd 6:0:0:0: [sdb] Write Protect is off
[   78.402130] sd 6:0:0:0: [sdb] Mode Sense: 3b 00 00 00
[   78.402876] sd 6:0:0:0: [sdb] No Caching mode page found
[   78.402882] sd 6:0:0:0: [sdb] Assuming drive cache: write through
[   78.444310] sd 6:0:0:0: [sdb] Attached SCSI disk
[   85.907972] EXT4-fs (sdb): mounted filesystem with ordered data mode. Opts: 
(null)
[  113.556376] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 1
[  113.556383] xhci_hcd :00:14.0: Looking for event-dma fffa4000 
trb-start fffa5fe0 trb-end fffa6000 seg-start fffa5000 
seg-end fffa5ff0
[  234.236311] usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
[  234.890862] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 1
[  234.890869] xhci_hcd :00:14.0: Looking for event-dma fff94000 
trb-start fff95fd0 trb-end fff96000 seg-start fff95000 
seg-end fff95ff0
[  355.339935] usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
[  355.574012] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 1
[  355.574018] xhci_hcd :00:14.0: Looking for event-dma fff84000 
trb-start fff85fb0 trb-end fff86000 seg-start fff85000 
seg-end fff85ff0
[  476.430339] usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
[  476.728729] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 1
[  476.728738] xhci_hcd :00:14.0: Looking for event-dma fff74000 
trb-start fff75fe0 trb-end fff76000 seg-start fff75000 
seg-end fff75ff0

3.19.3 works fine
4.0.8 fails
4.2.0-rc2-00077-gf760b87 fails

There was some similar thread in march 2015 but the issue there got resolved
by reverting one usb patch, so my issue has to be different (that revert is 
included
in 4.0 final I think).

.config -> http://sprunge.us/bACC
lsusb -> http://sprunge.us/DOWb
dmesg -> http://sprunge.us/UbTF

machine is dell xps 15 9530

-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/11] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-07-25 Thread Hans de Goede
Lately (with the use of uas / bulk-streams) we have been seeing several
cases where this error triggers (which should never happen).

Add some extra logging to make debugging these errors easier.

Signed-off-by: Hans de Goede 
---
 drivers/usb/host/xhci-mem.c  |  4 +++-
 drivers/usb/host/xhci-ring.c | 22 ++
 drivers/usb/host/xhci.h  |  6 +++---
 3 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 8056d90..26129d3 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -1903,7 +1903,7 @@ static int xhci_test_trb_in_td(struct xhci_hcd *xhci,
start_dma = xhci_trb_virt_to_dma(input_seg, start_trb);
end_dma = xhci_trb_virt_to_dma(input_seg, end_trb);
 
-   seg = trb_in_td(input_seg, start_trb, end_trb, input_dma);
+   seg = trb_in_td(xhci, input_seg, start_trb, end_trb, input_dma, false);
if (seg != result_seg) {
xhci_warn(xhci, "WARN: %s TRB math test %d failed!\n",
test_name, test_number);
@@ -1917,6 +1917,8 @@ static int xhci_test_trb_in_td(struct xhci_hcd *xhci,
end_trb, end_dma);
xhci_warn(xhci, "Expected seg %p, got seg %p\n",
result_seg, seg);
+   trb_in_td(xhci, input_seg, start_trb, end_trb, input_dma,
+ true);
return -1;
}
return 0;
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index d9b3286..213b28a 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1718,10 +1718,12 @@ cleanup:
  * TRB in this TD, this function returns that TRB's segment.  Otherwise it
  * returns 0.
  */
-struct xhci_segment *trb_in_td(struct xhci_segment *start_seg,
+struct xhci_segment *trb_in_td(struct xhci_hcd *xhci,
+   struct xhci_segment *start_seg,
union xhci_trb  *start_trb,
union xhci_trb  *end_trb,
-   dma_addr_t  suspect_dma)
+   dma_addr_t  suspect_dma,
+   booldebug)
 {
dma_addr_t start_dma;
dma_addr_t end_seg_dma;
@@ -1740,6 +1742,15 @@ struct xhci_segment *trb_in_td(struct xhci_segment 
*start_seg,
/* If the end TRB isn't in this segment, this is set to 0 */
end_trb_dma = xhci_trb_virt_to_dma(cur_seg, end_trb);
 
+   if (debug)
+   xhci_warn(xhci,
+   "Looking for event-dma %016llx trb-start 
%016llx trb-end %016llx seg-start %016llx seg-end %016llx\n",
+   (unsigned long long)suspect_dma,
+   (unsigned long long)start_dma,
+   (unsigned long long)end_trb_dma,
+   (unsigned long long)cur_seg->dma,
+   (unsigned long long)end_seg_dma);
+
if (end_trb_dma > 0) {
/* The end TRB is in this segment, so suspect should be 
here */
if (start_dma <= end_trb_dma) {
@@ -2472,8 +2483,8 @@ static int handle_tx_event(struct xhci_hcd *xhci,
td_num--;
 
/* Is this a TRB in the currently executing TD? */
-   event_seg = trb_in_td(ep_ring->deq_seg, ep_ring->dequeue,
-   td->last_trb, event_dma);
+   event_seg = trb_in_td(xhci, ep_ring->deq_seg, ep_ring->dequeue,
+   td->last_trb, event_dma, false);
 
/*
 * Skip the Force Stopped Event. The event_trb(event_dma) of FSE
@@ -2508,6 +2519,9 @@ static int handle_tx_event(struct xhci_hcd *xhci,
"part of current TD ep_index %d "
"comp_code %u\n", ep_index,
trb_comp_code);
+   trb_in_td(xhci, ep_ring->deq_seg,
+ ep_ring->dequeue, td->last_trb,
+ event_dma, true);
return -ESHUTDOWN;
}
 
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 88b2958..8429b7e 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1806,9 +1806,9 @@ void xhci_reset_bandwidth(struct usb_hcd *hcd, struct 
usb_device *udev);
 
 /* xHCI ring, segment, TRB, and TD functions */
 dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg, union xhci_trb *trb);
-struct xhci_segment *trb_in_td(struct xhci_segment *start_seg,
-   union xhci_trb *start_trb, union xhci_trb *end_trb,
-   dma_addr_t suspect_dma);
+struct xhci_segment *trb_in_td(struct xhci_hcd *xhci,
+   struct xhci_segment *start_seg, union xhci_trb *start_trb,
+  

[PATCH 1/7] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-08-20 Thread Mathias Nyman
From: Hans de Goede 

Lately (with the use of uas / bulk-streams) we have been seeing several
cases where this error triggers (which should never happen).

Add some extra logging to make debugging these errors easier.

Signed-off-by: Hans de Goede 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-mem.c  |  4 +++-
 drivers/usb/host/xhci-ring.c | 26 +-
 drivers/usb/host/xhci.h  |  6 +++---
 3 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 8056d90..26129d3 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -1903,7 +1903,7 @@ static int xhci_test_trb_in_td(struct xhci_hcd *xhci,
start_dma = xhci_trb_virt_to_dma(input_seg, start_trb);
end_dma = xhci_trb_virt_to_dma(input_seg, end_trb);
 
-   seg = trb_in_td(input_seg, start_trb, end_trb, input_dma);
+   seg = trb_in_td(xhci, input_seg, start_trb, end_trb, input_dma, false);
if (seg != result_seg) {
xhci_warn(xhci, "WARN: %s TRB math test %d failed!\n",
test_name, test_number);
@@ -1917,6 +1917,8 @@ static int xhci_test_trb_in_td(struct xhci_hcd *xhci,
end_trb, end_dma);
xhci_warn(xhci, "Expected seg %p, got seg %p\n",
result_seg, seg);
+   trb_in_td(xhci, input_seg, start_trb, end_trb, input_dma,
+ true);
return -1;
}
return 0;
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 60fb52a..fd0e784 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1722,10 +1722,12 @@ cleanup:
  * TRB in this TD, this function returns that TRB's segment.  Otherwise it
  * returns 0.
  */
-struct xhci_segment *trb_in_td(struct xhci_segment *start_seg,
+struct xhci_segment *trb_in_td(struct xhci_hcd *xhci,
+   struct xhci_segment *start_seg,
union xhci_trb  *start_trb,
union xhci_trb  *end_trb,
-   dma_addr_t  suspect_dma)
+   dma_addr_t  suspect_dma,
+   booldebug)
 {
dma_addr_t start_dma;
dma_addr_t end_seg_dma;
@@ -1744,6 +1746,15 @@ struct xhci_segment *trb_in_td(struct xhci_segment 
*start_seg,
/* If the end TRB isn't in this segment, this is set to 0 */
end_trb_dma = xhci_trb_virt_to_dma(cur_seg, end_trb);
 
+   if (debug)
+   xhci_warn(xhci,
+   "Looking for event-dma %016llx trb-start 
%016llx trb-end %016llx seg-start %016llx seg-end %016llx\n",
+   (unsigned long long)suspect_dma,
+   (unsigned long long)start_dma,
+   (unsigned long long)end_trb_dma,
+   (unsigned long long)cur_seg->dma,
+   (unsigned long long)end_seg_dma);
+
if (end_trb_dma > 0) {
/* The end TRB is in this segment, so suspect should be 
here */
if (start_dma <= end_trb_dma) {
@@ -2476,8 +2487,8 @@ static int handle_tx_event(struct xhci_hcd *xhci,
td_num--;
 
/* Is this a TRB in the currently executing TD? */
-   event_seg = trb_in_td(ep_ring->deq_seg, ep_ring->dequeue,
-   td->last_trb, event_dma);
+   event_seg = trb_in_td(xhci, ep_ring->deq_seg, ep_ring->dequeue,
+   td->last_trb, event_dma, false);
 
/*
 * Skip the Force Stopped Event. The event_trb(event_dma) of FSE
@@ -2508,7 +2519,12 @@ static int handle_tx_event(struct xhci_hcd *xhci,
/* HC is busted, give up! */
    xhci_err(xhci,
            "ERROR Transfer event TRB DMA ptr not "
-   "part of current TD\n");
+   "part of current TD ep_index %d "
+   "comp_code %u\n", ep_index,
+   trb_comp_code);
+   trb_in_td(xhci, ep_ring->deq_seg,
+ ep_ring->dequeue, td->last_trb,
+ event_dma, true);
return -ESHUTDOWN;
}
 
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index dace515..d544c75 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1804,9 +1804,9 @@ void xhci_reset_bandwidth(struct usb_hcd *hcd, struct 
usb_device *udev);
 
 /* xHCI ri

Re: xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1

2015-07-20 Thread Arkadiusz Miskiewicz
On Saturday 18 of July 2015, Arkadiusz Miskiewicz wrote:
> Hi.
> 
> I'm on 4.2.0-rc2-00077-gf760b87 kernel and while trying to copy some file
> from usb storage (sata disk behind sata-usb bridge or pendrive; hapens in
> both cases) copying process hangs just early after start with:

Looks like suspend & resume is enough. Reloading bluetooth firmware done by 
kernel
triggers problem:

[  106.302783] rtc_cmos 00:02: System wakeup disabled by ACPI
[  106.313280] PM: resume of devices complete after 3003.032 msecs
[  106.314079] Restarting tasks ... done.
[  106.326434] Bluetooth: hci0: read Intel version: 370710018002030d00
[  106.330422] Bluetooth: hci0: Intel Bluetooth firmware file: 
intel/ibt-hw-37.7.10-fw-1.80.2.3.d.bseq
[  106.398223] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 0 comp_code 1
[  106.398230] xhci_hcd :00:14.0: Looking for event-dma fffd3000 
trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 
seg-end fffd4ff0
[  106.400396] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 0 comp_code 1
[  106.400402] xhci_hcd :00:14.0: Looking for event-dma fffd3030 
trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 
seg-end fffd4ff0
[  106.402225] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 0 comp_code 1
[  106.402228] xhci_hcd :00:14.0: Looking for event-dma fffd3060 
trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 
seg-end fffd4ff0
[  106.404401] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 0 comp_code 1
[  106.404408] xhci_hcd :00:14.0: Looking for event-dma fffd3090 
trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 
seg-end fffd4ff0
[  106.406229] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 0 comp_code 1
[  106.406232] xhci_hcd :00:14.0: Looking for event-dma fffd30c0 
trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 
seg-end fffd4ff0
[  106.408389] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 0 comp_code 1
[  106.408395] xhci_hcd :00:14.0: Looking for event-dma fffd30f0 
trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 
seg-end fffd4ff0
[  106.410291] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 0 comp_code 1
[  106.410294] xhci_hcd :00:14.0: Looking for event-dma fffd3120 
trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 
seg-end fffd4ff0
[  106.412427] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 0 comp_code 1
[  106.412433] xhci_hcd :00:14.0: Looking for event-dma fffd3150 
trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 
seg-end fffd4ff0
[  106.414315] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 0 comp_code 1
[  106.414318] xhci_hcd :00:14.0: Looking for event-dma fffd3180 
trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 
seg-end fffd4ff0

[...]



http://sprunge.us/IDUh


-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1

2015-07-21 Thread Mathias Nyman
On 20.07.2015 23:13, Arkadiusz Miskiewicz wrote:
> On Saturday 18 of July 2015, Arkadiusz Miskiewicz wrote:
>> Hi.
>>
>> I'm on 4.2.0-rc2-00077-gf760b87 kernel and while trying to copy some file
>> from usb storage (sata disk behind sata-usb bridge or pendrive; hapens in
>> both cases) copying process hangs just early after start with:
> 
> Looks like suspend & resume is enough. Reloading bluetooth firmware done by 
> kernel
> triggers problem:
> 
> [  106.302783] rtc_cmos 00:02: System wakeup disabled by ACPI
> [  106.313280] PM: resume of devices complete after 3003.032 msecs
> [  106.314079] Restarting tasks ... done.
> [  106.326434] Bluetooth: hci0: read Intel version: 370710018002030d00
> [  106.330422] Bluetooth: hci0: Intel Bluetooth firmware file: 
> intel/ibt-hw-37.7.10-fw-1.80.2.3.d.bseq
> [  106.398223] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not 
> part of current TD ep_index 0 comp_code 1

Looks like we get an event for a really old transfer for some reason, it should 
probably be handled already.

I got a patch that adds more paranoid checks for TRB cancel, which has been one 
major reasons
for the "Transfer event TRB DMA ptr not part of current TD" Errors. It also 
adds some logging
to show what's went wrong. (patch attached, applies on 4.2-rc3) Can you see if 
it helps?

If it doesn't, then adding xhci debugging could give us some clue:
echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control

You said 3.19.3 works fine, but 4.0.8 and 4.2-rc2 fail, any chance you could 
bisect it?

Thanks

-Mathias   
  

>From cd27574451792569dff002ab33148cbaf9d52faf Mon Sep 17 00:00:00 2001
From: Mathias Nyman 
Date: Tue, 25 Nov 2014 17:35:27 +0200
Subject: [PATCH] xhci: Don't touch URB TD memory if they are no longer on the
 endpoint ring

If a URB is cancelled we want to make sure the URB's TRBs still point
to memory on the endpoint ring. If the ring was already dropped then that
TRB may point to memory already in use by another ring, or freed.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 33 ++---
 drivers/usb/host/xhci.c  | 29 -
 drivers/usb/host/xhci.h  |  1 +
 3 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 94416ff..1e46d4f 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -136,6 +136,25 @@ static void next_trb(struct xhci_hcd *xhci,
 	}
 }
 
+/* check if the TD is on the ring */
+bool xhci_td_on_ring(struct xhci_td *td, struct xhci_ring *ring)
+{
+	struct xhci_segment *seg;
+
+	if (!td->start_seg || !ring || !ring->first_seg)
+		return false;
+
+	seg = ring->first_seg;
+	do {
+		if (td->start_seg == seg)
+			return true;
+		seg = seg->next;
+	} while (seg != ring->first_seg);
+
+	return false;
+}
+
+
 /*
  * See Cycle bit rules. SW is the consumer for the event ring only.
  * Don't make a ring full of link TRBs.  That would be dumb and this would loop.
@@ -685,10 +704,16 @@ static void xhci_handle_cmd_stop_ep(struct xhci_hcd *xhci, int slot_id,
 	cur_td->urb->stream_id);
 			goto remove_finished_td;
 		}
-		/*
-		 * If we stopped on the TD we need to cancel, then we have to
-		 * move the xHC endpoint ring dequeue pointer past this TD.
+		/* In case ring was dropped and segments freed or cached we
+		 * don't want to touch that memory anymore, or, if we stopped
+		 * on the TD we want to remove we need to move the dq pointer
+		 * past this TD, otherwise just turn TD to no-op
 		 */
+		if (!xhci_td_on_ring(cur_td, ep_ring)) {
+			xhci_err(xhci, "Cancelled TD not on stopped ring\n");
+			goto remove_finished_td;
+		}
+
 		if (cur_td == ep->stopped_td)
 			xhci_find_new_dequeue_state(xhci, slot_id, ep_index,
 	cur_td->urb->stream_id,
@@ -1295,11 +1320,13 @@ static void handle_cmd_completion(struct xhci_hcd *xhci,
 	/* Is the command ring deq ptr out of sync with the deq seg ptr? */
 	if (cmd_dequeue_dma == 0) {
 		xhci->error_bitmask |= 1 << 4;
+		xhci_err(xhci, "Command completion ptr and seg out of sync\n");
 		return;
 	}
 	/* Does the DMA address match our internal dequeue pointer address? */
 	if (cmd_dma != (u64) cmd_dequeue_dma) {
 		xhci->error_bitmask |= 1 << 5;
+		xhci_err(xhci, "Command completion DMA address mismatch\n");
 		return;
 	}
 
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 7da0d60..d72b46e 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1527,6 +1527,7 @@ int xhci_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
 	struct xhci_ring *ep_ring;
 	struct xhci_virt_ep *ep;
 	struct xhci_command *command;
+	bool remove_td_from_ring = false;
 
 	xhci = hcd_to_xhci(hcd);
 	spin_lock

Re: xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1

2015-07-22 Thread Arkadiusz Miskiewicz

[sorry, resend from different email - vger postmaster team has stupid filters 
in place]

On Tuesday 21 of July 2015, Mathias Nyman wrote:
> On 20.07.2015 23:13, Arkadiusz Miskiewicz wrote:
> > On Saturday 18 of July 2015, Arkadiusz Miskiewicz wrote:
> >> Hi.
> >> 
> >> I'm on 4.2.0-rc2-00077-gf760b87 kernel and while trying to copy some
> >> file from usb storage (sata disk behind sata-usb bridge or pendrive;
> >> hapens in
> > 
> >> both cases) copying process hangs just early after start with:
> > Looks like suspend & resume is enough. Reloading bluetooth firmware done
> > by kernel triggers problem:
> > 
> > [  106.302783] rtc_cmos 00:02: System wakeup disabled by ACPI
> > [  106.313280] PM: resume of devices complete after 3003.032 msecs
> > [  106.314079] Restarting tasks ... done.
> > [  106.326434] Bluetooth: hci0: read Intel version: 370710018002030d00
> > [  106.330422] Bluetooth: hci0: Intel Bluetooth firmware file:
> > intel/ibt-hw-37.7.10-fw-1.80.2.3.d.bseq [  106.398223] xhci_hcd
> > :00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD
> > ep_index 0 comp_code 1
> 
> Looks like we get an event for a really old transfer for some reason, it
> should probably be handled already.
> 
> I got a patch that adds more paranoid checks for TRB cancel, which has been
> one major reasons for the "Transfer event TRB DMA ptr not part of current
> TD" Errors. It also adds some logging to show what's went wrong. (patch
> attached, applies on 4.2-rc3) Can you see if it helps?

It doesn't unfortunately. 4.2.0-rc3-00017-gd725e66 + that patch
-> dmesg http://sprunge.us/PDIE

around 91s - after resume from ram bluetooth driver reloads

around 754s - tried to copy data from external usb disk


> If it doesn't, then adding xhci debugging could give us some clue:
> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control

Ok, http://sprunge.us/GiHX

mounted fs around 1347s and started copying; TRB problems were still there but 
file was being copied (in bursts)

> You said 3.19.3 works fine, but 4.0.8 and 4.2-rc2 fail, any chance you
> could bisect it?

Unfortunately some kernels around 3.19 don't even boot (grub says it loads 
initrd and no progress from that) - some commit fixed that but no idea which 
one.

> Thanks
> -Mathias

-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1

2015-07-24 Thread Mathias Nyman
On 22.07.2015 17:12, Arkadiusz Miskiewicz wrote:
> 
> [sorry, resend from different email - vger postmaster team has stupid filters 
> in place]
> 
> On Tuesday 21 of July 2015, Mathias Nyman wrote:
>> On 20.07.2015 23:13, Arkadiusz Miskiewicz wrote:
>>> On Saturday 18 of July 2015, Arkadiusz Miskiewicz wrote:
>>>> Hi.
>>>>
>>>> I'm on 4.2.0-rc2-00077-gf760b87 kernel and while trying to copy some
>>>> file from usb storage (sata disk behind sata-usb bridge or pendrive;
>>>> hapens in
>>>
>>>> both cases) copying process hangs just early after start with:
>>> Looks like suspend & resume is enough. Reloading bluetooth firmware done
>>> by kernel triggers problem:
>>>
>>> [  106.302783] rtc_cmos 00:02: System wakeup disabled by ACPI
>>> [  106.313280] PM: resume of devices complete after 3003.032 msecs
>>> [  106.314079] Restarting tasks ... done.
>>> [  106.326434] Bluetooth: hci0: read Intel version: 370710018002030d00
>>> [  106.330422] Bluetooth: hci0: Intel Bluetooth firmware file:
>>> intel/ibt-hw-37.7.10-fw-1.80.2.3.d.bseq [  106.398223] xhci_hcd
>>> :00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD
>>> ep_index 0 comp_code 1
>>
>> Looks like we get an event for a really old transfer for some reason, it
>> should probably be handled already.
>>
>> I got a patch that adds more paranoid checks for TRB cancel, which has been
>> one major reasons for the "Transfer event TRB DMA ptr not part of current
>> TD" Errors. It also adds some logging to show what's went wrong. (patch
>> attached, applies on 4.2-rc3) Can you see if it helps?
> 
> It doesn't unfortunately. 4.2.0-rc3-00017-gd725e66 + that patch
> -> dmesg http://sprunge.us/PDIE
> 
> around 91s - after resume from ram bluetooth driver reloads
> 
> around 754s - tried to copy data from external usb disk
> 
> 
>> If it doesn't, then adding xhci debugging could give us some clue:
>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
> 
> Ok, http://sprunge.us/GiHX
> 

Thanks for the logs, They show that the error is related to transfer 
descriptors that wrap around
on the endpoint ring buffer by exactly one transfer block. 

I don't know yet why this happens, and I might need some help running 
additional debug
patches to solve this. I'll take a more in depth look at the code one more time 
first.

A short explanation of the error, mostly for myself:

To transfer data we have a ring buffer that holds transfer requests blocks 
(TRBs). 
The ring buffer is made up of two segments,
the last TRB in each segment is a special link TRB that points to the next 
segment.

Segment A: 0x3000 - 0x3ff0   (where link TRB at 0x3ff0 points to Segment B, 
0x4000)
Segment B: 0x4000 - 0x4ff0   (where link TRB at 0x4ff0 points back at Segment 
A, 0x3000) 

A tranfer descriptor (TD) can consist of many TRBs, in this case three TRBs. 
When a TD is completed we will get an event telling where the last transferred 
TRB of the TD was.

So in this case the error:
xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD 
ep_index 0 comp_code 1
xhci_hcd :00:14.0: Looking for event-dma fffd3000 trb-start 
fffd4fd0 trb-end fffd5000
seg-start fffd4000 seg-end fffd4ff0

tells us the first TRB of the TD (trb-start) is at 4fd0,
The second TRB is at 4fe0, 
Then we have the special link TRB be at 4ff0, pointing us back to the first 
segment. 
The third and final TRB should be back at the first segment at 3000.

We get an event for the last TRB at 3000 and all should be fine, but driver 
claims the TDs TRBs start at 4fd0,
and the last TRB is at 5000, missing the link TRB wrapping us back to the first 
segment.


-Mathias









--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1

2015-07-24 Thread Mathias Nyman
On 24.07.2015 14:59, Mathias Nyman wrote:
> On 22.07.2015 17:12, Arkadiusz Miskiewicz wrote:
>>
>> On Tuesday 21 of July 2015, Mathias Nyman wrote:
>>> On 20.07.2015 23:13, Arkadiusz Miskiewicz wrote:
>>>> On Saturday 18 of July 2015, Arkadiusz Miskiewicz wrote:
>>>>> Hi.
>>>>>
>>>>> I'm on 4.2.0-rc2-00077-gf760b87 kernel and while trying to copy some
>>>>> file from usb storage (sata disk behind sata-usb bridge or pendrive;
>>>>> hapens in
>>>>
>>>>> both cases) copying process hangs just early after start with:
>>>> Looks like suspend & resume is enough. Reloading bluetooth firmware done
>>>> by kernel triggers problem:
>>>>
>>>> [  106.302783] rtc_cmos 00:02: System wakeup disabled by ACPI
>>>> [  106.313280] PM: resume of devices complete after 3003.032 msecs
>>>> [  106.314079] Restarting tasks ... done.
>>>> [  106.326434] Bluetooth: hci0: read Intel version: 370710018002030d00
>>>> [  106.330422] Bluetooth: hci0: Intel Bluetooth firmware file:
>>>> intel/ibt-hw-37.7.10-fw-1.80.2.3.d.bseq [  106.398223] xhci_hcd
>>>> :00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD
>>>> ep_index 0 comp_code 1
>>>
> 
> Thanks for the logs, They show that the error is related to transfer 
> descriptors that wrap around
> on the endpoint ring buffer by exactly one transfer block. 
> 
> I don't know yet why this happens, and I might need some help running 
> additional debug
> patches to solve this. I'll take a more in depth look at the code one more 
> time first.
> 

I think I found something, The recent ring segment size increase exposed an off 
by one
error that has been in the driver for a long time. But you need to be unlucky 
and have
your memory pages allocated in a specific order to trigger it.

small fix, looks like this:

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 94416ff..77da8fe 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -82,7 +82,7 @@ dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg,
return 0;
/* offset in TRBs */
segment_offset = trb - seg->trbs;
-   if (segment_offset > TRBS_PER_SEGMENT)
+   if (segment_offset > TRBS_PER_SEGMENT - 1)
return 0;
return seg->dma + (segment_offset * sizeof(*trb));
 }


Patch attached, could you try it out?

Thanks
-Mathias   

>From 10e909ee20846793e41973941b1367e2303ec313 Mon Sep 17 00:00:00 2001
From: Mathias Nyman 
Date: Fri, 24 Jul 2015 15:56:23 +0300
Subject: [PATCH] xhci: fix off by one error in TRB DMA address boundary check

We need to check that a TRB is part of the current segment in use
before calculating its DMA address.

Previously a ring segment didn't use a full memory page, and every
new ring segment got a new memory page, so the off by one
error in checking the upper bound was never seen.

Now that we use a full memory page, 256 TRBs (4096 bytes) the off by one
caused issues as it doesnt catch the case when a TRB was the first element
of the next segment.

This is triggered if the virtual memory pages for a ring segment are
next to each in increasing order where the ring buffer wraps around and causes
errors like:

[  106.398223] xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not
 part of current TD ep_index 0 comp_code 1
[  106.398230] xhci_hcd :00:14.0: Looking for event-dma fffd3000
 trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 seg-end fffd4ff0

the trb-end address is one outside the end-seg address.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-ring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 94416ff..77da8fe 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -82,7 +82,7 @@ dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg,
 		return 0;
 	/* offset in TRBs */
 	segment_offset = trb - seg->trbs;
-	if (segment_offset > TRBS_PER_SEGMENT)
+	if (segment_offset > TRBS_PER_SEGMENT - 1)
 		return 0;
 	return seg->dma + (segment_offset * sizeof(*trb));
 }
-- 
1.8.3.2



Re: xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1

2015-07-24 Thread Peter Stuge
Mathias Nyman wrote:
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -82,7 +82,7 @@ dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg,
> return 0;
> /* offset in TRBs */
> segment_offset = trb - seg->trbs;
> -   if (segment_offset > TRBS_PER_SEGMENT)
> +   if (segment_offset > TRBS_PER_SEGMENT - 1)

Maybe change the > comparison to >= rather than add the extra "- 1"?


//Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1

2015-07-24 Thread Mathias Nyman
On 24.07.2015 16:53, Peter Stuge wrote:
> Mathias Nyman wrote:
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -82,7 +82,7 @@ dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg,
>> return 0;
>> /* offset in TRBs */
>> segment_offset = trb - seg->trbs;
>> -   if (segment_offset > TRBS_PER_SEGMENT)
>> +   if (segment_offset > TRBS_PER_SEGMENT - 1)
> 
> Maybe change the > comparison to >= rather than add the extra "- 1"?

Yes, sure, I'll change it if this really was the cause.

Just happy to finally find a probable cause after staring at logs and code for 
some time

-Mathias  

 

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1

2015-07-24 Thread Arkadiusz Miskiewicz
On Friday 24 of July 2015, Mathias Nyman wrote:
> On 24.07.2015 14:59, Mathias Nyman wrote:
> > On 22.07.2015 17:12, Arkadiusz Miskiewicz wrote:
> >> On Tuesday 21 of July 2015, Mathias Nyman wrote:
> >>> On 20.07.2015 23:13, Arkadiusz Miskiewicz wrote:
> >>>> On Saturday 18 of July 2015, Arkadiusz Miskiewicz wrote:
> >>>>> Hi.
> >>>>> 
> >>>>> I'm on 4.2.0-rc2-00077-gf760b87 kernel and while trying to copy some
> >>>>> file from usb storage (sata disk behind sata-usb bridge or pendrive;
> >>>>> hapens in
> >>>> 
> >>>>> both cases) copying process hangs just early after start with:
> >>>> Looks like suspend & resume is enough. Reloading bluetooth firmware
> >>>> done by kernel triggers problem:
> >>>> 
> >>>> [  106.302783] rtc_cmos 00:02: System wakeup disabled by ACPI
> >>>> [  106.313280] PM: resume of devices complete after 3003.032 msecs
> >>>> [  106.314079] Restarting tasks ... done.
> >>>> [  106.326434] Bluetooth: hci0: read Intel version: 370710018002030d00
> >>>> [  106.330422] Bluetooth: hci0: Intel Bluetooth firmware file:
> >>>> intel/ibt-hw-37.7.10-fw-1.80.2.3.d.bseq [  106.398223] xhci_hcd
> >>>> :00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD
> >>>> ep_index 0 comp_code 1
> > 
> > Thanks for the logs, They show that the error is related to transfer
> > descriptors that wrap around on the endpoint ring buffer by exactly one
> > transfer block.
> > 
> > I don't know yet why this happens, and I might need some help running
> > additional debug patches to solve this. I'll take a more in depth look
> > at the code one more time first.
> 
> I think I found something, The recent ring segment size increase exposed an
> off by one error that has been in the driver for a long time. But you need
> to be unlucky and have your memory pages allocated in a specific order to
> trigger it.
> 
> small fix, looks like this:
> 
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 94416ff..77da8fe 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -82,7 +82,7 @@ dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg,
> return 0;
> /* offset in TRBs */
> segment_offset = trb - seg->trbs;
> -   if (segment_offset > TRBS_PER_SEGMENT)
> +   if (segment_offset > TRBS_PER_SEGMENT - 1)
> return 0;
> return seg->dma + (segment_offset * sizeof(*trb));
>  }
> 
> 
> Patch attached, could you try it out?

Works fine with this patch, so:

Tested-by: Arkadiusz Miśkiewicz 

Thanks!

ps. please push to stable@, too

> 
> Thanks
> -Mathias

-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1

2015-08-03 Thread Mathias Nyman
On 24.07.2015 18:33, Arkadiusz Miskiewicz wrote:
> On Friday 24 of July 2015, Mathias Nyman wrote:
>> On 24.07.2015 14:59, Mathias Nyman wrote:
>>> On 22.07.2015 17:12, Arkadiusz Miskiewicz wrote:
>>>> On Tuesday 21 of July 2015, Mathias Nyman wrote:
>>>>> On 20.07.2015 23:13, Arkadiusz Miskiewicz wrote:
>>>>>> On Saturday 18 of July 2015, Arkadiusz Miskiewicz wrote:
>>>>>>> Hi.
>>>>>>>
>>>>>>> I'm on 4.2.0-rc2-00077-gf760b87 kernel and while trying to copy some
>>>>>>> file from usb storage (sata disk behind sata-usb bridge or pendrive;
>>>>>>> hapens in
>>>>>>
>>>>>>> both cases) copying process hangs just early after start with:
>>>>>> Looks like suspend & resume is enough. Reloading bluetooth firmware
>>>>>> done by kernel triggers problem:
>>>>>>
>>>>>> [  106.302783] rtc_cmos 00:02: System wakeup disabled by ACPI
>>>>>> [  106.313280] PM: resume of devices complete after 3003.032 msecs
>>>>>> [  106.314079] Restarting tasks ... done.
>>>>>> [  106.326434] Bluetooth: hci0: read Intel version: 370710018002030d00
>>>>>> [  106.330422] Bluetooth: hci0: Intel Bluetooth firmware file:
>>>>>> intel/ibt-hw-37.7.10-fw-1.80.2.3.d.bseq [  106.398223] xhci_hcd
>>>>>> :00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD
>>>>>> ep_index 0 comp_code 1
>>>
>>> Thanks for the logs, They show that the error is related to transfer
>>> descriptors that wrap around on the endpoint ring buffer by exactly one
>>> transfer block.
>>>
>>> I don't know yet why this happens, and I might need some help running
>>> additional debug patches to solve this. I'll take a more in depth look
>>> at the code one more time first.
>>
>> I think I found something, The recent ring segment size increase exposed an
>> off by one error that has been in the driver for a long time. But you need
>> to be unlucky and have your memory pages allocated in a specific order to
>> trigger it.
>>
>> small fix, looks like this:
>>
>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
>> index 94416ff..77da8fe 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -82,7 +82,7 @@ dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg,
>> return 0;
>> /* offset in TRBs */
>> segment_offset = trb - seg->trbs;
>> -   if (segment_offset > TRBS_PER_SEGMENT)
>> +   if (segment_offset > TRBS_PER_SEGMENT - 1)
>> return 0;
>> return seg->dma + (segment_offset * sizeof(*trb));
>>  }
>>
>>
>> Patch attached, could you try it out?
> 
> Works fine with this patch, so:
> 
> Tested-by: Arkadiusz Miśkiewicz 
> 
> Thanks!
> 
> ps. please push to stable@, too
> 

Patch sent forward, added Tested-by and stable tags

Thanks
- Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/11] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-08-01 Thread Greg Kroah-Hartman
On Fri, Jul 25, 2014 at 10:01:20PM +0200, Hans de Goede wrote:
> Lately (with the use of uas / bulk-streams) we have been seeing several
> cases where this error triggers (which should never happen).
> 
> Add some extra logging to make debugging these errors easier.
> 
> Signed-off-by: Hans de Goede 
> ---
>  drivers/usb/host/xhci-mem.c  |  4 +++-
>  drivers/usb/host/xhci-ring.c | 22 ++
>  drivers/usb/host/xhci.h  |  6 +++---
>  3 files changed, 24 insertions(+), 8 deletions(-)

This patch just fails to apply:
checking file drivers/usb/host/xhci-mem.c
checking file drivers/usb/host/xhci-ring.c
Hunk #4 FAILED at 2519.
1 out of 4 hunks FAILED
checking file drivers/usb/host/xhci.h
Hunk #1 succeeded at 1804 (offset -2 lines).

So while I wanted to apply it, I can't :(

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/11] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-08-02 Thread Hans de Goede
Hi,

On 08/02/2014 12:58 AM, Greg Kroah-Hartman wrote:
> On Fri, Jul 25, 2014 at 10:01:20PM +0200, Hans de Goede wrote:
>> Lately (with the use of uas / bulk-streams) we have been seeing several
>> cases where this error triggers (which should never happen).
>>
>> Add some extra logging to make debugging these errors easier.
>>
>> Signed-off-by: Hans de Goede 
>> ---
>>  drivers/usb/host/xhci-mem.c  |  4 +++-
>>  drivers/usb/host/xhci-ring.c | 22 ++
>>  drivers/usb/host/xhci.h  |  6 +++---
>>  3 files changed, 24 insertions(+), 8 deletions(-)
> 
> This patch just fails to apply:
> checking file drivers/usb/host/xhci-mem.c
> checking file drivers/usb/host/xhci-ring.c
> Hunk #4 FAILED at 2519.
> 1 out of 4 hunks FAILED
> checking file drivers/usb/host/xhci.h
> Hunk #1 succeeded at 1804 (offset -2 lines).
> 
> So while I wanted to apply it, I can't :(

Oops, I based it on the latest 3.16-rc# at the time of sending,
there probably was something in next already which conflicts.

Thanks for picking up (most of) the other patches!

I'll rebase the remainder of the patches and resend.

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/11] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-08-04 Thread Mathias Nyman
On 07/25/2014 11:01 PM, Hans de Goede wrote:
> Lately (with the use of uas / bulk-streams) we have been seeing several
> cases where this error triggers (which should never happen).
> 
> Add some extra logging to make debugging these errors easier.
> 
> Signed-off-by: Hans de Goede 
> ---
>  drivers/usb/host/xhci-mem.c  |  4 +++-
>  drivers/usb/host/xhci-ring.c | 22 ++
>  drivers/usb/host/xhci.h  |  6 +++---
>  3 files changed, 24 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
> index 8056d90..26129d3 100644
> --- a/drivers/usb/host/xhci-mem.c
> +++ b/drivers/usb/host/xhci-mem.c
> @@ -1903,7 +1903,7 @@ static int xhci_test_trb_in_td(struct xhci_hcd *xhci,
>   start_dma = xhci_trb_virt_to_dma(input_seg, start_trb);
>   end_dma = xhci_trb_virt_to_dma(input_seg, end_trb);
>  
> - seg = trb_in_td(input_seg, start_trb, end_trb, input_dma);
> + seg = trb_in_td(xhci, input_seg, start_trb, end_trb, input_dma, false);
>   if (seg != result_seg) {
>   xhci_warn(xhci, "WARN: %s TRB math test %d failed!\n",
>   test_name, test_number);
> @@ -1917,6 +1917,8 @@ static int xhci_test_trb_in_td(struct xhci_hcd *xhci,
>   end_trb, end_dma);
>   xhci_warn(xhci, "Expected seg %p, got seg %p\n",
>   result_seg, seg);
> + trb_in_td(xhci, input_seg, start_trb, end_trb, input_dma,
> +   true);
>   return -1;
>   }
>   return 0;
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index d9b3286..213b28a 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -1718,10 +1718,12 @@ cleanup:
>   * TRB in this TD, this function returns that TRB's segment.  Otherwise it
>   * returns 0.
>   */
> -struct xhci_segment *trb_in_td(struct xhci_segment *start_seg,
> +struct xhci_segment *trb_in_td(struct xhci_hcd *xhci,
> + struct xhci_segment *start_seg,
>   union xhci_trb  *start_trb,
>   union xhci_trb  *end_trb,
> - dma_addr_t  suspect_dma)
> + dma_addr_t  suspect_dma,
> + booldebug)

The added debug information is useful, but I'm not a big fan of the boolean 
debug function parameter.

I get that we only want to print the information when we really expect the trb 
to be in the TD, and fail.
This is a simple way of doing it, but reading the code with lots of true / 
false function arguments is difficult.
(xhci has too many of them already)

I haven't got a better solution, all other variants seems to have their own 
drawbacks.
New suggestions are welcome

-Mathias


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/11] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-08-04 Thread Alan Stern
On Mon, 4 Aug 2014, Mathias Nyman wrote:

> > --- a/drivers/usb/host/xhci-ring.c
> > +++ b/drivers/usb/host/xhci-ring.c
> > @@ -1718,10 +1718,12 @@ cleanup:
> >   * TRB in this TD, this function returns that TRB's segment.  Otherwise it
> >   * returns 0.
> >   */
> > -struct xhci_segment *trb_in_td(struct xhci_segment *start_seg,
> > +struct xhci_segment *trb_in_td(struct xhci_hcd *xhci,
> > +   struct xhci_segment *start_seg,
> > union xhci_trb  *start_trb,
> > union xhci_trb  *end_trb,
> > -   dma_addr_t  suspect_dma)
> > +   dma_addr_t  suspect_dma,
> > +   booldebug)
> 
> The added debug information is useful, but I'm not a big fan of the boolean 
> debug function parameter.
> 
> I get that we only want to print the information when we really expect the 
> trb to be in the TD, and fail.
> This is a simple way of doing it, but reading the code with lots of true / 
> false function arguments is difficult.
> (xhci has too many of them already)
> 
> I haven't got a better solution, all other variants seems to have their own 
> drawbacks.
> New suggestions are welcome

This function could be made private.  Then there could be two separate 
public functions to call it, one with the debug parameter set to true 
and one with the parameter set to false.

The parameter's name could be changed to something more readable, such 
as warn_if_not_found.

Alan Stern


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/11] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-08-04 Thread Hans de Goede
Hi,

On 08/04/2014 05:24 PM, Alan Stern wrote:
> On Mon, 4 Aug 2014, Mathias Nyman wrote:
> 
>>> --- a/drivers/usb/host/xhci-ring.c
>>> +++ b/drivers/usb/host/xhci-ring.c
>>> @@ -1718,10 +1718,12 @@ cleanup:
>>>   * TRB in this TD, this function returns that TRB's segment.  Otherwise it
>>>   * returns 0.
>>>   */
>>> -struct xhci_segment *trb_in_td(struct xhci_segment *start_seg,
>>> +struct xhci_segment *trb_in_td(struct xhci_hcd *xhci,
>>> +   struct xhci_segment *start_seg,
>>> union xhci_trb  *start_trb,
>>> union xhci_trb  *end_trb,
>>> -   dma_addr_t  suspect_dma)
>>> +   dma_addr_t  suspect_dma,
>>> +   booldebug)
>>
>> The added debug information is useful, but I'm not a big fan of the boolean 
>> debug function parameter.
>>
>> I get that we only want to print the information when we really expect the 
>> trb to be in the TD, and fail.
>> This is a simple way of doing it, but reading the code with lots of true / 
>> false function arguments is difficult.
>> (xhci has too many of them already)
>>
>> I haven't got a better solution, all other variants seems to have their own 
>> drawbacks.
>> New suggestions are welcome
> 
> This function could be made private.  Then there could be two separate 
> public functions to call it, one with the debug parameter set to true 
> and one with the parameter set to false.

Not sure that kind of indirection makes things better, it feels like
obfuscation to me.

> The parameter's name could be changed to something more readable, such 
> as warn_if_not_found.

But that is not what it does, it prints debug information *while* it is
searching. If things were as simple as warn if not found, we could simply always
warn. The problem is that by the time we know the TRB is not found in the TD,
we may have already gone through the loop multiple times, and we want a log
message for each iteration of the loop when this happens. So one way or
the other we need to go through the loop twice.

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/11] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-08-04 Thread Hans de Goede
Hi,

On 08/02/2014 10:41 AM, Hans de Goede wrote:
> Hi,
> 
> On 08/02/2014 12:58 AM, Greg Kroah-Hartman wrote:
>> On Fri, Jul 25, 2014 at 10:01:20PM +0200, Hans de Goede wrote:
>>> Lately (with the use of uas / bulk-streams) we have been seeing several
>>> cases where this error triggers (which should never happen).
>>>
>>> Add some extra logging to make debugging these errors easier.
>>>
>>> Signed-off-by: Hans de Goede 
>>> ---
>>>  drivers/usb/host/xhci-mem.c  |  4 +++-
>>>  drivers/usb/host/xhci-ring.c | 22 ++
>>>  drivers/usb/host/xhci.h  |  6 +++---
>>>  3 files changed, 24 insertions(+), 8 deletions(-)
>>
>> This patch just fails to apply:
>> checking file drivers/usb/host/xhci-mem.c
>> checking file drivers/usb/host/xhci-ring.c
>> Hunk #4 FAILED at 2519.
>> 1 out of 4 hunks FAILED
>> checking file drivers/usb/host/xhci.h
>> Hunk #1 succeeded at 1804 (offset -2 lines).
>>
>> So while I wanted to apply it, I can't :(
> 
> Oops, I based it on the latest 3.16-rc# at the time of sending,
> there probably was something in next already which conflicts.

Ok, I've figured out where the conflict came from. I posted
another series of 3 patches earlier which already made some
logging changes to the same code path, specifically
this patch depends on the

"xhci: Log ep-index and comp-code on TRB DMA ptr not part of current TD"

patch from that series being present. I've squashed the 2 together
for the resend I'm about to do.

More important that earlier series contained a bug fix, which
Matthias said he would pick up, but which does not seem to
have made its way into usb-next:

http://www.spinics.net/lists/linux-usb/msg110950.html

I'll also include that one in the resend.

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/7] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-08-04 Thread Hans de Goede
Lately (with the use of uas / bulk-streams) we have been seeing several
cases where this error triggers (which should never happen).

Add some extra logging to make debugging these errors easier.

Signed-off-by: Hans de Goede 
---
 drivers/usb/host/xhci-mem.c  |  4 +++-
 drivers/usb/host/xhci-ring.c | 26 +-
 drivers/usb/host/xhci.h  |  6 +++---
 3 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 8056d90..26129d3 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -1903,7 +1903,7 @@ static int xhci_test_trb_in_td(struct xhci_hcd *xhci,
start_dma = xhci_trb_virt_to_dma(input_seg, start_trb);
end_dma = xhci_trb_virt_to_dma(input_seg, end_trb);
 
-   seg = trb_in_td(input_seg, start_trb, end_trb, input_dma);
+   seg = trb_in_td(xhci, input_seg, start_trb, end_trb, input_dma, false);
if (seg != result_seg) {
xhci_warn(xhci, "WARN: %s TRB math test %d failed!\n",
test_name, test_number);
@@ -1917,6 +1917,8 @@ static int xhci_test_trb_in_td(struct xhci_hcd *xhci,
end_trb, end_dma);
xhci_warn(xhci, "Expected seg %p, got seg %p\n",
result_seg, seg);
+   trb_in_td(xhci, input_seg, start_trb, end_trb, input_dma,
+ true);
return -1;
}
return 0;
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index ac8cf23..2e19986 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1722,10 +1722,12 @@ cleanup:
  * TRB in this TD, this function returns that TRB's segment.  Otherwise it
  * returns 0.
  */
-struct xhci_segment *trb_in_td(struct xhci_segment *start_seg,
+struct xhci_segment *trb_in_td(struct xhci_hcd *xhci,
+   struct xhci_segment *start_seg,
union xhci_trb  *start_trb,
union xhci_trb  *end_trb,
-   dma_addr_t  suspect_dma)
+   dma_addr_t  suspect_dma,
+   booldebug)
 {
dma_addr_t start_dma;
dma_addr_t end_seg_dma;
@@ -1744,6 +1746,15 @@ struct xhci_segment *trb_in_td(struct xhci_segment 
*start_seg,
/* If the end TRB isn't in this segment, this is set to 0 */
end_trb_dma = xhci_trb_virt_to_dma(cur_seg, end_trb);
 
+   if (debug)
+   xhci_warn(xhci,
+   "Looking for event-dma %016llx trb-start 
%016llx trb-end %016llx seg-start %016llx seg-end %016llx\n",
+   (unsigned long long)suspect_dma,
+   (unsigned long long)start_dma,
+   (unsigned long long)end_trb_dma,
+   (unsigned long long)cur_seg->dma,
+   (unsigned long long)end_seg_dma);
+
if (end_trb_dma > 0) {
/* The end TRB is in this segment, so suspect should be 
here */
if (start_dma <= end_trb_dma) {
@@ -2476,8 +2487,8 @@ static int handle_tx_event(struct xhci_hcd *xhci,
td_num--;
 
/* Is this a TRB in the currently executing TD? */
-   event_seg = trb_in_td(ep_ring->deq_seg, ep_ring->dequeue,
-   td->last_trb, event_dma);
+   event_seg = trb_in_td(xhci, ep_ring->deq_seg, ep_ring->dequeue,
+   td->last_trb, event_dma, false);
 
/*
 * Skip the Force Stopped Event. The event_trb(event_dma) of FSE
@@ -2509,7 +2520,12 @@ static int handle_tx_event(struct xhci_hcd *xhci,
/* HC is busted, give up! */
    xhci_err(xhci,
            "ERROR Transfer event TRB DMA ptr not "
-   "part of current TD\n");
+   "part of current TD ep_index %d "
+   "comp_code %u\n", ep_index,
+   trb_comp_code);
+   trb_in_td(xhci, ep_ring->deq_seg,
+ ep_ring->dequeue, td->last_trb,
+ event_dma, true);
return -ESHUTDOWN;
}
 
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index dace515..d544c75 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1804,9 +1804,9 @@ void xhci_reset_bandwidth(struct usb_hcd *hcd, struct 
usb_device *udev);
 
 /* xHCI ring, segment, TRB, and TD functions */
 dma_addr_t xhci

Re: [PATCH 03/11] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-08-04 Thread Mathias Nyman
On 08/04/2014 06:56 PM, Hans de Goede wrote:
> Hi,
> 
> On 08/02/2014 10:41 AM, Hans de Goede wrote:
>> Hi,
>>
>> On 08/02/2014 12:58 AM, Greg Kroah-Hartman wrote:
>>> On Fri, Jul 25, 2014 at 10:01:20PM +0200, Hans de Goede wrote:
 Lately (with the use of uas / bulk-streams) we have been seeing several
 cases where this error triggers (which should never happen).

 Add some extra logging to make debugging these errors easier.

 Signed-off-by: Hans de Goede 
 ---
  drivers/usb/host/xhci-mem.c  |  4 +++-
  drivers/usb/host/xhci-ring.c | 22 ++
  drivers/usb/host/xhci.h  |  6 +++---
  3 files changed, 24 insertions(+), 8 deletions(-)
>>>
>>> This patch just fails to apply:
>>> checking file drivers/usb/host/xhci-mem.c
>>> checking file drivers/usb/host/xhci-ring.c
>>> Hunk #4 FAILED at 2519.
>>> 1 out of 4 hunks FAILED
>>> checking file drivers/usb/host/xhci.h
>>> Hunk #1 succeeded at 1804 (offset -2 lines).
>>>
>>> So while I wanted to apply it, I can't :(
>>
>> Oops, I based it on the latest 3.16-rc# at the time of sending,
>> there probably was something in next already which conflicts.
> 
> Ok, I've figured out where the conflict came from. I posted
> another series of 3 patches earlier which already made some
> logging changes to the same code path, specifically
> this patch depends on the
> 
> "xhci: Log ep-index and comp-code on TRB DMA ptr not part of current TD"
> 
> patch from that series being present. I've squashed the 2 together
> for the resend I'm about to do.
> 
> More important that earlier series contained a bug fix, which
> Matthias said he would pick up, but which does not seem to
> have made its way into usb-next:
> 
> http://www.spinics.net/lists/linux-usb/msg110950.html
> 

That's right, I was under the impression we were too late for usb-next already, 
and 
planned on sending it as a fix to usb-linus once 3.17-rc1 is out.

> I'll also include that one in the resend.

You don't think we should try to get it to 3.17 still? 

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/11] xhci: Log extra info on "ERROR Transfer event TRB DMA ptr not part of current TD"

2014-08-04 Thread Hans de Goede
Hi,

On 08/04/2014 06:19 PM, Mathias Nyman wrote:
> On 08/04/2014 06:56 PM, Hans de Goede wrote:
>> Hi,
>>
>> On 08/02/2014 10:41 AM, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 08/02/2014 12:58 AM, Greg Kroah-Hartman wrote:
 On Fri, Jul 25, 2014 at 10:01:20PM +0200, Hans de Goede wrote:
> Lately (with the use of uas / bulk-streams) we have been seeing several
> cases where this error triggers (which should never happen).
>
> Add some extra logging to make debugging these errors easier.
>
> Signed-off-by: Hans de Goede 
> ---
>  drivers/usb/host/xhci-mem.c  |  4 +++-
>  drivers/usb/host/xhci-ring.c | 22 ++
>  drivers/usb/host/xhci.h  |  6 +++---
>  3 files changed, 24 insertions(+), 8 deletions(-)

 This patch just fails to apply:
 checking file drivers/usb/host/xhci-mem.c
 checking file drivers/usb/host/xhci-ring.c
 Hunk #4 FAILED at 2519.
 1 out of 4 hunks FAILED
 checking file drivers/usb/host/xhci.h
 Hunk #1 succeeded at 1804 (offset -2 lines).

 So while I wanted to apply it, I can't :(
>>>
>>> Oops, I based it on the latest 3.16-rc# at the time of sending,
>>> there probably was something in next already which conflicts.
>>
>> Ok, I've figured out where the conflict came from. I posted
>> another series of 3 patches earlier which already made some
>> logging changes to the same code path, specifically
>> this patch depends on the
>>
>> "xhci: Log ep-index and comp-code on TRB DMA ptr not part of current TD"
>>
>> patch from that series being present. I've squashed the 2 together
>> for the resend I'm about to do.
>>
>> More important that earlier series contained a bug fix, which
>> Matthias said he would pick up, but which does not seem to
>> have made its way into usb-next:
>>
>> http://www.spinics.net/lists/linux-usb/msg110950.html
>>
> 
> That's right, I was under the impression we were too late for usb-next 
> already, and 
> planned on sending it as a fix to usb-linus once 3.17-rc1 is out.
> 
>> I'll also include that one in the resend.
> 
> You don't think we should try to get it to 3.17 still? 

I do think it should be added to 3.17 still, see the cover letter of
the resend. I included it for completeness.

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


USB headset doesn't work with USB 3.0 port in kernel 2.6.37 - xhci_hcd error: "ERROR Transfer event TRB DMA ptr not part of current TD"

2016-07-15 Thread Goutham BG
Hi,

I'm working on a project to support USB headset on a device containing TI
chipset DM8168 SoC. It runs Linux kernel version 2.6.37. The headset works
fine with proper audio when connected to USB 2.0 ports which uses musb host
controller driver. However, when I connect the headset to USB 3.0 ports, which
use xhci host controller driver, it sometimes throws the following error 
repeatedly.
---
xhci_hcd :09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
---

After this error, most of the PCM read/writes fail and xhci starts throwing 
other "ep ring" errors as shown below.
---
cannot submit datapipe for urb 2, error -22: internal error
xhci_hcd :09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
cannot submit datapipe for urb 2, error -22: internal error
xhci_hcd :09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
cannot submit datapipe for urb 2, error -22: internal error
xhci_hcd :09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
cannot submit datapipe for urb 2, error -22: internal error
xhci_hcd :09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
cannot submit datapipe for urb 2, error -22: internal error
xhci_hcd :09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
xhci_hcd :09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
xhci_hcd :09:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
xhci_hcd :09:00.0: ERROR no room on ep ring
cannot submit datapipe for urb 0, error -12: unknown error
xhci_hcd :09:00.0: ERROR no room on ep ring
cannot submit datapipe for urb 0, error -12: unknown error
xhci_hcd :09:00.0: ERROR no room on ep ring
---

I suspect this could be a bug fixed by the patch titled "xhci: Fix bug after
deq ptr set to link TRB" which was submitted to kernel in v3.6. 
Commit id is 50d0206fcaea3e736f912fd5b00ec6233fb4ce44. 
In the commit message of this patch, it is mentioned that this fix has to be
backported to kernels as old as 2.6.31 and a separate patch will be created 
for
older kernels. Following is the excerpt from the commit message of this patch:
---
This patch should be backported to kernels as old as 2.6.31.  A separate
patch will be created for kernels older than 3.4, since inc_deq was
modified in 3.4 and this patch will not apply.
---

Please let me know if a patch is available for this fix which can be applied 
to
kernel 2.6.37.

Appreciate your help on this.

Thanks,
Goutham BG

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB headset doesn't work with USB 3.0 port in kernel 2.6.37 - xhci_hcd error: "ERROR Transfer event TRB DMA ptr not part of current TD"

2016-07-15 Thread Greg KH
On Fri, Jul 15, 2016 at 10:38:04AM +, Goutham BG wrote:
> Hi,
> 
> I'm working on a project to support USB headset on a device containing TI
> chipset DM8168 SoC. It runs Linux kernel version 2.6.37.

Wow that's an obsolete and very very old kernel version, many hundreds
of thousands of changes old.  Please work with your vendor who is
forcing you to use such a kernel version, as you are already paying for
that support from them.  There is nothing that we can do here to help
you with this, unless you can reproduce this on a 4.6 kernel release.

good luck!

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html