Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-08-02 Thread Durval Menezes
Hello,

On Mon, Aug 1, 2016 at 1:44 PM, Durval Menezes  wrote:
> Hi Mike,
> 
> On Mon, Aug 1, 2016 at 12:05 PM, Mike Murdoch  
> wrote:
> > On 2016-08-01 13:57, Durval Menezes wrote:
> > > Hi Mathias,
> > >
> > > On Mon, Aug 1, 2016 at 8:20 AM, Mathias Nyman 
> > >  wrote:
> > >>> On 01.08.2016 13:15, Durval Menezes wrote:
> > >>> Hello Mike, Mathias, list,
> > >>>
> > >>> On 06.02.2016 19:08, Mike Murdoch wrote:
> > >>> Bug ID: 111251
> > >>>
> > >>> I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
> > >>> kernel 4.4.1-gentoo.
> > >>>
> > >>> 0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
> > >>> Controller (rev 04) (prog-if 30 [XHCI])
> > >>>  Subsystem: Lenovo uPD720200 USB 3.0 Host Controller
> > >>>  Flags: bus master, fast devsel, latency 0
> > >>>  Memory at f380 (64-bit, non-prefetchable) [size=8K]
> > >>>  Capabilities: [50] Power Management version 3
> > >>>  Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
> > >>>  Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
> > >>>  Capabilities: [a0] Express Endpoint, MSI 00
> > >>>  Capabilities: [100] Advanced Error Reporting
> > >>>  Capabilities: [140] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
> > >>>  Capabilities: [150] Latency Tolerance Reporting
> > >>>  Kernel driver in use: xhci_hcd
> > >>>  Kernel modules: xhci_pci
> > >>>
> > >>> When runtime power control for this controller is disabled
> > >>> (/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
> > >>> works fine and reaches over 120MB/s transfer rates.
> > >>>
> > >>> When runtime power control for this controller is enabled
> > >>> (/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
> > >>> can be observed:
> > >>>
> > >>> - Transfer rates are much lower at around 30MB/s
> > >>> - During transfers, the controller dies after a couple of seconds:
> > >>>
> > >>> I found this message in the list archives, and I have the exact same
> > >>> issues on exactly the same hardware (Thinkpad W520 laptop with the same
> > >>> USB3 controller showing on lspci -v); otherwise, I'm running distro 
> > >>> kernel
> > >>> 2.6.32-573.7.1.el6.x86_64 on a Springdale Linux 6.7 (RHEL6) install.
> > >>>
> > >>> I just verified that my controller's PM was set by default to "auto":
> > >>> cat /sys/bus/pci/devices/\:0e\:00.0/power/control
> > >>> auto
> > >>> I have now set it to "on" and will test whether this will work around
> > >>> the issue (I'm waiting for my USB3.0 "heavy duty" disk docks to be
> > >>> released from another system that is using them right now).

The docks (actually a 4-disk Mediasonic Probox enclosure, and a
single-disk USpeed SATA-to-USB adapter) have returned; I've rebooted my
machine (to make sure I was starting from as clean a slate as possible),
then (before plugging anything) set the controller's PM to "on" (ie, no
power management) with the "echo" command above, and confirmed it with the
"cat" command above. Then I tried to plug first the adapter to each of
the two ports controlled by the uPD720200; the result for each attempt
(as recorded on syslog with level debug) was just:

Aug  2 11:45:16 localhost kernel: hub 3-0:1.0: unable to enumerate USB 
device on port 1

For the lower port, and 

Aug  2 11:54:19 localhost kernel: hub 3-0:1.0: unable to enumerate USB 
device on port 2

For the upper port.

To confirm that the adapter is working, I connected it to the "combo"
USB/eSATA adapter (which on the W520 is right besides the two uPD720200
ports) and it worked great (albeit limited to USB2.1, as this is this
port's type):

Aug  2 11:56:46 localhost kernel: usb 2-1.2: new high speed USB device 
number 3 using ehci_hcd
Aug  2 11:56:47 localhost kernel: usb 2-1.2: New USB device found, 
idVendor=174c, idProduct=5106
Aug  2 11:56:47 localhost kernel: usb 2-1.2: New USB device strings: 
Mfr=2, Product=3, SerialNumber=1
Aug  2 11:56:47 localhost kernel: usb 2-1.2: Product: AS2105
Aug  2 11:56:47 localhost kernel: usb 2-1.2: Manufacturer: ASMedia
Aug  2 11:56:47 localhost kernel: usb 2-1.2: SerialNumber: 
W2A87518
Aug  2 11:56:47 localhost kernel: usb 2-1.2: configuration #1 chosen 
from 1 choice
Aug  2 11:56:47 localhost kernel: Initializing USB Mass Storage 
driver...
Aug  2 11:56:47 localhost kernel: scsi6 : SCSI emulation for USB Mass 
Storage devices
Aug  2 11:56:47 localhost kernel: usb-storage: device found at 3
Aug  2 11:56:47 localhost kernel: usb-storage: waiting for device to 
settle before scanning
Aug  2 11:56:47 localhost kernel: usbcore: registered new interface 
driver usb-storage
Aug  2 11:56:47 localhost kernel: USB Mass Storage support registered.
Aug  2 11:56:48 localhost kernel: 

Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-08-01 Thread Durval Menezes
Hi Mike,

On Mon, Aug 1, 2016 at 12:05 PM, Mike Murdoch  wrote:
> On 2016-08-01 13:57, Durval Menezes wrote:
> > Hi Mathias,
> >
> > On Mon, Aug 1, 2016 at 8:20 AM, Mathias Nyman 
> >  wrote:
> >>> On 01.08.2016 13:15, Durval Menezes wrote:
> >>> Hello Mike, Mathias, list,
> >>>
> >>> On 06.02.2016 19:08, Mike Murdoch wrote:
> >>> Bug ID: 111251
> >>>
> >>> I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
> >>> kernel 4.4.1-gentoo.
> >>>
> >>> 0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
> >>> Controller (rev 04) (prog-if 30 [XHCI])
> >>>  Subsystem: Lenovo uPD720200 USB 3.0 Host Controller
> >>>  Flags: bus master, fast devsel, latency 0
> >>>  Memory at f380 (64-bit, non-prefetchable) [size=8K]
> >>>  Capabilities: [50] Power Management version 3
> >>>  Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
> >>>  Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
> >>>  Capabilities: [a0] Express Endpoint, MSI 00
> >>>  Capabilities: [100] Advanced Error Reporting
> >>>  Capabilities: [140] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
> >>>  Capabilities: [150] Latency Tolerance Reporting
> >>>  Kernel driver in use: xhci_hcd
> >>>  Kernel modules: xhci_pci
> >>>
> >>> When runtime power control for this controller is disabled
> >>> (/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
> >>> works fine and reaches over 120MB/s transfer rates.
> >>>
> >>> When runtime power control for this controller is enabled
> >>> (/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
> >>> can be observed:
> >>>
> >>> - Transfer rates are much lower at around 30MB/s
> >>> - During transfers, the controller dies after a couple of seconds:
> >>>
> >>> I found this message in the list archives, and I have the exact same
> >>> issues on exactly the same hardware (Thinkpad W520 laptop with the same
> >>> USB3 controller showing on lspci -v); otherwise, I'm running distro kernel
> >>> 2.6.32-573.7.1.el6.x86_64 on a Springdale Linux 6.7 (RHEL6) install.
> >>>
> >>> I just verified that my controller's PM was set by default to "auto":
> >>> cat /sys/bus/pci/devices/\:0e\:00.0/power/control
> >>> auto
> >>> I have now set it to "on" and will test whether this will work around
> >>> the issue (I'm waiting for my USB3.0 "heavy duty" disk docks to be
> >>> released from another system that is using them right now).
> >>>
> >>> I have one question for Mike: have you upgraded your uPD720200 controller
> >>> firmware (as per [1], [2]) or are you still running stock?
> >>>
> >>> Also, one question for Mathias: do you know whether your patches at [3]
> >>> can be applied to kernel 2.6.32?
> >> The last patch in [3] is faulty. So don't use the patches from the mail.
> >>
> >> I just force updated that branch, so if you like you can try to backport
> >> patches from:
> >>
> >>  git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git 
> >> bug_usb3_enum_rtresume
> >>
> >> only 2 patches are relevant:
> >>
> >> 8caabe9 xhci: Don't suspend the xhci bus it there is a pending event.
> >> 4427456 xhci: resume USB 3 roothub first
> > Thanks Mathias. Now I only need Mike's response concerning the firmware
> > in order to proceed.
> > 
> No, I haven't tried updating the firmware. Feel free to give it a go,
> I'm curious if it'll make a difference.

As long as the patch (or the workaround) allow me to avoid the issue,
I'd rather let sleeping dragons lie (windows-only update procedure, etc)

:-)

> As for the patches. All three of them did fix this bug, but introduced
> other problems (I don't remember details, sorry). As Mathias said, the
> last one is faulty. However, using only the first two patches is *not*
> enough to completely fix this bug (I verified it just now).

No prob, Mathias sent me a reference to the relevant (and presumably 
working) patches, I will first try the "disable PM mode" workaround and
later (possibly *much* later) to backport the patches to my kernel.

> Unfortunately I don't have the time to do much testing. The Thinkpad is
> used by someone else and I only have access to it on the weekends. A
> workaround is to just disable runtime power management.

Thanks for the feedback.

> Let me know how things work for you!

Will do: will post a progress report (on the workaround) later today to the
list and, as you asked, directly to your email too.

Thanks again,
-- 
  Durval Menezes (durval AT tmp DOT com DOT br, http://www.tmp.com.br/)

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-08-01 Thread Mike Murdoch
Hello,

On 2016-08-01 13:57, Durval Menezes wrote:
> Hi Mathias,
>
> On Mon, Aug 1, 2016 at 8:20 AM, Mathias Nyman  
> wrote:
>>> On 01.08.2016 13:15, Durval Menezes wrote:
>>> Hello Mike, Mathias, list,
>>>
>>> On 06.02.2016 19:08, Mike Murdoch wrote:
>>> Bug ID: 111251
>>>
>>> I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
>>> kernel 4.4.1-gentoo.
>>>
>>> 0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
>>> Controller (rev 04) (prog-if 30 [XHCI])
>>>  Subsystem: Lenovo uPD720200 USB 3.0 Host Controller
>>>  Flags: bus master, fast devsel, latency 0
>>>  Memory at f380 (64-bit, non-prefetchable) [size=8K]
>>>  Capabilities: [50] Power Management version 3
>>>  Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
>>>  Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
>>>  Capabilities: [a0] Express Endpoint, MSI 00
>>>  Capabilities: [100] Advanced Error Reporting
>>>  Capabilities: [140] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
>>>  Capabilities: [150] Latency Tolerance Reporting
>>>  Kernel driver in use: xhci_hcd
>>>  Kernel modules: xhci_pci
>>>
>>> When runtime power control for this controller is disabled
>>> (/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
>>> works fine and reaches over 120MB/s transfer rates.
>>>
>>> When runtime power control for this controller is enabled
>>> (/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
>>> can be observed:
>>>
>>> - Transfer rates are much lower at around 30MB/s
>>> - During transfers, the controller dies after a couple of seconds:
>>>
>>> I found this message in the list archives, and I have the exact same
>>> issues on exactly the same hardware (Thinkpad W520 laptop with the same
>>> USB3 controller showing on lspci -v); otherwise, I'm running distro kernel
>>> 2.6.32-573.7.1.el6.x86_64 on a Springdale Linux 6.7 (RHEL6) install.
>>>
>>> I just verified that my controller's PM was set by default to "auto":
>>> cat /sys/bus/pci/devices/\:0e\:00.0/power/control
>>> auto
>>> I have now set it to "on" and will test whether this will work around
>>> the issue (I'm waiting for my USB3.0 "heavy duty" disk docks to be
>>> released from another system that is using them right now).
>>>
>>> I have one question for Mike: have you upgraded your uPD720200 controller
>>> firmware (as per [1], [2]) or are you still running stock?
>>>
>>> Also, one question for Mathias: do you know whether your patches at [3]
>>> can be applied to kernel 2.6.32?
>> The last patch in [3] is faulty. So don't use the patches from the mail.
>>
>> I just force updated that branch, so if you like you can try to backport
>> patches from:
>>
>>  git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git 
>> bug_usb3_enum_rtresume
>>
>> only 2 patches are relevant:
>>
>> 8caabe9 xhci: Don't suspend the xhci bus it there is a pending event.
>> 4427456 xhci: resume USB 3 roothub first
> Thanks Mathias. Now I only need Mike's response concerning the firmware
> in order to proceed.
>
> Cheers,
No, I haven't tried updating the firmware. Feel free to give it a go,
I'm curious if it'll make a difference.

As for the patches. All three of them did fix this bug, but introduced
other problems (I don't remember details, sorry). As Mathias said, the
last one is faulty. However, using only the first two patches is *not*
enough to completely fix this bug (I verified it just now).

Unfortunately I don't have the time to do much testing. The Thinkpad is
used by someone else and I only have access to it on the weekends. A
workaround is to just disable runtime power management.

Let me know how things work for you!

Cheers,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-08-01 Thread Durval Menezes
Hi Mathias,

On Mon, Aug 1, 2016 at 8:20 AM, Mathias Nyman  
wrote:
> > On 01.08.2016 13:15, Durval Menezes wrote:
> > Hello Mike, Mathias, list,
> > 
> > On 06.02.2016 19:08, Mike Murdoch wrote:
> > Bug ID: 111251
> > 
> > I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
> > kernel 4.4.1-gentoo.
> > 
> > 0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
> > Controller (rev 04) (prog-if 30 [XHCI])
> >  Subsystem: Lenovo uPD720200 USB 3.0 Host Controller
> >  Flags: bus master, fast devsel, latency 0
> >  Memory at f380 (64-bit, non-prefetchable) [size=8K]
> >  Capabilities: [50] Power Management version 3
> >  Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
> >  Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
> >  Capabilities: [a0] Express Endpoint, MSI 00
> >  Capabilities: [100] Advanced Error Reporting
> >  Capabilities: [140] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
> >  Capabilities: [150] Latency Tolerance Reporting
> >  Kernel driver in use: xhci_hcd
> >  Kernel modules: xhci_pci
> > 
> > When runtime power control for this controller is disabled
> > (/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
> > works fine and reaches over 120MB/s transfer rates.
> > 
> > When runtime power control for this controller is enabled
> > (/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
> > can be observed:
> > 
> > - Transfer rates are much lower at around 30MB/s
> > - During transfers, the controller dies after a couple of seconds:
> > 
> > I found this message in the list archives, and I have the exact same
> > issues on exactly the same hardware (Thinkpad W520 laptop with the same
> > USB3 controller showing on lspci -v); otherwise, I'm running distro kernel
> > 2.6.32-573.7.1.el6.x86_64 on a Springdale Linux 6.7 (RHEL6) install.
> > 
> > I just verified that my controller's PM was set by default to "auto":
> > cat /sys/bus/pci/devices/\:0e\:00.0/power/control
> > auto
> > I have now set it to "on" and will test whether this will work around
> > the issue (I'm waiting for my USB3.0 "heavy duty" disk docks to be
> > released from another system that is using them right now).
> > 
> > I have one question for Mike: have you upgraded your uPD720200 controller
> > firmware (as per [1], [2]) or are you still running stock?
> > 
> > Also, one question for Mathias: do you know whether your patches at [3]
> > can be applied to kernel 2.6.32?
> 
> The last patch in [3] is faulty. So don't use the patches from the mail.
> 
> I just force updated that branch, so if you like you can try to backport
> patches from:
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git 
> bug_usb3_enum_rtresume
> 
> only 2 patches are relevant:
> 
> 8caabe9 xhci: Don't suspend the xhci bus it there is a pending event.
> 4427456 xhci: resume USB 3 roothub first

Thanks Mathias. Now I only need Mike's response concerning the firmware
in order to proceed.

Cheers,
-- 
  Durval Menezes (durval AT tmp DOT com DOT br, http://www.tmp.com.br/)
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-08-01 Thread Mathias Nyman

Hi

On 01.08.2016 13:15, Durval Menezes wrote:

Hello Mike, Mathias, list,

On 06.02.2016 19:08, Mike Murdoch wrote:

Bug ID: 111251

I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
kernel 4.4.1-gentoo.

0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
Controller (rev 04) (prog-if 30 [XHCI])
 Subsystem: Lenovo uPD720200 USB 3.0 Host Controller
 Flags: bus master, fast devsel, latency 0
 Memory at f380 (64-bit, non-prefetchable) [size=8K]
 Capabilities: [50] Power Management version 3
 Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
 Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
 Capabilities: [a0] Express Endpoint, MSI 00
 Capabilities: [100] Advanced Error Reporting
 Capabilities: [140] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
 Capabilities: [150] Latency Tolerance Reporting
 Kernel driver in use: xhci_hcd
 Kernel modules: xhci_pci

When runtime power control for this controller is disabled
(/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
works fine and reaches over 120MB/s transfer rates.

When runtime power control for this controller is enabled
(/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
can be observed:

- Transfer rates are much lower at around 30MB/s
- During transfers, the controller dies after a couple of seconds:


I found this message in the list archives, and I have the exact same
issues on exactly the same hardware (Thinkpad W520 laptop with the same
USB3 controller showing on lspci -v); otherwise, I'm running distro kernel
2.6.32-573.7.1.el6.x86_64 on a Springdale Linux 6.7 (RHEL6) install.

I just verified that my controller's PM was set by default to "auto":
cat /sys/bus/pci/devices/\:0e\:00.0/power/control
auto
I have now set it to "on" and will test whether this will work around
the issue (I'm waiting for my USB3.0 "heavy duty" disk docks to be
released from another system that is using them right now).

I have one question for Mike: have you upgraded your uPD720200 controller
firmware (as per [1], [2]) or are you still running stock?

Also, one question for Mathias: do you know whether your patches at [3]
can be applied to kernel 2.6.32?


The last patch in [3] is faulty. So don't use the patches from the mail.

I just force updated that branch, so if you like you can try to backport
patches from:

 git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git 
bug_usb3_enum_rtresume

only 2 patches are relevant:

8caabe9 xhci: Don't suspend the xhci bus it there is a pending event.
4427456 xhci: resume USB 3 roothub first

-Mathias



References
 [1] 
https://forums.lenovo.com/t5/ThinkPad-P-and-W-Series-Mobile/Anyone-updated-their-W520-USB-3-0-firmware/td-p/1164719
 [2] http://pete.akeo.ie/2011/10/flashing-necrenesas-usb-30.html
 [3] http://marc.info/?l=linux-usb=145684596900873=2



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-08-01 Thread Durval Menezes
Hello Mike, Mathias, list,

On 06.02.2016 19:08, Mike Murdoch wrote:
> Bug ID: 111251
> 
> I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
> kernel 4.4.1-gentoo.
> 
> 0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
> Controller (rev 04) (prog-if 30 [XHCI])
> Subsystem: Lenovo uPD720200 USB 3.0 Host Controller
> Flags: bus master, fast devsel, latency 0
> Memory at f380 (64-bit, non-prefetchable) [size=8K]
> Capabilities: [50] Power Management version 3
> Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
> Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
> Capabilities: [a0] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [140] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
> Capabilities: [150] Latency Tolerance Reporting
> Kernel driver in use: xhci_hcd
> Kernel modules: xhci_pci
> 
> When runtime power control for this controller is disabled
> (/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
> works fine and reaches over 120MB/s transfer rates.
> 
> When runtime power control for this controller is enabled
> (/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
> can be observed:
> 
> - Transfer rates are much lower at around 30MB/s
> - During transfers, the controller dies after a couple of seconds:

I found this message in the list archives, and I have the exact same
issues on exactly the same hardware (Thinkpad W520 laptop with the same
USB3 controller showing on lspci -v); otherwise, I'm running distro kernel
2.6.32-573.7.1.el6.x86_64 on a Springdale Linux 6.7 (RHEL6) install.

I just verified that my controller's PM was set by default to "auto":
cat /sys/bus/pci/devices/\:0e\:00.0/power/control
auto
I have now set it to "on" and will test whether this will work around
the issue (I'm waiting for my USB3.0 "heavy duty" disk docks to be
released from another system that is using them right now).

I have one question for Mike: have you upgraded your uPD720200 controller
firmware (as per [1], [2]) or are you still running stock?

Also, one question for Mathias: do you know whether your patches at [3]
can be applied to kernel 2.6.32?

References
[1] 
https://forums.lenovo.com/t5/ThinkPad-P-and-W-Series-Mobile/Anyone-updated-their-W520-USB-3-0-firmware/td-p/1164719
[2] http://pete.akeo.ie/2011/10/flashing-necrenesas-usb-30.html
[3] http://marc.info/?l=linux-usb=145684596900873=2 

Cheers, 
-- 
  Durval Menezes (durval AT tmp DOT com DOT br, http://www.tmp.com.br/)

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-03-14 Thread Mike Murdoch


On 2016-03-14 10:06, Mathias Nyman wrote:
> On 13.03.2016 11:16, Mike Murdoch wrote:
>>
>>
>> On 2016-03-01 16:32, Mathias Nyman wrote:
>>> On 18.02.2016 18:34, Mike Murdoch wrote:


 On 2016-02-18 16:12, Mathias Nyman wrote:
> On 16.02.2016 23:58, main.ha...@googlemail.com wrote:
>>
>>
>> On 2016-02-08 15:31, Mathias Nyman wrote:
>>> Hi
>>>
>>> On 06.02.2016 19:08, Mike Murdoch wrote:
 Bug ID: 111251

 Hello,

 I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520
 laptop on
 kernel 4.4.1-gentoo.

 0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
 Controller (rev 04) (prog-if 30 [XHCI])
 Subsystem: Lenovo uPD720200 USB 3.0 Host Controller

 When runtime power control for this controller is disabled
 (/sys/bus/pci/devices/:0e:00.0/power/control = on), the
 controller
 works fine and reaches over 120MB/s transfer rates.

 When runtime power control for this controller is enabled
 (/sys/bus/pci/devices/:0e:00.0/power/control = auto), two
 effects
 can be observed:

 - Transfer rates are much lower at around 30MB/s
 - During transfers, the controller dies after a couple of seconds:

 At this point, a reboot is required to reactivate the controller,
 unloading and reloading the xhci_* modules does not work.

>>>
>>>
>>> ...
>>>
>>> I did some more digging, there are a few things that need to be
>>> addressed:
>>> 1. We should resume USB3 bus before USB2 bus to let devices enumerate
>>> as USB3 better,
>>> this gives them more time to finish the link training.
>>>
>>> 2. After resuming xhci we don't see any port changes immediately, hub
>>> thinks nothing
>>> happended and stops polling the ports, hub will suspend again ->
>>> xhci will try to
>>> suspend.
>>> 3. Roothubs will autosuspend immediately after autoresume,
>>> (autosuspend timeout = 0)
>>> This could be a reason why we see the "xhci_suspend" entry in the
>>> log. We either
>>> need to increase the autosuspend timeout, or prevent suspend if we
>>> can see the pending
>>> event in a xhci status register.
>>>
>>> inserting usb3 storage device
>>> Feb 16 20:03:33 xhci_hcd :0e:00.0: // Setting command ring address
>>> to 0xe001
>>> Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_resume: starting port
>>> polling.
>>> Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_hub_status_data: stopping
>>> port polling.
>>> Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_suspend: stopping port
>>> polling.
>>>
>>> I got a few patches, attached. They both partially try to fix the
>>> issue, and add more logging.
>>> Same changes can be found in a topic branch from in:
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
>>> bug_usb3_enum_rtresume
>>>
>>> Any chance to try them out?
>>>
>>> -Mathias
>>
>> Hello,
>>
>> I've come around to testing these patches. I applied them all at once
>> (did you want me to test them individually?) and they appear to fix this
>> issue completely! Full speed and no dead controllers.Do you need any
>> further logs?
>>
>
> That's good news.
>
> Can I add your "Tested-by:" tag to two of the patches?
> I'll send them as fixes after rc1 is out.
>
> No more logs needed as it works, I'll send the third additional debug
> info
> patch to usb-next later. It will be useful for future debugging
>
> Thanks
> Mathias
>
>
>
> for further debugging this case
> The third patch is just additional debug info and useful for future
> debugging (or if those
>
>
Hello,

yes, feel free to add the tag. Thanks for everything!

- Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-03-14 Thread Mathias Nyman

On 13.03.2016 11:16, Mike Murdoch wrote:



On 2016-03-01 16:32, Mathias Nyman wrote:

On 18.02.2016 18:34, Mike Murdoch wrote:



On 2016-02-18 16:12, Mathias Nyman wrote:

On 16.02.2016 23:58, main.ha...@googlemail.com wrote:



On 2016-02-08 15:31, Mathias Nyman wrote:

Hi

On 06.02.2016 19:08, Mike Murdoch wrote:

Bug ID: 111251

Hello,

I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520
laptop on
kernel 4.4.1-gentoo.

0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
Controller (rev 04) (prog-if 30 [XHCI])
Subsystem: Lenovo uPD720200 USB 3.0 Host Controller

When runtime power control for this controller is disabled
(/sys/bus/pci/devices/:0e:00.0/power/control = on), the
controller
works fine and reaches over 120MB/s transfer rates.

When runtime power control for this controller is enabled
(/sys/bus/pci/devices/:0e:00.0/power/control = auto), two
effects
can be observed:

- Transfer rates are much lower at around 30MB/s
- During transfers, the controller dies after a couple of seconds:

At this point, a reboot is required to reactivate the controller,
unloading and reloading the xhci_* modules does not work.





...

I did some more digging, there are a few things that need to be
addressed:
1. We should resume USB3 bus before USB2 bus to let devices enumerate
as USB3 better,
this gives them more time to finish the link training.

2. After resuming xhci we don't see any port changes immediately, hub
thinks nothing
happended and stops polling the ports, hub will suspend again ->
xhci will try to
suspend.
3. Roothubs will autosuspend immediately after autoresume,
(autosuspend timeout = 0)
This could be a reason why we see the "xhci_suspend" entry in the
log. We either
need to increase the autosuspend timeout, or prevent suspend if we
can see the pending
event in a xhci status register.

inserting usb3 storage device
Feb 16 20:03:33 xhci_hcd :0e:00.0: // Setting command ring address
to 0xe001
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_resume: starting port
polling.
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_hub_status_data: stopping
port polling.
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_suspend: stopping port
polling.

I got a few patches, attached. They both partially try to fix the
issue, and add more logging.
Same changes can be found in a topic branch from in:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
bug_usb3_enum_rtresume

Any chance to try them out?

-Mathias


Hello,

I've come around to testing these patches. I applied them all at once
(did you want me to test them individually?) and they appear to fix this
issue completely! Full speed and no dead controllers.Do you need any
further logs?



That's good news.

Can I add your "Tested-by:" tag to two of the patches?
I'll send them as fixes after rc1 is out.

No more logs needed as it works, I'll send the third additional debug info
patch to usb-next later. It will be useful for future debugging

Thanks
Mathias



for further debugging this case
The third patch is just additional debug info and useful for future debugging 
(or if those

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-03-13 Thread Mike Murdoch


On 2016-03-01 16:32, Mathias Nyman wrote:
> On 18.02.2016 18:34, Mike Murdoch wrote:
>>
>>
>> On 2016-02-18 16:12, Mathias Nyman wrote:
>>> On 16.02.2016 23:58, main.ha...@googlemail.com wrote:


 On 2016-02-08 15:31, Mathias Nyman wrote:
> Hi
>
> On 06.02.2016 19:08, Mike Murdoch wrote:
>> Bug ID: 111251
>>
>> Hello,
>>
>> I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520
>> laptop on
>> kernel 4.4.1-gentoo.
>>
>> 0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
>> Controller (rev 04) (prog-if 30 [XHCI])
>>Subsystem: Lenovo uPD720200 USB 3.0 Host Controller
>>
>> When runtime power control for this controller is disabled
>> (/sys/bus/pci/devices/:0e:00.0/power/control = on), the
>> controller
>> works fine and reaches over 120MB/s transfer rates.
>>
>> When runtime power control for this controller is enabled
>> (/sys/bus/pci/devices/:0e:00.0/power/control = auto), two
>> effects
>> can be observed:
>>
>> - Transfer rates are much lower at around 30MB/s
>> - During transfers, the controller dies after a couple of seconds:
>>
>> At this point, a reboot is required to reactivate the controller,
>> unloading and reloading the xhci_* modules does not work.
>>
>
>
> ...
>
> I did some more digging, there are a few things that need to be
> addressed:
> 1. We should resume USB3 bus before USB2 bus to let devices enumerate
> as USB3 better,
>this gives them more time to finish the link training.
>
> 2. After resuming xhci we don't see any port changes immediately, hub
> thinks nothing
>happended and stops polling the ports, hub will suspend again ->
> xhci will try to
>suspend.  
> 3. Roothubs will autosuspend immediately after autoresume,
> (autosuspend timeout = 0)
>This could be a reason why we see the "xhci_suspend" entry in the
> log. We either
>need to increase the autosuspend timeout, or prevent suspend if we
> can see the pending
>event in a xhci status register.
>  
> inserting usb3 storage device
> Feb 16 20:03:33 xhci_hcd :0e:00.0: // Setting command ring address
> to 0xe001
> Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_resume: starting port
> polling.
> Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_hub_status_data: stopping
> port polling.
> Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_suspend: stopping port
> polling.
>
> I got a few patches, attached. They both partially try to fix the
> issue, and add more logging.
> Same changes can be found in a topic branch from in:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
> bug_usb3_enum_rtresume
>
> Any chance to try them out?
>
> -Mathias

Hello,

I've come around to testing these patches. I applied them all at once
(did you want me to test them individually?) and they appear to fix this
issue completely! Full speed and no dead controllers.Do you need any
further logs?

Many thanks so far! :)

Cheers,
- Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-03-01 Thread Mathias Nyman

On 18.02.2016 18:34, Mike Murdoch wrote:



On 2016-02-18 16:12, Mathias Nyman wrote:

On 16.02.2016 23:58, main.ha...@googlemail.com wrote:



On 2016-02-08 15:31, Mathias Nyman wrote:

Hi

On 06.02.2016 19:08, Mike Murdoch wrote:

Bug ID: 111251

Hello,

I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
kernel 4.4.1-gentoo.

0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
Controller (rev 04) (prog-if 30 [XHCI])
   Subsystem: Lenovo uPD720200 USB 3.0 Host Controller

When runtime power control for this controller is disabled
(/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
works fine and reaches over 120MB/s transfer rates.

When runtime power control for this controller is enabled
(/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
can be observed:

- Transfer rates are much lower at around 30MB/s
- During transfers, the controller dies after a couple of seconds:

At this point, a reboot is required to reactivate the controller,
unloading and reloading the xhci_* modules does not work.





...

I did some more digging, there are a few things that need to be addressed:
1. We should resume USB3 bus before USB2 bus to let devices enumerate as USB3 
better,
   this gives them more time to finish the link training.

2. After resuming xhci we don't see any port changes immediately, hub thinks 
nothing
   happended and stops polling the ports, hub will suspend again -> xhci will 
try to
   suspend.   


3. Roothubs will autosuspend immediately after autoresume, (autosuspend timeout 
= 0)
   This could be a reason why we see the "xhci_suspend" entry in the log. We 
either
   need to increase the autosuspend timeout, or prevent suspend if we can see 
the pending
   event in a xhci status register.
 
inserting usb3 storage device

Feb 16 20:03:33 xhci_hcd :0e:00.0: // Setting command ring address to 
0xe001
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_resume: starting port polling.
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_hub_status_data: stopping port 
polling.
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_suspend: stopping port polling.

I got a few patches, attached. They both partially try to fix the issue, and 
add more logging.
Same changes can be found in a topic branch from in:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git 
bug_usb3_enum_rtresume

Any chance to try them out?

-Mathias
>From 4427456ee6228155e72f38c740e0bf78c8ad7792 Mon Sep 17 00:00:00 2001
From: Mathias Nyman 
Date: Mon, 29 Feb 2016 11:33:25 +0200
Subject: [PATCH 1/3] xhci: resume USB 3 roothub first

Give USB 3 devices a better chance to enumerate at USB 3 speeds if
they are connected to a suspended host.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index d51ee0c..b609288 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1108,8 +1108,8 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
 		/* Resume root hubs only when have pending events. */
 		status = readl(>op_regs->status);
 		if (status & STS_EINT) {
-			usb_hcd_resume_root_hub(hcd);
 			usb_hcd_resume_root_hub(xhci->shared_hcd);
+			usb_hcd_resume_root_hub(hcd);
 		}
 	}
 
@@ -1124,10 +1124,10 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
 
 	/* Re-enable port polling. */
 	xhci_dbg(xhci, "%s: starting port polling.\n", __func__);
-	set_bit(HCD_FLAG_POLL_RH, >flags);
-	usb_hcd_poll_rh_status(hcd);
 	set_bit(HCD_FLAG_POLL_RH, >shared_hcd->flags);
 	usb_hcd_poll_rh_status(xhci->shared_hcd);
+	set_bit(HCD_FLAG_POLL_RH, >flags);
+	usb_hcd_poll_rh_status(hcd);
 
 	return retval;
 }
-- 
1.9.1

>From a37d2160f8e801fe8ba3c1a4938f0fa17d56de7b Mon Sep 17 00:00:00 2001
From: Mathias Nyman 
Date: Mon, 29 Feb 2016 16:07:03 +0200
Subject: [PATCH 2/3] xhci: Add bus number to debug output

Debugging enumeration races between usb USB2 and USB3 buses
are hard to debug when there is no indication of what message
belongs to which bus.

Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/xhci-hub.c  | 26 ++
 drivers/usb/host/xhci-ring.c |  8 +---
 drivers/usb/host/xhci.c  |  2 +-
 3 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index d61fcc4..8e61925 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -458,8 +458,8 @@ static void xhci_disable_port(struct usb_hcd *hcd, struct xhci_hcd *xhci,
 	/* Write 1 to disable the port */
 	writel(port_status | PORT_PE, addr);
 	port_status = readl(addr);
-	xhci_dbg(xhci, "disable port, actual port %d status  = 0x%x\n",
-			wIndex, port_status);
+	xhci_dbg(xhci, "disable port, bus%d port %d portsc  = 0x%x\n",
+		 hcd->self.busnum, 

Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-02-18 Thread Mike Murdoch


On 2016-02-18 16:12, Mathias Nyman wrote:
> On 16.02.2016 23:58, main.ha...@googlemail.com wrote:
>>
>>
>> On 2016-02-08 15:31, Mathias Nyman wrote:
>>> Hi
>>>
>>> On 06.02.2016 19:08, Mike Murdoch wrote:
 Bug ID: 111251

 Hello,

 I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
 kernel 4.4.1-gentoo.

 0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
 Controller (rev 04) (prog-if 30 [XHCI])
   Subsystem: Lenovo uPD720200 USB 3.0 Host Controller

 When runtime power control for this controller is disabled
 (/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
 works fine and reaches over 120MB/s transfer rates.

 When runtime power control for this controller is enabled
 (/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
 can be observed:

 - Transfer rates are much lower at around 30MB/s
 - During transfers, the controller dies after a couple of seconds:

 xhci_hcd :0e:00.0: xHCI host not responding to stop endpoint
 command.
 xhci_hcd :0e:00.0: Assuming host is dying, halting host.
 xhci_hcd :0e:00.0: Host not halted after 16000 microseconds.
 xhci_hcd :0e:00.0: Non-responsive xHCI host is not halting.
 xhci_hcd :0e:00.0: Completing active URBs anyway.
 xhci_hcd :0e:00.0: HC died; cleaning up
 sd 9:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR
 driverbyte=DRIVER_OK
 sd 9:0:0:0: [sdc] tag#0 CDB: Read(10) 28 00 00 19 a9 00 00 00 f0 00
 blk_update_request: I/O error, dev sdc, sector 1681664
 xhci_hcd :0e:00.0: Stopped the command ring failed, maybe the host
 is dead
 xhci_hcd :0e:00.0: Host not halted after 16000 microseconds.
 xhci_hcd :0e:00.0: Abort command ring failed
 xhci_hcd :0e:00.0: HC died; cleaning up

 At this point, a reboot is required to reactivate the controller,
 unloading and reloading the xhci_* modules does not work.

>>>
>>> With 120MB/s I assume it was a USB3 device.
>>> Was there any USB 2 device connected as well?
>>> Does this occur with only a USB2 device connected to xhci?
>>>
>>> xhci handles suspend/resume a bit differently for USB2 and USB3
>>> roothubs.
>>>
>>> Does this happen on older kernels as well? 4.3 or 4.2 based?
>>>
>>> For more xhci debugging, do:
>>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>>> and check dmesg for more xhci info.
>>>
>>> If reloading the module did not help it is more likely that the
>>> controller is in some
>>> unexpected state.
>>> If however, it would instead be just bad timeout timer handling we
>>> could just return immediately
>>> in the timeout handler, and check if the usb device(s) continue to
>>> work normally.
>>>
>>> This could be done by editing drivers/usb/hosts/xhci-ring.c
>>>
>>> +++ b/drivers/usb/host/xhci-ring.c
>>> @@ -831,6 +831,7 @@ void xhci_stop_endpoint_command_watchdog(unsigned
>>> long arg)
>>>  struct xhci_virt_ep *ep;
>>>  int ret, i, j;
>>>  unsigned long flags;
>>> +   return;
>>>
>>> -Mathias
>>>
>>>
>> Hello Mat,
>>
>> thanks for your response. I have experimented with your suggestions.
>>
>> As for your questions: No, there was only one USB3 stick connected to
>> the host controller during the tests. USB2 devices work fine too.
>>
>> Yes, I encountered this problem on a 4.1 series kernel aswell as the 4.4
>> series.
>>
>> I have enabled the debug controls and attached the results to this mail,
>> along with some commentary. I am hoping this works in the mailing list.
>>
>> I've also tried your suggested modification, and it does seem to work!
>> With it, the controller does not die, but it still sacrifices a lot of
>> speed (as I had mentioned in the first mail of this thread)
>>
>>
>> I hope this is helpful!
>>
>
> Thanks, it is helpful
>
> Looks like when the USB3 device is inserted it is first detected as a
> USB2 device,
> then immediately afterwars as a USB3 device, the usb2 device stops
> responding so 5
> seconds later we timeout, and kill everything.
>
> selected parts of the log:
>
> inserting usb3 storage device
> 20:03:33 xhci_hcd :0e:00.0: xhci_resume: starting port polling.
> 20:03:33 xhci_hcd :0e:00.0: Port Status Change Event for port 3
> 20:03:33 xhci_hcd :0e:00.0: get port status, actual port 0 status 
> = 0x202e1  /* PORT 0
> 20:03:33 xhci_hcd :0e:00.0: get port status, actual port 1 status 
> = 0x2a0 /* PORT 1
> 20:03:33 usb 1-1: new high-speed USB device number 2 using xhci_hcd
> 20:03:33 xhci_hcd :0e:00.0: Slot ID 1 Input Context:/*
> Found a HS device
> 20:03:33 xhci_hcd :0e:00.0: IN Endpoint 00 Context (ep_index 00):
> 20:03:33 xhci_hcd :0e:00.0: @8805fc8a5048 (virt) @a048
> (dma) 0xfffdf001 - deq
> 20:03:33 xhci_hcd :0e:00.0: Successful setup context command
>  *   now we have a device at SLOT 

Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-02-18 Thread Mathias Nyman

On 16.02.2016 23:58, main.ha...@googlemail.com wrote:



On 2016-02-08 15:31, Mathias Nyman wrote:

Hi

On 06.02.2016 19:08, Mike Murdoch wrote:

Bug ID: 111251

Hello,

I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
kernel 4.4.1-gentoo.

0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
Controller (rev 04) (prog-if 30 [XHCI])
  Subsystem: Lenovo uPD720200 USB 3.0 Host Controller

When runtime power control for this controller is disabled
(/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
works fine and reaches over 120MB/s transfer rates.

When runtime power control for this controller is enabled
(/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
can be observed:

- Transfer rates are much lower at around 30MB/s
- During transfers, the controller dies after a couple of seconds:

xhci_hcd :0e:00.0: xHCI host not responding to stop endpoint
command.
xhci_hcd :0e:00.0: Assuming host is dying, halting host.
xhci_hcd :0e:00.0: Host not halted after 16000 microseconds.
xhci_hcd :0e:00.0: Non-responsive xHCI host is not halting.
xhci_hcd :0e:00.0: Completing active URBs anyway.
xhci_hcd :0e:00.0: HC died; cleaning up
sd 9:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR
driverbyte=DRIVER_OK
sd 9:0:0:0: [sdc] tag#0 CDB: Read(10) 28 00 00 19 a9 00 00 00 f0 00
blk_update_request: I/O error, dev sdc, sector 1681664
xhci_hcd :0e:00.0: Stopped the command ring failed, maybe the host
is dead
xhci_hcd :0e:00.0: Host not halted after 16000 microseconds.
xhci_hcd :0e:00.0: Abort command ring failed
xhci_hcd :0e:00.0: HC died; cleaning up

At this point, a reboot is required to reactivate the controller,
unloading and reloading the xhci_* modules does not work.



With 120MB/s I assume it was a USB3 device.
Was there any USB 2 device connected as well?
Does this occur with only a USB2 device connected to xhci?

xhci handles suspend/resume a bit differently for USB2 and USB3 roothubs.

Does this happen on older kernels as well? 4.3 or 4.2 based?

For more xhci debugging, do:
echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
and check dmesg for more xhci info.

If reloading the module did not help it is more likely that the
controller is in some
unexpected state.
If however, it would instead be just bad timeout timer handling we
could just return immediately
in the timeout handler, and check if the usb device(s) continue to
work normally.

This could be done by editing drivers/usb/hosts/xhci-ring.c

+++ b/drivers/usb/host/xhci-ring.c
@@ -831,6 +831,7 @@ void xhci_stop_endpoint_command_watchdog(unsigned
long arg)
 struct xhci_virt_ep *ep;
 int ret, i, j;
 unsigned long flags;
+   return;

-Mathias



Hello Mat,

thanks for your response. I have experimented with your suggestions.

As for your questions: No, there was only one USB3 stick connected to
the host controller during the tests. USB2 devices work fine too.

Yes, I encountered this problem on a 4.1 series kernel aswell as the 4.4
series.

I have enabled the debug controls and attached the results to this mail,
along with some commentary. I am hoping this works in the mailing list.

I've also tried your suggested modification, and it does seem to work!
With it, the controller does not die, but it still sacrifices a lot of
speed (as I had mentioned in the first mail of this thread)


I hope this is helpful!



Thanks, it is helpful

Looks like when the USB3 device is inserted it is first detected as a USB2 
device,
then immediately afterwars as a USB3 device, the usb2 device stops responding 
so 5
seconds later we timeout, and kill everything.

selected parts of the log:

inserting usb3 storage device
20:03:33 xhci_hcd :0e:00.0: xhci_resume: starting port polling.
20:03:33 xhci_hcd :0e:00.0: Port Status Change Event for port 3
20:03:33 xhci_hcd :0e:00.0: get port status, actual port 0 status  = 
0x202e1  /* PORT 0
20:03:33 xhci_hcd :0e:00.0: get port status, actual port 1 status  = 0x2a0  
   /* PORT 1
20:03:33 usb 1-1: new high-speed USB device number 2 using xhci_hcd
20:03:33 xhci_hcd :0e:00.0: Slot ID 1 Input Context:
/* Found a HS device
20:03:33 xhci_hcd :0e:00.0: IN Endpoint 00 Context (ep_index 00):
20:03:33 xhci_hcd :0e:00.0: @8805fc8a5048 (virt) @a048 (dma) 
0xfffdf001 - deq
20:03:33 xhci_hcd :0e:00.0: Successful setup context command
 *   now we have a device at SLOT 1 with control endpoint 0 buffer at address  
0xfffdf000
20:03:33 xhci_hcd :0e:00.0: Slot ID 2 Input Context:
20:03:33 xhci_hcd :0e:00.0: IN Endpoint 00 Context (ep_index 00):
20:03:33 xhci_hcd :0e:00.0: @8800b68d7048 (virt) @2048 (dma) 
0xfffe1001 - deq
 * now we have another device at SLOT 2 with control endpoint buffer at 
0xfffe1000
20:03:33 usb 2-1: new SuperSpeed USB device number 3 using xhci_hcd /* 
found SS device
20:03:33 usb 2-1: 

Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-02-16 Thread main . haarp


On 2016-02-08 15:31, Mathias Nyman wrote:
> Hi
>
> On 06.02.2016 19:08, Mike Murdoch wrote:
>> Bug ID: 111251
>>
>> Hello,
>>
>> I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
>> kernel 4.4.1-gentoo.
>>
>> 0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
>> Controller (rev 04) (prog-if 30 [XHCI])
>>  Subsystem: Lenovo uPD720200 USB 3.0 Host Controller
>>
>> When runtime power control for this controller is disabled
>> (/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
>> works fine and reaches over 120MB/s transfer rates.
>>
>> When runtime power control for this controller is enabled
>> (/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
>> can be observed:
>>
>> - Transfer rates are much lower at around 30MB/s
>> - During transfers, the controller dies after a couple of seconds:
>>
>> xhci_hcd :0e:00.0: xHCI host not responding to stop endpoint
>> command.
>> xhci_hcd :0e:00.0: Assuming host is dying, halting host.
>> xhci_hcd :0e:00.0: Host not halted after 16000 microseconds.
>> xhci_hcd :0e:00.0: Non-responsive xHCI host is not halting.
>> xhci_hcd :0e:00.0: Completing active URBs anyway.
>> xhci_hcd :0e:00.0: HC died; cleaning up
>> sd 9:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR
>> driverbyte=DRIVER_OK
>> sd 9:0:0:0: [sdc] tag#0 CDB: Read(10) 28 00 00 19 a9 00 00 00 f0 00
>> blk_update_request: I/O error, dev sdc, sector 1681664
>> xhci_hcd :0e:00.0: Stopped the command ring failed, maybe the host
>> is dead
>> xhci_hcd :0e:00.0: Host not halted after 16000 microseconds.
>> xhci_hcd :0e:00.0: Abort command ring failed
>> xhci_hcd :0e:00.0: HC died; cleaning up
>>
>> At this point, a reboot is required to reactivate the controller,
>> unloading and reloading the xhci_* modules does not work.
>>
>
> With 120MB/s I assume it was a USB3 device.
> Was there any USB 2 device connected as well?
> Does this occur with only a USB2 device connected to xhci?
>
> xhci handles suspend/resume a bit differently for USB2 and USB3 roothubs.
>
> Does this happen on older kernels as well? 4.3 or 4.2 based?
>
> For more xhci debugging, do:
> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
> and check dmesg for more xhci info.
>
> If reloading the module did not help it is more likely that the
> controller is in some
> unexpected state.
> If however, it would instead be just bad timeout timer handling we
> could just return immediately
> in the timeout handler, and check if the usb device(s) continue to
> work normally.
>
> This could be done by editing drivers/usb/hosts/xhci-ring.c
>
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -831,6 +831,7 @@ void xhci_stop_endpoint_command_watchdog(unsigned
> long arg)
> struct xhci_virt_ep *ep;
> int ret, i, j;
> unsigned long flags;
> +   return;
>
> -Mathias
>
>
Hello Mat,

thanks for your response. I have experimented with your suggestions.

As for your questions: No, there was only one USB3 stick connected to
the host controller during the tests. USB2 devices work fine too.

Yes, I encountered this problem on a 4.1 series kernel aswell as the 4.4
series.

I have enabled the debug controls and attached the results to this mail,
along with some commentary. I am hoping this works in the mailing list.

I've also tried your suggested modification, and it does seem to work!
With it, the controller does not die, but it still sacrifices a lot of
speed (as I had mentioned in the first mail of this thread)


I hope this is helpful!

Cheers,
- Mike
enabling auto powersave for the host controller

Feb 16 20:03:22 xhci_hcd :0e:00.0: xhci_suspend: stopping port polling.
Feb 16 20:03:22 xhci_hcd :0e:00.0: // Setting command ring address to 
0xe001

inserting usb3 storage device

Feb 16 20:03:33 xhci_hcd :0e:00.0: // Setting command ring address to 
0xe001
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_resume: starting port polling.
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_hub_status_data: stopping port 
polling.
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_suspend: stopping port polling.
Feb 16 20:03:33 xhci_hcd :0e:00.0: Port Status Change Event for port 3
Feb 16 20:03:33 xhci_hcd :0e:00.0: resume root hub
Feb 16 20:03:33 xhci_hcd :0e:00.0: handle_port_status: starting port 
polling.
Feb 16 20:03:33 xhci_hcd :0e:00.0: // Setting command ring address to 
0xe001
Feb 16 20:03:33 xhci_hcd :0e:00.0: // Setting command ring address to 
0xe001
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_resume: starting port polling.
Feb 16 20:03:33 xhci_hcd :0e:00.0: xhci_hub_status_data: stopping port 
polling.
Feb 16 20:03:33 xhci_hcd :0e:00.0: get port status, actual port 0 status  = 
0x202e1
Feb 16 20:03:33 xhci_hcd :0e:00.0: Get port status returned 0x10101
Feb 16 20:03:33 xhci_hcd :0e:00.0: clear port connect change, actual port 0 
status  = 0x2e1
Feb 16 20:03:33 

Re: NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-02-08 Thread Mathias Nyman

Hi

On 06.02.2016 19:08, Mike Murdoch wrote:

Bug ID: 111251

Hello,

I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
kernel 4.4.1-gentoo.

0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
Controller (rev 04) (prog-if 30 [XHCI])
 Subsystem: Lenovo uPD720200 USB 3.0 Host Controller

When runtime power control for this controller is disabled
(/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
works fine and reaches over 120MB/s transfer rates.

When runtime power control for this controller is enabled
(/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
can be observed:

- Transfer rates are much lower at around 30MB/s
- During transfers, the controller dies after a couple of seconds:

xhci_hcd :0e:00.0: xHCI host not responding to stop endpoint command.
xhci_hcd :0e:00.0: Assuming host is dying, halting host.
xhci_hcd :0e:00.0: Host not halted after 16000 microseconds.
xhci_hcd :0e:00.0: Non-responsive xHCI host is not halting.
xhci_hcd :0e:00.0: Completing active URBs anyway.
xhci_hcd :0e:00.0: HC died; cleaning up
sd 9:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR
driverbyte=DRIVER_OK
sd 9:0:0:0: [sdc] tag#0 CDB: Read(10) 28 00 00 19 a9 00 00 00 f0 00
blk_update_request: I/O error, dev sdc, sector 1681664
xhci_hcd :0e:00.0: Stopped the command ring failed, maybe the host
is dead
xhci_hcd :0e:00.0: Host not halted after 16000 microseconds.
xhci_hcd :0e:00.0: Abort command ring failed
xhci_hcd :0e:00.0: HC died; cleaning up

At this point, a reboot is required to reactivate the controller,
unloading and reloading the xhci_* modules does not work.



With 120MB/s I assume it was a USB3 device.
Was there any USB 2 device connected as well?
Does this occur with only a USB2 device connected to xhci?

xhci handles suspend/resume a bit differently for USB2 and USB3 roothubs.

Does this happen on older kernels as well? 4.3 or 4.2 based?

For more xhci debugging, do:
echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
and check dmesg for more xhci info.

If reloading the module did not help it is more likely that the controller is 
in some
unexpected state.
If however, it would instead be just bad timeout timer handling we could just 
return immediately
in the timeout handler, and check if the usb device(s) continue to work 
normally.

This could be done by editing drivers/usb/hosts/xhci-ring.c

+++ b/drivers/usb/host/xhci-ring.c
@@ -831,6 +831,7 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg)
struct xhci_virt_ep *ep;
int ret, i, j;
unsigned long flags;
+   return;

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


NEC uPD720200 xHCI Controller dies when Runtime PM enabled

2016-02-06 Thread Mike Murdoch
Bug ID: 111251

Hello,

I have a NEC uPD720200 USB3.0 controller in a Thinkpad W520 laptop on
kernel 4.4.1-gentoo.

0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
Controller (rev 04) (prog-if 30 [XHCI])
Subsystem: Lenovo uPD720200 USB 3.0 Host Controller
Flags: bus master, fast devsel, latency 0
Memory at f380 (64-bit, non-prefetchable) [size=8K]
Capabilities: [50] Power Management version 3
Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
Capabilities: [150] Latency Tolerance Reporting
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci

When runtime power control for this controller is disabled
(/sys/bus/pci/devices/:0e:00.0/power/control = on), the controller
works fine and reaches over 120MB/s transfer rates.

When runtime power control for this controller is enabled
(/sys/bus/pci/devices/:0e:00.0/power/control = auto), two effects
can be observed:

- Transfer rates are much lower at around 30MB/s
- During transfers, the controller dies after a couple of seconds:

xhci_hcd :0e:00.0: xHCI host not responding to stop endpoint command.
xhci_hcd :0e:00.0: Assuming host is dying, halting host.
xhci_hcd :0e:00.0: Host not halted after 16000 microseconds.
xhci_hcd :0e:00.0: Non-responsive xHCI host is not halting.
xhci_hcd :0e:00.0: Completing active URBs anyway.
xhci_hcd :0e:00.0: HC died; cleaning up
sd 9:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR
driverbyte=DRIVER_OK
sd 9:0:0:0: [sdc] tag#0 CDB: Read(10) 28 00 00 19 a9 00 00 00 f0 00
blk_update_request: I/O error, dev sdc, sector 1681664
xhci_hcd :0e:00.0: Stopped the command ring failed, maybe the host
is dead
xhci_hcd :0e:00.0: Host not halted after 16000 microseconds.
xhci_hcd :0e:00.0: Abort command ring failed
xhci_hcd :0e:00.0: HC died; cleaning up

At this point, a reboot is required to reactivate the controller,
unloading and reloading the xhci_* modules does not work.

I'll be happy to assist in getting this fixed :)
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html