Re: [REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
Sure. Done. On 05/03/2018 07:04 PM, Mika Westerberg wrote: Could you then attach full dmesg of the failure without revert to the bugzilla bug? -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
On Thu, May 03, 2018 at 06:53:13PM +0200, Esokrates wrote: > Hi, > > Thanks very much for pointing out that commit! > Indeed, reverting makes the problem go away! > Also, interestingly it also makes the errors in > https://bugzilla.kernel.org/show_bug.cgi?id=199557 > go away, tested using 4.16.7! OK, good. Could you then attach full dmesg of the failure without revert to the bugzilla bug? -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
Hi, Thanks very much for pointing out that commit! Indeed, reverting makes the problem go away! Also, interestingly it also makes the errors in https://bugzilla.kernel.org/show_bug.cgi?id=199557 go away, tested using 4.16.7! On 05/03/2018 05:18 PM, Mika Westerberg wrote: Could you try to revert: 13d3047c8150 ("ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status()") and see if the problem goes away? -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
On Thu, May 03, 2018 at 04:13:05PM +0300, Mathias Nyman wrote: > On 03.05.2018 12:30, Esokrates wrote: > > Hi,> Beginning with Linux 4.16 rc7 (4.16 rc6 was NOT affected), I do > > find the following regularly in dmesg (often it does not happen during > > boot, but after suspend to ram / resume): > > > > Hi > > Can you try 'git bisect' to find the patch that causes the issues? > (Adding Mika to Cc) Could you try to revert: 13d3047c8150 ("ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status()") and see if the problem goes away? > > [ 216.443309] pcieport :00:1c.0: AER: Corrected error received: id=00e0 > > [ 216.443951] pcieport :00:1c.0: PCIe Bus Error: severity=Corrected, > > type=Physical Layer, id=00e0(Receiver ID) > > [ 216.444607] pcieport :00:1c.0: device [8086:9d10] error > > status/mask=0001/2000 > > [ 216.445300] pcieport :00:1c.0: [ 0] Receiver Error (First) > > [ 216.517886] xhci_hcd :39:00.0: remove, state 4 > > [ 216.518573] usb usb4: USB disconnect, device number 1 > > [ 216.519438] xhci_hcd :39:00.0: USB bus 4 deregistered > > [ 216.520320] xhci_hcd :39:00.0: xHCI host controller not responding, > > assume dead > > [ 216.521908] xhci_hcd :39:00.0: remove, state 4 > > [ 216.522950] usb usb3: USB disconnect, device number 1 > > [ 216.523891] xhci_hcd :39:00.0: Host halt failed, -19 > > [ 216.524994] xhci_hcd :39:00.0: Host not accessible, reset failed. > > [ 216.526153] xhci_hcd :39:00.0: USB bus 3 deregistered > > > > > > Running 4.16.0 I also observed > > > > [ 31.509282] ACPI: Waking up from system sleep state S3 > > [ 31.809429] ACPI: EC: interrupt unblocked > > [ 31.828849] pci_raw_set_power_state: 62 callbacks suppressed > > [ 31.828852] pcieport :01:00.0: Refused to change power state, > > currently in D3 > > [ 31.830422] pcieport :02:01.0: Refused to change power state, > > currently in D3 > > [ 31.830423] pcieport :02:02.0: Refused to change power state, > > currently in D3 > > [ 31.848853] pcieport :02:00.0: Refused to change power state, > > currently in D3 > > [ 31.852520] xhci_hcd :39:00.0: Refused to change power state, > > currently in D3 > > [ 31.872529] thunderbolt :03:00.0: Refused to change power state, > > currently in D3 > > [ 31.933970] thunderbolt :03:00.0: control channel starting... > > [ 31.937403] ACPI: EC: event unblocked > > [ 31.938385] sd 2:0:0:0: [sda] Starting disk > > [ 31.938886] ACPI: button: The lid device is not compliant to SW_LID. > > [ 31.956574] xhci_hcd :39:00.0: Refused to change power state, > > currently in D3 > > [ 31.956624] xhci_hcd :39:00.0: WARN: xHC restore state timeout > > [ 31.956631] xhci_hcd :39:00.0: PCI post-resume error -110! > > [ 31.956656] xhci_hcd :39:00.0: HC died; cleaning up > > [ 31.956658] xhci_hcd :39:00.0: HC died; cleaning up > > [ 31.956664] dpm_run_callback(): pci_pm_resume+0x0/0xb0 returns -110 > > [ 31.956668] PM: Device :39:00.0 failed to resume async: error -110 > > > > Furthermore sometimes I also get a bunch of these errors before the errors > > above: > > May 03 10:58:49 debian kernel: usb 1-3: device descriptor read/64, error -71 > > May 03 10:58:49 debian kernel: usb 1-3: device descriptor read/64, error -71 > > May 03 10:58:50 debian kernel: usb 1-3: device descriptor read/64, error -71 > > May 03 10:58:50 debian kernel: usb 1-3: device descriptor read/64, error -71 > > May 03 10:58:51 debian kernel: usb 1-3: device not accepting address 2, > > error -71 > > > > All of this never happened before 4.16rc7. All kernels since 4.16.rc7 are > > reproducibly affected (suspend/resume helps triggering), including > > 4.17.0rc3. > > > > My hardware is a XPS 13 9360 Kabylake, lsub and lspci output are attached. > > > > I am not subscribed to the mailing list, so please CC me when replying to > > the list. > > > > Thanks very much! -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
On 03.05.2018 12:30, Esokrates wrote: Hi,> Beginning with Linux 4.16 rc7 (4.16 rc6 was NOT affected), I do find the following regularly in dmesg (often it does not happen during boot, but after suspend to ram / resume): Hi Can you try 'git bisect' to find the patch that causes the issues? (Adding Mika to Cc) [ 216.443309] pcieport :00:1c.0: AER: Corrected error received: id=00e0 [ 216.443951] pcieport :00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID) [ 216.444607] pcieport :00:1c.0: device [8086:9d10] error status/mask=0001/2000 [ 216.445300] pcieport :00:1c.0: [ 0] Receiver Error (First) [ 216.517886] xhci_hcd :39:00.0: remove, state 4 [ 216.518573] usb usb4: USB disconnect, device number 1 [ 216.519438] xhci_hcd :39:00.0: USB bus 4 deregistered [ 216.520320] xhci_hcd :39:00.0: xHCI host controller not responding, assume dead [ 216.521908] xhci_hcd :39:00.0: remove, state 4 [ 216.522950] usb usb3: USB disconnect, device number 1 [ 216.523891] xhci_hcd :39:00.0: Host halt failed, -19 [ 216.524994] xhci_hcd :39:00.0: Host not accessible, reset failed. [ 216.526153] xhci_hcd :39:00.0: USB bus 3 deregistered Running 4.16.0 I also observed [ 31.509282] ACPI: Waking up from system sleep state S3 [ 31.809429] ACPI: EC: interrupt unblocked [ 31.828849] pci_raw_set_power_state: 62 callbacks suppressed [ 31.828852] pcieport :01:00.0: Refused to change power state, currently in D3 [ 31.830422] pcieport :02:01.0: Refused to change power state, currently in D3 [ 31.830423] pcieport :02:02.0: Refused to change power state, currently in D3 [ 31.848853] pcieport :02:00.0: Refused to change power state, currently in D3 [ 31.852520] xhci_hcd :39:00.0: Refused to change power state, currently in D3 [ 31.872529] thunderbolt :03:00.0: Refused to change power state, currently in D3 [ 31.933970] thunderbolt :03:00.0: control channel starting... [ 31.937403] ACPI: EC: event unblocked [ 31.938385] sd 2:0:0:0: [sda] Starting disk [ 31.938886] ACPI: button: The lid device is not compliant to SW_LID. [ 31.956574] xhci_hcd :39:00.0: Refused to change power state, currently in D3 [ 31.956624] xhci_hcd :39:00.0: WARN: xHC restore state timeout [ 31.956631] xhci_hcd :39:00.0: PCI post-resume error -110! [ 31.956656] xhci_hcd :39:00.0: HC died; cleaning up [ 31.956658] xhci_hcd :39:00.0: HC died; cleaning up [ 31.956664] dpm_run_callback(): pci_pm_resume+0x0/0xb0 returns -110 [ 31.956668] PM: Device :39:00.0 failed to resume async: error -110 Furthermore sometimes I also get a bunch of these errors before the errors above: May 03 10:58:49 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:49 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:50 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:50 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:51 debian kernel: usb 1-3: device not accepting address 2, error -71 All of this never happened before 4.16rc7. All kernels since 4.16.rc7 are reproducibly affected (suspend/resume helps triggering), including 4.17.0rc3. My hardware is a XPS 13 9360 Kabylake, lsub and lspci output are attached. I am not subscribed to the mailing list, so please CC me when replying to the list. Thanks very much! -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
Hi, Beginning with Linux 4.16 rc7 (4.16 rc6 was NOT affected), I do find the following regularly in dmesg (often it does not happen during boot, but after suspend to ram / resume): [ 216.443309] pcieport :00:1c.0: AER: Corrected error received: id=00e0 [ 216.443951] pcieport :00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID) [ 216.444607] pcieport :00:1c.0: device [8086:9d10] error status/mask=0001/2000 [ 216.445300] pcieport :00:1c.0:[ 0] Receiver Error (First) [ 216.517886] xhci_hcd :39:00.0: remove, state 4 [ 216.518573] usb usb4: USB disconnect, device number 1 [ 216.519438] xhci_hcd :39:00.0: USB bus 4 deregistered [ 216.520320] xhci_hcd :39:00.0: xHCI host controller not responding, assume dead [ 216.521908] xhci_hcd :39:00.0: remove, state 4 [ 216.522950] usb usb3: USB disconnect, device number 1 [ 216.523891] xhci_hcd :39:00.0: Host halt failed, -19 [ 216.524994] xhci_hcd :39:00.0: Host not accessible, reset failed. [ 216.526153] xhci_hcd :39:00.0: USB bus 3 deregistered Running 4.16.0 I also observed [ 31.509282] ACPI: Waking up from system sleep state S3 [ 31.809429] ACPI: EC: interrupt unblocked [ 31.828849] pci_raw_set_power_state: 62 callbacks suppressed [ 31.828852] pcieport :01:00.0: Refused to change power state, currently in D3 [ 31.830422] pcieport :02:01.0: Refused to change power state, currently in D3 [ 31.830423] pcieport :02:02.0: Refused to change power state, currently in D3 [ 31.848853] pcieport :02:00.0: Refused to change power state, currently in D3 [ 31.852520] xhci_hcd :39:00.0: Refused to change power state, currently in D3 [ 31.872529] thunderbolt :03:00.0: Refused to change power state, currently in D3 [ 31.933970] thunderbolt :03:00.0: control channel starting... [ 31.937403] ACPI: EC: event unblocked [ 31.938385] sd 2:0:0:0: [sda] Starting disk [ 31.938886] ACPI: button: The lid device is not compliant to SW_LID. [ 31.956574] xhci_hcd :39:00.0: Refused to change power state, currently in D3 [ 31.956624] xhci_hcd :39:00.0: WARN: xHC restore state timeout [ 31.956631] xhci_hcd :39:00.0: PCI post-resume error -110! [ 31.956656] xhci_hcd :39:00.0: HC died; cleaning up [ 31.956658] xhci_hcd :39:00.0: HC died; cleaning up [ 31.956664] dpm_run_callback(): pci_pm_resume+0x0/0xb0 returns -110 [ 31.956668] PM: Device :39:00.0 failed to resume async: error -110 Furthermore sometimes I also get a bunch of these errors before the errors above: May 03 10:58:49 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:49 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:50 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:50 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:51 debian kernel: usb 1-3: device not accepting address 2, error -71 All of this never happened before 4.16rc7. All kernels since 4.16.rc7 are reproducibly affected (suspend/resume helps triggering), including 4.17.0rc3. My hardware is a XPS 13 9360 Kabylake, lsub and lspci output are attached. I am not subscribed to the mailing list, so please CC me when replying to the list. Thanks very much! 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 02) Subsystem: Dell Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers Flags: bus master, fast devsel, latency 0 Capabilities: [e0] Vendor Specific Information: Len=10 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02) (prog-if 00 [VGA controller]) Subsystem: Dell HD Graphics 620 Flags: bus master, fast devsel, latency 0, IRQ 128 Memory at db00 (64-bit, non-prefetchable) [size=16M] Memory at 9000 (64-bit, prefetchable) [size=256M] I/O ports at f000 [size=64] [virtual] Expansion ROM at 000c [disabled] [size=128K] Capabilities: [40] Vendor Specific Information: Len=0c Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [100] Process Address Space ID (PASID) Capabilities: [200] Address Translation Service (ATS) Capabilities: [300] Page Request Interface (PRI) Kernel driver in use: i915 Kernel modules: i915 00:04.0 Signal processing controller: Intel Corporation Skylake Processor Thermal Subsystem (rev 02) Subsystem: Dell Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem Flags: fast devsel, IRQ 16 Memory at dc22 (64-bit, non-prefetchable) [size=32K] Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-