I recently replaced the motherboard on my desktop with an MSI B450-A Pro Max motherboard. Since then my keybaords, mouse and tablet have become very unreliable. I see messages like this over and over in dmesg:
ug 23 00:01:49 rhapsody kernel: [198769.314732] usb 1-2.4: reset full-speed USB device number 27 using ehci-pci Aug 23 00:01:49 rhapsody kernel: [198769.562234] usb 1-2.1: reset full-speed USB device number 28 using ehci-pci Aug 23 00:01:52 rhapsody kernel: [198772.570704] usb 1-2.1: reset full-speed USB device number 28 using ehci-pci Aug 23 00:02:02 rhapsody kernel: [198782.526669] usb 1-2.4: reset full-speed USB device number 27 using ehci-pci Aug 23 00:02:03 rhapsody kernel: [198782.714660] usb 1-2.1: reset full-speed USB device number 28 using ehci-pci Aug 23 00:02:04 rhapsody kernel: [198784.210171] usb 1-2.3: reset low-speed USB device number 26 using ehci-pci Aug 23 00:02:06 rhapsody kernel: [198786.110181] usb 1-2.4: reset full-speed USB device number 27 using ehci-pci Aug 23 00:02:08 rhapsody kernel: [198787.726158] usb 1-2.4: reset full-speed USB device number 27 using ehci-pci Aug 23 00:02:10 rhapsody kernel: [198790.126628] usb 1-2.1: reset full-speed USB device number 28 using ehci-pci Aug 23 00:02:10 rhapsody kernel: [198790.314141] usb 1-2.4: reset full-speed USB device number 27 using ehci-pci Aug 23 00:02:12 rhapsody kernel: [198792.518765] usb 1-2.4: reset full-speed USB device number 27 using ehci-pci The devices I am using are: - Logitech K360 wireless keyboard - Wired Lenovo USB keyboard - Wired Lenovo USB mouse - Wired Wacom Intuos tablet After a reset, the wireless keyboard simply stops working. Rest of the devices keep seeing intermittent failure. I tried various combinations of hubs and USB controllers to see what works. MSI B450-A motherboard has USB 3.0 and USB 3.1 controllers. I added a USB 2.0 PCI card as well for this test: 03:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller (rev 01) 29:01.0 USB controller: NEC Corporation OHCI USB Controller (rev 43) 29:01.1 USB controller: NEC Corporation OHCI USB Controller (rev 43) 29:01.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04) 2c:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller I have a bus powered USB 3.0 hub, a bus powered USB 2.0 hub and a self powered USB 2.0 hub built into my monitor. I have connected my devices directly into the ports on motherboard and PCI card as well as into external hub. Here are the results I saw when devices wee plugged into various combination of ports: 1. USB 3.0/3.1 controller - does NOT work 2. USB 2.0 controller - WORKS 3. USB 3.0/3.1 controller -> Self powered USB 2.0 hub in monitor - does NOT work 4. USB 3.0/3.1 controller -> bus powered USB 3.0 hub - does NOT work 5. USB 3.0/3.1 controller -> Bus powered USB 2.0 hub - WORKS 7. USB 2.0 controller -> Bus powered USB 3.0 hub - does NOT work 8. USB 2.0 controller -> Bus powered 2.0 hub - Does not work I narrowed the failure down to following lines (this code was added in 5.5 with commit 64cc3f12d1c7 "USB: EHCI: Do not return -EPIPE when hub is disconnected"): drivers/usb/host/ehci-q.c: 217 } else if ((token & QTD_STS_MMF) && 218 (QTD_PID(token) == PID_CODE_IN)) { 219 status = -EPROTO; 220 /* CERR nonzero + halt --> stall */ At the time of failure, when we reach this conditional, token is either 0x80408d46 or 0x408d46 which means following bits are set: QTD_STS_STS, QTD_STS_MMF, QTD_STS_HALT, QTD_IOC, QTD_TOGGLE and QTD_PID = 1 QTD_CERR = 3 QTD_LENGTH = 0x40 (64) This causes the branch "(token & QTD_STS_MMF) && (QTD_PID(token) == PID_CODE_IN" to be taken and qtd_copy_status() returns EPROTO. This return value in qh_completions() results in ehci_clear_tt_buffer() being called: drivers/usb/host/ehci-q.c: 472 /* As part of low/full-speed endpoint-halt processi ng 473 * we must clear the TT buffer (11.17.5). 474 */ 475 if (unlikely(last_status != -EINPROGRESS && 476 last_status != -EREMOTEIO)) { 477 /* The TT's in some hubs malfunction when t hey 478 * receive this request following a STALL ( they 479 * stop sending isochronous packets). Sinc e a 480 * STALL can't leave the TT buffer in a bus y 481 * state (if you believe Figures 11-48 - 11 -51 482 * in the USB 2.0 spec), we won't clear the TT 483 * buffer in this case. Strictly speaking this 484 * is a violation of the spec. 485 */ 486 if (last_status != -EPIPE) 487 ehci_clear_tt_buffer(ehci, qh, urb, 488 token); 489 } It seems like clearing TT buffers in this case is resulting in hub hanging. A USB reset gets it going again until we repeat the cycle over again. The comment in this code says "The TT's in some hubs malfunction when they receive this request following a STALL (they stop sending isochronous packets)". That may be what is happening. Removing the code that returns EPROTO for such case solves the problem on my machine (as in the RFC patch) but that probably is not the right solution. I do not understand USB protocol well enough to propose a better solution. Does anyone have a better idea? Khalid Aziz (1): usb: ehci: Remove erroneous return of EPROTO upon detection of stall drivers/usb/host/ehci-q.c | 4 ---- 1 file changed, 4 deletions(-) -- 2.25.1