On 18.10.2017 17:50, Alan Stern wrote:
On Tue, 17 Oct 2017 wenxi...@linux.vnet.ibm.com wrote:

From: Wen Xiong <wenxi...@linux.vnet.ibm.com>


We saw "Host halt failed, -110" error when rebooting system/
shutdowning system/kexec constantly.

This patch called usb_disconnect() before calling xhci_halt().
usb_disconnect()disconnect the parent and all of its children,
clean up hardware state and make sure that hardware is ready
to be halted down.

Rebooting.
[18648.996035] sd 0:2:1:0: [sdb] Synchronizing SCSI cache
[18678.831197] mpt3sas_cm1: sending message unit reset !!
[18678.832774] mpt3sas_cm1: message unit reset: SUCCESS
[18683.900798] mpt3sas_cm0: sending message unit reset !!
[18683.902370] mpt3sas_cm0: message unit reset: SUCCESS
[18693.921103] xhci_hcd 0005:01:00.0: Host halt failed, -110
[18693.924483] reboot: Restarting system
[18861.282906007,5] OPAL: Reboot request...

Signed-off-by: Wen Xiong <wenxi...@linux.vnet.ibm.com>

Thanks,
Wendy

---
  drivers/usb/host/xhci.c |    7 +++++++
  1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index ee198ea..67fdb0f 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -709,10 +709,17 @@ static void xhci_stop(struct usb_hcd *hcd)
  static void xhci_shutdown(struct usb_hcd *hcd)
  {
        struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+       struct usb_device *rhdev = hcd->self.root_hub;
+
+       dev_info(hcd->self.controller, "remove, state %x\n", hcd->state);

        if (xhci->quirks & XHCI_SPURIOUS_REBOOT)
                usb_disable_xhci_ports(to_pci_dev(hcd->self.sysdev));

+       mutex_lock(&usb_bus_idr_lock);
+       usb_disconnect(&rhdev);
+       mutex_unlock(&usb_bus_idr_lock);
+
        spin_lock_irq(&xhci->lock);
        xhci_halt(xhci);
        /* Workaround for spurious wakeups at shutdown with HSW */

This does not seem right.  For one thing, shutdown routines are
supposed to avoid taking locks (as much as possible), so that system
shutdown can proceed without deadlocking.

For another, this is a layering violation.  The xhci-hcd driver does
not register the root hub, so it should not unregister it.

The right way to fix this is to make sure that the shutdown routine
will succeed even if the host controller is busy with other activity.
The other activities should all be aborted, and all future requests
should fail.

Current shutdown routine just forces the host controller to stop, it clears the
run bit and polls the "halted" status for 16ms. Apparently we don't see the 
halted
bit within 16ms.

Spec say that the correct way to stop is to first command all transfer rings to 
stop,
then stop the command ring, and after that stop the host controller.

If we just bluntly stop the host (as we do) spec say (xhci 5.4.1.1) it should 
stop
anyway within 16ms, but undefined behavior may occur.

So the options in xHCI are down to:
1. just clear the run bit and ignore checking any status.
   - really fast shutdown routine for xhci
2. clear run bit and increase status polling time, see if we get rid of error 
message.
   - can get rid of error message but no actual change, well, we would know if 
host stopped
3. properly stop all transfer rings and command ring, and then stop host.
   - cleanest and slowest way, do we care about this? everything should be 
reset after shutdown.

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to