On Thu, 4 Feb 2016, Chris Bainbridge wrote: > On Thu, Feb 04, 2016 at 04:00:51PM -0500, Alan Stern wrote: > > On Thu, 4 Feb 2016, Chris Bainbridge wrote: > > > > > The XHCI controller presents two USB buses to the system - one for USB 2 > > > and one for USB 3. When only one bus is locked there is a race condition > > > during hub init that results in errors like: > > > > > > [ 13.183701] usb 3-3: device descriptor read/all, error -110 > > > > What exactly is the race condition? Why does locking both buses fix > > it?
... > hub_port_init is called in parallel for both buses. > The first thread is in usb_get_device_descriptor when the second one > enters the function and calls the code to get an address. I don't know > precisely how it fails - it looks like the functions for doing the > initialisation are synchronous and sleeping waiting for a response and > that gets disrupted when the second thread tries to initialise the hub. > What was the basis for using a lock on the bus rather than the > controller? I don't remember exactly. At the time the code was written, there was no important distinction between a bus and a controller. This was long before USB-3 appeared. When USB-3 support was added, the basis for keeping the lock on the bus was that I assumed there would be no problem talking to different devices at address 0 if they were on different buses. > Does the spec say that buses of the same controller can be > initialised in parallel? Mathias previously said: > > > Just found an additional note in the xhci specs section 4.5.3 saying that: > > "Note: Software shall not transition more than one Device Slot to the > > Default State at a time" > > which is what xhci_setup_device() does in addition to moving slots to the > > addressed state > > But I don't know if that means you can do the reset/set address/read > descriptors in parallel? In fact you can do the Set-Address and Read-Descriptor parts in parallel. (In USB-3 the Set-Address thing is a no-op anyhow; the hardware takes over that role completely and does it during the reset.) But that quote from the spec implies that the resets must not be done in parallel. > > I don't think this is a good idea. The driver core needs to be able to > > access the controller while this function is running. You can > > introduce a new mutex if you want, perhaps in the primary hcd > > structure, but don't use bus->controller->mutex. > > An explicit lock might be a good idea. I was trying to avoid adding > another lock so used the one in struct device as it appeared unused. It gets used by the driver core. Don't worry about the overhead of adding a new lock if it really is needed; the number of USB buses or controllers on any computer isn't big enough to matter. > The > XHCI code seems to only use the lock in struct xhci_hcd and ehci uses > struct ehci->lock. > > btw I think this bug may be the same as reported at > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1437492 It could well be. Alan Stern