On Thu, 4 Feb 2016, Chris Bainbridge wrote:

> On Thu, Feb 04, 2016 at 04:00:51PM -0500, Alan Stern wrote:
> > On Thu, 4 Feb 2016, Chris Bainbridge wrote:
> > 
> > > The XHCI controller presents two USB buses to the system - one for USB 2
> > > and one for USB 3. When only one bus is locked there is a race condition
> > > during hub init that results in errors like:
> > > 
> > >  [   13.183701] usb 3-3: device descriptor read/all, error -110
> > 
> > What exactly is the race condition?  Why does locking both buses fix 
> > it?

...

> hub_port_init is called in parallel for both buses.
> The first thread is in usb_get_device_descriptor when the second one
> enters the function and calls the code to get an address. I don't know
> precisely how it fails - it looks like the functions for doing the
> initialisation are synchronous and sleeping waiting for a response and
> that gets disrupted when the second thread tries to initialise the hub.
> What was the basis for using a lock on the bus rather than the
> controller?

I don't remember exactly.  At the time the code was written, there was
no important distinction between a bus and a controller.  This was long
before USB-3 appeared.

When USB-3 support was added, the basis for keeping the lock on the bus 
was that I assumed there would be no problem talking to different 
devices at address 0 if they were on different buses.

>  Does the spec say that buses of the same controller can be
> initialised in parallel? Mathias previously said:
> 
> > Just found an additional note in the xhci specs section 4.5.3 saying that:
> > "Note: Software shall not transition more than one Device Slot to the 
> > Default State at a time"
> > which is what xhci_setup_device() does in addition to moving slots to the 
> > addressed state
> 
> But I don't know if that means you can do the reset/set address/read
> descriptors in parallel?

In fact you can do the Set-Address and Read-Descriptor parts in
parallel.  (In USB-3 the Set-Address thing is a no-op anyhow; the
hardware takes over that role completely and does it during the reset.)  
But that quote from the spec implies that the resets must not be done
in parallel.

> > I don't think this is a good idea.  The driver core needs to be able to
> > access the controller while this function is running.  You can
> > introduce a new mutex if you want, perhaps in the primary hcd
> > structure, but don't use bus->controller->mutex.
> 
> An explicit lock might be a good idea. I was trying to avoid adding
> another lock so used the one in struct device as it appeared unused.

It gets used by the driver core.  Don't worry about the overhead of 
adding a new lock if it really is needed; the number of USB buses or 
controllers on any computer isn't big enough to matter.

> The
> XHCI code seems to only use the lock in struct xhci_hcd and ehci uses
> struct ehci->lock.
> 
> btw I think this bug may be the same as reported at
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1437492

It could well be.

Alan Stern

Reply via email to