Re: [PATCH] usb: core: hub: hub_port_init lock controller instead of bus

2016-04-07 Thread Chris Bainbridge
On Wed, Feb 10, 2016 at 07:13:38PM +0200, Mathias Nyman wrote:
> 
> Most likely xhci is messed up after two device slots are in default state at 
> the same time.
> This happens when both threads are in hub_port_init() have called 
> hub_port_reset()
> 
> The issue becomes visible when the the descriptor read and set address both 
> fail after
> the port resets.
> 
> xhci specs 4.5.3 has one tiny note about this:
> "Note: Software shall not transition more than one Device Slot to the Default 
> State at a time"
> 
> So to me, and from xhci pov this patch looks like the correct solution,
> but I might be missing some usb core side details.
> 
> -Mathias
> 

Just following up to see if this patch disappeared into the void?


Re: [PATCH] usb: core: hub: hub_port_init lock controller instead of bus

2016-04-07 Thread Chris Bainbridge
On Wed, Feb 10, 2016 at 07:13:38PM +0200, Mathias Nyman wrote:
> 
> Most likely xhci is messed up after two device slots are in default state at 
> the same time.
> This happens when both threads are in hub_port_init() have called 
> hub_port_reset()
> 
> The issue becomes visible when the the descriptor read and set address both 
> fail after
> the port resets.
> 
> xhci specs 4.5.3 has one tiny note about this:
> "Note: Software shall not transition more than one Device Slot to the Default 
> State at a time"
> 
> So to me, and from xhci pov this patch looks like the correct solution,
> but I might be missing some usb core side details.
> 
> -Mathias
> 

Just following up to see if this patch disappeared into the void?


Re: [PATCH] usb: core: hub: hub_port_init lock controller instead of bus

2016-02-10 Thread Mathias Nyman

On 08.02.2016 15:49, Chris Bainbridge wrote:

The XHCI controller presents two USB buses to the system - one for USB 2
and one for USB 3. When only one bus is locked there is a race condition
with two threads in hub_port_init:

[8.984500] Call Trace:
[8.985698]  [] schedule+0x37/0x90
[8.986918]  [] usb_kill_urb+0x8d/0xd0
[8.988130]  [] ? wake_up_atomic_t+0x30/0x30
[8.989343]  [] usb_start_wait_urb+0xbe/0x150
[8.990561]  [] usb_control_msg+0xbc/0xf0
[8.991766]  [] hub_port_init+0x51e/0xb70
[8.992964]  [] hub_event+0x817/0x1570
[8.994156]  [] process_one_work+0x1ff/0x620
[8.995342]  [] ? process_one_work+0x15f/0x620
[8.996528]  [] worker_thread+0x64/0x4b0
[8.997707]  [] ? rescuer_thread+0x390/0x390
[8.998883]  [] kthread+0x105/0x120
[9.56]  [] ? kthread_create_on_node+0x200/0x200
[9.001241]  [] ret_from_fork+0x3f/0x70
[9.002420]  [] ? kthread_create_on_node+0x200/0x200

[9.870837] Call Trace:
[9.875664]  [] xhci_setup_device+0x53d/0xa40
[9.876871]  [] xhci_address_device+0xe/0x10
[9.878068]  [] hub_port_init+0x1bf/0xb70
[9.879262]  [] ? trace_hardirqs_on+0xd/0x10
[9.880465]  [] hub_event+0x817/0x1570
[9.881668]  [] process_one_work+0x1ff/0x620
[9.882869]  [] ? process_one_work+0x15f/0x620
[9.884074]  [] worker_thread+0x64/0x4b0
[9.885268]  [] ? rescuer_thread+0x390/0x390
[9.886457]  [] kthread+0x105/0x120
[9.887634]  [] ? kthread_create_on_node+0x200/0x200
[9.17]  [] ret_from_fork+0x3f/0x70
[9.889995]  [] ? kthread_create_on_node+0x200/0x200

Which results from the two call chains:

hub_port_init
  usb_get_device_descriptor
   usb_get_descriptor
usb_control_msg
 usb_internal_control_msg
  usb_start_wait_urb
   usb_submit_urb / wait_for_completion_timeout / usb_kill_urb

hub_port_init
  hub_set_address
   xhci_address_device
xhci_setup_device

The logged kernel errors are:

[8.034843] xhci_hcd :00:14.0: Timeout while waiting for setup device 
command
[   13.183701] usb 3-3: device descriptor read/all, error -110

On a test system this failure occurred on 6% of all boots.

Hypothetically, it should perhaps be possible to set the address of the
hub on one bus while the hub on the other bus already has an outstanding
descriptor read request. But as this is not working reliably, fix it by
locking the controller in hub_port_init to prevent parallel
initialisation of both buses.



Most likely xhci is messed up after two device slots are in default state at 
the same time.
This happens when both threads are in hub_port_init() have called 
hub_port_reset()

The issue becomes visible when the the descriptor read and set address both 
fail after
the port resets.

xhci specs 4.5.3 has one tiny note about this:
"Note: Software shall not transition more than one Device Slot to the Default State 
at a time"

So to me, and from xhci pov this patch looks like the correct solution,
but I might be missing some usb core side details.

-Mathias



Re: [PATCH] usb: core: hub: hub_port_init lock controller instead of bus

2016-02-10 Thread Mathias Nyman

On 08.02.2016 15:49, Chris Bainbridge wrote:

The XHCI controller presents two USB buses to the system - one for USB 2
and one for USB 3. When only one bus is locked there is a race condition
with two threads in hub_port_init:

[8.984500] Call Trace:
[8.985698]  [] schedule+0x37/0x90
[8.986918]  [] usb_kill_urb+0x8d/0xd0
[8.988130]  [] ? wake_up_atomic_t+0x30/0x30
[8.989343]  [] usb_start_wait_urb+0xbe/0x150
[8.990561]  [] usb_control_msg+0xbc/0xf0
[8.991766]  [] hub_port_init+0x51e/0xb70
[8.992964]  [] hub_event+0x817/0x1570
[8.994156]  [] process_one_work+0x1ff/0x620
[8.995342]  [] ? process_one_work+0x15f/0x620
[8.996528]  [] worker_thread+0x64/0x4b0
[8.997707]  [] ? rescuer_thread+0x390/0x390
[8.998883]  [] kthread+0x105/0x120
[9.56]  [] ? kthread_create_on_node+0x200/0x200
[9.001241]  [] ret_from_fork+0x3f/0x70
[9.002420]  [] ? kthread_create_on_node+0x200/0x200

[9.870837] Call Trace:
[9.875664]  [] xhci_setup_device+0x53d/0xa40
[9.876871]  [] xhci_address_device+0xe/0x10
[9.878068]  [] hub_port_init+0x1bf/0xb70
[9.879262]  [] ? trace_hardirqs_on+0xd/0x10
[9.880465]  [] hub_event+0x817/0x1570
[9.881668]  [] process_one_work+0x1ff/0x620
[9.882869]  [] ? process_one_work+0x15f/0x620
[9.884074]  [] worker_thread+0x64/0x4b0
[9.885268]  [] ? rescuer_thread+0x390/0x390
[9.886457]  [] kthread+0x105/0x120
[9.887634]  [] ? kthread_create_on_node+0x200/0x200
[9.17]  [] ret_from_fork+0x3f/0x70
[9.889995]  [] ? kthread_create_on_node+0x200/0x200

Which results from the two call chains:

hub_port_init
  usb_get_device_descriptor
   usb_get_descriptor
usb_control_msg
 usb_internal_control_msg
  usb_start_wait_urb
   usb_submit_urb / wait_for_completion_timeout / usb_kill_urb

hub_port_init
  hub_set_address
   xhci_address_device
xhci_setup_device

The logged kernel errors are:

[8.034843] xhci_hcd :00:14.0: Timeout while waiting for setup device 
command
[   13.183701] usb 3-3: device descriptor read/all, error -110

On a test system this failure occurred on 6% of all boots.

Hypothetically, it should perhaps be possible to set the address of the
hub on one bus while the hub on the other bus already has an outstanding
descriptor read request. But as this is not working reliably, fix it by
locking the controller in hub_port_init to prevent parallel
initialisation of both buses.



Most likely xhci is messed up after two device slots are in default state at 
the same time.
This happens when both threads are in hub_port_init() have called 
hub_port_reset()

The issue becomes visible when the the descriptor read and set address both 
fail after
the port resets.

xhci specs 4.5.3 has one tiny note about this:
"Note: Software shall not transition more than one Device Slot to the Default State 
at a time"

So to me, and from xhci pov this patch looks like the correct solution,
but I might be missing some usb core side details.

-Mathias



[PATCH] usb: core: hub: hub_port_init lock controller instead of bus

2016-02-08 Thread Chris Bainbridge
The XHCI controller presents two USB buses to the system - one for USB 2
and one for USB 3. When only one bus is locked there is a race condition
with two threads in hub_port_init:

[8.984500] Call Trace:
[8.985698]  [] schedule+0x37/0x90
[8.986918]  [] usb_kill_urb+0x8d/0xd0
[8.988130]  [] ? wake_up_atomic_t+0x30/0x30
[8.989343]  [] usb_start_wait_urb+0xbe/0x150
[8.990561]  [] usb_control_msg+0xbc/0xf0
[8.991766]  [] hub_port_init+0x51e/0xb70
[8.992964]  [] hub_event+0x817/0x1570
[8.994156]  [] process_one_work+0x1ff/0x620
[8.995342]  [] ? process_one_work+0x15f/0x620
[8.996528]  [] worker_thread+0x64/0x4b0
[8.997707]  [] ? rescuer_thread+0x390/0x390
[8.998883]  [] kthread+0x105/0x120
[9.56]  [] ? kthread_create_on_node+0x200/0x200
[9.001241]  [] ret_from_fork+0x3f/0x70
[9.002420]  [] ? kthread_create_on_node+0x200/0x200

[9.870837] Call Trace:
[9.875664]  [] xhci_setup_device+0x53d/0xa40
[9.876871]  [] xhci_address_device+0xe/0x10
[9.878068]  [] hub_port_init+0x1bf/0xb70
[9.879262]  [] ? trace_hardirqs_on+0xd/0x10
[9.880465]  [] hub_event+0x817/0x1570
[9.881668]  [] process_one_work+0x1ff/0x620
[9.882869]  [] ? process_one_work+0x15f/0x620
[9.884074]  [] worker_thread+0x64/0x4b0
[9.885268]  [] ? rescuer_thread+0x390/0x390
[9.886457]  [] kthread+0x105/0x120
[9.887634]  [] ? kthread_create_on_node+0x200/0x200
[9.17]  [] ret_from_fork+0x3f/0x70
[9.889995]  [] ? kthread_create_on_node+0x200/0x200

Which results from the two call chains:

hub_port_init
 usb_get_device_descriptor
  usb_get_descriptor
   usb_control_msg
usb_internal_control_msg
 usb_start_wait_urb
  usb_submit_urb / wait_for_completion_timeout / usb_kill_urb

hub_port_init
 hub_set_address
  xhci_address_device
   xhci_setup_device

The logged kernel errors are:

[8.034843] xhci_hcd :00:14.0: Timeout while waiting for setup device 
command
[   13.183701] usb 3-3: device descriptor read/all, error -110

On a test system this failure occurred on 6% of all boots.

Hypothetically, it should perhaps be possible to set the address of the
hub on one bus while the hub on the other bus already has an outstanding
descriptor read request. But as this is not working reliably, fix it by
locking the controller in hub_port_init to prevent parallel
initialisation of both buses.

Fixes: 638139eb95d2 ("usb: hub: allow to process more usb hub events in 
parallel")
Signed-off-by: Chris Bainbridge 
---
 drivers/usb/core/hcd.c  | 15 +--
 drivers/usb/core/hub.c  |  8 
 include/linux/usb.h |  3 +--
 include/linux/usb/hcd.h |  1 +
 4 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index df0e3b92533a..5824e819176a 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -966,7 +966,7 @@ static void usb_bus_init (struct usb_bus *bus)
bus->bandwidth_allocated = 0;
bus->bandwidth_int_reqs  = 0;
bus->bandwidth_isoc_reqs = 0;
-   mutex_init(>usb_address0_mutex);
+   mutex_init(>devnum_next_mutex);
 
INIT_LIST_HEAD (>bus_list);
 }
@@ -2497,6 +2497,14 @@ struct usb_hcd *usb_create_shared_hcd(const struct 
hc_driver *driver,
return NULL;
}
if (primary_hcd == NULL) {
+   hcd->address0_mutex = kmalloc(sizeof(*hcd->address0_mutex),
+   GFP_KERNEL);
+   if (!hcd->address0_mutex) {
+   kfree(hcd);
+   dev_dbg(dev, "hcd address0 mutex alloc failed\n");
+   return NULL;
+   }
+   mutex_init(hcd->address0_mutex);
hcd->bandwidth_mutex = kmalloc(sizeof(*hcd->bandwidth_mutex),
GFP_KERNEL);
if (!hcd->bandwidth_mutex) {
@@ -2508,6 +2516,7 @@ struct usb_hcd *usb_create_shared_hcd(const struct 
hc_driver *driver,
dev_set_drvdata(dev, hcd);
} else {
mutex_lock(_port_peer_mutex);
+   hcd->address0_mutex = primary_hcd->address0_mutex;
hcd->bandwidth_mutex = primary_hcd->bandwidth_mutex;
hcd->primary_hcd = primary_hcd;
primary_hcd->primary_hcd = primary_hcd;
@@ -2574,8 +2583,10 @@ static void hcd_release(struct kref *kref)
struct usb_hcd *hcd = container_of (kref, struct usb_hcd, kref);
 
mutex_lock(_port_peer_mutex);
-   if (usb_hcd_is_primary_hcd(hcd))
+   if (usb_hcd_is_primary_hcd(hcd)) {
+   kfree(hcd->address0_mutex);
kfree(hcd->bandwidth_mutex);
+   }
if (hcd->shared_hcd) {
struct usb_hcd *peer = hcd->shared_hcd;
 
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 350dcd9af5d8..72ee2338bde0 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -2066,7 +2066,7 @@ static void 

[PATCH] usb: core: hub: hub_port_init lock controller instead of bus

2016-02-08 Thread Chris Bainbridge
The XHCI controller presents two USB buses to the system - one for USB 2
and one for USB 3. When only one bus is locked there is a race condition
with two threads in hub_port_init:

[8.984500] Call Trace:
[8.985698]  [] schedule+0x37/0x90
[8.986918]  [] usb_kill_urb+0x8d/0xd0
[8.988130]  [] ? wake_up_atomic_t+0x30/0x30
[8.989343]  [] usb_start_wait_urb+0xbe/0x150
[8.990561]  [] usb_control_msg+0xbc/0xf0
[8.991766]  [] hub_port_init+0x51e/0xb70
[8.992964]  [] hub_event+0x817/0x1570
[8.994156]  [] process_one_work+0x1ff/0x620
[8.995342]  [] ? process_one_work+0x15f/0x620
[8.996528]  [] worker_thread+0x64/0x4b0
[8.997707]  [] ? rescuer_thread+0x390/0x390
[8.998883]  [] kthread+0x105/0x120
[9.56]  [] ? kthread_create_on_node+0x200/0x200
[9.001241]  [] ret_from_fork+0x3f/0x70
[9.002420]  [] ? kthread_create_on_node+0x200/0x200

[9.870837] Call Trace:
[9.875664]  [] xhci_setup_device+0x53d/0xa40
[9.876871]  [] xhci_address_device+0xe/0x10
[9.878068]  [] hub_port_init+0x1bf/0xb70
[9.879262]  [] ? trace_hardirqs_on+0xd/0x10
[9.880465]  [] hub_event+0x817/0x1570
[9.881668]  [] process_one_work+0x1ff/0x620
[9.882869]  [] ? process_one_work+0x15f/0x620
[9.884074]  [] worker_thread+0x64/0x4b0
[9.885268]  [] ? rescuer_thread+0x390/0x390
[9.886457]  [] kthread+0x105/0x120
[9.887634]  [] ? kthread_create_on_node+0x200/0x200
[9.17]  [] ret_from_fork+0x3f/0x70
[9.889995]  [] ? kthread_create_on_node+0x200/0x200

Which results from the two call chains:

hub_port_init
 usb_get_device_descriptor
  usb_get_descriptor
   usb_control_msg
usb_internal_control_msg
 usb_start_wait_urb
  usb_submit_urb / wait_for_completion_timeout / usb_kill_urb

hub_port_init
 hub_set_address
  xhci_address_device
   xhci_setup_device

The logged kernel errors are:

[8.034843] xhci_hcd :00:14.0: Timeout while waiting for setup device 
command
[   13.183701] usb 3-3: device descriptor read/all, error -110

On a test system this failure occurred on 6% of all boots.

Hypothetically, it should perhaps be possible to set the address of the
hub on one bus while the hub on the other bus already has an outstanding
descriptor read request. But as this is not working reliably, fix it by
locking the controller in hub_port_init to prevent parallel
initialisation of both buses.

Fixes: 638139eb95d2 ("usb: hub: allow to process more usb hub events in 
parallel")
Signed-off-by: Chris Bainbridge 
---
 drivers/usb/core/hcd.c  | 15 +--
 drivers/usb/core/hub.c  |  8 
 include/linux/usb.h |  3 +--
 include/linux/usb/hcd.h |  1 +
 4 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index df0e3b92533a..5824e819176a 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -966,7 +966,7 @@ static void usb_bus_init (struct usb_bus *bus)
bus->bandwidth_allocated = 0;
bus->bandwidth_int_reqs  = 0;
bus->bandwidth_isoc_reqs = 0;
-   mutex_init(>usb_address0_mutex);
+   mutex_init(>devnum_next_mutex);
 
INIT_LIST_HEAD (>bus_list);
 }
@@ -2497,6 +2497,14 @@ struct usb_hcd *usb_create_shared_hcd(const struct 
hc_driver *driver,
return NULL;
}
if (primary_hcd == NULL) {
+   hcd->address0_mutex = kmalloc(sizeof(*hcd->address0_mutex),
+   GFP_KERNEL);
+   if (!hcd->address0_mutex) {
+   kfree(hcd);
+   dev_dbg(dev, "hcd address0 mutex alloc failed\n");
+   return NULL;
+   }
+   mutex_init(hcd->address0_mutex);
hcd->bandwidth_mutex = kmalloc(sizeof(*hcd->bandwidth_mutex),
GFP_KERNEL);
if (!hcd->bandwidth_mutex) {
@@ -2508,6 +2516,7 @@ struct usb_hcd *usb_create_shared_hcd(const struct 
hc_driver *driver,
dev_set_drvdata(dev, hcd);
} else {
mutex_lock(_port_peer_mutex);
+   hcd->address0_mutex = primary_hcd->address0_mutex;
hcd->bandwidth_mutex = primary_hcd->bandwidth_mutex;
hcd->primary_hcd = primary_hcd;
primary_hcd->primary_hcd = primary_hcd;
@@ -2574,8 +2583,10 @@ static void hcd_release(struct kref *kref)
struct usb_hcd *hcd = container_of (kref, struct usb_hcd, kref);
 
mutex_lock(_port_peer_mutex);
-   if (usb_hcd_is_primary_hcd(hcd))
+   if (usb_hcd_is_primary_hcd(hcd)) {
+   kfree(hcd->address0_mutex);
kfree(hcd->bandwidth_mutex);
+   }
if (hcd->shared_hcd) {
struct usb_hcd *peer = hcd->shared_hcd;
 
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 350dcd9af5d8..72ee2338bde0 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@