Re: usb: dwc2: crash regression in commit 3bc04e28a030 (bisected)

2018-07-03 Thread Antti Seppälä
On 3 July 2018 at 09:56, Doug Anderson  wrote:
> Hi,
>
> On Sun, Jul 1, 2018 at 11:30 PM, Antti Seppälä  wrote:
>> On 30 June 2018 at 02:57, Doug Anderson  wrote:
>>> Hi,
>>>
>>> On Fri, Jun 29, 2018 at 11:29 AM, Antti Seppälä  wrote:
 Hi Doug, John and linux-usb.

 I'd like to report a regression in commit 3bc04e28a030 (usb: dwc2:
 host: Get aligned DMA in a more supported way)
>>>
>>> Seems unlikely, but any chance that
>>>  helps you?
>>>
>>
>> Thank you for your suggestion but unfortunately the patch does not
>> help and the crash remains.
>
> A few more shots in the dark in case they help:
>
> 1. For the kmalloc() in dwc2_alloc_dma_aligned_buffer(), change from:
>
> kmalloc_ptr = kmalloc(kmalloc_size, mem_flags);
>
> to:
>
> kmalloc_ptr = kmalloc(kmalloc_size, mem_flags | GFP_DMA);
>
>
> The old code used to hardcode this, so maybe it somehow makes a difference?
>

I tried this but it did not have any effect on the crash.

> ---
>
> 2. Change DWC2_USB_DMA_ALIGN to a larger size.  Maybe 32 or 64?
>

I tried these values too but they did not help. It seems to me that
these change the alignment of the start of the struct
dma_aligned_buffer but since the issue is with the alignment of ->data
inside that struct it is always moved to an unaligned location (offset
by the two pointers at the start of the struct).

There is even a mention of this requirement in Documentation/DMA-API.txt:

  Memory coherency operates at a granularity called the cache
  line width.  In order for memory mapped by this API to operate
  correctly, the mapped region must begin exactly on a cache line
  boundary and end exactly on one (to prevent two separately mapped
  regions from sharing a single cache line).  Since the cache line size
  may not be known at compile time, the API will not enforce this
  requirement.  Therefore, it is recommended that driver writers who
  don't take special care to determine the cache line size at run time
  only map virtual regions that begin and end on page boundaries (which
  are guaranteed also to be cache line boundaries).


> ---
>
> 3. Undo just the part of the patch that removed:
>
> /*
>  * Clip max_transfer_size to 65535. dwc2_hc_setup_align_buf() allocates
>  * coherent buffers with this size, and if it's too large we can
>  * exhaust the coherent DMA pool.
>  */
> if (hw->max_transfer_size > 65535)
>   hw->max_transfer_size = 65535;
>
> Maybe there's something on your platform where you have a problem with
> big transfers?
>

I tried adding back this one as well but unfortunately no help from that either.

Then I wrote a proof-of-concept patch that really solves the issue for me.

The patch allocates additional space at the end of that temp bounce
buffer and stores the original value of urb->transfer_buffer there for
keeping until it is needed again.

This way the bounce buffer for DMA is aligned on a page boundary that
is compliant from DMA-API.txt perspective.

Any thoughts or comments? If the patch doesn't look too crazy I can
send it properly with git.

-- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -2606,34 +2606,28 @@

 #define DWC2_USB_DMA_ALIGN 4

-struct dma_aligned_buffer {
-void *kmalloc_ptr;
-void *old_xfer_buffer;
-u8 data[0];
-};
-
 static void dwc2_free_dma_aligned_buffer(struct urb *urb)
 {
-struct dma_aligned_buffer *temp;
+void *stored_xfer_buffer;

 if (!(urb->transfer_flags & URB_ALIGNED_TEMP_BUFFER))
 return;

-temp = container_of(urb->transfer_buffer,
-struct dma_aligned_buffer, data);
+memcpy(&stored_xfer_buffer, urb->transfer_buffer +
+urb->transfer_buffer_length, sizeof(void *));

 if (usb_urb_dir_in(urb))
-memcpy(temp->old_xfer_buffer, temp->data,
+memcpy(stored_xfer_buffer, urb->transfer_buffer,
urb->transfer_buffer_length);
-urb->transfer_buffer = temp->old_xfer_buffer;
-kfree(temp->kmalloc_ptr);
+kfree(urb->transfer_buffer);
+urb->transfer_buffer = stored_xfer_buffer;

 urb->transfer_flags &= ~URB_ALIGNED_TEMP_BUFFER;
 }

 static int dwc2_alloc_dma_aligned_buffer(struct urb *urb, gfp_t mem_flags)
 {
-struct dma_aligned_buffer *temp, *kmalloc_ptr;
+void *kmalloc_ptr;
 size_t kmalloc_size;

 if (urb->num_sgs || urb->sg ||
@@ -2641,22 +2635,21 @@
 !((uintptr_t)urb->transfer_buffer & (DWC2_USB_DMA_ALIGN - 1)))
 return 0;

-/* Allocate a buffer with enough padding for alignment */
-kmalloc_size = urb->transfer_buffer_length +
-sizeof(struct dma_aligned_buffer) + DWC2_USB_DMA_ALIGN - 1;
+/* Allocate a buffer with enough padding for original pointer */
+kmalloc_size = urb->transfer_buffer_length + sizeof(void *);

 kmalloc_ptr = kmalloc(kmalloc_size, mem_

Re: usb: dwc2: crash regression in commit 3bc04e28a030 (bisected)

2018-07-02 Thread Doug Anderson
Hi,

On Sun, Jul 1, 2018 at 11:30 PM, Antti Seppälä  wrote:
> On 30 June 2018 at 02:57, Doug Anderson  wrote:
>> Hi,
>>
>> On Fri, Jun 29, 2018 at 11:29 AM, Antti Seppälä  wrote:
>>> Hi Doug, John and linux-usb.
>>>
>>> I'd like to report a regression in commit 3bc04e28a030 (usb: dwc2:
>>> host: Get aligned DMA in a more supported way)
>>
>> Seems unlikely, but any chance that
>>  helps you?
>>
>
> Thank you for your suggestion but unfortunately the patch does not
> help and the crash remains.

A few more shots in the dark in case they help:

1. For the kmalloc() in dwc2_alloc_dma_aligned_buffer(), change from:

kmalloc_ptr = kmalloc(kmalloc_size, mem_flags);

to:

kmalloc_ptr = kmalloc(kmalloc_size, mem_flags | GFP_DMA);


The old code used to hardcode this, so maybe it somehow makes a difference?

---

2. Change DWC2_USB_DMA_ALIGN to a larger size.  Maybe 32 or 64?

---

3. Undo just the part of the patch that removed:

/*
 * Clip max_transfer_size to 65535. dwc2_hc_setup_align_buf() allocates
 * coherent buffers with this size, and if it's too large we can
 * exhaust the coherent DMA pool.
 */
if (hw->max_transfer_size > 65535)
  hw->max_transfer_size = 65535;

Maybe there's something on your platform where you have a problem with
big transfers?

===

It feels like there's something unique about your setup since I
haven't heard about this crash and that patch is 1.5 years old.
Certainly it could be the MIPS / Big Endian, but I suppose it could
also be the driver for the USB device you're plugging in.  Any chance
the new code is just tickling a problem in that driver?  Can you
reproduce similar crashes with any other USB devices on your platform?


-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: usb: dwc2: crash regression in commit 3bc04e28a030 (bisected)

2018-07-01 Thread Antti Seppälä
On 30 June 2018 at 02:57, Doug Anderson  wrote:
> Hi,
>
> On Fri, Jun 29, 2018 at 11:29 AM, Antti Seppälä  wrote:
>> Hi Doug, John and linux-usb.
>>
>> I'd like to report a regression in commit 3bc04e28a030 (usb: dwc2:
>> host: Get aligned DMA in a more supported way)
>
> Seems unlikely, but any chance that
>  helps you?
>

Thank you for your suggestion but unfortunately the patch does not
help and the crash remains.

>
>> Apparently the patch does something nasty that the Lantiq platform
>> really does not like as whenever I plug my usb 3g modem into the usb
>> port of my BT HomeHub v5 (Lantiq XRX200) I get a kernel
>> oops/crash/panic with usually quite a weird content that looks like
>> some sort of memory corruption.
>>
>> I've bisected the crash and reverting 3bc04e28a030 allows the 3g-modem
>> to be plugged and the kernel does not crash.
>>
>> Below is the console log when I plug the modem in. I used stable
>> vanilla kernel 4.9.109 from OpenWrt during my tests with dwc2 debug
>> prints enabled:
>>
>> root@lantiq:/# echo -n "module dwc2 +p" >
>> /sys/kernel/debug/dynamic_debug/control
>> [   92.563454] dwc2 1e101000.ifxhcd: gintsts=0521  gintmsk=f3000806
>> [   92.568762] dwc2 1e101000.ifxhcd: DWC OTG HCD HUB STATUS DATA: Root
>> port status changed
>> [   92.576447] dwc2 1e101000.ifxhcd:   port_connect_status_change: 1
>> [   92.582523] dwc2 1e101000.ifxhcd:   port_reset_change: 0
>> [   92.587830] dwc2 1e101000.ifxhcd:   port_enable_change: 0
>> [   92.593228] dwc2 1e101000.ifxhcd:   port_suspend_change: 0
>> [   92.598710] dwc2 1e101000.ifxhcd:   port_over_current_change: 0
>> [   92.607242] dwc2 1e101000.ifxhcd: ClearPortFeature 
>> USB_PORT_FEAT_C_CONNECTION
>> [   92.758240] dwc2 1e101000.ifxhcd: SetPortFeature
>> [   92.761535] dwc2 1e101000.ifxhcd: SetPortFeature - USB_PORT_FEAT_RESET
>> [   92.768063] dwc2 1e101000.ifxhcd: In host mode, hprt0=00021501
>> [   92.841013] dwc2 1e101000.ifxhcd: gintsts=0529  gintmsk=f3000806
>> [   92.905329] dwc2 1e101000.ifxhcd: ClearPortFeature USB_PORT_FEAT_C_RESET
>> [   92.968536] usb 1-1: new high-speed USB device number 2 using dwc2
>> [   92.975029] dwc2 1e101000.ifxhcd: SetPortFeature
>> [   92.978358] dwc2 1e101000.ifxhcd: SetPortFeature - USB_PORT_FEAT_RESET
>> [   92.984837] dwc2 1e101000.ifxhcd: In host mode, hprt0=1101
>> [   92.990674] dwc2 1e101000.ifxhcd: gintsts=0529  gintmsk=f3000806
>> [   93.060349] dwc2 1e101000.ifxhcd: DWC OTG HCD HUB STATUS DATA: Root
>> port status changed
>> [   93.067020] dwc2 1e101000.ifxhcd:   port_connect_status_change: 0
>> [   93.073099] dwc2 1e101000.ifxhcd:   port_reset_change: 0
>> [   93.078375] dwc2 1e101000.ifxhcd:   port_enable_change: 1
>> [   93.083766] dwc2 1e101000.ifxhcd:   port_suspend_change: 0
>> [   93.089252] dwc2 1e101000.ifxhcd:   port_over_current_change: 0
>> [   93.096284] dwc2 1e101000.ifxhcd: gintsts=0529  gintmsk=f3000806
>> [   93.152672] dwc2 1e101000.ifxhcd: ClearPortFeature USB_PORT_FEAT_C_RESET
>> [   93.216952] dwc2 1e101000.ifxhcd: DWC OTG HCD EP DISABLE:
>> bEndpointAddress=0x00, ep->hcpriv=86e4fa00
>> [   93.224792] dwc2 1e101000.ifxhcd: DWC OTG HCD EP DISABLE:
>> bEndpointAddress=0x00, ep->hcpriv=  (null)
>> [   93.233926] dwc2 1e101000.ifxhcd: DWC OTG HCD EP RESET: 
>> bEndpointAddress=0x00
>> [   93.268547] dwc2 1e101000.ifxhcd: DWC OTG HCD EP RESET: 
>> bEndpointAddress=0x81
>> [   93.274399] dwc2 1e101000.ifxhcd: DWC OTG HCD EP RESET: 
>> bEndpointAddress=0x01
>> [   93.307463] usb-storage 1-1:1.0: USB Mass Storage device detected
>> [   93.312238] dwc2 1e101000.ifxhcd: DWC OTG HCD HUB STATUS DATA: Root
>> port status changed
>> [   93.312256] dwc2 1e101000.ifxhcd:   port_connect_status_change: 0
>> [   93.312270] dwc2 1e101000.ifxhcd:   port_reset_change: 0
>> [   93.312329] dwc2 1e101000.ifxhcd:   port_enable_change: 1
>> [   93.312342] dwc2 1e101000.ifxhcd:   port_suspend_change: 0
>> [   93.312356] dwc2 1e101000.ifxhcd:   port_over_current_change: 0
>> [   93.408514] scsi host0: usb-storage 1-1:1.0
>> [   93.437010] dwc2 1e101000.ifxhcd: ClearPortFeature USB_PORT_FEAT_C_ENABLE
>> [   94.152597] dwc2 1e101000.ifxhcd: DWC OTG HCD EP RESET: 
>> bEndpointAddress=0x01
>> [   94.166421] dwc2 1e101000.ifxhcd: gintsts=2529  gintmsk=f3000806
>> [   94.171336] dwc2 1e101000.ifxhcd: ++Disconnect Detected Interrupt++
>> (Host) a_host
>> [   94.180561] dwc2 1e101000.ifxhcd: DWC OTG HCD EP RESET: 
>> bEndpointAddress=0x01
>> [   94.186473] dwc2 1e101000.ifxhcd: Not connected
>> [   94.300415] dwc2 1e101000.ifxhcd: DWC OTG HCD HUB STATUS DATA: Root
>> port status changed
>> [   94.307074] dwc2 1e101000.ifxhcd:   port_connect_status_change: 1
>> [   94.313203] dwc2 1e101000.ifxhcd:   port_reset_change: 0
>> [   94.318467] dwc2 1e101000.ifxhcd:   port_enable_change: 1
>> [   94.323858] dwc2 1e101000.ifxhcd:   port_suspend_change: 0
>> [   94.329344] dwc2 1e101000.ifxhcd:   port_over_current_change: 0
>> [   94.336998] 

Re: usb: dwc2: crash regression in commit 3bc04e28a030 (bisected)

2018-06-29 Thread Doug Anderson
Hi,

On Fri, Jun 29, 2018 at 11:29 AM, Antti Seppälä  wrote:
> Hi Doug, John and linux-usb.
>
> I'd like to report a regression in commit 3bc04e28a030 (usb: dwc2:
> host: Get aligned DMA in a more supported way)

Seems unlikely, but any chance that
 helps you?


> Apparently the patch does something nasty that the Lantiq platform
> really does not like as whenever I plug my usb 3g modem into the usb
> port of my BT HomeHub v5 (Lantiq XRX200) I get a kernel
> oops/crash/panic with usually quite a weird content that looks like
> some sort of memory corruption.
>
> I've bisected the crash and reverting 3bc04e28a030 allows the 3g-modem
> to be plugged and the kernel does not crash.
>
> Below is the console log when I plug the modem in. I used stable
> vanilla kernel 4.9.109 from OpenWrt during my tests with dwc2 debug
> prints enabled:
>
> root@lantiq:/# echo -n "module dwc2 +p" >
> /sys/kernel/debug/dynamic_debug/control
> [   92.563454] dwc2 1e101000.ifxhcd: gintsts=0521  gintmsk=f3000806
> [   92.568762] dwc2 1e101000.ifxhcd: DWC OTG HCD HUB STATUS DATA: Root
> port status changed
> [   92.576447] dwc2 1e101000.ifxhcd:   port_connect_status_change: 1
> [   92.582523] dwc2 1e101000.ifxhcd:   port_reset_change: 0
> [   92.587830] dwc2 1e101000.ifxhcd:   port_enable_change: 0
> [   92.593228] dwc2 1e101000.ifxhcd:   port_suspend_change: 0
> [   92.598710] dwc2 1e101000.ifxhcd:   port_over_current_change: 0
> [   92.607242] dwc2 1e101000.ifxhcd: ClearPortFeature 
> USB_PORT_FEAT_C_CONNECTION
> [   92.758240] dwc2 1e101000.ifxhcd: SetPortFeature
> [   92.761535] dwc2 1e101000.ifxhcd: SetPortFeature - USB_PORT_FEAT_RESET
> [   92.768063] dwc2 1e101000.ifxhcd: In host mode, hprt0=00021501
> [   92.841013] dwc2 1e101000.ifxhcd: gintsts=0529  gintmsk=f3000806
> [   92.905329] dwc2 1e101000.ifxhcd: ClearPortFeature USB_PORT_FEAT_C_RESET
> [   92.968536] usb 1-1: new high-speed USB device number 2 using dwc2
> [   92.975029] dwc2 1e101000.ifxhcd: SetPortFeature
> [   92.978358] dwc2 1e101000.ifxhcd: SetPortFeature - USB_PORT_FEAT_RESET
> [   92.984837] dwc2 1e101000.ifxhcd: In host mode, hprt0=1101
> [   92.990674] dwc2 1e101000.ifxhcd: gintsts=0529  gintmsk=f3000806
> [   93.060349] dwc2 1e101000.ifxhcd: DWC OTG HCD HUB STATUS DATA: Root
> port status changed
> [   93.067020] dwc2 1e101000.ifxhcd:   port_connect_status_change: 0
> [   93.073099] dwc2 1e101000.ifxhcd:   port_reset_change: 0
> [   93.078375] dwc2 1e101000.ifxhcd:   port_enable_change: 1
> [   93.083766] dwc2 1e101000.ifxhcd:   port_suspend_change: 0
> [   93.089252] dwc2 1e101000.ifxhcd:   port_over_current_change: 0
> [   93.096284] dwc2 1e101000.ifxhcd: gintsts=0529  gintmsk=f3000806
> [   93.152672] dwc2 1e101000.ifxhcd: ClearPortFeature USB_PORT_FEAT_C_RESET
> [   93.216952] dwc2 1e101000.ifxhcd: DWC OTG HCD EP DISABLE:
> bEndpointAddress=0x00, ep->hcpriv=86e4fa00
> [   93.224792] dwc2 1e101000.ifxhcd: DWC OTG HCD EP DISABLE:
> bEndpointAddress=0x00, ep->hcpriv=  (null)
> [   93.233926] dwc2 1e101000.ifxhcd: DWC OTG HCD EP RESET: 
> bEndpointAddress=0x00
> [   93.268547] dwc2 1e101000.ifxhcd: DWC OTG HCD EP RESET: 
> bEndpointAddress=0x81
> [   93.274399] dwc2 1e101000.ifxhcd: DWC OTG HCD EP RESET: 
> bEndpointAddress=0x01
> [   93.307463] usb-storage 1-1:1.0: USB Mass Storage device detected
> [   93.312238] dwc2 1e101000.ifxhcd: DWC OTG HCD HUB STATUS DATA: Root
> port status changed
> [   93.312256] dwc2 1e101000.ifxhcd:   port_connect_status_change: 0
> [   93.312270] dwc2 1e101000.ifxhcd:   port_reset_change: 0
> [   93.312329] dwc2 1e101000.ifxhcd:   port_enable_change: 1
> [   93.312342] dwc2 1e101000.ifxhcd:   port_suspend_change: 0
> [   93.312356] dwc2 1e101000.ifxhcd:   port_over_current_change: 0
> [   93.408514] scsi host0: usb-storage 1-1:1.0
> [   93.437010] dwc2 1e101000.ifxhcd: ClearPortFeature USB_PORT_FEAT_C_ENABLE
> [   94.152597] dwc2 1e101000.ifxhcd: DWC OTG HCD EP RESET: 
> bEndpointAddress=0x01
> [   94.166421] dwc2 1e101000.ifxhcd: gintsts=2529  gintmsk=f3000806
> [   94.171336] dwc2 1e101000.ifxhcd: ++Disconnect Detected Interrupt++
> (Host) a_host
> [   94.180561] dwc2 1e101000.ifxhcd: DWC OTG HCD EP RESET: 
> bEndpointAddress=0x01
> [   94.186473] dwc2 1e101000.ifxhcd: Not connected
> [   94.300415] dwc2 1e101000.ifxhcd: DWC OTG HCD HUB STATUS DATA: Root
> port status changed
> [   94.307074] dwc2 1e101000.ifxhcd:   port_connect_status_change: 1
> [   94.313203] dwc2 1e101000.ifxhcd:   port_reset_change: 0
> [   94.318467] dwc2 1e101000.ifxhcd:   port_enable_change: 1
> [   94.323858] dwc2 1e101000.ifxhcd:   port_suspend_change: 0
> [   94.329344] dwc2 1e101000.ifxhcd:   port_over_current_change: 0
> [   94.336998] dwc2 1e101000.ifxhcd: ClearPortFeature 
> USB_PORT_FEAT_C_CONNECTION
> [   94.343368] dwc2 1e101000.ifxhcd: ClearPortFeature USB_PORT_FEAT_C_ENABLE
> [   94.350145] usb 1-1: USB disconnect, device number 2
> [   94.365605] dwc2 1e101000.ifxh