Hi,
John Stultz wrote:
> On Fri, Jun 28, 2019 at 3:58 PM Sasha Levin <[email protected]> wrote:
>> On Fri, Jun 28, 2019 at 06:24:04PM +0000, John Stultz wrote:
>>> With recent changes in AOSP, adb is using asynchronous io, which
>>> causes the following crash usually on a reboot:
>>>
>>> [ 184.278302] BUG: scheduling while atomic: ksoftirqd/0/9/0x00000104
>>> [ 184.284617] Modules linked in: wl18xx wlcore snd_soc_hdmi_codec
>>> wlcore_sdio tcpci_rt1711h tcpci tcpm typec adv7511 cec dwc3 phy_hi3660_usb3
>>> snd_soc_simple_card snd_soc_a
>>> [ 184.316034] Preemption disabled at:
>>> [ 184.316072] [<ffffff8008081de4>] __do_softirq+0x64/0x398
>>> [ 184.324953] CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G S
>>> 4.19.43-00669-g8e4970572c43-dirty #356
>>> [ 184.334963] Hardware name: HiKey960 (DT)
>>> [ 184.338892] Call trace:
>>> [ 184.341352] dump_backtrace+0x0/0x158
>>> [ 184.345025] show_stack+0x14/0x20
>>> [ 184.348355] dump_stack+0x80/0xa4
>>> [ 184.351685] __schedule_bug+0x6c/0xc0
>>> [ 184.355363] __schedule+0x64c/0x978
>>> [ 184.358863] schedule+0x2c/0x90
>>> [ 184.362053] dwc3_gadget_ep_dequeue+0x274/0x388 [dwc3]
>>> [ 184.367210] usb_ep_dequeue+0x24/0xf8
>>> [ 184.370884] ffs_aio_cancel+0x3c/0x80
>>> [ 184.374561] free_ioctx_users+0x40/0x148
>>> [ 184.378500] percpu_ref_switch_to_atomic_rcu+0x180/0x1c0
>>> [ 184.383830] rcu_process_callbacks+0x24c/0x5d8
>>> [ 184.388283] __do_softirq+0x13c/0x398
>>> [ 184.391959] run_ksoftirqd+0x3c/0x48
>>> [ 184.395549] smpboot_thread_fn+0x220/0x288
>>> [ 184.399660] kthread+0x12c/0x130
>>> [ 184.402901] ret_from_fork+0x10/0x1c
>>>
>>>
>>> This happens as usb_ep_dequeue can be called in interrupt
>>> context, and dwc3_gadget_ep_dequeue() then calls
>>> wait_event_lock_irq() which can sleep.
>>>
>>> Upstream kernels are not affected due to the change
>>> fec9095bdef4 ("dwc3: gadget: remove wait_end_transfer") which
>>> removes the wait_even_lock_irq code. Unfortunately that change
>>> has a number of dependencies, which I'm submitting here.
>>>
>>> Also, to match upstream, in this series I've reverted one
>>> change that was backported to -stable, to replace it with the
>>> cherry-picked upstream commit (as the dependencies are now
>>> there)
>>>
>>> This issue also affects 4.14,4.9 and I believe 4.4 kernels,
>>> however I don't know how to best backport this functionality
>>> that far back. Help from the maintainers would be very much
>>> appreciated!
>>>
>>>
>>> New in v2:
>>> * Reordered the patchset to put the revert patch first, which
>>> avoids any bisection build issues. (Thanks to Jack Pham for
>>> the suggestion!)
>>>
>>>
>>> Feedback and comments would be welcome!
>> I've queued it up for 4.19.
>>
>> Is it the case that for older kernels the dependency list is too long?
> Yea. It gets ugly and I'm not enough of an expert on the driver to
> feel comfortable knowing if I'm doing the right thing reworking this
> stack onto an even older tree.
>
> But I do see crashes on reboot w/ 4.14 and 4.9 (I and suspect 4.4 as
> well), so I'll need to figure out something eventually.
>
>
If you're backporting this series, then you also need to apply these
fixes for this series:
This fixes a race issue:
c5353b225df9 ("usb: dwc3: gadget: don't enable interrupt when disabling
endpoint")
This fixes incorrect TRB skip:
c7152763f02e ("usb: dwc3: Reset num_trbs after skipping")
BR,
Thinh