Hi Roger Quadros <rog...@ti.com> writes: >>>> For something that simple, we wouldn't even need to use OTG FSM layer >>>> because that brings no benefit for such a simple requirement. >>> >>> no no. I think you got it wrong. I'm not using the OTG FSM layer at all :). >> >> what are all the otg_fsm mentions then? Also you have: >> >> + struct usb_otg otg; >> + struct otg_fsm otg_fsm; >> > > I'll get rid of them. They aren't really needed. > Now I see why you thought I was using the otg_fsm layer. :)
fair enough >>>> Can you either confirm or refute the statement above? >>> >>> The real problem was that if host adapter was removed during a system >>> suspend >>> then while resuming XHCI was locking up like below. This is probably because >>> we're trying to remove/Halt the HCD in the otg_irq_handler before XHCI >>> driver has resumed? >>> >>> How can we ensure that we call dwc3_host_exit() only *after* XHCI driver >>> has resumed? >> >> Well, xHCI is our child, so driver model should *already* be >> guaranteeing that, no? Specially since you're calling this from >> ->complete() which happens after ->resume() has been called for the >> entire tree. It's true, however, that dwc3's ->complete() will be called >> before xhci's ->complete(). Is this the problem you're pointing at? Or >> do you mean xHCI is runtime suspended (or runtime resuming) while you >> call dwc3_host_exit()? If that's the case, then there's a bug in xHCI >> itself. > > Yes dwc3->complete() being called before xhci->complete(), and so HCD being > removed before xhci->complete() causes the lockup. > > It could be a bug in xHCI driver as well. I see... >>> We need a way to mask the OTG events without loosing them while they are >>> masked. >>> Do you know how that could be achieved? >> >> masking doesn't clear events. It just masks them. Look at gadget.c for >> how we do it. Basically, the code we have here is racy, really racy and >> will cause hard-to-debug lockups. Your code can only work with >> IRQF_ONESHOT, which we don't want to add back. >> >> In any case, to mask events, you set BIT 31 in GEVNTSIZ register. Just >> copy it from gadget.c ;-) > > Isn't GEVNTSIZ specific to device side? We're talking about the OTG block > here. that's true, sorry. > Are you sure that setting bit 31 of GEVNTSIZ will mask OTG_irq? I don't think > so. > > Note that OTG_IRQ is a separate IRQ line than PERIPHERAL_IRQ. man, there's really no way to mask OTG events. This can be bad :-) John, can you confirm if there's really no way of masking OTG events without loosing them? >>>>> + /* OEVTEN = 0 */ >>>>> + dwc3_otg_disable_events(dwc, DWC3_OTG_ALL_EVENTS); >>>>> + /* OEVTEN.ConIDStsChngEn = 1. Instead we enable all events */ >>>>> + dwc3_otg_enable_events(dwc, DWC3_OTG_ALL_EVENTS); >>>> >>>> oh, disable everything and enable everything right after. What gives? >>> >>> I did this following Fig 11.4. But there there don't enable all events, >>> so it was a good idea to be on a clean slate by disabling all events first >>> and then only enabling selected events. >>> >>> In any case I think it is good practice. i.e. clear before OR operation? >>> FYI. dwc3_otg_enable_events doesn't clear the events that are not enabled so >>> if some old event not part of enable_mask was enabled it will stay enabled. >> >> can't this result in IRQ triggering forever and us not handling it? ;-) > > Why should enabling events trigger IRQ? IRQ will trigger only when the > event actually happens no? heh, right :-) What I mean is that you might enable an interrupt event which you don't clear, because you don't support it or don't handle it, or whatever. Reserved bits might become non-reserved in the future and so on. >>>>> + /* start the xHCI host driver */ >>>>> + if (!skip) { >>>> >>>> when would skip be true? >>>> >>> >>> only during system resume. >> >> hmmm, is there a reason for that? I mean, could we live without it for >> the time being? Seems like all this achieves is avoiding reenumeration >> of some devices during resume. Do we care from a starting >> implementation? > > At least on AM43x, it was required. without that USB devices plugged in > before a system suspend were lost after resume. > > I agree on dropping this for now and adding it later. looks like we have another problem which needs to be investigated ;-) >>>>> + dwc3_writel(dwc->regs, DWC3_GCTL, reg); >>>>> + >>>>> + /* start the Peripheral driver */ >>>>> + if (dwc->gadget_driver) { >>>>> + __dwc3_gadget_start(dwc); >>>>> + if (dwc->gadget_pullup) >>>>> + dwc3_gadget_run_stop(dwc, true, false); >>>> >>>> why don't you add/remove the UDC just like you do for the HCD? (you >>>> wouldn't add/remove a device, but rather call >>>> usb_del_gadget_udc()/usb_add_gadget_udc() directly. Would that clean up >>>> some of this? >>> >>> It causes more problems than solving anything. >>> e.g. An already loaded gadget driver will have to be manually removed and >>> re-loaded to >>> work after a peripheral to host to peripheral mode switch. >> >> is that really still true? When we remove the UDC, the currently-loaded >> gadget driver will be moved to the pending list. Once a UDC is added >> back, udc-core will bind it again to the UDC. >> > > OK. I need to test this again. As you say, the issue might already have been > fixed. > Good to know that. okay >>>>> + irq_set_status_flags(dwc->otg_irq, IRQ_NOAUTOEN); >>>> >>>> why? >>> >>> I don't know how to fix this. I have to do this because dwc3_omap is doing >>> it >>> on the shared IRQ line and the flags must match if we need to share it. >> >> hmmm... Then why does dwc_omap IRQ have IRQ_NOAUTOEN and otg_irq doesn't? > > We're setting IRQ_NOAUTOEN for otg_irq above. > But the problem is that other platforms might not have this set so it will > break there. exactly :-) > Need to think of a better way how to tackle this. okay -- balbi