We seem to be getting closer to agreement... On Sat, 19 Feb 2005, David Brownell wrote:
> On Friday 18 February 2005 8:40 am, Alan Stern wrote: > > > > What if CONFIG_USB_SUSPEND isn't set? Yes, I know we're trying to move > > away from this possibility. But for the moment we still need to handle it > > correctly. > > Let me turn that around: how could just _removing_ states from the system > make those full-fledged USB_SUSPEND rules stop applying to the HCD? > > Remember: when USB_SUSPEND isn't set, that just means there's no (selective) > usb suspend. So HCDs would suspend only during system sleep state > transitions. > (Internal low power states don't count as "suspend", and by definition they > shouldn't be visible to usbcore ...) This is a very minor point as far as this thread is concerned. Let's ignore it for now. Maybe if we concentrate on getting everything to work right with CONFIG_USB_SUSPEND set, we'll find nobody needs to leave it unset! > > > And (e) is off the edge of the picture ... there's no active HC, so it > > > has no state known to usbcore. If usbcore has a handle on such a > > > data structure, that's a BUG(). > > > > Maybe there's no state, but there is a data structure. It hangs around > > until the refcount drops to 0, which might not be for some time after the > > bus is deregistered. The existence of pointers to such things isn't a > > BUG, and so long as the pointers exist they can be dereferenced. > > OK, I mis"spoke". If usbcore tries to _use_ such a pointer it's a bug. > Before the HCD deregisters its bus, it had to get rid of all internal > references ... which involved telling usbcore to kick all drivers off > all devices left on the bus. Clearly, the only possible place that > a pointer to the bus could still linger is in a misbehaving driver. > So as I said: If usbcore has such a pointer, it's a BUG(). > > If you want to extend the scope of what you were saying and include > drivers above usbcore, then I'd amend what I said slightly: such > misbehavior from drivers doesn't involve a BUG(), but a KERN_WARNING > message might be appropriate for all operations other than dropping > reference counts. That'd not be a case of usbcore keeping a handle > on such a bus though! It'd be a driver keeping a device pointer after > it returned from disconnect(), and trying to submit an URB through > that bus ... not a mortal sin, but a sin nonetheless. Basically I agree. The pointers will still exist, but it would be a bug if usbcore or a driver used one for anything. After all, drivers aren't supposed to do anything with a device once their disconnect routine has returned. > > > At one point, it was true that all the tests in usbcore could handle > > > the transitions locklessly, because of the nature of the transitions > > > and the fact that the HCDs had explicit tests on the submit path. > > > > Maybe we need to adopt that kind of strategy again. > > If we're not there already ... I don't have time to audit the code > lately. (Though I did just notice that UHCI is missing that test...) UHCI is still missing a lot of things... It's the next big project I want to address, but first I want to clarify (in my own mind if nowhere else) the interactions it has with the glue layer. The two main elements concerned are hcd->state and root hub polling (more on that below). However, if we make sure HCDs use their private spinlocks in the get_frame_number call, we should then be pretty close to having all the pathways protected at the HCD level. Which means usbcore wouldn't have to do its own protection and so could eliminate most uses of hcd->state. Ideally I'd like to see hcd->state removed entirely from usbcore, maybe even from hcd-pci.c. This may not be practical but I bet we can come close. > I think I'm still thinking that this definition of "root hub suspended" > is troublesome. If it's getting any kind of URB, it's not suspended in > the sense that any other USB device would be suspended. That would seem > to be the root of the confusion. Yes. There's more confusion to be found when CONFIG_USB_SUSPEND isn't set, in which case we can have a situation where the root hub is suspended but not "usbcore-suspended", using the terminology we settled on earlier. Things get a lot simpler if we assume CONFIG_USB_SUSPEND is always set. Making that assumption, there's no need for hcd_submit_urb to test hcd->state. Any problem would already be detected higher up. Or if you prefer, the test could be changed into an assertion. > That, and the fact that the root hub timer (possibly needed _only_ for > UHCI at this point) is serving a dual role in that case: both detecting > port change events, and initiating some sort of wakeup. If we could stop > having usbcore muck around in those internal HCD state transitions, things > would probably get a lot simpler. The dual-role aspect won't turn out to be so bad, I think. By the way, I wanted to ask: Do you know which of the HCDs other than UHCI need root hub polling? Can they all rely on interrupts for detecting port status changes? > > Furthermore, what good does it do for the HCD to hold a spinlock while > > setting hcd->state if usbcore doesn't hold the same spinlock while testing > > it? I'm concerned about making usbcore work right -- my assumption is > > that the HCDs are already in good shape. > > I think the HCDs are mostly in good shape, but remember that these > issues are all along the interface between HCDs and usbcore ... and > that what we're discussing here is changes to that interface. > > The good it does for the HCD to hold a spinlock is in conjunction > with that policy I sketched before: the state transitions being > set up so that usbcore can do this locklessly. Should anything > come down from usbcore before the HCD is marked as dead, that'd > just need cleanup. But given the lock, the HCD will know to reject > new requests from other contexts, starting right then; that places > a ceiling on the cleanup that'll be needed. > > That should sound fine except for a certain gaping hole, about > the shape of a "but usbcore doesn't clean them up yet". ;) I agree. The cleanup part can be taken care of. And you seem to be confirming my point that since the HCDs will do all the necessary tests while holding the appropriate locks, usbcore doesn't need to test hcd->state. > That said, I agree that the "gone to heaven" transition after "died" > needs work. Right now it doesn't really work; things stay in limbo. Two features need to be added to the hub driver: a facility for cleaning up after root hubs that have died (remove all the children, deactivate the root hub, and never reactivate it), and a facility for resuming root hubs when needed. With those two things in place we should be okay. > > Well, if CONFIG_USB_SUSPEND isn't set then we certainly _can_ have HCs and > > root hubs suspended without any of their children being suspended. > > So how do we move more quickly to having USB_SUSPEND always be available > when PM is configured? The best way will be to clean up the HCDs (especially UHCI!), the glue layer, and their interaction. I think the last two should be done first, and that's what I'm trying to accomplish here. Adding suspend/resume support to the individual USB device drivers should turn out not to be very hard. Or at least, not so hard that people will need to turn off CONFIG_USB_SUSPEND. > Not that I think we should remove it, but I don't think get_frame_number > gets used outside of debugging ... Linux isn't being particularly paranoid > about synchronizing isochronous transfers with external clocks. (Yet!) > > And in any case, I think usb_get_current_frame_number() should be > checking the state of the device it's passed. The HCD glue checks > the HC state, so it should be mostly OK except for a small SMP race. > I agree there's something to fix in this area though. Simply have the HCD check the state while holding its private lock. That solves the race and means the glue layer doesn't need to check anything. Or again, you could change the usbcore test into an assertion -- but it hardly seems worthwhile since get_frame_number is used so little. > In general I'd be happier to have usbcore out of the business of > adding a stack frame to IRQ handlers; but meanwhile, there's still > the issue of how to make it clean up after dead HCs. That scheme outlined above for the hub driver should take care of the cleanup part. The other side of the coin is initiating the cleanup; there's no reason the HCDs can't do that at the time they realize the HC has died. Then the glue layer's IRQ handler wouldn't have to worry about it. You could probably remove that handler entirely. > > It isn't like the "Unlink after no-IRQ" warning, but it is like #7. > > So now that we're both clear on what's meant ... are you proposing > to remove these integrity assertions? I'm proposing that we try to find a way to do them without using hcd->state. Maybe it's not possible. But if we can get things down to the point where hcd->state is used for nothing but a few assertions, I'll be a lot happier. > Re submitting URBs -- HCDs should be preventing that already, I'd call it > a bug in the HCD if it isn't. Absolutely. So usbcore shouldn't need to make the test. > Re timer polling -- yes, that's a mess, and I seem to recall you were > looking at a way to let HCDs (like OHCI and EHCI) say they don't need > a root hub timer. (Hmm, maybe just add a hc_driver.flags bit to say if > the HCD wants the timer, and some way for HCDs with IRQ-driven root hubs > to tell khubd to check their root hub. It might be more efficient to > have me take a whack at that first, letting you focus on any issues > that turns up with UHCI.) I already wrote a patch for it a couple of months ago. It wasn't complete and it needs to be updated. Here are my current ideas: Add a new spinlock to hcd.c, for locking root hub URBs. Kind of like an HCD's private spinlock, but protecting all root hubs instead of all devices on a particular bus. Not really necessary, but it's cleaner than using hcd_data_lock for that purpose as well as protecting the URB lists. Add to struct usb_hcd: unsigned poll_rh:1; struct urb *status_urb; The timer routine does nothing but test hcd->poll_rh and call usb_hcd_poll_rh_status if the flag is set. HCDs can also call usb_hcd_poll_rh_status directly when their IRQ handler detects a port status change has occurred. usb_hcd_poll_rh_status calls the HCD's status routine, even if no status URB is queued. (If one is queued and the routine returns a positive result, the URB is completed.) Then if hcd->poll_rh is set, the timer is re-enabled for 250 ms later. (As an unrelated side note, when the HCD's status routine sees a port status change and the root hub is suspended, it should then be able to ask the hub driver to resume the root hub.) The timer is initially set when registering the root hub, if poll_rh is on. An HCD can simply clear the flag in its start routine to prevent polling. And it can synchronously delete the timer to prevent polling while the HC is suspended. (It can also clear poll_rh to prevent polling at other times, if it doesn't mind possibly getting one extraneous poll.) The only difficult part of this is knowing which HCDs need polling, and getting them to disable it while they are suspended. (Although since most of the non-PCI HCDs don't yet implement suspend/resume, maybe that's not so hard after all!) > Preventing get_frame_number() ... right now there's a small race that > should appear only on SMP, and that API is hardly ever used, but this > spot could stand some more work. My preferred fix would be making > USB_SUSPEND be the only mode for usbcore when PM is enabled. Making the HCDs use their private lock and check the state will be sufficient for get_frame_number. Alan Stern ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel