David Brownell wrote: > On Wednesday 07 March 2007 1:10 pm, [EMAIL PROTECTED] wrote: >> >> OK, I've made an attempt to solve this issue without using such a >> large hammer. Instead of shutting off the entire periodic schedule >> during cpufreq changes, this patch will just inactivate (using the >> "I" >> bit in >> qh->hw_info1, currently unused in the EHCI driver) any queue heads >> qh->that >> are to full/low speed interrupt endpoints, during the cpufreq >> changes. > > Makes sense ... although I don't trust your safe_to_inactivate() > logic, that's inherently racey. You could check to see if the > transaction has started, and decide it didn't (and thus set the I > bit) just when the controller starts that transaction. The race > seems unavoidable > > I had way too much ... "fun" isn't quite the word ... chasing issues > like that with fault handling in the async ring. Such issues might > be less common with the periodic schedule; maybe. When the driver > loses that race, this comes up as an MMF error right? >
I think I am avoiding a race with the controller here, by not checking QTD_STS_STS until the second uframe after the controller should have started the split transaction. If it HAD started the split transaction, the controller should have written QTD_STS_STS back at the end of the uframe in which it started it. Originally I was checking QTD_STS_STS the uframe immediately after it should have started the transaction, but there was a tiny race there, though it usually took >12 hours running on my system to show up. I'm also not setting the "I" bit in any uframes that might already be cached, or the uframe before that (i_thresh is set to the number of cached uframes + 2). So as long as the code can set the "I" bit within one uframe after deciding that it's safe to do so, I don't think there should be a race. And, since interrupts are masked, and the CPU isn't changing frequency, I would expect any system with an EHCI controller to be able to do some minor math and set a bit within 125us. > >> I have found that the ICH7 works without this patch, because it >> caches 7 uframes of the periodic schedule, so it never has to read >> main >> memory in order to complete a split transaction in time. The >> Broadcom/Serverworks HT1000 only caches 1 uframe, so it fails >> without this patch. > > I've always wondered how that caching is supposed to work. I suspect > that you could make the ICH7 fail too, if the periodic schedule were > busy enough ... it's not like they'll allocate enough silicon to > cache every possible QH in SRAM inside that controller! That could > mean a LOT of memory for caching. Right? > > Not to poke too much at all your good work here, but how would ICH7 > fail in that case, given the logic in this patch? > Well, right now it would fail just like 2.6.20, because the patch won't even register the cpufreq notifier if the EHCI controller says it caches 7 or more uframes! :) That is, it would get an MMF error and return -EPROTO. But, if we did register the cpufreq notifier even for the ICH7, I would expect it not to fail regardless of whether it was caching or not, since any split interrupt transactions would be inactivated before any CPU frequency changes. When the controller is caching 7 uframes, there's a smaller window during which safe_to_inactivate() will return 1, but there is still a window, unless you have a device with a polling period of 1 frame (in which case the code will just leave the transaction active, and we'll probably still get an MMF error). > >> I have tested this code on both the HT1000 and ICH7 (I tested the >> patch on the ICH7 with the cpufreq notifier registered, too, even >> though the patch won't even register it, just to make sure it would >> work correctly). > > How heavily were your periodic schedules loaded? The > /sys/class/usb_host debug file labeled "periodic" should provide a > textual display. It would be good to test both hardware configs with > a "lot" of USB mice hooked up. > > Not that the current EHCI periodic scheduler handles very many, but > ISTR that three full/low speed devices per TT was realistic. And > high speed hubs with multiple TTs are easy enough to find, given that > Cypress chip. At Dell, I expect you have one or two extra USB mice > and keyboards lying around that could be used to try abusing your > patch. :) > Yes, I'll load these systems up with lots of USB keyboards and mice, and re-run. I was able to see failures pretty often with no patch, or with a patch that wasn't quite working, though, with the configs I have--I've been testing with 3-4 keyboards/mice behind hub(s). > >> This patch adds more code, but it avoids a couple of the issues that >> were brought up with the previous patch: >> >> 1--This patch doesn't mess with isochronous transfers (or high speed >> interrupt transfers) at all. > > That seems to be a good resolution in both cases. ISO transfers are > "best effort" in any case, and you hadn't identified issues except > with full/low speed interrupt transfers. > > >> 2--The notifier will take very little time unless the cpufreq change >> happens while a split transaction is in progress (or is already >> cached >> in the controller). > > Nice! > > >> I can repost it with a more concise summarization of the problem/fix >> and a signed-off-by if it looks ok. > > It basically looks OK. Please start adopting the "func(parms)" style > in that code, vs "func (parms)" ... IMO it's less readable, but it is > what Documentation/CodingStyle says we now prefer. Also, don't > forget to have a brief comment in the source code saying what the > cpufreq notifier is trying to avoid. (ISTR: MMF failures caused by > DMA not being able to access memory while it's being reclocked.) > > - Dave I was wondering about that, but used "func (parms)" because I was trying to match the rest of the code in the ehci driver. I can change that. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel