Ingo Molnar wrote: >* George Spelvin <[email protected]> wrote: > As a side note: so VMs often want to skip the whole calibration business, > because they are running on a well-calibrated host.
> 1,000 msecs is also an eternity: consider for example the KVM + tools/kvm > based "Clear Containers" feature from Arjan: > ... which boots up a generic Linux kernel to generic Linux user-space in 32 > milliseconds, i.e. it boots in 0.03 seconds (!). Agreed, if you're paravirtualized, you can just pass this stuff in from the host. But there's plenty of hardware virtualization that boots a generic Linux. I pulled generous numbers out of my ass because I didn't want to over-reach in the argument that it's taking too long. The shorter the boot time, the stronger the point. >> With a total of 0.84 us of read uncertaity (1/12 of quick_pit_calibrate >> currently), we can get within 500 ppm within 1.75 us. Or do better >> within 5 or 10. > (msec you mean I suspect?) Yes, typo; that should be 1.75 ms. >> The loop I'd write would start the PIC (and the RTC, if we want to) >> and then go round-robin reading all the time sources and associated >> TSC values. > I'd just start with the PIT to have as few balls in flight as possible. Once I get the loop structured properly, additional timers really aren't a problem. The biggest PITA is the PM_TMR and all its brokenness (do I have a PIIX machine in the closet somewhere?), but the quick_pit_calibrate patch I already posted to LKML shows how to handle that. I set up a small circular buffer of captured values, and when I'm (say) three captures past the "interesting" one, go back and see if the reads look good. > Could you please structure it the following way: > > - first a patch that fixes bogus comments about the current code. It has > bitrotten and if we change it significantly we better have a well > documented starting point that is easier to compare against. > > - then a patch that introduces your more accurate calibration method and > uses it as the first method to calibrate. If it fails (and it should have a > notion of failing) then it should fall back to the other two methods. > > - possibly add a boot option to skip your new calibration method - > i.e. to make the kernel behave in the old way. This would be useful > for tracking down any regressions in this. > > - then maybe add a patch for the RTC method, but as a .config driven opt-in > initially. Sonds good, but when do we get to the decruftification? I'd prefer to prepare the final patch (if nothing else, so Linus will be reassured by the diffstat), although I can see holding it back for a few releases. > Please also add calibration tracing code (.config driven and default-off), > so that the statistical properties of calibration can be debugged and > validated without patching the kernel. Definitely desired, but I have to be careful here. Obviously I can't print during the timing loop, so it will take either a lot of memory, or add significant computation to the loop. I also don't want to flood the kernel log before syslog is started. Do you have any specific suggestions? Should I just capture everything into a permanently-allocated buffer and export it via debugfs? >> I realize this is a far bigger overhaul than Adrian proposed, but do other >> people agree that some decruftification is warranted? > Absolutely! Thanks for the encouragement! >> Any suggestions for a reasonable time/quality tradeoff? 500 ppm ASAP? >> Best I can do in 10 ms? Wait until the PIT is 500 ppm and then use >> the better result from a higher-resolution timer if available? > So I'd suggest a minimum polling interval (at least 1 msecs?) plus a > ppm target. Would 100ppm be too aggressive? How about 122 ppm (1/8192) because I'm lazy? :-) What I imagine is this: - The code will loop until it reaches 122 ppm or 55 ms, whichever comes first. (There's also a minimum, before which 122 ppm isn't checked.) - Initially, failure to reach 122 ppm will print a message and fall back. - In the final cleanup patch, I'll accept anything up to 500 ppm and only fail (and disable TSC) if I can't reach that. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

