Hal Murray <hmur...@megapathdsl.net>: > > g...@rellim.com said: > > Makes no sense to me. Adding randomness helps when you have hysteresis, > > stiction, friction, lash and some other things, but none of those apply to > > NTP. > > The NTP case is roughly stiction. Remember the age of this code. It was > working long before CPUs had instructions to read a cycle counter. Back > then, the system clock was updated on the scheduler interrupt. There was no > interpolation between ticks.
*blink* I think I just achieved enlightenment. Gary, Hal, please review the following carefully to ensure that I haven't updated my beliefs wrongly. Stiction in this context = "adjacent clock reads could get back the same value", is that right? Suddenly a whole bunch of things, like the implications of only updating the clock on a scheduler interrupt, make sense. And now I think I get (a) why Mills fuzzed the clock, and (b) why the code is so careful about checking for clock stepback. If your working assumption is that the clock will only update on a scheduler tick, and your PLL correction requires you to have a monotonically increasing clock, stiction is *bad*. You have no choice but to fuzz the clock, and the probabilistically least risky way to do it is by around half the tick interval, but because random is random you cannot guarantee when you have to do it twice between ticks that your second pseudosample will be greater than your first. You need what the code calls a "Lamport violation" check to throw out bad pseudosamples. Therefore I *deduce* that the PLL correction (the one NTP does, not the in-kernel one Hal tells us is associated with PPS) requires a monotonically increasing clock. It's the simplest explanation for the way libntp/systime.c works, and it explains *everything* that has puzzled me about that code. I love this project - it makes me learn new things. > Mark/Eric: Can you guarantee that we will never run on a system with a crappy > clock? In this context, crappy means one that takes big steps. OK, now that I think I understand this issue I'm going to say "Yes, we can assume this". All x86 machines back to the Pentium (1993) have a hardware cycle counter; it's called the TSC. As an interesting detail, this was a 64-bit register even when the primary word size was 32 bits. All ARM processors back to the ARM6 (1992) have one as well. A little web searching finds clear indications of cycle counters on the UltraSparc (SPARC V9), Alpha, MIPS, PowerPC, IA64 and PA-RISC. I also hunted for information on dedicated smartphone processors. I found clear indication of a cycle counter on the Qualcomm Snapdragon and clouded ones for Apple A-series processors. The Nvidia Tegra, MediaTek, HiSilicon and Samsung HyExynos chips are all recent ARM variants and can therefore be assumed to have an ARM %tick register. Reading between the lines, it looks to me like this hardware feature became ubiquitous in the early 1990s and that one of the drivers was hardware-assisted crypto. It is therefore *highly* unlikely to be omitted from any new design, even in low-power embedded. And if you have a TSC, sampling it is a trivial handful of assembler instructions. > I thinnk that all Gary's test proved is that his system doesn't have a crappy > clock. Yes. Agreed. > If we are serious about getting rid of that code, I'll put investigating that > area higher on my list. I think we have more important things to do. I think I can take it from here. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> _______________________________________________ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel