On 2013-12-05, mike cook <michael.c...@sfr.fr> wrote: >> >> >> The problem for ntp is that ntp takes a long time to recover from a bad >> drift value. >> > > This seems to have been an issue since I started using ntp, more than 10 > years ago. I am surprised that it is not fixed.
Because David Mills has no interest at all in fixing it. He has a model for the operation of ntpd, and that model does not include rapid correction of errors. It is designed for long term stability first and foremost. > > A simple test on linux with a modern version of ntp: Here is the normal > state of this R-PI > > Thu Dec 5 09:34:13 CET 2013 > mike@raspberrypi ~ $ sudo cat /var/lib/ntp/ntp.drift > -36.772 > mike@raspberrypi ~ $ ls -l /var/lib/ntp/ntp.drift > -rw-r--r-- 1 root root 8 Dec 5 08:51 /var/lib/ntp/ntp.drift > mike@raspberrypi ~ $ ntpq -pn |grep \* > mintc=3, offset=-0.169517, frequency=-37.191, sys_jitter=0.333279, > > offset with this server is fairly stable at 1-300 microseconds, sometimes > better. > > So now stop ntpd , stick a silly value in the drift file and restart. > > root@raspberrypi:/home/mike# echo "-256.666" > /var/lib/ntp/ntp.drift > root@raspberrypi:/home/mike# cat /var/lib/ntp/ntp.drift > -256.666 > root@raspberrypi:/home/mike# /etc/init.d/ntp start > Starting NTP server: ntpd. > root@raspberrypi:/home/mike# ntpq -c rv > associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync, > version="ntpd 4.2.7p319@1.2483 Tue May 28 11:26:22 UTC 2013 (2)", > processor="armv6l", system="Linux/3.2.27-pps", leap=00, stratum=2, > precision=-19, rootdelay=14.258, rootdisp=202.121, refid=145.238.203.14, > reftime=d64aba43.dfdbb690 Thu, Dec 5 2013 9:39:31.874, > clock=d64aba54.494fd9c7 Thu, Dec 5 2013 9:39:48.286, peer=2675, tc=6, > mintc=3, offset=3.357234, frequency=-256.666, sys_jitter=1.622350, > clk_jitter=2.342, clk_wander=0.000 > > So we have picked up the drift and are using it as is, no verification. > > root@raspberrypi:/home/mike# ntpq -pn > ... > *145.238.203.14 .TS-3. 1 u 44 64 1 14.258 3.357 1.622 > ... > So iburst got us a reasonable start point. Now lets see how it evolves: > > oot@raspberrypi:/home/mike# while true; do date; ntpq -pn |grep \*;ntpq -c rv > |grep frequency; ls -l /var/lib/ntp/ntp.drift;cat /var/lib/ntp/ntp.drift; > sleep 60; done > Thu Dec 5 09:46:00 CET 2013 > *145.238.203.14 .TS-3. 1 u 62 64 77 14.258 3.357 11.413 > mintc=3, offset=16.974005, frequency=-256.666, sys_jitter=11.412631, > -rw-r--r-- 1 root root 9 Dec 5 09:38 /var/lib/ntp/ntp.drift > -256.666 > > three samples later , > > Thu Dec 5 09:49:01 CET 2013 > *145.238.203.14 .TS-3. 1 u 39 64 377 14.270 13.392 > 25.459 the offset multiplies by three > mintc=3, offset=16.974005, frequency=-256.666, sys_jitter=25.458613, > -rw-r--r-- 1 root root 9 Dec 5 09:38 /var/lib/ntp/ntp.drift > -256.666 > Thu Dec 5 09:50:02 CET 2013 > *145.238.203.14 .TS-3. 1 u 32 64 377 14.272 64.913 > 38.415 then more than 20 times > mintc=3, offset=64.912586, frequency=-224.970, sys_jitter=38.415064, > -rw-r--r-- 1 root root 9 Dec 5 09:38 /var/lib/ntp/ntp.drift > -256.666 > Thu Dec 5 09:51:02 CET 2013 > *145.238.203.14 .TS-3. 1 u 25 64 377 14.272 64.913 34.083 > mintc=3, offset=64.912586, frequency=-224.970, sys_jitter=34.083058, > -rw-r--r-- 1 root root 9 Dec 5 09:38 /var/lib/ntp/ntp.drift > -256.666 > Thu Dec 5 09:52:02 CET 2013 > *145.238.203.14 .TS-3. 1 u 19 64 377 14.242 78.513 > 37.945 and it gets worse - note that we still think this is a > good source > mintc=3, offset=78.512782, frequency=-214.937, sys_jitter=37.944744, > -rw-r--r-- 1 root root 9 Dec 5 09:38 /var/lib/ntp/ntp.drift > -256.666 > Thu Dec 5 09:53:03 CET 2013 > *145.238.203.14 .TS-3. 1 u 10 64 377 14.242 78.513 30.074 > mintc=3, offset=78.512782, frequency=-214.937, sys_jitter=30.073729, > -rw-r--r-- 1 root root 9 Dec 5 09:38 /var/lib/ntp/ntp.drift > -256.666 > > Our worst state is at 10:03, 30 minutes after the start up. The real time > frequency value is decreasing but not reflected to the file. This is an issue > as an admin blindly restarting ntp after noticing crappy offsets will hit the > same wall again. > > The file gets updated after 1Hr, at > > Thu Dec 5 10:39:21 CET 2013 > *145.238.203.14 .TS-3. 1 u 25 64 377 12.834 37.836 8.350 > mintc=3, offset=37.835862, frequency=-88.963, sys_jitter=8.349705, > -rw-r--r-- 1 root root 8 Dec 5 10:39 /var/lib/ntp/ntp.drift > -88.963 > > The rate of convergence is getting quicker but we don't get back to a good > state until nearly 3Hrs: > > Thu Dec 5 12:20:02 CET 2013 > *145.238.203.14 .TS-3. 1 u 28 64 377 12.979 0.287 0.693 > mintc=3, offset=0.287015, frequency=-38.180, sys_jitter=0.693195, > -rw-r--r-- 1 root root 8 Dec 5 11:39 /var/lib/ntp/ntp.drift > -41.923 > > And the "normal" drift is reached around 4hrs after the restart. > > Thu Dec 5 13:30:30 CET 2013 > *145.238.203.14 .TS-3. 1 u 60 64 377 12.882 0.134 0.058 > mintc=3, offset=0.134499, frequency=-37.307, sys_jitter=0.057602, > -rw-r--r-- 1 root root 8 Dec 5 12:39 /var/lib/ntp/ntp.drift > -37.928 > > I am sure that a much faster convergence could be achieved with a little > thought, even if it meant a little ringing. Yes, chrony does it. But it uses a very different philosophy from ntpd. David has said time and again that he is completely uninterested in fixing the "issue" rapidity of convergence of ntpd, but also that he retains the right to decide how ntpd should behave. The problem is not the ringing (that would make it worse-- most rapid convergence for a simply feedback is at critical damping which ntpd tries to roughly get to. But because it has no memory, such a circuit is limited in what it can do. Chrony has a memory It uses the last 3-64 measurements to estimate what the correct time is, and then tries to get there quickly and stably. All ntpd knows is the current offset and has no idea if a non-zero value is because the local clock is off, or because noise has made the remote measurement bad. It cannot examine the past history to try to decide between the two possibilities, and thus must tread carefully, because it stupid to rapidly chase errors. And remember that the only tool ntpd has is to change the rate. Now, you or I might well look at say the last 5 offsets, and see the trend, and the noise in that trend ans say-- Hey, the rate of my clock is way off. But that is not what ntpd does. _______________________________________________ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions