Re: ntpd struggling to keep up - how to fix?
On 2010-Feb-22 03:41:05 -0800, Jeremy Chadwick free...@jdc.parodius.com wrote: ntpd under normal operation (not +/- 500ppm) figure out on its own the average amount of drift, which is what ntpd.drift is for, correct? Yes. It takes a long time for ntpd to characterise the local system clock. Once it does so, it stores the calculated drift in ntp.drift and updates it every hour or so. This means that when ntpd is restarted, it can immediately set its PLL to a reasonably close value, rather than starting from scratch. -- Peter Jeremy pgpojg0gBGkf4.pgp Description: PGP signature
Re: ntpd struggling to keep up - how to fix?
On Mon, 22 Feb 2010 07:17:42 +1100 Peter Jeremy peterjer...@acm.org wrote: On 2010-Feb-21 17:36:19 +0100, Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: Over time (probably a couple of days from scratch), the poll rate should increase to 1024. If it doesn't, it may indicate that your Like so: r...@kg-f2# ntpq -p remote refid st t when poll reach delay offset jitter == *kg-omni1.kg4.no 192.121.13.583 u 564 1024 3770.1633.018 3.196 -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
Peter Jeremy peterjer...@acm.org wrote: ... Once ntpd decides to continuously step, something is broken. Is there some reason why, as long as it is not yet synced, ntpd should not do this sort of calculation and rate correction itself rather than insist on having a human perform the calculation and enter the adjustment? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On 2010-Feb-22 01:02:54 -0800, per...@pluto.rain.com wrote: Peter Jeremy peterjer...@acm.org wrote: ... Once ntpd decides to continuously step, something is broken. Is there some reason why, as long as it is not yet synced, ntpd should not do this sort of calculation and rate correction itself rather than insist on having a human perform the calculation and enter the adjustment? ntpd _does_ do this sort of calculation but the NTP algorithms bound the PLL adjustment to +/-500ppm. RFC1305 suggests that a reasonable tolerance for board-mounted, uncompensated quartz- crystal oscillators is 100ppm and therefore the +/-500ppm bound is reasonable (see the RFC for the gory maths). In this case, the op's clock was ~2500ppm slow - well outside the NTP tolerance. It was therefore necessary to change the nominal timecounter frequency to bring it into lock range. I do not believe it is reasonable for ntpd to do this by itself: - It should very rarely be needed since NTP should be able to compensate for normal tolerances. - The actual local clock source and how to alter the kernel's idea of its nominal frequency is outside the purview of NTP. - Giving ntpd free reign over the timecounter frequency runs the real risk of ntpd rendering the system unusable if ntpd becomes confused (or is mislead) about the time. Note that FreeBSD/i386 and /amd64 include 4 different possible timecounters, only 3 of which can be tweaked. Other FreeBSD architectures will have different timecounters. Other OSs may have completely different mechanisms for handling the local clock source. Trying to embed knowledge of all these different clock sources into ntpd would be unrealistic. I look after over 100 assorted Unix hosts at home and work (HP AlphaServers and Proliants, various Sun servers, Dell and whitebox PCs and various laptops) and the worst driftrates I have seen previously are: - Sun T-2000 servers have a design flaw in the clock spectrum spreading so it appears to be ~250ppm fast. Sun fixed this with a kernel patch that increases the nominal clock frequency. - A Sun V20z is just over 100ppm out - I have tweaked the relevant timecounter to compensate for this (to avoid triggering my NTP frequency error alarms). - 4 assorted Sun hosts that run 55-60ppm out. At least based on my sample, the only hosts that were anywhere near ntpd's tolerance limits were acknowledged to have a design problem and the vendor provided a fix. IMO, this is a better approach than trying to make ntpd omniscient. -- Peter Jeremy pgpNNc5IxcM1u.pgp Description: PGP signature
Re: ntpd struggling to keep up - how to fix?
On Mon, Feb 22, 2010 at 10:18:10PM +1100, Peter Jeremy wrote: On 2010-Feb-22 01:02:54 -0800, per...@pluto.rain.com wrote: Peter Jeremy peterjer...@acm.org wrote: ... Once ntpd decides to continuously step, something is broken. Is there some reason why, as long as it is not yet synced, ntpd should not do this sort of calculation and rate correction itself rather than insist on having a human perform the calculation and enter the adjustment? ntpd _does_ do this sort of calculation but the NTP algorithms bound the PLL adjustment to +/-500ppm. RFC1305 suggests that a reasonable tolerance for board-mounted, uncompensated quartz- crystal oscillators is 100ppm and therefore the +/-500ppm bound is reasonable (see the RFC for the gory maths). In this case, the op's clock was ~2500ppm slow - well outside the NTP tolerance. It was therefore necessary to change the nominal timecounter frequency to bring it into lock range. I do not believe it is reasonable for ntpd to do this by itself: - It should very rarely be needed since NTP should be able to compensate for normal tolerances. - The actual local clock source and how to alter the kernel's idea of its nominal frequency is outside the purview of NTP. - Giving ntpd free reign over the timecounter frequency runs the real risk of ntpd rendering the system unusable if ntpd becomes confused (or is mislead) about the time. Note that FreeBSD/i386 and /amd64 include 4 different possible timecounters, only 3 of which can be tweaked. Other FreeBSD architectures will have different timecounters. Other OSs may have completely different mechanisms for handling the local clock source. Trying to embed knowledge of all these different clock sources into ntpd would be unrealistic. I look after over 100 assorted Unix hosts at home and work (HP AlphaServers and Proliants, various Sun servers, Dell and whitebox PCs and various laptops) and the worst driftrates I have seen previously are: - Sun T-2000 servers have a design flaw in the clock spectrum spreading so it appears to be ~250ppm fast. Sun fixed this with a kernel patch that increases the nominal clock frequency. - A Sun V20z is just over 100ppm out - I have tweaked the relevant timecounter to compensate for this (to avoid triggering my NTP frequency error alarms). - 4 assorted Sun hosts that run 55-60ppm out. At least based on my sample, the only hosts that were anywhere near ntpd's tolerance limits were acknowledged to have a design problem and the vendor provided a fix. IMO, this is a better approach than trying to make ntpd omniscient. A question with regards to the latter systems you mentioned (though I'm speaking generally and not specifically with regards to those H/W models), as I want to make sure I understand correctly: ntpd under normal operation (not +/- 500ppm) figure out on its own the average amount of drift, which is what ntpd.drift is for, correct? -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On 2010-Feb-21 14:29:28 -0500, David Magda dma...@ee.ryerson.ca wrote: For future reference, how does the math work? How do you go from taking a timer number: $ sysctl machdep.acpi_timer_freq machdep.acpi_timer_freq: 3577045 And the ntpd(8) time reset log entries to adjust the frequency? Or do you use the PPM output of the ntpdc(8) command? I'm not quite sure I understand what happened here. :) I'm using a combination of the ACPI frequency, time reset logs and PLL frequency reported by the op: On 2010-Feb-20 22:32:01 +0100, Torfinn Ingolfsen torfinn.ingolf...@broadpa= rk.no wrote: r...@kg-f2# sysctl machdep.acpi_timer_freq machdep.acpi_timer_freq: 3577045 r...@kg-f2# tvlm Feb 20 20:06:41 kg-f2 ntpd[942]: kernel time sync status change 2001 Feb 20 20:21:49 kg-f2 ntpd[942]: time reset +1.118880 s Feb 20 20:37:53 kg-f2 ntpd[942]: time reset +1.188538 s Feb 20 20:53:03 kg-f2 ntpd[942]: time reset +1.121903 s Feb 20 21:09:00 kg-f2 ntpd[942]: time reset +1.179924 s Feb 20 21:24:57 kg-f2 ntpd[942]: time reset +1.178490 s Feb 20 21:39:58 kg-f2 ntpd[942]: time reset +1.110647 s Feb 20 21:55:53 kg-f2 ntpd[942]: time reset +1.177292 s Feb 20 22:11:44 kg-f2 ntpd[942]: time reset +1.172358 s Feb 20 22:26:48 kg-f2 ntpd[942]: time reset +1.114350 s ... r...@kg-f2# ntpdc -c loopi -c sysi offset: 0.00 s frequency:500.000 ppm Together with the assumptions that the system clock is stable (ie the rate of drift is constant) and the syslog entries occurred at precisely the times reported. If the former assumption isn't true (which was a distinct possibility given the size of error) then ntpd isn't going to work. If he latter assumption is incorrect then the calculated clock skew will be incorrect - but hopefully enough to bring it into ntpd capture range to allow later tweaking. If ntpd cannot slew the local clock sufficiently, it will step the clock roughly every 900 seconds, hence the regular time reset messages. Since we are assuming a stable clock, we can accumulate the offsets in multiple reset messages to give a cumulative offset. For the above figures, the clock drift (sum of time reset messages) totals ~10.36 seconds over a period of 2:20:07 (the difference between the kernel time sync and last time reset message). [Note that I somehow mistranscribed both the offset and duration in my last mail - apologies for the confusion this might have caused]. 10.36s in 2:20:07 == 10.36/8407 ~= 1.233e-3 or 1233ppm. Thus ntpd is reporting that the system clock is still 1233ppm slow, even with ntpd pulling the system clock by its maximum of 500ppm. Adding these gives a total clock error of 1733ppm. The nominal clock frequency used by the timecounter is 3577045Hz. In order to calculate the actual clock frequency, we need to subtract the clock error (1733ppm) from this frequency: 3577045Hz * (1 - 1733e-6) = 3570846Hz (I rounded the clock error differently previously and got 3570847Hz). -- Peter Jeremy pgpiFgyALIePm.pgp Description: PGP signature
Re: ntpd struggling to keep up - how to fix?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 20/02/2010 23:42, Jeremy Chadwick wrote: For sake of example -- look at ntpq's delay column for each peer, and then look at the same column but for ntpdc. You'll see that for ntpdc they're divided by 1000 (presumably kern.hz rate): No -- those are just times measured in milliseconds (for ntpq) or seconds (for ntpdc). kern.hz doesn't come into it. Cheers, Matthew - -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.14 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkuA7MMACgkQ8Mjk52CukIymTACfV62sN6DC8TQjnxhqS7w5r89l m8MAn3vxDX8w2LpfA7ik67KXrhS2LY6G =eRg0 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Sun, 21 Feb 2010 16:08:23 +1100 Peter Jeremy peterjer...@acm.org wrote: That's definitely not good - though it's marginally better than before. I have checked on a local machine and the timecounter frequency definitely needs to be adjusted in the opposite direction to the ntpd drift. I think I see the problem: I suggested 3579545Hz - 2500ppm, which gives an ACPI frequency of 3570596Hz. There was some miscommunication and you have set an ACPI frequency of 3577045Hz which is 2500Hz (or 698ppm) lower. The drift reported by the time resets has gone from +1930ppm (14.5s in 2:05:17) to +1233ppm (8.4s in 2:20:06) - which is 697ppm - fairly close to the change you made. (The PLL is running at +500ppm so the actual clock offset is 500ppm more than the time reset reports suggest. Very good info, it helps me understand more. Thanks! Having re-checked my maths, using both your time reset results, can you please try: sysctl machdep.acpi_timer_freq=3570847 Ok, trying that now: r...@kg-f2# sysctl machdep.acpi_timer_freq=3570847 machdep.acpi_timer_freq: 3577045 - 3570847 r...@kg-f2# /etc/rc.d/ntpd stop Stopping ntpd. r...@kg-f2# rm /var/db/ntpd.drift r...@kg-f2# /etc/rc.d/ntpd start Starting ntpd. That should result in a drift of close to zero (well within NTP's lock range of +/- 300ppm). Good. No. Once ntpd decides to continuously step, something is broken. Aha, very good to know. -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Sun, 21 Feb 2010 16:08:23 +1100 Peter Jeremy peterjer...@acm.org wrote: Having re-checked my maths, using both your time reset results, can you please try: sysctl machdep.acpi_timer_freq=3570847 That should result in a drift of close to zero (well within NTP's lock range of +/- 300ppm). And a few hours later: from /var/log/messages: Feb 21 09:54:50 kg-f2 ntpd[55452]: ntpd 4.2.4p5-a (1) Feb 21 09:59:10 kg-f2 ntpd[55453]: kernel time sync status change 2001 More info: r...@kg-f2# ntpq -p remote refid st t when poll reach delay offset jitter == *kg-omni1.kg4.no 78.157.115.4 3 u 31 64 3770.174 -10.253 0.160 r...@kg-f2# ntpdc -c loopi -c sysi offset: -0.010253 s frequency:6.744 ppm poll adjust: -30 watchdog timer: 47 s system peer: kg-omni1.kg4.no system peer mode: client leap indicator: 00 stratum: 4 precision:-18 root distance:0.02956 s root dispersion: 0.06795 s reference ID: [10.1.10.1] reference time: cf2bdf36.f8820aef Sun, Feb 21 2010 17:35:02.970 system flags: auth monitor ntp kernel stats jitter: 0.000153 s stability:0.000 ppm broadcastdelay: 0.003998 s authdelay:0.00 s Problem solved. Thanks a lot. -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Feb 21, 2010, at 11:36, Torfinn Ingolfsen wrote: On Sun, 21 Feb 2010 16:08:23 +1100 Peter Jeremy peterjer...@acm.org wrote: Having re-checked my maths, using both your time reset results, can you please try: sysctl machdep.acpi_timer_freq=3570847 That should result in a drift of close to zero (well within NTP's lock range of +/- 300ppm). And a few hours later: from /var/log/messages: Feb 21 09:54:50 kg-f2 ntpd[55452]: ntpd 4.2.4p5-a (1) Feb 21 09:59:10 kg-f2 ntpd[55453]: kernel time sync status change 2001 More info: r...@kg-f2# ntpq -p remote refid st t when poll reach delay offset jitter == *kg-omni1.kg4.no 78.157.115.4 3 u 31 64 3770.174 -10.253 0.160 r...@kg-f2# ntpdc -c loopi -c sysi offset: -0.010253 s frequency:6.744 ppm poll adjust: -30 watchdog timer: 47 s [...] For future reference, how does the math work? How do you go from taking a timer number: $ sysctl machdep.acpi_timer_freq machdep.acpi_timer_freq: 3577045 And the ntpd(8) time reset log entries to adjust the frequency? Or do you use the PPM output of the ntpdc(8) command? I'm not quite sure I understand what happened here. :) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On 2010-Feb-21 17:36:19 +0100, Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: *kg-omni1.kg4.no 78.157.115.4 3 u 31 64 3770.174 -10.253 0.160 r...@kg-f2# ntpdc -c loopi -c sysi offset: -0.010253 s frequency:6.744 ppm That looks much healthier though it doesn't explain why your system clock is ~2500ppm out to start with. You may get one further small (~128msec) time step as part of ntpd's PLL calibration. Over time (probably a couple of days from scratch), the poll rate should increase to 1024. If it doesn't, it may indicate that your system clock stability isn't very good or you have excessive jitter in your reference. -- Peter Jeremy pgpzEr0C192AE.pgp Description: PGP signature
Re: ntpd struggling to keep up - how to fix?
On Sat, 20 Feb 2010 12:53:51 +1100 Peter Jeremy peterjer...@acm.org wrote: Looks reasonable. Let us know the results. I'd be interested in the output from ntpdc -c loopi -c sysi. Ok, here we go (the server panic'ed again last night): r...@kg-f2# uptime 10:28PM up 2:26, 3 users, load averages: 0.00, 0.00, 0.00 r...@kg-f2# sysctl machdep.acpi_timer_freq machdep.acpi_timer_freq: 3577045 r...@kg-f2# tvlm Feb 20 20:06:41 kg-f2 ntpd[942]: kernel time sync status change 2001 Feb 20 20:21:49 kg-f2 ntpd[942]: time reset +1.118880 s Feb 20 20:37:53 kg-f2 ntpd[942]: time reset +1.188538 s Feb 20 20:53:03 kg-f2 ntpd[942]: time reset +1.121903 s Feb 20 21:09:00 kg-f2 ntpd[942]: time reset +1.179924 s Feb 20 21:24:57 kg-f2 ntpd[942]: time reset +1.178490 s Feb 20 21:39:58 kg-f2 ntpd[942]: time reset +1.110647 s Feb 20 21:55:53 kg-f2 ntpd[942]: time reset +1.177292 s Feb 20 22:11:44 kg-f2 ntpd[942]: time reset +1.172358 s Feb 20 22:26:48 kg-f2 ntpd[942]: time reset +1.114350 s r...@kg-f2# ntpq -p remote refid st t when poll reach delay offset jitter == kg-omni1.kg4.no 129.240.64.3 3 u8 6470.176 133.306 77.731 r...@kg-f2# ntpdc -c loopi -c sysi offset: 0.00 s frequency:500.000 ppm poll adjust: 4 watchdog timer: 194 s system peer: 0.0.0.0 system peer mode: unspec leap indicator: 11 stratum: 16 precision:-18 root distance:0.0 s root dispersion: 0.00290 s reference ID: [83.84.69.80] reference time: . Thu, Feb 7 2036 7:28:16.000 system flags: auth monitor ntp kernel stats jitter: 0.358109 s stability:0.000 ppm broadcastdelay: 0.003998 s authdelay:0.00 s Not synced at all. Not good. :-/ Perhaps I should give it more time? -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Sat, 20 Feb 2010 22:32:01 +0100 Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: This output looks ... wrong ... somehow to my eyes: r...@kg-f2# date Sat Feb 20 22:51:24 CET 2010 r...@kg-f2# ntpq -p remote refid st t when poll reach delay offset jitter == *kg-omni1.kg4.no 129.240.64.3 3 u 62 64 3770.244 597.314 360.123 r...@kg-f2# ntpdc -c loopi -c sysi offset: 0.00 s frequency:500.000 ppm poll adjust: 4 watchdog timer: 549 s system peer: kg-omni1.kg4.no system peer mode: client leap indicator: 11 stratum: 16 precision:-18 root distance:0.0 s root dispersion: 0.00822 s reference ID: [10.1.10.1] reference time: . Thu, Feb 7 2036 7:28:16.000 system flags: auth monitor ntp kernel stats jitter: 0.360107 s stability:0.000 ppm broadcastdelay: 0.003998 s authdelay:0.00 s Shouldn't ntpq and ntpdc be in agreement? -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Sat, Feb 20, 2010 at 10:55:21PM +0100, Torfinn Ingolfsen wrote: On Sat, 20 Feb 2010 22:32:01 +0100 Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: This output looks ... wrong ... somehow to my eyes: r...@kg-f2# date Sat Feb 20 22:51:24 CET 2010 r...@kg-f2# ntpq -p remote refid st t when poll reach delay offset jitter == *kg-omni1.kg4.no 129.240.64.3 3 u 62 64 3770.244 597.314 360.123 r...@kg-f2# ntpdc -c loopi -c sysi offset: 0.00 s frequency:500.000 ppm poll adjust: 4 watchdog timer: 549 s system peer: kg-omni1.kg4.no system peer mode: client leap indicator: 11 stratum: 16 precision:-18 root distance:0.0 s root dispersion: 0.00822 s reference ID: [10.1.10.1] reference time: . Thu, Feb 7 2036 7:28:16.000 system flags: auth monitor ntp kernel stats jitter: 0.360107 s stability:0.000 ppm broadcastdelay: 0.003998 s authdelay:0.00 s Shouldn't ntpq and ntpdc be in agreement? ntpq and ntpdc output data in slightly different formats, depending on what arguments you give them. I'm not familiar with the loopi or sysi commands; Peter should be able to help here. For sake of example -- look at ntpq's delay column for each peer, and then look at the same column but for ntpdc. You'll see that for ntpdc they're divided by 1000 (presumably kern.hz rate): $ ntpq -c peers remote refid st t when poll reach delay offset jitter == +clock-a.develoo 204.123.2.72 2 u 476 512 377 25.287 -0.852 0.550 -enigma.wiredgoa 209.81.9.7 2 u 185 512 377 14.7540.284 0.688 +mtnlion.com 139.78.135.142 u 208 512 377 30.788 -0.233 0.160 *ntp1.phoenixpub .LCL.1 u 179 512 377 36.322 -0.552 0.522 -ntp-1.gw.uiuc.e 128.174.38.133 2 u 141 512 377 77.321 -5.381 0.328 -tick.jrc.us 172.21.0.14 2 u 149 512 377 112.424 -8.110 1.440 $ ntpdc -c peers remote local st poll reach delay offsetdisp === *mailserv1.phoen 192.168.1.51 1 512 377 0.03632 -0.000552 0.09666 =clock-a.develoo 192.168.1.51 2 512 377 0.02528 -0.000852 0.08611 =tick.jrc.us 192.168.1.51 2 512 377 0.11241 -0.008110 0.08615 =enigma.wiredgoa 192.168.1.51 2 512 377 0.01474 0.000284 0.11473 =mtnlion.com 192.168.1.51 2 512 377 0.03078 -0.000233 0.09665 =ntp-1.gw.uiuc.e 192.168.1.51 2 512 377 0.07732 -0.005381 0.10612 -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On 2010-Feb-20 22:32:01 +0100, Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: On Sat, 20 Feb 2010 12:53:51 +1100 Peter Jeremy peterjer...@acm.org wrote: Looks reasonable. Let us know the results. I'd be interested in the output from ntpdc -c loopi -c sysi. Ok, here we go (the server panic'ed again last night): r...@kg-f2# uptime 10:28PM up 2:26, 3 users, load averages: 0.00, 0.00, 0.00 r...@kg-f2# sysctl machdep.acpi_timer_freq machdep.acpi_timer_freq: 3577045 r...@kg-f2# tvlm Feb 20 20:06:41 kg-f2 ntpd[942]: kernel time sync status change 2001 Feb 20 20:21:49 kg-f2 ntpd[942]: time reset +1.118880 s Feb 20 20:37:53 kg-f2 ntpd[942]: time reset +1.188538 s Feb 20 20:53:03 kg-f2 ntpd[942]: time reset +1.121903 s Feb 20 21:09:00 kg-f2 ntpd[942]: time reset +1.179924 s Feb 20 21:24:57 kg-f2 ntpd[942]: time reset +1.178490 s Feb 20 21:39:58 kg-f2 ntpd[942]: time reset +1.110647 s Feb 20 21:55:53 kg-f2 ntpd[942]: time reset +1.177292 s Feb 20 22:11:44 kg-f2 ntpd[942]: time reset +1.172358 s Feb 20 22:26:48 kg-f2 ntpd[942]: time reset +1.114350 s That's definitely not good - though it's marginally better than before. I have checked on a local machine and the timecounter frequency definitely needs to be adjusted in the opposite direction to the ntpd drift. I think I see the problem: I suggested 3579545Hz - 2500ppm, which gives an ACPI frequency of 3570596Hz. There was some miscommunication and you have set an ACPI frequency of 3577045Hz which is 2500Hz (or 698ppm) lower. The drift reported by the time resets has gone from +1930ppm (14.5s in 2:05:17) to +1233ppm (8.4s in 2:20:06) - which is 697ppm - fairly close to the change you made. (The PLL is running at +500ppm so the actual clock offset is 500ppm more than the time reset reports suggest. Having re-checked my maths, using both your time reset results, can you please try: sysctl machdep.acpi_timer_freq=3570847 That should result in a drift of close to zero (well within NTP's lock range of +/- 300ppm). frequency:500.000 ppm And this is definitely not good. Not synced at all. Not good. :-/ Perhaps I should give it more time? No. Once ntpd decides to continuously step, something is broken. I've done some double-checking and On 2010-Feb-20 22:55:21 +0100, Torfinn Ingolfsen ytorfinn.ingolf...@broadpark.no wrote: This output looks ... wrong ... somehow to my eyes: ... Shouldn't ntpq and ntpdc be in agreement? I'm not sure which particular bits you are concerned about but ntpq reports delay/offset/jitter in msec whilst ntpdc reports them in sec. Note that I can't explain why the loopi offset is zero - ntpdc(8) states that this is the last offset given to the loop filter by the packet processing code. For me it's non-zero but doesn't quite match the offset reported by 'ntpq -p'. -- Peter Jeremy pgpZax0MQojXe.pgp Description: PGP signature
Re: ntpd struggling to keep up - how to fix?
On 2010-Feb-19 00:38:44 +0100, Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: r...@kg-f2# sysctl machdep.acpi_timer_freq=3577045 machdep.acpi_timer_freq: 3579545 - 3577045 Looks reasonable. Let us know the results. I'd be interested in the output from ntpdc -c loopi -c sysi. -- Peter Jeremy pgpJFAUKCAoRd.pgp Description: PGP signature
Re: ntpd struggling to keep up - how to fix?
On 2010-Feb-17 20:03:22 +0100, Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: On Wed, 17 Feb 2010 19:49:27 +0100 Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: Unfortunately, it isn't enough to keep the machine in sync all the time. But it is better than HPET so I'll keep it. Did you delete /etc/ntp.drift between timecounter changes? This thread is interesting: http://lkml.indiana.edu/hypermail/linux/kernel/0903.1/01356.html Is there a way in FreeBSD to perform adjustmenst like adjtimex? There's ntptime(8) but it doesn't have a self-calibrate mode. Based on the messages log you gave, and assuming the ntpd PLL is sane, your acpi-safe clock is about 2500ppm slow (the steps reflect about 2000ppm and the ntpd PLL should be compensating for a further 500ppm) - this is really bad, even for consumer-grade stuff. Are you running non-standard clock speeds or multipliers? If there's nothing obvious, I'd follow John Hay's suggesion and force set either your TSC or ACPI frequency in sysctl.conf (you can't override the HPET frequency). Take either the TSC or ACPI frequency reported by sysctl machdep, reduce it by 2500ppm and set that in /etc/sysctl.conf. Assuming a standard (3.58MHz) ACPI, the latter would look like: machdep.acpi_timer_freq=3570596 kern.timecounter.hardware=ACPI-safe The stop ntpd, delete /var/db/ntp.drift and either reboot or manually set the above sysctl's and restart ntpd. [I think I've got the adjustment direction correct in the above, if I've stuffed up, you need to adjust in the other direction] -- Peter Jeremy pgpcm7bDQJdjP.pgp Description: PGP signature
Re: ntpd struggling to keep up - how to fix?
On Fri, 19 Feb 2010 07:54:58 +1100 Peter Jeremy peterjer...@acm.org wrote: On 2010-Feb-17 20:03:22 +0100, Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: Did you delete /etc/ntp.drift between timecounter changes? I sure did, I used the instructions given. There's ntptime(8) but it doesn't have a self-calibrate mode. Ok, good to know. Based on the messages log you gave, and assuming the ntpd PLL is sane, your acpi-safe clock is about 2500ppm slow (the steps reflect about 2000ppm and the ntpd PLL should be compensating for a further 500ppm) - this is really bad, even for consumer-grade stuff. Are you running non-standard clock speeds or multipliers? No, everything at default values here (ie. I haven't changed anything in either BIOS or FreeBSD), except from changing timer from HPET to ACPI-safe. If there's nothing obvious, I'd follow John Hay's suggesion and force set either your TSC or ACPI frequency in sysctl.conf (you can't override the HPET frequency). Take either the TSC or ACPI frequency reported by sysctl machdep, reduce it by 2500ppm and set that in /etc/sysctl.conf. Assuming a standard (3.58MHz) ACPI, the latter would look like: machdep.acpi_timer_freq=3570596 This one is r...@kg-f2# sysctl machdep.acpi_timer_freq machdep.acpi_timer_freq: 3579545 So I should change that to 3577045, right? Like so: r...@kg-f2# sysctl machdep.acpi_timer_freq=3579545 machdep.acpi_timer_freq: 3579545 - 3579545 and I put it into /etc/sysctl.conf as well (in case the machine reboots again). kern.timecounter.hardware=ACPI-safe Yes, this is already in /etc/sysctl.conf The stop ntpd, delete /var/db/ntp.drift and either reboot or manually set the above sysctl's and restart ntpd. Done. We'll see if it works or not. [I think I've got the adjustment direction correct in the above, if I've stuffed up, you need to adjust in the other direction] Ok. Thanks to all for helping out. -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Thu, 18 Feb 2010 23:12:23 +0100 Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: So I should change that to 3577045, right? Like so: r...@kg-f2# sysctl machdep.acpi_timer_freq=3579545 machdep.acpi_timer_freq: 3579545 - 3579545 Eh... I just realized that I did it wrong. well, that's what cut and paste will do to you, if you don't pay attention. ;) Ok, I will try to do it right now: r...@kg-f2# sysctl machdep.acpi_timer_freq=3577045 machdep.acpi_timer_freq: 3579545 - 3577045 r...@kg-f2# /etc/rc.d/ntpd stop Stopping ntpd. r...@kg-f2# ll /var/db/ntp* -rw-r--r-- 1 root wheel 8 Feb 13 20:27 /var/db/ntpd.drift r...@kg-f2# rm /var/db/ntpd.drift r...@kg-f2# ll /var/db/ntp* ls: /var/db/ntp*: No such file or directory r...@kg-f2# /etc/rc.d/ntpd start Starting ntpd. and fixing it in /etc/sysctl.conf too. Done. -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Fri, 12 Feb 2010 17:44:52 +0100 Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: On Fri, 12 Feb 2010 05:11:17 -0800 Jeremy Chadwick free...@jdc.parodius.com wrote: Please try doing this: - stop ntpd - rm /var/db/ntpd.drift - sysctl kern.timecounter.hardware=ACPI-safe - start ntpd Thanks, I'm currently testing that. Results in 72 hours (or less) :-) Well, using ACPI-safe only get a small improvement, here are the lines from /var/log/messages: Feb 17 17:16:47 kg-f2 ntpd[912]: time reset +1.785920 s Feb 17 17:32:39 kg-f2 ntpd[912]: time reset +1.836376 s Feb 17 17:48:18 kg-f2 ntpd[912]: time reset +1.811593 s Feb 17 18:04:12 kg-f2 ntpd[912]: time reset +1.840545 s Feb 17 18:19:19 kg-f2 ntpd[912]: time reset +1.751837 s Feb 17 18:35:19 kg-f2 ntpd[912]: time reset +1.852328 s Feb 17 18:51:18 kg-f2 ntpd[912]: time reset +1.850928 s Feb 17 19:06:50 kg-f2 ntpd[912]: time reset +1.798706 s Feb 17 19:22:35 kg-f2 ntpd[912]: time reset +1.823697 s Feb 17 19:37:56 kg-f2 ntpd[912]: time reset +1.777376 s Unfortunately, it isn't enough to keep the machine in sync all the time. But it is better than HPET so I'll keep it. -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Wed, 17 Feb 2010 19:49:27 +0100 Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: Unfortunately, it isn't enough to keep the machine in sync all the time. But it is better than HPET so I'll keep it. This thread is interesting: http://lkml.indiana.edu/hypermail/linux/kernel/0903.1/01356.html Is there a way in FreeBSD to perform adjustmenst like adjtimex? 'apropos adjtime' only gives me a system call, the man pages for hz(9) and hardclock(9) doesn't exist on 8.0-stable (or on 7.2-stable). -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Wed, Feb 17, 2010 at 08:03:22PM +0100, Torfinn Ingolfsen wrote: On Wed, 17 Feb 2010 19:49:27 +0100 Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote: Unfortunately, it isn't enough to keep the machine in sync all the time. But it is better than HPET so I'll keep it. This thread is interesting: http://lkml.indiana.edu/hypermail/linux/kernel/0903.1/01356.html Is there a way in FreeBSD to perform adjustmenst like adjtimex? 'apropos adjtime' only gives me a system call, the man pages for hz(9) and hardclock(9) doesn't exist on 8.0-stable (or on 7.2-stable). You can set the timecounter frequency with sysctl. On my one time server I have these lines in /etc/sysctl.conf machdep.tsc_freq=132658584 kern.timecounter.hardware=TSC John -- John Hay -- j...@meraka.csir.co.za / j...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Fri, Feb 12, 2010 at 11:46:04AM -0800, Jeremy Chadwick wrote: Technical footnote: I wish I understood 1) the difference between ACPI-safe and ACPI-fast, and 2) how the system or OS ranks the timecounters (the higher the value in parenthesis, supposedly the more accurate/preferred it is). Xin, do you happen to know how this works? 1) When you read the ACPI timing register, you should get a sensible answer. However on some (most?) hardware, you can read the register and get it half way through an update. When the kernel finds the ACPI timer, it tries reading it a few times in a row, and checks the results look good - if they do, you get ACPI-fast. If it catches a half-updated register, then you get ACPI-slow, which reads the register multiple times in an effort to avoid the problem. 2) The ranking of timers is essentially hard wired, though for some times it is adjusted in some way. For example, the ranking of the TSC may be reduced if it looks like an SMP system. I believe the ranking was originally intended to be a measure of how fast the counter could be read, but things have turned out to be complicated by difficult hardware. David. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Fri, 12 Feb 2010 11:46:04 -0800 Jeremy Chadwick free...@jdc.parodius.com wrote: override this though! :-) ), which -- assuming it works -- should solve your problem. We'll see. The box rebooted again last night (see another thread on this mailing list), so now I have added kern.timecounter.hardware=ACPI-safe to /etc/sysctl.conf, just in case it reboots again. Technical footnote: I wish I understood 1) the difference between ACPI-safe and ACPI-fast, and 2) how the system or OS ranks the I'm still wondering why this machine doesn't have ACPI-fast: o...@kg-f2# sysctl kern.timecounter.choice kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) dummy(-100) While my workstation do: ti...@kg-v2$ sysctl kern.timecounter.choice kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-100) and anaother machine: r...@kg-quiet# sysctl kern.timecounter.choice kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-100) and another: r...@kg-vm# sysctl kern.timecounter.choice kern.timecounter.choice: TSC(-100) ACPI-fast(1000) i8254(0) dummy(-100) Probably a BIOS / acpi problem. -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Thu, 11 Feb 2010 10:25:59 -0800 Chuck Swiger cswi...@mac.com wrote: The rate at which this machine is losing time is probably exceeding the ~50 seconds per day that NTPd is willing to correct without extreme measures (ie, it has to step time rather than drift-correct). You might help it maintain a more sane idea of time by using at least 4 timeservers. Hmm, ok that iis something I can try. You might take a look at 'vmstat -i' and look out for an interrupt storm, but it's possible your hardware's clock is simply busted. AFAICT, vmstat -i looks ok: r...@kg-f2# uptime 1:23PM up 18:31, 3 users, load averages: 0.00, 0.00, 0.00 r...@kg-f2# vmstat -i interrupt total rate irq1: atkbd0 36 0 irq6: fdc0 1 0 irq16: siis0 ohci0+ 408 0 irq22: atapci0856338 12 cpu0: timer133347678 1999 irq256: re0 234087 3 cpu1: timer17654 1999 Total 267776202 4016 -- Regards, Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Thu, 11 Feb 2010 11:25:15 -0800 Jeremy Chadwick free...@jdc.parodius.com wrote: Your machine has a rapidly drifting clock, usually an indicator of a hardware problem (crystal gone bad is a common one -- seen this at work quite a few times), or possibly a bad time counter source chosen by the kernel. Can you please provide the output of: sysctl kern.timecounter Here it is: r...@kg-f2# sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) dummy(-100) kern.timecounter.hardware: HPET kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 52444 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-safe.mask: 4294967295 kern.timecounter.tc.ACPI-safe.counter: 3252982815 kern.timecounter.tc.ACPI-safe.frequency: 3579545 kern.timecounter.tc.ACPI-safe.quality: 850 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 3443625641 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 900 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 1276479615 kern.timecounter.tc.TSC.frequency: 2819782573 kern.timecounter.tc.TSC.quality: -100 kern.timecounter.smp_tsc: 0 kern.timecounter.invariant_tsc: 1 Finally, was this OS installation used on different hardware in the past? Meaning: was the hard disk previously installed on another machine? Nope. Brand new hw, hard drive, and FreeBSD 8.0-release install. Then I upgraded to 8.0-stable. Why I'm asking: /var/db/ntpd.drift could be from an old computer (the previous hardware), and the clock drift rate would be different than that of your newer[1] hardware. No, /var/db/ntp.drift is created on this machine. -- Regards, Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Thu, 11 Feb 2010 22:49:59 +0100 Stefan Krueger stadtki...@gmx.de wrote: I have the some problem on my machine (also AMD, running 8.0 + patches), after a while ntpd gives up sync'ing and then the time is off by minutes (roughly 80sec after.. say 10 hours) :( FWIW, I have several other AMD systems, from both Asus, MSI and Gigabyte. Noene of them have problems with ntp. I switched to opentnpd (you can find it in ports) and the clock stays in sync now, so you might want to consider that, too I will if that is the only solution. However, if there is something wrong with my setup / configuration of this machine, I'll rather fix that. -- Regards, Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Fri, Feb 12, 2010 at 01:29:47PM +0100, Torfinn Ingolfsen wrote: On Thu, 11 Feb 2010 11:25:15 -0800 Jeremy Chadwick free...@jdc.parodius.com wrote: Your machine has a rapidly drifting clock, usually an indicator of a hardware problem (crystal gone bad is a common one -- seen this at work quite a few times), or possibly a bad time counter source chosen by the kernel. Can you please provide the output of: sysctl kern.timecounter Here it is: r...@kg-f2# sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) dummy(-100) kern.timecounter.hardware: HPET kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 52444 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-safe.mask: 4294967295 kern.timecounter.tc.ACPI-safe.counter: 3252982815 kern.timecounter.tc.ACPI-safe.frequency: 3579545 kern.timecounter.tc.ACPI-safe.quality: 850 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 3443625641 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 900 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 1276479615 kern.timecounter.tc.TSC.frequency: 2819782573 kern.timecounter.tc.TSC.quality: -100 kern.timecounter.smp_tsc: 0 kern.timecounter.invariant_tsc: 1 Please try doing this: - stop ntpd - rm /var/db/ntpd.drift - sysctl kern.timecounter.hardware=ACPI-safe - start ntpd Then see if your clock drifts. If it stops, great -- you can put that sysctl assignment line in /etc/sysctl.conf and consider it a done deal. I highly recommend putting some comments around it though so in the future you don't go What's this? Silly! and delete it. ;-) I'll also point out that it's common on FreeBSD[1] to see messages like the following (or at least it was circa 2006 -- I believe ntpd has been updated since then, but I've no indication said quirk was fixed/addressed): Dec 19 00:22:26 icarus ntpd[624]: kernel time sync enabled 2001 Dec 19 01:47:48 icarus ntpd[624]: kernel time sync enabled 6001 Dec 19 02:04:52 icarus ntpd[624]: kernel time sync enabled 2001 repeat indefinitely You can add the following to your ntp.conf to fix that problem: # maxpoll 9 is used to work around PLL/FLL flipping, which happens at # exactly 1024 seconds (the default maxpoll value). Another FreeBSD # user recommended using 9 instead: # http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031512.html # server some.ntp.server maxpoll 9 I recommend using the iburst directive on one (and only one!) server lines in your config, otherwise ntpd will usually 'settle' for about 10-15 minutes before bothering to try and update the clock the first time around. Example config: server clock.develooper.com maxpoll 9 iburst server ntp.nblug.org maxpoll 9 server tick.mtnlion.com maxpoll 9 server dewey.lib.ci.phoenix.az.us maxpoll 9 server ntp-1.cso.uiuc.edu maxpoll 9 server tick.jrc.usmaxpoll 9 Finally, you should really consider adding some stratum 2 sources to your list, *in addition* to the stratum 3 server you're already using. ntpd can happily work with multiple servers, and will pick the best one as well as average them out vs. drift. It's pretty smart, honestly. We can talk about stratum 3 vs. 2 vs. 1 and use ntpdc -c peers to find out what those NTP servers sync with, if you'd like. [1]: http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031512.html -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Fri, 12 Feb 2010 05:11:17 -0800 Jeremy Chadwick free...@jdc.parodius.com wrote: Please try doing this: - stop ntpd - rm /var/db/ntpd.drift - sysctl kern.timecounter.hardware=ACPI-safe - start ntpd Thanks, I'm currently testing that. Results in 72 hours (or less) :-) Then see if your clock drifts. If it stops, great -- you can put that sysctl assignment line in /etc/sysctl.conf and consider it a done deal. I highly recommend putting some comments around it though so in the future you don't go What's this? Silly! and delete it. ;-) Yes, I know. Learned the hard way that I need to document things for my own use. :-) I'll also point out that it's common on FreeBSD[1] to see messages like the following (or at least it was circa 2006 -- I believe ntpd Yes, those messages are still there. At one time, I thought about fixing that (by using the config you present), but in the end I figured that these messages actually helps me in pinpointing time of crash in the (few) cases wherer one of my machines crashes or panics. So in the end I did nothing about it. Finally, you should really consider adding some stratum 2 sources to your list, *in addition* to the stratum 3 server you're already using. Well, the stratum 3 server is my firewall. :) All the machines on my LAN use that one as the ntp server. My firewall is currently using three ntp servers as sources, one is a non-public stratum 2 server (yes, I asked for permission before I started using it), one is the no.poool.ntp.org pool ( stratum 3) and I just found out that the third one has stopped responding. So I removed it. I have added the dk.pool.ntp.org and se.pool.ntp.org pools, we will see how that turns out. ntpd can happily work with multiple servers, and will pick the best one Yes, I know. It is a few years since I set this up, but at that time I figured that if I use three ntp servers for my firewall, and just used the firewall for all my internal machines, that would be good enough for my uses. Perhaps I need to re-evaluate my needs. -- Regards, Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Fri, Feb 12, 2010 at 05:11:17AM -0800 I heard the voice of Jeremy Chadwick, and lo! it spake thus: I highly recommend putting some comments around it though so in the future you don't go What's this? Silly! and delete it. ;-) But do delete it every once in a while. My experience over the years is that sometimes a given OS build will do way worse than I'm used to, whereas a new build a few months later works just peachy. -- Matthew Fuller (MF4839) | fulle...@over-yonder.net Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ On the Internet, nobody can hear you scream. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
Date: Fri, 12 Feb 2010 05:11:17 -0800 From: Jeremy Chadwick free...@jdc.parodius.com Sender: owner-freebsd-sta...@freebsd.org On Fri, Feb 12, 2010 at 01:29:47PM +0100, Torfinn Ingolfsen wrote: On Thu, 11 Feb 2010 11:25:15 -0800 Jeremy Chadwick free...@jdc.parodius.com wrote: Your machine has a rapidly drifting clock, usually an indicator of a hardware problem (crystal gone bad is a common one -- seen this at work quite a few times), or possibly a bad time counter source chosen by the kernel. Can you please provide the output of: sysctl kern.timecounter Here it is: r...@kg-f2# sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) dummy(-100) kern.timecounter.hardware: HPET kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 52444 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-safe.mask: 4294967295 kern.timecounter.tc.ACPI-safe.counter: 3252982815 kern.timecounter.tc.ACPI-safe.frequency: 3579545 kern.timecounter.tc.ACPI-safe.quality: 850 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 3443625641 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 900 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 1276479615 kern.timecounter.tc.TSC.frequency: 2819782573 kern.timecounter.tc.TSC.quality: -100 kern.timecounter.smp_tsc: 0 kern.timecounter.invariant_tsc: 1 Please try doing this: - stop ntpd - rm /var/db/ntpd.drift - sysctl kern.timecounter.hardware=ACPI-safe - start ntpd Then see if your clock drifts. If it stops, great -- you can put that sysctl assignment line in /etc/sysctl.conf and consider it a done deal. I highly recommend putting some comments around it though so in the future you don't go What's this? Silly! and delete it. ;-) I'll also point out that it's common on FreeBSD[1] to see messages like the following (or at least it was circa 2006 -- I believe ntpd has been updated since then, but I've no indication said quirk was fixed/addressed): Dec 19 00:22:26 icarus ntpd[624]: kernel time sync enabled 2001 Dec 19 01:47:48 icarus ntpd[624]: kernel time sync enabled 6001 Dec 19 02:04:52 icarus ntpd[624]: kernel time sync enabled 2001 repeat indefinitely You can add the following to your ntp.conf to fix that problem: # maxpoll 9 is used to work around PLL/FLL flipping, which happens at # exactly 1024 seconds (the default maxpoll value). Another FreeBSD # user recommended using 9 instead: # http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031512.html # server some.ntp.server maxpoll 9 I recommend using the iburst directive on one (and only one!) server lines in your config, otherwise ntpd will usually 'settle' for about 10-15 minutes before bothering to try and update the clock the first time around. Example config: Why (and only one!)? I have never seen a problem with 'iburst' on all servers (assuming they are Internet connected. 'iburst' only makes a difference on the initial query and, if the server you have marked as 'iburst' is unreachable, it will really slow down synchronization. I am unaware of any issues with multiple servers being marked 'iburst' and typically configure 7 ntp servers for a system, all tagged as 'iburst'. Never sen any issue with this. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: ober...@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Fri, Feb 12, 2010 at 11:16:37AM -0800, Kevin Oberman wrote: Date: Fri, 12 Feb 2010 05:11:17 -0800 From: Jeremy Chadwick free...@jdc.parodius.com Sender: owner-freebsd-sta...@freebsd.org On Fri, Feb 12, 2010 at 01:29:47PM +0100, Torfinn Ingolfsen wrote: On Thu, 11 Feb 2010 11:25:15 -0800 Jeremy Chadwick free...@jdc.parodius.com wrote: Your machine has a rapidly drifting clock, usually an indicator of a hardware problem (crystal gone bad is a common one -- seen this at work quite a few times), or possibly a bad time counter source chosen by the kernel. Can you please provide the output of: sysctl kern.timecounter Here it is: r...@kg-f2# sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) dummy(-100) kern.timecounter.hardware: HPET kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 52444 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-safe.mask: 4294967295 kern.timecounter.tc.ACPI-safe.counter: 3252982815 kern.timecounter.tc.ACPI-safe.frequency: 3579545 kern.timecounter.tc.ACPI-safe.quality: 850 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 3443625641 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 900 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 1276479615 kern.timecounter.tc.TSC.frequency: 2819782573 kern.timecounter.tc.TSC.quality: -100 kern.timecounter.smp_tsc: 0 kern.timecounter.invariant_tsc: 1 Please try doing this: - stop ntpd - rm /var/db/ntpd.drift - sysctl kern.timecounter.hardware=ACPI-safe - start ntpd Then see if your clock drifts. If it stops, great -- you can put that sysctl assignment line in /etc/sysctl.conf and consider it a done deal. I highly recommend putting some comments around it though so in the future you don't go What's this? Silly! and delete it. ;-) I'll also point out that it's common on FreeBSD[1] to see messages like the following (or at least it was circa 2006 -- I believe ntpd has been updated since then, but I've no indication said quirk was fixed/addressed): Dec 19 00:22:26 icarus ntpd[624]: kernel time sync enabled 2001 Dec 19 01:47:48 icarus ntpd[624]: kernel time sync enabled 6001 Dec 19 02:04:52 icarus ntpd[624]: kernel time sync enabled 2001 repeat indefinitely You can add the following to your ntp.conf to fix that problem: # maxpoll 9 is used to work around PLL/FLL flipping, which happens at # exactly 1024 seconds (the default maxpoll value). Another FreeBSD # user recommended using 9 instead: # http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031512.html # server some.ntp.server maxpoll 9 I recommend using the iburst directive on one (and only one!) server lines in your config, otherwise ntpd will usually 'settle' for about 10-15 minutes before bothering to try and update the clock the first time around. Example config: Why (and only one!)? I have never seen a problem with 'iburst' on all servers (assuming they are Internet connected. 'iburst' only makes a difference on the initial query and, if the server you have marked as 'iburst' is unreachable, it will really slow down synchronization. I am unaware of any issues with multiple servers being marked 'iburst' and typically configure 7 ntp servers for a system, all tagged as 'iburst'. Never sen any issue with this. iburst sends 8 packets (vs. the default of 1) with an interval delay of 2 seconds (assuming calldelay isn't set). There's some sort of rule in the NTP community where more than 20 requests per hour is considered rude or worthy of blocking. This was discussed on the timekeepers list a while back. The original thread (AFAIR) was originally about rules/regulations for vendors (such as router manufacturers who pick defaults, etc.), but I'm willing to bet the concepts apply universally: http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002299.html Relevant discussion pieces below, (*) marked as worth reading, and (!!!) are highly relevant: http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002321.html http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002323.html http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002325.html http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002326.html http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002327.html (*) http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002329.html (*) http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002330.html (*) http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002331.html (!!!) http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002332.html (!!!)
Re: ntpd struggling to keep up - how to fix?
On Fri, Feb 12, 2010 at 05:44:52PM +0100, Torfinn Ingolfsen wrote: On Fri, 12 Feb 2010 05:11:17 -0800 Jeremy Chadwick free...@jdc.parodius.com wrote: Please try doing this: - stop ntpd - rm /var/db/ntpd.drift - sysctl kern.timecounter.hardware=ACPI-safe - start ntpd Thanks, I'm currently testing that. Results in 72 hours (or less) :-) Something else came to mind: some BIOSes let you disable/enable HPET. Often labelled as High Performance Event Timer or Multimedia Timer, you could disable this option then check kern.timecounter.choice to see if HPET is gone from the list. If it is, FreeBSD will very likely choose ACPI-safe as the default timecounter (again, check kern.timecounter.hardware to see what the kernel chose itself. Remember that your sysctl.conf entry will override this though! :-) ), which -- assuming it works -- should solve your problem. Technical footnote: I wish I understood 1) the difference between ACPI-safe and ACPI-fast, and 2) how the system or OS ranks the timecounters (the higher the value in parenthesis, supposedly the more accurate/preferred it is). Xin, do you happen to know how this works? -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Fri, Feb 12, 2010 at 11:46:04AM -0800 I heard the voice of Jeremy Chadwick, and lo! it spake thus: Technical footnote: I wish I understood 1) the difference between ACPI-safe and ACPI-fast, AIUI, they're nearly the same thing, and it has to do with some testing to determine how it can be reliably accessed. I've had systems that would sometimes come up with -fast, and other times -safe (I think one varied depending on cold vs. warm boot for instance). and 2) how the system or OS ranks the timecounters (the higher the value in parenthesis, supposedly the more accurate/preferred it is). That's easier; TTBOMK, they're hardcoded in the source based on developer SWAG about their relative expenses and reliabilities and precisions. -- Matthew Fuller (MF4839) | fulle...@over-yonder.net Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ On the Internet, nobody can hear you scream. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
Hi-- On Feb 11, 2010, at 10:06 AM, Torfinn Ingolfsen wrote: [ ... ] Feb 7 16:11:45 kg-f2 ntpd[910]: time reset +2.373325 s and this goes on an on, forever. At any give time, no matter how long the machine has been up, ntpq ca report this: r...@kg-f2# ntpq -p remote refid st t when poll reach delay offset jitter == kg-omni1.kg4.no 129.240.64.3 3 u 13 64 370.162 703.094 444.681 The rate at which this machine is losing time is probably exceeding the ~50 seconds per day that NTPd is willing to correct without extreme measures (ie, it has to step time rather than drift-correct). You might help it maintain a more sane idea of time by using at least 4 timeservers. You might take a look at 'vmstat -i' and look out for an interrupt storm, but it's possible your hardware's clock is simply busted. Regards, -- -Chuck ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
On Thu, Feb 11, 2010 at 07:06:52PM +0100, Torfinn Ingolfsen wrote: Hi, One of my machines, the fileserver-with-zfs-to-be[1] has trouble keeping correct time. Or rather, ntpd is struggling. In /var/lkog/messages I see this: Feb 7 12:05:54 kg-f2 ntpd[909]: ntpd 4.2.4p5-a (1) Feb 7 12:11:16 kg-f2 ntpd[910]: time reset +1.020413 s Feb 7 12:11:16 kg-f2 ntpd[910]: kernel time sync status change 2001 Feb 7 12:26:26 kg-f2 ntpd[910]: time reset +2.277793 s Feb 7 12:41:29 kg-f2 ntpd[910]: time reset +2.260229 s Feb 7 12:57:02 kg-f2 ntpd[910]: time reset +2.332972 s Feb 7 13:21:24 kg-f2 ntpd[910]: time reset +3.659869 s Feb 7 13:37:01 kg-f2 ntpd[910]: time reset +2.343230 s Feb 7 13:52:24 kg-f2 ntpd[910]: time reset +2.310659 s Feb 7 14:07:29 kg-f2 ntpd[910]: time reset +2.265705 s Feb 7 14:23:03 kg-f2 ntpd[910]: time reset +2.335868 s Feb 7 14:39:06 kg-f2 ntpd[910]: time reset +2.46 s Feb 7 14:54:32 kg-f2 ntpd[910]: time reset +2.318222 s Feb 7 15:09:55 kg-f2 ntpd[910]: time reset +2.308120 s Feb 7 15:25:49 kg-f2 ntpd[910]: time reset +2.388391 s Feb 7 15:40:54 kg-f2 ntpd[910]: time reset +2.265464 s Feb 7 15:55:57 kg-f2 ntpd[910]: time reset +2.257952 s Feb 7 16:11:45 kg-f2 ntpd[910]: time reset +2.373325 s and this goes on an on, forever. At any give time, no matter how long the machine has been up, ntpq ca report this: r...@kg-f2# ntpq -p remote refid st t when poll reach delay offset jitter == kg-omni1.kg4.no 129.240.64.3 3 u 13 64 370.162 703.094 444.681 Note: all machines on my LAN use my firewall as the ntp server. The ntp server runs FreeBSD, none of the other machines have any trouble keeping time. My workstation for example: ti...@kg-v2$ ntpq -p remote refid st t when poll reach delay offset jitter == *kg-omni1.kg4.no 129.240.64.3 3 u 44 64 3770.1384.018 0.338 (my workstatuion also runs FreeBSD 8.0-stable / amd64) The machine runs FreeBSD 8.0-stable / amd64: r...@kg-f2# uname -a FreeBSD kg-f2.kg4.no 8.0-STABLE FreeBSD 8.0-STABLE #2: Sun Jan 31 18:39:17 CET 2010 r...@kg-f2.kg4.no:/usr/obj/usr/src/sys/GENERIC amd64 So, how can I get the machine to keep time / get ntpd synchronised? References: 1) hw info: http://sites.google.com/site/tingox/ga-ma74gm-s2h 2) FreeBSD info: http://sites.google.com/site/tingox/ga-ma74gm-s2h_freebsd Your machine has a rapidly drifting clock, usually an indicator of a hardware problem (crystal gone bad is a common one -- seen this at work quite a few times), or possibly a bad time counter source chosen by the kernel. Can you please provide the output of: sysctl kern.timecounter Finally, was this OS installation used on different hardware in the past? Meaning: was the hard disk previously installed on another machine? Why I'm asking: /var/db/ntpd.drift could be from an old computer (the previous hardware), and the clock drift rate would be different than that of your newer[1] hardware. If that's the case, please stop ntpd, rm /var/db/ntpd.drift, and restart ntpd. Be aware it will take up to 72 hours for the clock drift to be calculated correctly. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, On 2010/02/11 11:25, Jeremy Chadwick wrote: On Thu, Feb 11, 2010 at 07:06:52PM +0100, Torfinn Ingolfsen wrote: Hi, One of my machines, the fileserver-with-zfs-to-be[1] has trouble keeping correct time. Or rather, ntpd is struggling. In /var/lkog/messages I see this: Feb 7 12:05:54 kg-f2 ntpd[909]: ntpd 4.2.4p5-a (1) Feb 7 12:11:16 kg-f2 ntpd[910]: time reset +1.020413 s Feb 7 12:11:16 kg-f2 ntpd[910]: kernel time sync status change 2001 Feb 7 12:26:26 kg-f2 ntpd[910]: time reset +2.277793 s Feb 7 12:41:29 kg-f2 ntpd[910]: time reset +2.260229 s Feb 7 12:57:02 kg-f2 ntpd[910]: time reset +2.332972 s Feb 7 13:21:24 kg-f2 ntpd[910]: time reset +3.659869 s Feb 7 13:37:01 kg-f2 ntpd[910]: time reset +2.343230 s Feb 7 13:52:24 kg-f2 ntpd[910]: time reset +2.310659 s Feb 7 14:07:29 kg-f2 ntpd[910]: time reset +2.265705 s Feb 7 14:23:03 kg-f2 ntpd[910]: time reset +2.335868 s Feb 7 14:39:06 kg-f2 ntpd[910]: time reset +2.46 s Feb 7 14:54:32 kg-f2 ntpd[910]: time reset +2.318222 s Feb 7 15:09:55 kg-f2 ntpd[910]: time reset +2.308120 s Feb 7 15:25:49 kg-f2 ntpd[910]: time reset +2.388391 s Feb 7 15:40:54 kg-f2 ntpd[910]: time reset +2.265464 s Feb 7 15:55:57 kg-f2 ntpd[910]: time reset +2.257952 s Feb 7 16:11:45 kg-f2 ntpd[910]: time reset +2.373325 s and this goes on an on, forever. At any give time, no matter how long the machine has been up, ntpq ca report this: r...@kg-f2# ntpq -p remote refid st t when poll reach delay offset jitter == kg-omni1.kg4.no 129.240.64.3 3 u 13 64 370.162 703.094 444.681 Note: all machines on my LAN use my firewall as the ntp server. The ntp server runs FreeBSD, none of the other machines have any trouble keeping time. My workstation for example: ti...@kg-v2$ ntpq -p remote refid st t when poll reach delay offset jitter == *kg-omni1.kg4.no 129.240.64.3 3 u 44 64 3770.1384.018 0.338 (my workstatuion also runs FreeBSD 8.0-stable / amd64) The machine runs FreeBSD 8.0-stable / amd64: r...@kg-f2# uname -a FreeBSD kg-f2.kg4.no 8.0-STABLE FreeBSD 8.0-STABLE #2: Sun Jan 31 18:39:17 CET 2010 r...@kg-f2.kg4.no:/usr/obj/usr/src/sys/GENERIC amd64 So, how can I get the machine to keep time / get ntpd synchronised? References: 1) hw info: http://sites.google.com/site/tingox/ga-ma74gm-s2h 2) FreeBSD info: http://sites.google.com/site/tingox/ga-ma74gm-s2h_freebsd Your machine has a rapidly drifting clock, usually an indicator of a hardware problem (crystal gone bad is a common one -- seen this at work quite a few times), or possibly a bad time counter source chosen by the kernel. Can you please provide the output of: sysctl kern.timecounter Finally, was this OS installation used on different hardware in the past? Meaning: was the hard disk previously installed on another machine? Why I'm asking: /var/db/ntpd.drift could be from an old computer (the previous hardware), and the clock drift rate would be different than that of your newer[1] hardware. If that's the case, please stop ntpd, rm /var/db/ntpd.drift, and restart ntpd. Be aware it will take up to 72 hours for the clock drift to be calculated correctly. I think this looks like the same problem I had with another AMD system, which may be related to some HPET stuff (I no longer have access to that system, though :( Cheers, - -- Xin LI delp...@delphij.nethttp://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.14 (FreeBSD) iQEcBAEBAgAGBQJLdF9LAAoJEATO+BI/yjfBqacH/jreDlSiX9YCZqOSo22Dx0oW KGxuqUk6ViBTBEMOHJzpqNn37u/cbBQ7qlXaDfhg1LY825lCvx782mFGPH3J67qT IQZyLeWKGn/2BW/mhyQ9qOkEZKfifuwGmvvhxOwmnPyG2o1opFYiNxtLcJj0hPbs qqhf7wE2YzY4Khx7bTVsbclUz6kaXnusUF09Kg2F4LJ7WUilkAvFYwuG/J4sx7UN qKbw/F2bS1suyAt3cOmcb73rHN8MAbIyzjv0HOc4LUMnS6btFPUe5pqa7ghRNf7o 4wIoeGXQ6zupkjpHULIjU9hfu8uwKnTiDJ2xfJ6HjLvawsvOu/VUYvgqQM6cMd8= =Wy4x -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
Hi, One of my machines, the fileserver-with-zfs-to-be[1] has trouble keeping correct time. Or rather, ntpd is struggling. In /var/lkog/messages I see this: Feb 7 12:05:54 kg-f2 ntpd[909]: ntpd 4.2.4p5-a (1) Feb 7 12:11:16 kg-f2 ntpd[910]: time reset +1.020413 s Feb 7 12:11:16 kg-f2 ntpd[910]: kernel time sync status change 2001 Feb 7 12:26:26 kg-f2 ntpd[910]: time reset +2.277793 s Feb 7 12:41:29 kg-f2 ntpd[910]: time reset +2.260229 s Feb 7 12:57:02 kg-f2 ntpd[910]: time reset +2.332972 s Feb 7 13:21:24 kg-f2 ntpd[910]: time reset +3.659869 s Feb 7 13:37:01 kg-f2 ntpd[910]: time reset +2.343230 s Feb 7 13:52:24 kg-f2 ntpd[910]: time reset +2.310659 s Feb 7 14:07:29 kg-f2 ntpd[910]: time reset +2.265705 s Feb 7 14:23:03 kg-f2 ntpd[910]: time reset +2.335868 s Feb 7 14:39:06 kg-f2 ntpd[910]: time reset +2.46 s Feb 7 14:54:32 kg-f2 ntpd[910]: time reset +2.318222 s Feb 7 15:09:55 kg-f2 ntpd[910]: time reset +2.308120 s Feb 7 15:25:49 kg-f2 ntpd[910]: time reset +2.388391 s Feb 7 15:40:54 kg-f2 ntpd[910]: time reset +2.265464 s Feb 7 15:55:57 kg-f2 ntpd[910]: time reset +2.257952 s Feb 7 16:11:45 kg-f2 ntpd[910]: time reset +2.373325 s [snip] I think this looks like the same problem I had with another AMD system, which may be related to some HPET stuff (I no longer have access to that system, though :( I have the some problem on my machine (also AMD, running 8.0 + patches), after a while ntpd gives up sync'ing and then the time is off by minutes (roughly 80sec after.. say 10 hours) :( I switched to opentnpd (you can find it in ports) and the clock stays in sync now, so you might want to consider that, too PS: I had a spare disk so I tried Linux on the same machine, and ntpd is running fine for 2 days without any problems; so I guess it's not a hw fault HTH ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ntpd struggling to keep up - how to fix?
--On Thursday, February 11, 2010 10:49 PM +0100 Stefan Krueger stadtki...@gmx.de wrote: snip PS: I had a spare disk so I tried Linux on the same machine, and ntpd is running fine for 2 days without any problems; so I guess it's not a hw fault It is a HW fault. FreeBSD and Linux are just picking different time sources, Linux is guessing right, FreeBSD is guessing wrong. AMD actually has pretty widely known issues with this. I've had problems mostly in Solaris/OpenSolaris though, a few with FreeBSD, and only rarely with Linux. I don't know the details, just that at least the Opteron HPET apparently isn't reliable. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org