subject:"ntpd struggling to keep up \- how to fix\?"

Re: ntpd struggling to keep up - how to fix?

2010-02-23 Thread Peter Jeremy

On 2010-Feb-22 03:41:05 -0800, Jeremy Chadwick free...@jdc.parodius.com wrote:
ntpd under normal operation (not +/- 500ppm) figure out on its own the
average amount of drift, which is what ntpd.drift is for, correct?

Yes.  It takes a long time for ntpd to characterise the local system
clock.  Once it does so, it stores the calculated drift in ntp.drift
and updates it every hour or so.  This means that when ntpd is
restarted, it can immediately set its PLL to a reasonably close value,
rather than starting from scratch.

-- 
Peter Jeremy


pgpojg0gBGkf4.pgp
Description: PGP signature

Re: ntpd struggling to keep up - how to fix?

2010-02-23 Thread Torfinn Ingolfsen

On Mon, 22 Feb 2010 07:17:42 +1100
Peter Jeremy peterjer...@acm.org wrote:

 On 2010-Feb-21 17:36:19 +0100, Torfinn Ingolfsen 
 torfinn.ingolf...@broadpark.no wrote:
 Over time (probably a couple of days from scratch), the poll rate
 should increase to 1024.  If it doesn't, it may indicate that your

Like so:
r...@kg-f2# ntpq -p
 remote   refid  st t when poll reach   delay   offset  jitter
==
*kg-omni1.kg4.no 192.121.13.583 u  564 1024  3770.1633.018   3.196
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-22 Thread perryh

Peter Jeremy peterjer...@acm.org wrote:

 ... Once ntpd decides to continuously step, something is broken.

Is there some reason why, as long as it is not yet synced, ntpd
should not do this sort of calculation and rate correction itself
rather than insist on having a human perform the calculation and
enter the adjustment?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-22 Thread Peter Jeremy

On 2010-Feb-22 01:02:54 -0800, per...@pluto.rain.com wrote:
Peter Jeremy peterjer...@acm.org wrote:

 ... Once ntpd decides to continuously step, something is broken.

Is there some reason why, as long as it is not yet synced, ntpd
should not do this sort of calculation and rate correction itself
rather than insist on having a human perform the calculation and
enter the adjustment?

ntpd _does_ do this sort of calculation but the NTP algorithms
bound the PLL adjustment to +/-500ppm.  RFC1305 suggests that
a reasonable tolerance for board-mounted, uncompensated quartz-
crystal oscillators is 100ppm and therefore the +/-500ppm bound
is reasonable (see the RFC for the gory maths).

In this case, the op's clock was ~2500ppm slow - well outside
the NTP tolerance.  It was therefore necessary to change the
nominal timecounter frequency to bring it into lock range.  I
do not believe it is reasonable for ntpd to do this by itself:
- It should very rarely be needed since NTP should be able to
  compensate for normal tolerances.
- The actual local clock source and how to alter the kernel's
  idea of its nominal frequency is outside the purview of NTP.
- Giving ntpd free reign over the timecounter frequency runs
  the real risk of ntpd rendering the system unusable if ntpd
  becomes confused (or is mislead) about the time.

Note that FreeBSD/i386 and /amd64 include 4 different possible
timecounters, only 3 of which can be tweaked.  Other FreeBSD
architectures will have different timecounters.  Other OSs may
have completely different mechanisms for handling the local
clock source.  Trying to embed knowledge of all these different
clock sources into ntpd would be unrealistic.

I look after over 100 assorted Unix hosts at home and work (HP
AlphaServers and Proliants, various Sun servers, Dell and whitebox PCs
and various laptops) and the worst driftrates I have seen previously
are:
- Sun T-2000 servers have a design flaw in the clock spectrum
  spreading so it appears to be ~250ppm fast.  Sun fixed this
  with a kernel patch that increases the nominal clock frequency.
- A Sun V20z is just over 100ppm out - I have tweaked the
  relevant timecounter to compensate for this (to avoid triggering
  my NTP frequency error alarms).
- 4 assorted Sun hosts that run 55-60ppm out.

At least based on my sample, the only hosts that were anywhere near
ntpd's tolerance limits were acknowledged to have a design problem
and the vendor provided a fix.  IMO, this is a better approach than
trying to make ntpd omniscient.

-- 
Peter Jeremy


pgpNNc5IxcM1u.pgp
Description: PGP signature

Re: ntpd struggling to keep up - how to fix?

2010-02-22 Thread Jeremy Chadwick

On Mon, Feb 22, 2010 at 10:18:10PM +1100, Peter Jeremy wrote:
 On 2010-Feb-22 01:02:54 -0800, per...@pluto.rain.com wrote:
 Peter Jeremy peterjer...@acm.org wrote:
 
  ... Once ntpd decides to continuously step, something is broken.
 
 Is there some reason why, as long as it is not yet synced, ntpd
 should not do this sort of calculation and rate correction itself
 rather than insist on having a human perform the calculation and
 enter the adjustment?
 
 ntpd _does_ do this sort of calculation but the NTP algorithms
 bound the PLL adjustment to +/-500ppm.  RFC1305 suggests that
 a reasonable tolerance for board-mounted, uncompensated quartz-
 crystal oscillators is 100ppm and therefore the +/-500ppm bound
 is reasonable (see the RFC for the gory maths).
 
 In this case, the op's clock was ~2500ppm slow - well outside
 the NTP tolerance.  It was therefore necessary to change the
 nominal timecounter frequency to bring it into lock range.  I
 do not believe it is reasonable for ntpd to do this by itself:
 - It should very rarely be needed since NTP should be able to
   compensate for normal tolerances.
 - The actual local clock source and how to alter the kernel's
   idea of its nominal frequency is outside the purview of NTP.
 - Giving ntpd free reign over the timecounter frequency runs
   the real risk of ntpd rendering the system unusable if ntpd
   becomes confused (or is mislead) about the time.
 
 Note that FreeBSD/i386 and /amd64 include 4 different possible
 timecounters, only 3 of which can be tweaked.  Other FreeBSD
 architectures will have different timecounters.  Other OSs may
 have completely different mechanisms for handling the local
 clock source.  Trying to embed knowledge of all these different
 clock sources into ntpd would be unrealistic.
 
 I look after over 100 assorted Unix hosts at home and work (HP
 AlphaServers and Proliants, various Sun servers, Dell and whitebox PCs
 and various laptops) and the worst driftrates I have seen previously
 are:
 - Sun T-2000 servers have a design flaw in the clock spectrum
   spreading so it appears to be ~250ppm fast.  Sun fixed this
   with a kernel patch that increases the nominal clock frequency.
 - A Sun V20z is just over 100ppm out - I have tweaked the
   relevant timecounter to compensate for this (to avoid triggering
   my NTP frequency error alarms).
 - 4 assorted Sun hosts that run 55-60ppm out.
 
 At least based on my sample, the only hosts that were anywhere near
 ntpd's tolerance limits were acknowledged to have a design problem
 and the vendor provided a fix.  IMO, this is a better approach than
 trying to make ntpd omniscient.

A question with regards to the latter systems you mentioned (though I'm
speaking generally and not specifically with regards to those H/W
models), as I want to make sure I understand correctly:

ntpd under normal operation (not +/- 500ppm) figure out on its own the
average amount of drift, which is what ntpd.drift is for, correct?

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-22 Thread Peter Jeremy

On 2010-Feb-21 14:29:28 -0500, David Magda dma...@ee.ryerson.ca wrote:
For future reference, how does the math work? How do you go from  
taking a timer number:

   $ sysctl machdep.acpi_timer_freq
   machdep.acpi_timer_freq: 3577045

And the ntpd(8) time reset log entries to adjust the frequency? Or do  
you use the PPM output of the ntpdc(8) command?

I'm not quite sure I understand what happened here. :)

I'm using a combination of the ACPI frequency, time reset logs and
PLL frequency reported by the op:

On 2010-Feb-20 22:32:01 +0100, Torfinn Ingolfsen torfinn.ingolf...@broadpa=
rk.no wrote:
r...@kg-f2# sysctl machdep.acpi_timer_freq
machdep.acpi_timer_freq: 3577045
r...@kg-f2# tvlm
Feb 20 20:06:41 kg-f2 ntpd[942]: kernel time sync status change 2001
Feb 20 20:21:49 kg-f2 ntpd[942]: time reset +1.118880 s
Feb 20 20:37:53 kg-f2 ntpd[942]: time reset +1.188538 s
Feb 20 20:53:03 kg-f2 ntpd[942]: time reset +1.121903 s
Feb 20 21:09:00 kg-f2 ntpd[942]: time reset +1.179924 s
Feb 20 21:24:57 kg-f2 ntpd[942]: time reset +1.178490 s
Feb 20 21:39:58 kg-f2 ntpd[942]: time reset +1.110647 s
Feb 20 21:55:53 kg-f2 ntpd[942]: time reset +1.177292 s
Feb 20 22:11:44 kg-f2 ntpd[942]: time reset +1.172358 s
Feb 20 22:26:48 kg-f2 ntpd[942]: time reset +1.114350 s
...
r...@kg-f2# ntpdc -c loopi -c sysi
offset:   0.00 s
frequency:500.000 ppm

Together with the assumptions that the system clock is stable (ie the
rate of drift is constant) and the syslog entries occurred at
precisely the times reported.  If the former assumption isn't true
(which was a distinct possibility given the size of error) then ntpd
isn't going to work.  If he latter assumption is incorrect then the
calculated clock skew will be incorrect - but hopefully enough to
bring it into ntpd capture range to allow later tweaking.

If ntpd cannot slew the local clock sufficiently, it will step the
clock roughly every 900 seconds, hence the regular time reset
messages.  Since we are assuming a stable clock, we can accumulate
the offsets in multiple reset messages to give a cumulative offset.

For the above figures, the clock drift (sum of time reset messages)
totals ~10.36 seconds over a period of 2:20:07 (the difference between
the kernel time sync and last time reset message).  [Note that I
somehow mistranscribed both the offset and duration in my last mail -
apologies for the confusion this might have caused].

10.36s in 2:20:07 == 10.36/8407 ~= 1.233e-3 or 1233ppm.  Thus ntpd is
reporting that the system clock is still 1233ppm slow, even with ntpd
pulling the system clock by its maximum of 500ppm.  Adding these gives
a total clock error of 1733ppm.

The nominal clock frequency used by the timecounter is 3577045Hz.
In order to calculate the actual clock frequency, we need to subtract the
clock error (1733ppm) from this frequency:
3577045Hz * (1 - 1733e-6) = 3570846Hz
(I rounded the clock error differently previously and got 3570847Hz).

-- 
Peter Jeremy


pgpiFgyALIePm.pgp
Description: PGP signature

Re: ntpd struggling to keep up - how to fix?

2010-02-21 Thread Matthew Seaman

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 20/02/2010 23:42, Jeremy Chadwick wrote:
 For sake of example -- look at ntpq's delay column for each peer, and
 then look at the same column but for ntpdc.  You'll see that for ntpdc
 they're divided by 1000 (presumably kern.hz rate):

No -- those are just times measured in milliseconds (for ntpq) or
seconds (for ntpdc). kern.hz doesn't come into it.

Cheers,

Matthew

- -- 
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
  Kent, CT11 9PW
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkuA7MMACgkQ8Mjk52CukIymTACfV62sN6DC8TQjnxhqS7w5r89l
m8MAn3vxDX8w2LpfA7ik67KXrhS2LY6G
=eRg0
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-21 Thread Torfinn Ingolfsen

On Sun, 21 Feb 2010 16:08:23 +1100
Peter Jeremy peterjer...@acm.org wrote:

 That's definitely not good - though it's marginally better than before.
 I have checked on a local machine and the timecounter frequency definitely
 needs to be adjusted in the opposite direction to the ntpd drift.
 
 I think I see the problem: I suggested 3579545Hz - 2500ppm, which
 gives an ACPI frequency of 3570596Hz.  There was some miscommunication
 and you have set an ACPI frequency of 3577045Hz which is 2500Hz (or
 698ppm) lower.  The drift reported by the time resets has gone from
 +1930ppm (14.5s in 2:05:17) to +1233ppm (8.4s in 2:20:06) - which is
 697ppm - fairly close to the change you made.  (The PLL is running
 at +500ppm so the actual clock offset is 500ppm more than the time
 reset reports suggest.

Very good info, it helps me understand more. Thanks!

 Having re-checked my maths, using both your time reset results, can
 you please try:
   sysctl machdep.acpi_timer_freq=3570847

Ok, trying that now:
r...@kg-f2# sysctl machdep.acpi_timer_freq=3570847
machdep.acpi_timer_freq: 3577045 - 3570847
r...@kg-f2# /etc/rc.d/ntpd stop
Stopping ntpd.
r...@kg-f2# rm /var/db/ntpd.drift
r...@kg-f2# /etc/rc.d/ntpd start
Starting ntpd.


 That should result in a drift of close to zero (well within NTP's
 lock range of +/- 300ppm).

Good.

 No.  Once ntpd decides to continuously step, something is broken.

Aha, very good to know.
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-21 Thread Torfinn Ingolfsen

On Sun, 21 Feb 2010 16:08:23 +1100
Peter Jeremy peterjer...@acm.org wrote:

 Having re-checked my maths, using both your time reset results, can
 you please try:
   sysctl machdep.acpi_timer_freq=3570847
 That should result in a drift of close to zero (well within NTP's
 lock range of +/- 300ppm).

And a few hours later: from /var/log/messages:
Feb 21 09:54:50 kg-f2 ntpd[55452]: ntpd 4.2.4p5-a (1)
Feb 21 09:59:10 kg-f2 ntpd[55453]: kernel time sync status change 2001

More info:
r...@kg-f2# ntpq -p
 remote   refid  st t when poll reach   delay   offset  jitter
==
*kg-omni1.kg4.no 78.157.115.4 3 u   31   64  3770.174  -10.253   0.160
r...@kg-f2# ntpdc -c loopi -c sysi
offset:   -0.010253 s
frequency:6.744 ppm
poll adjust:  -30
watchdog timer:   47 s
system peer:  kg-omni1.kg4.no
system peer mode: client
leap indicator:   00
stratum:  4
precision:-18
root distance:0.02956 s
root dispersion:  0.06795 s
reference ID: [10.1.10.1]
reference time:   cf2bdf36.f8820aef  Sun, Feb 21 2010 17:35:02.970
system flags: auth monitor ntp kernel stats 
jitter:   0.000153 s
stability:0.000 ppm
broadcastdelay:   0.003998 s
authdelay:0.00 s

Problem solved.
Thanks a lot.
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-21 Thread David Magda



On Feb 21, 2010, at 11:36, Torfinn Ingolfsen wrote:

On Sun, 21 Feb 2010 16:08:23 +1100 Peter Jeremy  
peterjer...@acm.org wrote:



Having re-checked my maths, using both your time reset results, can
you please try:
 sysctl machdep.acpi_timer_freq=3570847
That should result in a drift of close to zero (well within NTP's
lock range of +/- 300ppm).


And a few hours later: from /var/log/messages:
Feb 21 09:54:50 kg-f2 ntpd[55452]: ntpd 4.2.4p5-a (1)
Feb 21 09:59:10 kg-f2 ntpd[55453]: kernel time sync status change 2001

More info:
r...@kg-f2# ntpq -p
remote   refid  st t when poll reach   delay
offset  jitter

==
*kg-omni1.kg4.no 78.157.115.4 3 u   31   64  3770.174   
-10.253   0.160

r...@kg-f2# ntpdc -c loopi -c sysi
offset:   -0.010253 s
frequency:6.744 ppm
poll adjust:  -30
watchdog timer:   47 s

[...]

For future reference, how does the math work? How do you go from  
taking a timer number:


$ sysctl machdep.acpi_timer_freq
machdep.acpi_timer_freq: 3577045

And the ntpd(8) time reset log entries to adjust the frequency? Or do  
you use the PPM output of the ntpdc(8) command?


I'm not quite sure I understand what happened here. :)

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-21 Thread Peter Jeremy

On 2010-Feb-21 17:36:19 +0100, Torfinn Ingolfsen 
torfinn.ingolf...@broadpark.no wrote:
*kg-omni1.kg4.no 78.157.115.4 3 u   31   64  3770.174  -10.253   0.160
r...@kg-f2# ntpdc -c loopi -c sysi
offset:   -0.010253 s
frequency:6.744 ppm

That looks much healthier though it doesn't explain why your system
clock is ~2500ppm out to start with.  You may get one further small
(~128msec) time step as part of ntpd's PLL calibration.

Over time (probably a couple of days from scratch), the poll rate
should increase to 1024.  If it doesn't, it may indicate that your
system clock stability isn't very good or you have excessive jitter
in your reference.

-- 
Peter Jeremy


pgpzEr0C192AE.pgp
Description: PGP signature

Re: ntpd struggling to keep up - how to fix?

2010-02-20 Thread Torfinn Ingolfsen

On Sat, 20 Feb 2010 12:53:51 +1100
Peter Jeremy peterjer...@acm.org wrote:

 Looks reasonable.  Let us know the results.  I'd be interested in
 the output from ntpdc -c loopi -c sysi.

Ok, here we go (the server panic'ed again last night):
r...@kg-f2# uptime
10:28PM  up  2:26, 3 users, load averages: 0.00, 0.00, 0.00
r...@kg-f2# sysctl machdep.acpi_timer_freq
machdep.acpi_timer_freq: 3577045
r...@kg-f2# tvlm
Feb 20 20:06:41 kg-f2 ntpd[942]: kernel time sync status change 2001
Feb 20 20:21:49 kg-f2 ntpd[942]: time reset +1.118880 s
Feb 20 20:37:53 kg-f2 ntpd[942]: time reset +1.188538 s
Feb 20 20:53:03 kg-f2 ntpd[942]: time reset +1.121903 s
Feb 20 21:09:00 kg-f2 ntpd[942]: time reset +1.179924 s
Feb 20 21:24:57 kg-f2 ntpd[942]: time reset +1.178490 s
Feb 20 21:39:58 kg-f2 ntpd[942]: time reset +1.110647 s
Feb 20 21:55:53 kg-f2 ntpd[942]: time reset +1.177292 s
Feb 20 22:11:44 kg-f2 ntpd[942]: time reset +1.172358 s
Feb 20 22:26:48 kg-f2 ntpd[942]: time reset +1.114350 s
r...@kg-f2# ntpq -p
 remote   refid  st t when poll reach   delay   offset  jitter
==
 kg-omni1.kg4.no 129.240.64.3 3 u8   6470.176  133.306  77.731
r...@kg-f2# ntpdc -c loopi -c sysi
offset:   0.00 s
frequency:500.000 ppm
poll adjust:  4
watchdog timer:   194 s
system peer:  0.0.0.0
system peer mode: unspec
leap indicator:   11
stratum:  16
precision:-18
root distance:0.0 s
root dispersion:  0.00290 s
reference ID: [83.84.69.80]
reference time:   .  Thu, Feb  7 2036  7:28:16.000
system flags: auth monitor ntp kernel stats 
jitter:   0.358109 s
stability:0.000 ppm
broadcastdelay:   0.003998 s
authdelay:0.00 s

Not synced at all. Not good. :-/
Perhaps I should give it more time?
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-20 Thread Torfinn Ingolfsen

On Sat, 20 Feb 2010 22:32:01 +0100
Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote:


This output looks ... wrong ... somehow to my eyes:
r...@kg-f2# date
Sat Feb 20 22:51:24 CET 2010
r...@kg-f2# ntpq -p
 remote   refid  st t when poll reach   delay   offset  jitter
==
*kg-omni1.kg4.no 129.240.64.3 3 u   62   64  3770.244  597.314 360.123
r...@kg-f2# ntpdc -c loopi -c sysi
offset:   0.00 s
frequency:500.000 ppm
poll adjust:  4
watchdog timer:   549 s
system peer:  kg-omni1.kg4.no
system peer mode: client
leap indicator:   11
stratum:  16
precision:-18
root distance:0.0 s
root dispersion:  0.00822 s
reference ID: [10.1.10.1]
reference time:   .  Thu, Feb  7 2036  7:28:16.000
system flags: auth monitor ntp kernel stats 
jitter:   0.360107 s
stability:0.000 ppm
broadcastdelay:   0.003998 s
authdelay:0.00 s

Shouldn't ntpq and ntpdc be in agreement?
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-20 Thread Jeremy Chadwick

On Sat, Feb 20, 2010 at 10:55:21PM +0100, Torfinn Ingolfsen wrote:
 On Sat, 20 Feb 2010 22:32:01 +0100
 Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote:
 
 
 This output looks ... wrong ... somehow to my eyes:
 r...@kg-f2# date
 Sat Feb 20 22:51:24 CET 2010
 r...@kg-f2# ntpq -p
  remote   refid  st t when poll reach   delay   offset  jitter
 ==
 *kg-omni1.kg4.no 129.240.64.3 3 u   62   64  3770.244  597.314 360.123
 r...@kg-f2# ntpdc -c loopi -c sysi
 offset:   0.00 s
 frequency:500.000 ppm
 poll adjust:  4
 watchdog timer:   549 s
 system peer:  kg-omni1.kg4.no
 system peer mode: client
 leap indicator:   11
 stratum:  16
 precision:-18
 root distance:0.0 s
 root dispersion:  0.00822 s
 reference ID: [10.1.10.1]
 reference time:   .  Thu, Feb  7 2036  7:28:16.000
 system flags: auth monitor ntp kernel stats 
 jitter:   0.360107 s
 stability:0.000 ppm
 broadcastdelay:   0.003998 s
 authdelay:0.00 s
 
 Shouldn't ntpq and ntpdc be in agreement?

ntpq and ntpdc output data in slightly different formats, depending on
what arguments you give them.  I'm not familiar with the loopi or sysi
commands; Peter should be able to help here.

For sake of example -- look at ntpq's delay column for each peer, and
then look at the same column but for ntpdc.  You'll see that for ntpdc
they're divided by 1000 (presumably kern.hz rate):

$ ntpq -c peers
 remote   refid  st t when poll reach   delay   offset  jitter
==
+clock-a.develoo 204.123.2.72 2 u  476  512  377   25.287   -0.852   0.550
-enigma.wiredgoa 209.81.9.7   2 u  185  512  377   14.7540.284   0.688
+mtnlion.com 139.78.135.142 u  208  512  377   30.788   -0.233   0.160
*ntp1.phoenixpub .LCL.1 u  179  512  377   36.322   -0.552   0.522
-ntp-1.gw.uiuc.e 128.174.38.133   2 u  141  512  377   77.321   -5.381   0.328
-tick.jrc.us 172.21.0.14  2 u  149  512  377  112.424   -8.110   1.440

$ ntpdc -c peers
 remote   local  st poll reach  delay   offsetdisp
===
*mailserv1.phoen 192.168.1.51 1  512  377 0.03632 -0.000552 0.09666
=clock-a.develoo 192.168.1.51 2  512  377 0.02528 -0.000852 0.08611
=tick.jrc.us 192.168.1.51 2  512  377 0.11241 -0.008110 0.08615
=enigma.wiredgoa 192.168.1.51 2  512  377 0.01474  0.000284 0.11473
=mtnlion.com 192.168.1.51 2  512  377 0.03078 -0.000233 0.09665
=ntp-1.gw.uiuc.e 192.168.1.51 2  512  377 0.07732 -0.005381 0.10612

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-20 Thread Peter Jeremy

On 2010-Feb-20 22:32:01 +0100, Torfinn Ingolfsen 
torfinn.ingolf...@broadpark.no wrote:
On Sat, 20 Feb 2010 12:53:51 +1100
Peter Jeremy peterjer...@acm.org wrote:

 Looks reasonable.  Let us know the results.  I'd be interested in
 the output from ntpdc -c loopi -c sysi.

Ok, here we go (the server panic'ed again last night):
r...@kg-f2# uptime
10:28PM  up  2:26, 3 users, load averages: 0.00, 0.00, 0.00
r...@kg-f2# sysctl machdep.acpi_timer_freq
machdep.acpi_timer_freq: 3577045
r...@kg-f2# tvlm
Feb 20 20:06:41 kg-f2 ntpd[942]: kernel time sync status change 2001
Feb 20 20:21:49 kg-f2 ntpd[942]: time reset +1.118880 s
Feb 20 20:37:53 kg-f2 ntpd[942]: time reset +1.188538 s
Feb 20 20:53:03 kg-f2 ntpd[942]: time reset +1.121903 s
Feb 20 21:09:00 kg-f2 ntpd[942]: time reset +1.179924 s
Feb 20 21:24:57 kg-f2 ntpd[942]: time reset +1.178490 s
Feb 20 21:39:58 kg-f2 ntpd[942]: time reset +1.110647 s
Feb 20 21:55:53 kg-f2 ntpd[942]: time reset +1.177292 s
Feb 20 22:11:44 kg-f2 ntpd[942]: time reset +1.172358 s
Feb 20 22:26:48 kg-f2 ntpd[942]: time reset +1.114350 s

That's definitely not good - though it's marginally better than before.
I have checked on a local machine and the timecounter frequency definitely
needs to be adjusted in the opposite direction to the ntpd drift.

I think I see the problem: I suggested 3579545Hz - 2500ppm, which
gives an ACPI frequency of 3570596Hz.  There was some miscommunication
and you have set an ACPI frequency of 3577045Hz which is 2500Hz (or
698ppm) lower.  The drift reported by the time resets has gone from
+1930ppm (14.5s in 2:05:17) to +1233ppm (8.4s in 2:20:06) - which is
697ppm - fairly close to the change you made.  (The PLL is running
at +500ppm so the actual clock offset is 500ppm more than the time
reset reports suggest.

Having re-checked my maths, using both your time reset results, can
you please try:
  sysctl machdep.acpi_timer_freq=3570847
That should result in a drift of close to zero (well within NTP's
lock range of +/- 300ppm).

frequency:500.000 ppm

And this is definitely not good.

Not synced at all. Not good. :-/
Perhaps I should give it more time?

No.  Once ntpd decides to continuously step, something is broken.

I've done some double-checking and 
On 2010-Feb-20 22:55:21 +0100, Torfinn Ingolfsen 
ytorfinn.ingolf...@broadpark.no wrote:
This output looks ... wrong ... somehow to my eyes:
...
Shouldn't ntpq and ntpdc be in agreement?

I'm not sure which particular bits you are concerned about but ntpq
reports delay/offset/jitter in msec whilst ntpdc reports them in sec.

Note that I can't explain why the loopi offset is zero - ntpdc(8)
states that this is the last offset given to the loop filter by the
packet processing code.  For me it's non-zero but doesn't quite
match the offset reported by 'ntpq -p'.

-- 
Peter Jeremy


pgpZax0MQojXe.pgp
Description: PGP signature

Re: ntpd struggling to keep up - how to fix?

2010-02-19 Thread Peter Jeremy

On 2010-Feb-19 00:38:44 +0100, Torfinn Ingolfsen 
torfinn.ingolf...@broadpark.no wrote:
r...@kg-f2# sysctl machdep.acpi_timer_freq=3577045
machdep.acpi_timer_freq: 3579545 - 3577045

Looks reasonable.  Let us know the results.  I'd be interested in
the output from ntpdc -c loopi -c sysi.

-- 
Peter Jeremy


pgpJFAUKCAoRd.pgp
Description: PGP signature

Re: ntpd struggling to keep up - how to fix?

2010-02-18 Thread Peter Jeremy

On 2010-Feb-17 20:03:22 +0100, Torfinn Ingolfsen 
torfinn.ingolf...@broadpark.no wrote:
On Wed, 17 Feb 2010 19:49:27 +0100
Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote:

 Unfortunately, it isn't enough to keep the machine in sync all the time.
 But it is better than HPET so I'll keep it.

Did you delete /etc/ntp.drift between timecounter changes?

This thread is interesting:
http://lkml.indiana.edu/hypermail/linux/kernel/0903.1/01356.html

Is there a way in FreeBSD to perform adjustmenst like adjtimex?

There's ntptime(8) but it doesn't have a self-calibrate mode.

Based on the messages log you gave, and assuming the ntpd PLL is sane,
your acpi-safe clock is about 2500ppm slow (the steps reflect about
2000ppm and the ntpd PLL should be compensating for a further 500ppm)
- this is really bad, even for consumer-grade stuff.  Are you running
non-standard clock speeds or multipliers?

If there's nothing obvious, I'd follow John Hay's suggesion and
force set either your TSC or ACPI frequency in sysctl.conf (you
can't override the HPET frequency).

Take either the TSC or ACPI frequency reported by sysctl machdep,
reduce it by 2500ppm and set that in /etc/sysctl.conf.  Assuming
a standard (3.58MHz) ACPI, the latter would look like:

machdep.acpi_timer_freq=3570596
kern.timecounter.hardware=ACPI-safe

The stop ntpd, delete /var/db/ntp.drift and either reboot or
manually set the above sysctl's and restart ntpd.

[I think I've got the adjustment direction correct in the above, if
I've stuffed up, you need to adjust in the other direction]

-- 
Peter Jeremy


pgpcm7bDQJdjP.pgp
Description: PGP signature

Re: ntpd struggling to keep up - how to fix?

2010-02-18 Thread Torfinn Ingolfsen

On Fri, 19 Feb 2010 07:54:58 +1100
Peter Jeremy peterjer...@acm.org wrote:

 On 2010-Feb-17 20:03:22 +0100, Torfinn Ingolfsen 
 torfinn.ingolf...@broadpark.no wrote:
 Did you delete /etc/ntp.drift between timecounter changes?

I sure did, I used the instructions given.

 There's ntptime(8) but it doesn't have a self-calibrate mode.

Ok, good to know.

 Based on the messages log you gave, and assuming the ntpd PLL is sane,
 your acpi-safe clock is about 2500ppm slow (the steps reflect about
 2000ppm and the ntpd PLL should be compensating for a further 500ppm)
 - this is really bad, even for consumer-grade stuff.  Are you running
 non-standard clock speeds or multipliers?

No, everything at default values here (ie. I haven't changed anything in either 
BIOS or FreeBSD),
except from changing timer from HPET to ACPI-safe.

 If there's nothing obvious, I'd follow John Hay's suggesion and
 force set either your TSC or ACPI frequency in sysctl.conf (you
 can't override the HPET frequency).
 
 Take either the TSC or ACPI frequency reported by sysctl machdep,
 reduce it by 2500ppm and set that in /etc/sysctl.conf.  Assuming
 a standard (3.58MHz) ACPI, the latter would look like:
 
 machdep.acpi_timer_freq=3570596

This one is 
r...@kg-f2# sysctl machdep.acpi_timer_freq
machdep.acpi_timer_freq: 3579545

So I should change that to 3577045, right?
Like so:
r...@kg-f2# sysctl machdep.acpi_timer_freq=3579545
machdep.acpi_timer_freq: 3579545 - 3579545

and I put it into /etc/sysctl.conf as well (in case the machine reboots again).

 kern.timecounter.hardware=ACPI-safe

Yes, this is already in /etc/sysctl.conf


 The stop ntpd, delete /var/db/ntp.drift and either reboot or
 manually set the above sysctl's and restart ntpd.

Done. We'll see if it works or not.

 [I think I've got the adjustment direction correct in the above, if
 I've stuffed up, you need to adjust in the other direction]

Ok.

Thanks to all for helping out.
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-18 Thread Torfinn Ingolfsen

On Thu, 18 Feb 2010 23:12:23 +0100
Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote:

 So I should change that to 3577045, right?
 Like so:
 r...@kg-f2# sysctl machdep.acpi_timer_freq=3579545
 machdep.acpi_timer_freq: 3579545 - 3579545

Eh... I just realized that I did it wrong. well, that's what cut and paste will 
do to you,
if you don't pay attention. ;)

Ok, I will try to do it right now:
r...@kg-f2# sysctl machdep.acpi_timer_freq=3577045
machdep.acpi_timer_freq: 3579545 - 3577045

r...@kg-f2# /etc/rc.d/ntpd stop
Stopping ntpd.
r...@kg-f2# ll /var/db/ntp*
-rw-r--r--  1 root  wheel  8 Feb 13 20:27 /var/db/ntpd.drift
r...@kg-f2# rm /var/db/ntpd.drift
r...@kg-f2# ll /var/db/ntp*
ls: /var/db/ntp*: No such file or directory
r...@kg-f2# /etc/rc.d/ntpd start
Starting ntpd.

and fixing it in /etc/sysctl.conf too.
Done.
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-17 Thread Torfinn Ingolfsen

On Fri, 12 Feb 2010 17:44:52 +0100
Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote:

 On Fri, 12 Feb 2010 05:11:17 -0800
 Jeremy Chadwick free...@jdc.parodius.com wrote:
 
  Please try doing this:
  
  - stop ntpd
  - rm /var/db/ntpd.drift
  - sysctl kern.timecounter.hardware=ACPI-safe
  - start ntpd
 
 Thanks, I'm currently testing that. Results in 72 hours (or less) :-)

Well, using ACPI-safe only get a small improvement,
here are the lines from /var/log/messages:
Feb 17 17:16:47 kg-f2 ntpd[912]: time reset +1.785920 s
Feb 17 17:32:39 kg-f2 ntpd[912]: time reset +1.836376 s
Feb 17 17:48:18 kg-f2 ntpd[912]: time reset +1.811593 s
Feb 17 18:04:12 kg-f2 ntpd[912]: time reset +1.840545 s
Feb 17 18:19:19 kg-f2 ntpd[912]: time reset +1.751837 s
Feb 17 18:35:19 kg-f2 ntpd[912]: time reset +1.852328 s
Feb 17 18:51:18 kg-f2 ntpd[912]: time reset +1.850928 s
Feb 17 19:06:50 kg-f2 ntpd[912]: time reset +1.798706 s
Feb 17 19:22:35 kg-f2 ntpd[912]: time reset +1.823697 s
Feb 17 19:37:56 kg-f2 ntpd[912]: time reset +1.777376 s

Unfortunately, it isn't enough to keep the machine in sync all the time.
But it is better than HPET so I'll keep it.
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-17 Thread Torfinn Ingolfsen

On Wed, 17 Feb 2010 19:49:27 +0100
Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote:

 Unfortunately, it isn't enough to keep the machine in sync all the time.
 But it is better than HPET so I'll keep it.

This thread is interesting:
http://lkml.indiana.edu/hypermail/linux/kernel/0903.1/01356.html

Is there a way in FreeBSD to perform adjustmenst like adjtimex?
'apropos adjtime' only gives me a system call, 
the man pages for hz(9) and hardclock(9) doesn't exist on 8.0-stable
(or on 7.2-stable).

-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-17 Thread John Hay

On Wed, Feb 17, 2010 at 08:03:22PM +0100, Torfinn Ingolfsen wrote:
 On Wed, 17 Feb 2010 19:49:27 +0100
 Torfinn Ingolfsen torfinn.ingolf...@broadpark.no wrote:
 
  Unfortunately, it isn't enough to keep the machine in sync all the time.
  But it is better than HPET so I'll keep it.
 
 This thread is interesting:
 http://lkml.indiana.edu/hypermail/linux/kernel/0903.1/01356.html
 
 Is there a way in FreeBSD to perform adjustmenst like adjtimex?
 'apropos adjtime' only gives me a system call, 
 the man pages for hz(9) and hardclock(9) doesn't exist on 8.0-stable
 (or on 7.2-stable).

You can set the timecounter frequency with sysctl. On my one time
server I have these lines in /etc/sysctl.conf

machdep.tsc_freq=132658584
kern.timecounter.hardware=TSC

John
-- 
John Hay -- j...@meraka.csir.co.za / j...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-15 Thread David Malone

On Fri, Feb 12, 2010 at 11:46:04AM -0800, Jeremy Chadwick wrote:
 Technical footnote: I wish I understood 1) the difference between
 ACPI-safe and ACPI-fast, and 2) how the system or OS ranks the
 timecounters (the higher the value in parenthesis, supposedly the more
 accurate/preferred it is).  Xin, do you happen to know how this works?

1) When you read the ACPI timing register, you should get a sensible
answer. However on some (most?) hardware, you can read the register
and get it half way through an update. When the kernel finds the
ACPI timer, it tries reading it a few times in a row, and checks
the results look good - if they do, you get ACPI-fast. If it catches
a half-updated register, then you get ACPI-slow, which reads the
register multiple times in an effort to avoid the problem.

2) The ranking of timers is essentially hard wired, though for some
times it is adjusted in some way. For example, the ranking of the
TSC may be reduced if it looks like an SMP system. I believe the
ranking was originally intended to be a measure of how fast the
counter could be read, but things have turned out to be complicated
by difficult hardware.

David.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-13 Thread Torfinn Ingolfsen

On Fri, 12 Feb 2010 11:46:04 -0800
Jeremy Chadwick free...@jdc.parodius.com wrote:

 override this though!  :-) ), which -- assuming it works -- should
 solve your problem.

We'll see. The box rebooted again last night (see another thread on this
mailing list), so now I have added kern.timecounter.hardware=ACPI-safe
to /etc/sysctl.conf, just in case it reboots again.

 Technical footnote: I wish I understood 1) the difference between
 ACPI-safe and ACPI-fast, and 2) how the system or OS ranks the

I'm still wondering why this machine doesn't have ACPI-fast:
o...@kg-f2# sysctl kern.timecounter.choice
kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) 
dummy(-100)

While my workstation do:
ti...@kg-v2$ sysctl kern.timecounter.choice
kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) 
dummy(-100)
and anaother machine:
r...@kg-quiet# sysctl kern.timecounter.choice
kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-100)
and another:
r...@kg-vm# sysctl kern.timecounter.choice
kern.timecounter.choice: TSC(-100) ACPI-fast(1000) i8254(0) dummy(-100)

Probably a BIOS / acpi problem.
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-12 Thread Torfinn Ingolfsen

On Thu, 11 Feb 2010 10:25:59 -0800
Chuck Swiger cswi...@mac.com wrote:

 The rate at which this machine is losing time is probably exceeding the ~50 
 seconds per day that NTPd is willing to correct without extreme measures (ie, 
 it has to step time rather than drift-correct).   You might help it maintain 
 a more sane idea of time by using at least 4 timeservers.

Hmm, ok that iis something I can try.

 You might take a look at 'vmstat -i' and look out for an interrupt storm, but 
 it's possible your hardware's clock is simply busted.  

AFAICT, vmstat -i looks ok:
r...@kg-f2# uptime
 1:23PM  up 18:31, 3 users, load averages: 0.00, 0.00, 0.00
r...@kg-f2# vmstat -i
interrupt  total   rate
irq1: atkbd0  36  0
irq6: fdc0 1  0
irq16: siis0 ohci0+  408  0
irq22: atapci0856338 12
cpu0: timer133347678   1999
irq256: re0   234087  3
cpu1: timer17654   1999
Total  267776202   4016


-- 
Regards,
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-12 Thread Torfinn Ingolfsen

On Thu, 11 Feb 2010 11:25:15 -0800
Jeremy Chadwick free...@jdc.parodius.com wrote:

 
 Your machine has a rapidly drifting clock, usually an indicator of a
 hardware problem (crystal gone bad is a common one -- seen this at work
 quite a few times), or possibly a bad time counter source chosen by the
 kernel.  Can you please provide the output of:
 
 sysctl kern.timecounter

Here it is:
r...@kg-f2# sysctl kern.timecounter
kern.timecounter.tick: 1
kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) 
dummy(-100)
kern.timecounter.hardware: HPET
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 52444
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.ACPI-safe.mask: 4294967295
kern.timecounter.tc.ACPI-safe.counter: 3252982815
kern.timecounter.tc.ACPI-safe.frequency: 3579545
kern.timecounter.tc.ACPI-safe.quality: 850
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.HPET.counter: 3443625641
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.quality: 900
kern.timecounter.tc.TSC.mask: 4294967295
kern.timecounter.tc.TSC.counter: 1276479615
kern.timecounter.tc.TSC.frequency: 2819782573
kern.timecounter.tc.TSC.quality: -100
kern.timecounter.smp_tsc: 0
kern.timecounter.invariant_tsc: 1

 Finally, was this OS installation used on different hardware in the
 past?  Meaning: was the hard disk previously installed on another
 machine?

Nope. Brand new hw, hard drive, and FreeBSD 8.0-release install. Then I 
upgraded to 8.0-stable.

  Why I'm asking: /var/db/ntpd.drift could be from an old
 computer (the previous hardware), and the clock drift rate would be
 different than that of your newer[1] hardware. 

No, /var/db/ntp.drift is created on this machine.
-- 
Regards,
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-12 Thread Torfinn Ingolfsen

On Thu, 11 Feb 2010 22:49:59 +0100
Stefan Krueger stadtki...@gmx.de wrote:

 I have the some problem on my machine (also AMD, running 8.0 +
 patches), after a while ntpd gives up sync'ing and then the time is off
 by minutes (roughly 80sec after.. say 10 hours) :(

FWIW, I have several other AMD systems, from both Asus, MSI and
Gigabyte. Noene of them have problems with ntp.

 I switched to opentnpd (you can find it in ports) and the clock stays
 in sync now, so you might want to consider that, too

I will if that is the only solution. However, if there is something
wrong with my setup / configuration of this machine, I'll rather fix
that.
-- 
Regards,
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-12 Thread Jeremy Chadwick

On Fri, Feb 12, 2010 at 01:29:47PM +0100, Torfinn Ingolfsen wrote:
 On Thu, 11 Feb 2010 11:25:15 -0800
 Jeremy Chadwick free...@jdc.parodius.com wrote:
 
  
  Your machine has a rapidly drifting clock, usually an indicator of a
  hardware problem (crystal gone bad is a common one -- seen this at work
  quite a few times), or possibly a bad time counter source chosen by the
  kernel.  Can you please provide the output of:
  
  sysctl kern.timecounter
 
 Here it is:
 r...@kg-f2# sysctl kern.timecounter
 kern.timecounter.tick: 1
 kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) 
 dummy(-100)
 kern.timecounter.hardware: HPET
 kern.timecounter.stepwarnings: 0
 kern.timecounter.tc.i8254.mask: 65535
 kern.timecounter.tc.i8254.counter: 52444
 kern.timecounter.tc.i8254.frequency: 1193182
 kern.timecounter.tc.i8254.quality: 0
 kern.timecounter.tc.ACPI-safe.mask: 4294967295
 kern.timecounter.tc.ACPI-safe.counter: 3252982815
 kern.timecounter.tc.ACPI-safe.frequency: 3579545
 kern.timecounter.tc.ACPI-safe.quality: 850
 kern.timecounter.tc.HPET.mask: 4294967295
 kern.timecounter.tc.HPET.counter: 3443625641
 kern.timecounter.tc.HPET.frequency: 14318180
 kern.timecounter.tc.HPET.quality: 900
 kern.timecounter.tc.TSC.mask: 4294967295
 kern.timecounter.tc.TSC.counter: 1276479615
 kern.timecounter.tc.TSC.frequency: 2819782573
 kern.timecounter.tc.TSC.quality: -100
 kern.timecounter.smp_tsc: 0
 kern.timecounter.invariant_tsc: 1

Please try doing this:

- stop ntpd
- rm /var/db/ntpd.drift
- sysctl kern.timecounter.hardware=ACPI-safe
- start ntpd

Then see if your clock drifts.  If it stops, great -- you can put that
sysctl assignment line in /etc/sysctl.conf and consider it a done deal.
I highly recommend putting some comments around it though so in the
future you don't go What's this? Silly! and delete it.  ;-)

I'll also point out that it's common on FreeBSD[1] to see messages
like the following (or at least it was circa 2006 -- I believe ntpd
has been updated since then, but I've no indication said quirk was
fixed/addressed):

Dec 19 00:22:26 icarus ntpd[624]: kernel time sync enabled 2001
Dec 19 01:47:48 icarus ntpd[624]: kernel time sync enabled 6001
Dec 19 02:04:52 icarus ntpd[624]: kernel time sync enabled 2001
repeat indefinitely

You can add the following to your ntp.conf to fix that problem:


# maxpoll 9 is used to work around PLL/FLL flipping, which happens at
# exactly 1024 seconds (the default maxpoll value).  Another FreeBSD
# user recommended using 9 instead:
# http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031512.html
#
server some.ntp.server maxpoll 9


I recommend using the iburst directive on one (and only one!) server
lines in your config, otherwise ntpd will usually 'settle' for about
10-15 minutes before bothering to try and update the clock the first
time around.  Example config:


server clock.develooper.com   maxpoll 9 iburst
server ntp.nblug.org  maxpoll 9
server tick.mtnlion.com   maxpoll 9
server dewey.lib.ci.phoenix.az.us maxpoll 9
server ntp-1.cso.uiuc.edu maxpoll 9
server tick.jrc.usmaxpoll 9


Finally, you should really consider adding some stratum 2 sources to
your list, *in addition* to the stratum 3 server you're already using.
ntpd can happily work with multiple servers, and will pick the best one
as well as average them out vs. drift.  It's pretty smart, honestly.  We
can talk about stratum 3 vs. 2 vs. 1 and use ntpdc -c peers to find
out what those NTP servers sync with, if you'd like.

[1]: http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031512.html

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-12 Thread Torfinn Ingolfsen

On Fri, 12 Feb 2010 05:11:17 -0800
Jeremy Chadwick free...@jdc.parodius.com wrote:

 Please try doing this:
 
 - stop ntpd
 - rm /var/db/ntpd.drift
 - sysctl kern.timecounter.hardware=ACPI-safe
 - start ntpd

Thanks, I'm currently testing that. Results in 72 hours (or less) :-)

 Then see if your clock drifts.  If it stops, great -- you can put that
 sysctl assignment line in /etc/sysctl.conf and consider it a done deal.
 I highly recommend putting some comments around it though so in the
 future you don't go What's this? Silly! and delete it.  ;-)

Yes, I know. Learned the hard way that I need to document things for my
own use. :-)

 I'll also point out that it's common on FreeBSD[1] to see messages
 like the following (or at least it was circa 2006 -- I believe ntpd

Yes, those messages are still there. At one time, I thought about
fixing that (by using the config you present), but in the end I figured
that these messages actually helps me in pinpointing time of crash in
the (few) cases wherer one of my machines crashes or panics. So in the
end I did nothing about it.

 Finally, you should really consider adding some stratum 2 sources to
 your list, *in addition* to the stratum 3 server you're already using.

Well, the stratum 3 server is my firewall. :) All the machines on my
LAN use that one as the ntp server.
My firewall is currently using three ntp
servers as sources, one is a non-public stratum 2 server (yes, I asked
for permission before I started using it), one is the no.poool.ntp.org
pool ( stratum
3) and I just found out that the third one has stopped responding.
So I removed it.

I have added the dk.pool.ntp.org and se.pool.ntp.org pools, we will see
how that turns out.

 ntpd can happily work with multiple servers, and will pick the best one

Yes, I know. It is a few years since I set this up, but at that time I
figured that if I use three ntp servers for my firewall, and just used
the firewall for all my internal machines, that would be good enough
for my uses. 
Perhaps I need to re-evaluate my needs.
-- 
Regards,
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-12 Thread Matthew D. Fuller

On Fri, Feb 12, 2010 at 05:11:17AM -0800 I heard the voice of
Jeremy Chadwick, and lo! it spake thus:

 I highly recommend putting some comments around it though so in the
 future you don't go What's this? Silly! and delete it.  ;-)

But do delete it every once in a while.  My experience over the years
is that sometimes a given OS build will do way worse than I'm used to,
whereas a new build a few months later works just peachy.


-- 
Matthew Fuller (MF4839)   |  fulle...@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
   On the Internet, nobody can hear you scream.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-12 Thread Kevin Oberman

 Date: Fri, 12 Feb 2010 05:11:17 -0800
 From: Jeremy Chadwick free...@jdc.parodius.com
 Sender: owner-freebsd-sta...@freebsd.org

 On Fri, Feb 12, 2010 at 01:29:47PM +0100, Torfinn Ingolfsen wrote:
  On Thu, 11 Feb 2010 11:25:15 -0800
  Jeremy Chadwick free...@jdc.parodius.com wrote:

   Your machine has a rapidly drifting clock, usually an indicator of a
   hardware problem (crystal gone bad is a common one -- seen this at work
   quite a few times), or possibly a bad time counter source chosen by the
   kernel.  Can you please provide the output of:

   sysctl kern.timecounter

  Here it is:
  r...@kg-f2# sysctl kern.timecounter
  kern.timecounter.tick: 1
  kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) 
  dummy(-100)
  kern.timecounter.hardware: HPET
  kern.timecounter.stepwarnings: 0
  kern.timecounter.tc.i8254.mask: 65535
  kern.timecounter.tc.i8254.counter: 52444
  kern.timecounter.tc.i8254.frequency: 1193182
  kern.timecounter.tc.i8254.quality: 0
  kern.timecounter.tc.ACPI-safe.mask: 4294967295
  kern.timecounter.tc.ACPI-safe.counter: 3252982815
  kern.timecounter.tc.ACPI-safe.frequency: 3579545
  kern.timecounter.tc.ACPI-safe.quality: 850
  kern.timecounter.tc.HPET.mask: 4294967295
  kern.timecounter.tc.HPET.counter: 3443625641
  kern.timecounter.tc.HPET.frequency: 14318180
  kern.timecounter.tc.HPET.quality: 900
  kern.timecounter.tc.TSC.mask: 4294967295
  kern.timecounter.tc.TSC.counter: 1276479615
  kern.timecounter.tc.TSC.frequency: 2819782573
  kern.timecounter.tc.TSC.quality: -100
  kern.timecounter.smp_tsc: 0
  kern.timecounter.invariant_tsc: 1

 Please try doing this:

 - stop ntpd
 - rm /var/db/ntpd.drift
 - sysctl kern.timecounter.hardware=ACPI-safe
 - start ntpd

 Then see if your clock drifts.  If it stops, great -- you can put that
 sysctl assignment line in /etc/sysctl.conf and consider it a done deal.
 I highly recommend putting some comments around it though so in the
 future you don't go What's this? Silly! and delete it.  ;-)

 I'll also point out that it's common on FreeBSD[1] to see messages
 like the following (or at least it was circa 2006 -- I believe ntpd
 has been updated since then, but I've no indication said quirk was
 fixed/addressed):

 Dec 19 00:22:26 icarus ntpd[624]: kernel time sync enabled 2001
 Dec 19 01:47:48 icarus ntpd[624]: kernel time sync enabled 6001
 Dec 19 02:04:52 icarus ntpd[624]: kernel time sync enabled 2001
 repeat indefinitely

 You can add the following to your ntp.conf to fix that problem:

 # maxpoll 9 is used to work around PLL/FLL flipping, which happens at
 # exactly 1024 seconds (the default maxpoll value).  Another FreeBSD
 # user recommended using 9 instead:
 # http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031512.html
 #
 server some.ntp.server maxpoll 9

 I recommend using the iburst directive on one (and only one!) server
 lines in your config, otherwise ntpd will usually 'settle' for about
 10-15 minutes before bothering to try and update the clock the first
 time around.  Example config:

Why (and only one!)? I have never seen a problem with 'iburst' on all
servers (assuming they are Internet connected. 'iburst' only makes a
difference on the initial query and, if the server you have marked as
'iburst' is unreachable, it will really slow down synchronization.

I am unaware of any issues with multiple servers being marked 'iburst'
and typically configure 7 ntp servers for a system, all tagged as
'iburst'. Never sen any issue with this.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: ober...@es.net  Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-12 Thread Jeremy Chadwick

On Fri, Feb 12, 2010 at 11:16:37AM -0800, Kevin Oberman wrote:
  Date: Fri, 12 Feb 2010 05:11:17 -0800
  From: Jeremy Chadwick free...@jdc.parodius.com
  Sender: owner-freebsd-sta...@freebsd.org

  On Fri, Feb 12, 2010 at 01:29:47PM +0100, Torfinn Ingolfsen wrote:
   On Thu, 11 Feb 2010 11:25:15 -0800
   Jeremy Chadwick free...@jdc.parodius.com wrote:

Your machine has a rapidly drifting clock, usually an indicator of a
hardware problem (crystal gone bad is a common one -- seen this at work
quite a few times), or possibly a bad time counter source chosen by the
kernel.  Can you please provide the output of:

sysctl kern.timecounter

   Here it is:
   r...@kg-f2# sysctl kern.timecounter
   kern.timecounter.tick: 1
   kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) 
   dummy(-100)
   kern.timecounter.hardware: HPET
   kern.timecounter.stepwarnings: 0
   kern.timecounter.tc.i8254.mask: 65535
   kern.timecounter.tc.i8254.counter: 52444
   kern.timecounter.tc.i8254.frequency: 1193182
   kern.timecounter.tc.i8254.quality: 0
   kern.timecounter.tc.ACPI-safe.mask: 4294967295
   kern.timecounter.tc.ACPI-safe.counter: 3252982815
   kern.timecounter.tc.ACPI-safe.frequency: 3579545
   kern.timecounter.tc.ACPI-safe.quality: 850
   kern.timecounter.tc.HPET.mask: 4294967295
   kern.timecounter.tc.HPET.counter: 3443625641
   kern.timecounter.tc.HPET.frequency: 14318180
   kern.timecounter.tc.HPET.quality: 900
   kern.timecounter.tc.TSC.mask: 4294967295
   kern.timecounter.tc.TSC.counter: 1276479615
   kern.timecounter.tc.TSC.frequency: 2819782573
   kern.timecounter.tc.TSC.quality: -100
   kern.timecounter.smp_tsc: 0
   kern.timecounter.invariant_tsc: 1

  Please try doing this:

  - stop ntpd
  - rm /var/db/ntpd.drift
  - sysctl kern.timecounter.hardware=ACPI-safe
  - start ntpd

  Then see if your clock drifts.  If it stops, great -- you can put that
  sysctl assignment line in /etc/sysctl.conf and consider it a done deal.
  I highly recommend putting some comments around it though so in the
  future you don't go What's this? Silly! and delete it.  ;-)

  I'll also point out that it's common on FreeBSD[1] to see messages
  like the following (or at least it was circa 2006 -- I believe ntpd
  has been updated since then, but I've no indication said quirk was
  fixed/addressed):

  Dec 19 00:22:26 icarus ntpd[624]: kernel time sync enabled 2001
  Dec 19 01:47:48 icarus ntpd[624]: kernel time sync enabled 6001
  Dec 19 02:04:52 icarus ntpd[624]: kernel time sync enabled 2001
  repeat indefinitely

  You can add the following to your ntp.conf to fix that problem:

  # maxpoll 9 is used to work around PLL/FLL flipping, which happens at
  # exactly 1024 seconds (the default maxpoll value).  Another FreeBSD
  # user recommended using 9 instead:
  # 
  http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031512.html
  #
  server some.ntp.server maxpoll 9

  I recommend using the iburst directive on one (and only one!) server
  lines in your config, otherwise ntpd will usually 'settle' for about
  10-15 minutes before bothering to try and update the clock the first
  time around.  Example config:

 Why (and only one!)? I have never seen a problem with 'iburst' on all
 servers (assuming they are Internet connected. 'iburst' only makes a
 difference on the initial query and, if the server you have marked as
 'iburst' is unreachable, it will really slow down synchronization.

 I am unaware of any issues with multiple servers being marked 'iburst'
 and typically configure 7 ntp servers for a system, all tagged as
 'iburst'. Never sen any issue with this.

iburst sends 8 packets (vs. the default of 1) with an interval delay of
2 seconds (assuming calldelay isn't set).

There's some sort of rule in the NTP community where more than 20
requests per hour is considered rude or worthy of blocking.  This was
discussed on the timekeepers list a while back.  The original thread
(AFAIR) was originally about rules/regulations for vendors (such as
router manufacturers who pick defaults, etc.), but I'm willing to bet
the concepts apply universally:

http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002299.html

Relevant discussion pieces below, (*) marked as worth reading, and (!!!)
are highly relevant:

http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002321.html
http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002323.html
http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002325.html
http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002326.html
http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002327.html (*)
http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002329.html (*)
http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002330.html (*)
http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002331.html (!!!)
http://fortytwo.ch/mailman/pipermail/timekeepers/2006/002332.html (!!!)

Re: ntpd struggling to keep up - how to fix?

2010-02-12 Thread Jeremy Chadwick

On Fri, Feb 12, 2010 at 05:44:52PM +0100, Torfinn Ingolfsen wrote:
 On Fri, 12 Feb 2010 05:11:17 -0800
 Jeremy Chadwick free...@jdc.parodius.com wrote:
 
  Please try doing this:
  
  - stop ntpd
  - rm /var/db/ntpd.drift
  - sysctl kern.timecounter.hardware=ACPI-safe
  - start ntpd
 
 Thanks, I'm currently testing that. Results in 72 hours (or less) :-)

Something else came to mind: some BIOSes let you disable/enable HPET.
Often labelled as High Performance Event Timer or Multimedia Timer,
you could disable this option then check kern.timecounter.choice to see
if HPET is gone from the list.

If it is, FreeBSD will very likely choose ACPI-safe as the default
timecounter (again, check kern.timecounter.hardware to see what the
kernel chose itself.  Remember that your sysctl.conf entry will
override this though!  :-) ), which -- assuming it works -- should
solve your problem.

Technical footnote: I wish I understood 1) the difference between
ACPI-safe and ACPI-fast, and 2) how the system or OS ranks the
timecounters (the higher the value in parenthesis, supposedly the more
accurate/preferred it is).  Xin, do you happen to know how this works?

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-12 Thread Matthew D. Fuller

On Fri, Feb 12, 2010 at 11:46:04AM -0800 I heard the voice of
Jeremy Chadwick, and lo! it spake thus:
 
 Technical footnote: I wish I understood 1) the difference between
 ACPI-safe and ACPI-fast,

AIUI, they're nearly the same thing, and it has to do with some
testing to determine how it can be reliably accessed.  I've had
systems that would sometimes come up with -fast, and other times -safe
(I think one varied depending on cold vs. warm boot for instance).


 and 2) how the system or OS ranks the timecounters (the higher the
 value in parenthesis, supposedly the more accurate/preferred it is).

That's easier; TTBOMK, they're hardcoded in the source based on
developer SWAG about their relative expenses and reliabilities and
precisions.


-- 
Matthew Fuller (MF4839)   |  fulle...@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
   On the Internet, nobody can hear you scream.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-11 Thread Chuck Swiger

Hi--

On Feb 11, 2010, at 10:06 AM, Torfinn Ingolfsen wrote:
[ ... ]
 Feb  7 16:11:45 kg-f2 ntpd[910]: time reset +2.373325 s
 
 and this goes on an on, forever. At any give time, no matter how long the 
 machine has been up, ntpq ca report this:
 r...@kg-f2# ntpq -p
 remote   refid  st t when poll reach   delay   offset  jitter
 ==
 kg-omni1.kg4.no 129.240.64.3 3 u   13   64   370.162  703.094 444.681

The rate at which this machine is losing time is probably exceeding the ~50 
seconds per day that NTPd is willing to correct without extreme measures (ie, 
it has to step time rather than drift-correct).  You might help it maintain a 
more sane idea of time by using at least 4 timeservers.

You might take a look at 'vmstat -i' and look out for an interrupt storm, but 
it's possible your hardware's clock is simply busted.  

Regards,
-- 
-Chuck

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-11 Thread Jeremy Chadwick

On Thu, Feb 11, 2010 at 07:06:52PM +0100, Torfinn Ingolfsen wrote:
 Hi,
 
 One of my machines, the fileserver-with-zfs-to-be[1] has trouble
 keeping correct time. Or rather, ntpd is struggling.
 In /var/lkog/messages I see this:
 Feb  7 12:05:54 kg-f2 ntpd[909]: ntpd 4.2.4p5-a (1)
 Feb  7 12:11:16 kg-f2 ntpd[910]: time reset +1.020413 s
 Feb  7 12:11:16 kg-f2 ntpd[910]: kernel time sync status change 2001
 Feb  7 12:26:26 kg-f2 ntpd[910]: time reset +2.277793 s
 Feb  7 12:41:29 kg-f2 ntpd[910]: time reset +2.260229 s
 Feb  7 12:57:02 kg-f2 ntpd[910]: time reset +2.332972 s
 Feb  7 13:21:24 kg-f2 ntpd[910]: time reset +3.659869 s
 Feb  7 13:37:01 kg-f2 ntpd[910]: time reset +2.343230 s
 Feb  7 13:52:24 kg-f2 ntpd[910]: time reset +2.310659 s
 Feb  7 14:07:29 kg-f2 ntpd[910]: time reset +2.265705 s
 Feb  7 14:23:03 kg-f2 ntpd[910]: time reset +2.335868 s
 Feb  7 14:39:06 kg-f2 ntpd[910]: time reset +2.46 s
 Feb  7 14:54:32 kg-f2 ntpd[910]: time reset +2.318222 s
 Feb  7 15:09:55 kg-f2 ntpd[910]: time reset +2.308120 s
 Feb  7 15:25:49 kg-f2 ntpd[910]: time reset +2.388391 s
 Feb  7 15:40:54 kg-f2 ntpd[910]: time reset +2.265464 s
 Feb  7 15:55:57 kg-f2 ntpd[910]: time reset +2.257952 s
 Feb  7 16:11:45 kg-f2 ntpd[910]: time reset +2.373325 s
 
 and this goes on an on, forever. At any give time, no matter how long the 
 machine has been up, ntpq ca report this:
 r...@kg-f2# ntpq -p
  remote   refid  st t when poll reach   delay   offset  jitter
 ==
  kg-omni1.kg4.no 129.240.64.3 3 u   13   64   370.162  703.094 444.681
 
 Note: all machines on my LAN use my firewall as the ntp server. 
 The ntp server runs FreeBSD, none of the other machines have any trouble 
 keeping time.
 My workstation for example:
 ti...@kg-v2$ ntpq -p
  remote   refid  st t when poll reach   delay   offset  jitter
 ==
 *kg-omni1.kg4.no 129.240.64.3 3 u   44   64  3770.1384.018   0.338
 (my workstatuion also runs FreeBSD 8.0-stable / amd64)
 
 The machine runs FreeBSD 8.0-stable / amd64:
 r...@kg-f2# uname -a
 FreeBSD kg-f2.kg4.no 8.0-STABLE FreeBSD 8.0-STABLE #2: Sun Jan 31 18:39:17 
 CET 2010 r...@kg-f2.kg4.no:/usr/obj/usr/src/sys/GENERIC  amd64
 
 So, how can I get the machine to keep time / get ntpd synchronised?
 
 References:
 1) hw info: http://sites.google.com/site/tingox/ga-ma74gm-s2h
 2) FreeBSD info: http://sites.google.com/site/tingox/ga-ma74gm-s2h_freebsd

Your machine has a rapidly drifting clock, usually an indicator of a
hardware problem (crystal gone bad is a common one -- seen this at work
quite a few times), or possibly a bad time counter source chosen by the
kernel.  Can you please provide the output of:

sysctl kern.timecounter

Finally, was this OS installation used on different hardware in the
past?  Meaning: was the hard disk previously installed on another
machine?  Why I'm asking: /var/db/ntpd.drift could be from an old
computer (the previous hardware), and the clock drift rate would be
different than that of your newer[1] hardware.  If that's the case,
please stop ntpd, rm /var/db/ntpd.drift, and restart ntpd.  Be aware it
will take up to 72 hours for the clock drift to be calculated correctly.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-11 Thread Xin LI

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

On 2010/02/11 11:25, Jeremy Chadwick wrote:
 On Thu, Feb 11, 2010 at 07:06:52PM +0100, Torfinn Ingolfsen wrote:
 Hi,

 One of my machines, the fileserver-with-zfs-to-be[1] has trouble
 keeping correct time. Or rather, ntpd is struggling.
 In /var/lkog/messages I see this:
 Feb  7 12:05:54 kg-f2 ntpd[909]: ntpd 4.2.4p5-a (1)
 Feb  7 12:11:16 kg-f2 ntpd[910]: time reset +1.020413 s
 Feb  7 12:11:16 kg-f2 ntpd[910]: kernel time sync status change 2001
 Feb  7 12:26:26 kg-f2 ntpd[910]: time reset +2.277793 s
 Feb  7 12:41:29 kg-f2 ntpd[910]: time reset +2.260229 s
 Feb  7 12:57:02 kg-f2 ntpd[910]: time reset +2.332972 s
 Feb  7 13:21:24 kg-f2 ntpd[910]: time reset +3.659869 s
 Feb  7 13:37:01 kg-f2 ntpd[910]: time reset +2.343230 s
 Feb  7 13:52:24 kg-f2 ntpd[910]: time reset +2.310659 s
 Feb  7 14:07:29 kg-f2 ntpd[910]: time reset +2.265705 s
 Feb  7 14:23:03 kg-f2 ntpd[910]: time reset +2.335868 s
 Feb  7 14:39:06 kg-f2 ntpd[910]: time reset +2.46 s
 Feb  7 14:54:32 kg-f2 ntpd[910]: time reset +2.318222 s
 Feb  7 15:09:55 kg-f2 ntpd[910]: time reset +2.308120 s
 Feb  7 15:25:49 kg-f2 ntpd[910]: time reset +2.388391 s
 Feb  7 15:40:54 kg-f2 ntpd[910]: time reset +2.265464 s
 Feb  7 15:55:57 kg-f2 ntpd[910]: time reset +2.257952 s
 Feb  7 16:11:45 kg-f2 ntpd[910]: time reset +2.373325 s

 and this goes on an on, forever. At any give time, no matter how long the 
 machine has been up, ntpq ca report this:
 r...@kg-f2# ntpq -p
  remote   refid  st t when poll reach   delay   offset  
 jitter
 ==
  kg-omni1.kg4.no 129.240.64.3 3 u   13   64   370.162  703.094 
 444.681

 Note: all machines on my LAN use my firewall as the ntp server. 
 The ntp server runs FreeBSD, none of the other machines have any trouble 
 keeping time.
 My workstation for example:
 ti...@kg-v2$ ntpq -p
  remote   refid  st t when poll reach   delay   offset  
 jitter
 ==
 *kg-omni1.kg4.no 129.240.64.3 3 u   44   64  3770.1384.018   
 0.338
 (my workstatuion also runs FreeBSD 8.0-stable / amd64)

 The machine runs FreeBSD 8.0-stable / amd64:
 r...@kg-f2# uname -a
 FreeBSD kg-f2.kg4.no 8.0-STABLE FreeBSD 8.0-STABLE #2: Sun Jan 31 18:39:17 
 CET 2010 r...@kg-f2.kg4.no:/usr/obj/usr/src/sys/GENERIC  amd64

 So, how can I get the machine to keep time / get ntpd synchronised?

 References:
 1) hw info: http://sites.google.com/site/tingox/ga-ma74gm-s2h
 2) FreeBSD info: http://sites.google.com/site/tingox/ga-ma74gm-s2h_freebsd
 
 Your machine has a rapidly drifting clock, usually an indicator of a
 hardware problem (crystal gone bad is a common one -- seen this at work
 quite a few times), or possibly a bad time counter source chosen by the
 kernel.  Can you please provide the output of:
 
 sysctl kern.timecounter
 
 Finally, was this OS installation used on different hardware in the
 past?  Meaning: was the hard disk previously installed on another
 machine?  Why I'm asking: /var/db/ntpd.drift could be from an old
 computer (the previous hardware), and the clock drift rate would be
 different than that of your newer[1] hardware.  If that's the case,
 please stop ntpd, rm /var/db/ntpd.drift, and restart ntpd.  Be aware it
 will take up to 72 hours for the clock drift to be calculated correctly.

I think this looks like the same problem I had with another AMD system,
which may be related to some HPET stuff (I no longer have access to that
system, though :(

Cheers,
- -- 
Xin LI delp...@delphij.nethttp://www.delphij.net/
FreeBSD - The Power to Serve!  Live free or die
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (FreeBSD)

iQEcBAEBAgAGBQJLdF9LAAoJEATO+BI/yjfBqacH/jreDlSiX9YCZqOSo22Dx0oW
KGxuqUk6ViBTBEMOHJzpqNn37u/cbBQ7qlXaDfhg1LY825lCvx782mFGPH3J67qT
IQZyLeWKGn/2BW/mhyQ9qOkEZKfifuwGmvvhxOwmnPyG2o1opFYiNxtLcJj0hPbs
qqhf7wE2YzY4Khx7bTVsbclUz6kaXnusUF09Kg2F4LJ7WUilkAvFYwuG/J4sx7UN
qKbw/F2bS1suyAt3cOmcb73rHN8MAbIyzjv0HOc4LUMnS6btFPUe5pqa7ghRNf7o
4wIoeGXQ6zupkjpHULIjU9hfu8uwKnTiDJ2xfJ6HjLvawsvOu/VUYvgqQM6cMd8=
=Wy4x
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-11 Thread Stefan Krueger

 Hi,

 One of my machines, the fileserver-with-zfs-to-be[1] has trouble
 keeping correct time. Or rather, ntpd is struggling.
 In /var/lkog/messages I see this:
 Feb  7 12:05:54 kg-f2 ntpd[909]: ntpd 4.2.4p5-a (1)
 Feb  7 12:11:16 kg-f2 ntpd[910]: time reset +1.020413 s
 Feb  7 12:11:16 kg-f2 ntpd[910]: kernel time sync status change 2001
 Feb  7 12:26:26 kg-f2 ntpd[910]: time reset +2.277793 s
 Feb  7 12:41:29 kg-f2 ntpd[910]: time reset +2.260229 s
 Feb  7 12:57:02 kg-f2 ntpd[910]: time reset +2.332972 s
 Feb  7 13:21:24 kg-f2 ntpd[910]: time reset +3.659869 s
 Feb  7 13:37:01 kg-f2 ntpd[910]: time reset +2.343230 s
 Feb  7 13:52:24 kg-f2 ntpd[910]: time reset +2.310659 s
 Feb  7 14:07:29 kg-f2 ntpd[910]: time reset +2.265705 s
 Feb  7 14:23:03 kg-f2 ntpd[910]: time reset +2.335868 s
 Feb  7 14:39:06 kg-f2 ntpd[910]: time reset +2.46 s
 Feb  7 14:54:32 kg-f2 ntpd[910]: time reset +2.318222 s
 Feb  7 15:09:55 kg-f2 ntpd[910]: time reset +2.308120 s
 Feb  7 15:25:49 kg-f2 ntpd[910]: time reset +2.388391 s
 Feb  7 15:40:54 kg-f2 ntpd[910]: time reset +2.265464 s
 Feb  7 15:55:57 kg-f2 ntpd[910]: time reset +2.257952 s
 Feb  7 16:11:45 kg-f2 ntpd[910]: time reset +2.373325 s

 [snip]
 
 I think this looks like the same problem I had with another AMD system,
 which may be related to some HPET stuff (I no longer have access to that
 system, though :(

I have the some problem on my machine (also AMD, running 8.0 +
patches), after a while ntpd gives up sync'ing and then the time is off
by minutes (roughly 80sec after.. say 10 hours) :(

I switched to opentnpd (you can find it in ports) and the clock stays
in sync now, so you might want to consider that, too

PS: I had a spare disk so I tried Linux on the same machine, and
ntpd is running fine for 2 days without any problems; so I guess it's
not a hw fault

HTH
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ntpd struggling to keep up - how to fix?

2010-02-11 Thread Michael Loftis




--On Thursday, February 11, 2010 10:49 PM +0100 Stefan Krueger 
stadtki...@gmx.de wrote:



snip


PS: I had a spare disk so I tried Linux on the same machine, and
ntpd is running fine for 2 days without any problems; so I guess it's
not a hw fault


It is a HW fault.  FreeBSD and Linux are just picking different time 
sources, Linux is guessing right, FreeBSD is guessing wrong. AMD actually 
has pretty widely known issues with this.  I've had problems mostly in 
Solaris/OpenSolaris though, a few with FreeBSD, and only rarely with Linux. 
I don't know the details, just that at least the Opteron HPET apparently 
isn't reliable.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

39 matches

Mail list logo