Re: [ntp:questions] Leap second bug?

2008-01-02 Thread Richard B. Gilbert
Spoon wrote:
> Hello everyone,
> 
> ntpd kicked my clock forward one second on January 1 at 00:19:38 UTC.
> 
> (My ntp.conf lists 12 servers. Delays range from 28 to 48 ms.)
> 

Unless you have a custom version of ntpd, I believe that ten servers is 
the absolute maximum!  I believe that ntpd will ignore the extras.

Four, Five, and Seven are the magic numbers to protect against the 
failure of one, two, and three servers.  Note that "failure" can mean 
either a server failing to respond to queries or a server with a 
blatantly incorrect time.  I have not tested this but I believe that 
ntpd will always select both the server with the "one true time" and an 
"advisory committee" of three servers if there are sufficient servers 
available to do so.

I'd suggest trimming  your server list to the five or seven servers 
closest to you in net space; e.g. the ones with the lowest value of delay.

FWIW it's rare to be able to FIND ten servers that are close to you!
Ten GOOD servers ranks close to miraculous.  Again, FWIW, even most of 
the "bad" servers know what time it is but the network between you and 
the server has enough jitter to mangle the time.


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-02 Thread Spoon
Richard B. Gilbert wrote:

> Spoon wrote:
> 
>> ntpd kicked my clock forward one second on January 1 at 00:19:38 UTC.
>> 
>> (My ntp.conf lists 12 servers. Delays range from 28 to 48 ms.)
> 
> Unless you have a custom version of ntpd,

I didn't modify the source in any way.

> I believe that ten servers is the absolute maximum!
> I believe that ntpd will ignore the extras.

The documentation for ntpq does state:
( http://www.eecis.udel.edu/~mills/ntp/html/ntpq.html )

.  excess

 The peer is discarded as not among the first ten peers sorted by 
synchronization distance and so is probably a poor candidate for further 
consideration.

But I've tested a configuration with 225 servers, and none were 
considered excess. (While 16 were considered candidate.)

for TALLY in '*' '+' '-' '#' '.' ' ' 'x' ; do
   N=$(grep -c "^\\$TALLY" DUMP)
   echo "$TALLY : $N"
done

* : 1
+ : 16
- : 33
# : 114
. : 0
   : 57
x : 4

> Four, Five, and Seven are the magic numbers to protect against the 
> failure of one, two, and three servers.  Note that "failure" can mean 
> either a server failing to respond to queries or a server with a 
> blatantly incorrect time.  I have not tested this but I believe that 
> ntpd will always select both the server with the "one true time" and an 
> "advisory committee" of three servers if there are sufficient servers 
> available to do so.
> 
> I'd suggest trimming  your server list to the five or seven servers 
> closest to you in net space; e.g. the ones with the lowest value of delay.

I had never considered I could set up /too many/ servers. I had always 
thought ntpd would just pick the N best.

> FWIW it's rare to be able to FIND ten servers that are close to you!
> Ten GOOD servers ranks close to miraculous.  Again, FWIW, even most of 
> the "bad" servers know what time it is but the network between you and 
> the server has enough jitter to mangle the time.

I can "see" 40 servers for which the delay is less than 60 ms.
The jitter is less than 2 for most (90%) of them.
(I suppose there are other metrics to consider before calling
a server good or bad?)

Regards.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-02 Thread Richard B. Gilbert
Spoon wrote:
> Richard B. Gilbert wrote:
> 
>> Spoon wrote:
>>
>>> ntpd kicked my clock forward one second on January 1 at 00:19:38 UTC.
>>>
>>> (My ntp.conf lists 12 servers. Delays range from 28 to 48 ms.)
>>
>>
>> Unless you have a custom version of ntpd,
> 
> 
> I didn't modify the source in any way.
> 
>> I believe that ten servers is the absolute maximum!
>> I believe that ntpd will ignore the extras.
> 
> 
> The documentation for ntpq does state:
> ( http://www.eecis.udel.edu/~mills/ntp/html/ntpq.html )
> 
> .  excess
> 
> The peer is discarded as not among the first ten peers sorted by 
> synchronization distance and so is probably a poor candidate for further 
> consideration.
> 
> But I've tested a configuration with 225 servers, and none were 
> considered excess. (While 16 were considered candidate.)
> 
> for TALLY in '*' '+' '-' '#' '.' ' ' 'x' ; do
>   N=$(grep -c "^\\$TALLY" DUMP)
>   echo "$TALLY : $N"
> done
> 
> * : 1
> + : 16
> - : 33
> # : 114
> . : 0
>   : 57
> x : 4
> 
>> Four, Five, and Seven are the magic numbers to protect against the 
>> failure of one, two, and three servers.  Note that "failure" can mean 
>> either a server failing to respond to queries or a server with a 
>> blatantly incorrect time.  I have not tested this but I believe that 
>> ntpd will always select both the server with the "one true time" and 
>> an "advisory committee" of three servers if there are sufficient 
>> servers available to do so.
>>
>> I'd suggest trimming  your server list to the five or seven servers 
>> closest to you in net space; e.g. the ones with the lowest value of 
>> delay.
> 
> 
> I had never considered I could set up /too many/ servers. I had always 
> thought ntpd would just pick the N best.
> 
>> FWIW it's rare to be able to FIND ten servers that are close to you!
>> Ten GOOD servers ranks close to miraculous.  Again, FWIW, even most of 
>> the "bad" servers know what time it is but the network between you and 
>> the server has enough jitter to mangle the time.
> 
> 
> I can "see" 40 servers for which the delay is less than 60 ms.
> The jitter is less than 2 for most (90%) of them.
> (I suppose there are other metrics to consider before calling
> a server good or bad?)
> 
> Regards.

The error in transmitting the time from server to client is limited to 
one half the round trip time.  It's usually far less than that but 
cannot be greater.  So servers close to you (low delay) should generally 
provide better time.

A good way to learn what's good and bad is to use a hardware reference 
clock such as a GPS receiver.  A timing receiver with PPS output will 
generally keep the leading edge of the PPS within 50 microseconds of the
"top of the second".  The very best internet servers will agree with the 
GPS within two or three milliseconds.  A very poorly chosen server might 
be out by 100 milliseconds or more.  By very poorly chosen, I mean 
something like someone in New York City configuring ntpd to use a server 
in Tokyo! A server with a GPS reference will open your eyes to the 
atrocities committed by typical internet connections!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-02 Thread Unruh
"Richard B. Gilbert" <[EMAIL PROTECTED]> writes:

...
>A good way to learn what's good and bad is to use a hardware reference 
>clock such as a GPS receiver.  A timing receiver with PPS output will 
>generally keep the leading edge of the PPS within 50 microseconds of the
>"top of the second".  The very best internet servers will agree with the 
>GPS within two or three milliseconds.  A very poorly chosen server might 
>be out by 100 milliseconds or more.  By very poorly chosen, I mean 
>something like someone in New York City configuring ntpd to use a server 
>in Tokyo! A server with a GPS reference will open your eyes to the 
>atrocities committed by typical internet connections!

A good gps receiver should be good to 1usec, not 50. And if you interrupt
drive the computer, the computer i timestamp should also be good to 2-3usec or 
so. 
That is the kind of jitter I get in getting the gps time from a gps PPS
receiver. 
And yes, even for computers in the same building, the best I can get using
that ntp server as the source is about 50 usec, with .2 msec best delay (
but sometimes the delays are many many times that). The biggest problem
seems to be the computer itself getting the packet out the network card
onto the net. 

Astonishingly a connection across the country 1000 miles away has a round
trip of 40ms, but the time offset is usually about .2msec-- ie the two legs
of the trip are amazingly repeatable. 


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-02 Thread Richard B. Gilbert
Unruh wrote:
> "Richard B. Gilbert" <[EMAIL PROTECTED]> writes:
> 
> ...
> 
>>A good way to learn what's good and bad is to use a hardware reference 
>>clock such as a GPS receiver.  A timing receiver with PPS output will 
>>generally keep the leading edge of the PPS within 50 microseconds of the
>>"top of the second".  The very best internet servers will agree with the 
>>GPS within two or three milliseconds.  A very poorly chosen server might 
>>be out by 100 milliseconds or more.  By very poorly chosen, I mean 
>>something like someone in New York City configuring ntpd to use a server 
>>in Tokyo! A server with a GPS reference will open your eyes to the 
>>atrocities committed by typical internet connections!
> 
> 
> A good gps receiver should be good to 1usec, not 50. And if you interrupt

Sorry, I meant to write nanoseconds rather than microseconds.  My 
Motorola Oncore M12+T specifies the PPS to be within 50 ns.

> drive the computer, the computer i timestamp should also be good to 2-3usec 
> or so. 
> That is the kind of jitter I get in getting the gps time from a gps PPS
> receiver. 

That sounds about right.

> And yes, even for computers in the same building, the best I can get using
> that ntp server as the source is about 50 usec, with .2 msec best delay (
> but sometimes the delays are many many times that). The biggest problem
> seems to be the computer itself getting the packet out the network card
> onto the net.

Once you get a LAN into the act, accuracy deteriorates rapidly.  50 
microseconds can easily degrade to 2-5 milliseconds depending on the 
network hardware.  That's still more than adequate for most 
applications.  Still, I don't think that the bottle neck is likely to be 
the computer's ability to get the packet on the wire.  Once the packet 
hits the wire it still has to go through a switch and, perhaps, even a 
router and another switch to reach its destination in a large building.
At my last job, we had a "core" switch in the data center, a switch on 
the first floor, another on the second floor, and yet another in the 
warehouse area.  A packet originating in the data center had to pass 
through two switches to get anywhere else in the building.  When we 
added VLANs to trim the sizes of our broadcast domains, the router had 
to be consulted to figure out how to get the packet to its destination.

> 
> Astonishingly a connection across the country 1000 miles away has a round
> trip of 40ms, but the time offset is usually about .2msec-- ie the two legs
> of the trip are amazingly repeatable. 
> 
> 

It sometimes happens that way.  In the general case, the to and from 
paths are not guaranteed to be the same.  Your query might go direct 
from New York to Los Angeles and the reply might travel via Dallas-Fort 
Worth!  The routers work together to move the traffic as quickly and as 
cheaply as possible but the route cannot be guaranteed unless the entire 
path is under your control.  I'm sure that some companies need, and can 
afford, a direct line from New York to Los Angeles but most of us must 
rely on the internet eventually getting things where they are going.

I've noticed that servers that appear dreadful from 8 AM to 10 PM local 
time can show amazing improvement when the net quiets down at night.


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-02 Thread Unruh
"Richard B. Gilbert" <[EMAIL PROTECTED]> writes:

>Unruh wrote:
>> "Richard B. Gilbert" <[EMAIL PROTECTED]> writes:
>> 
>> ...
>> 
>>>A good way to learn what's good and bad is to use a hardware reference 
>>>clock such as a GPS receiver.  A timing receiver with PPS output will 
>>>generally keep the leading edge of the PPS within 50 microseconds of the
>>>"top of the second".  The very best internet servers will agree with the 
>>>GPS within two or three milliseconds.  A very poorly chosen server might 
>>>be out by 100 milliseconds or more.  By very poorly chosen, I mean 
>>>something like someone in New York City configuring ntpd to use a server 
>>>in Tokyo! A server with a GPS reference will open your eyes to the 
>>>atrocities committed by typical internet connections!
>> 
>> 
>> A good gps receiver should be good to 1usec, not 50. And if you interrupt

>Sorry, I meant to write nanoseconds rather than microseconds.  My 
>Motorola Oncore M12+T specifies the PPS to be within 50 ns.

OK, although that is pretty optimistic. A wire from the receiver to the
computer delays it by about 2ns/ foot, the interrupt serice on the computer
and the time required to timestamp that packet is about another 2usec, with
fluctuations depending on whether other interrupts are being serviced.
 

>> drive the computer, the computer i timestamp should also be good to 2-3usec 
>> or so. 
>> That is the kind of jitter I get in getting the gps time from a gps PPS
>> receiver. 

>That sounds about right.

>> And yes, even for computers in the same building, the best I can get using
>> that ntp server as the source is about 50 usec, with .2 msec best delay (
>> but sometimes the delays are many many times that). The biggest problem
>> seems to be the computer itself getting the packet out the network card
>> onto the net.

>Once you get a LAN into the act, accuracy deteriorates rapidly.  50 
>microseconds can easily degrade to 2-5 milliseconds depending on the 
>network hardware.  That's still more than adequate for most 

That is all over a lan, through 2 switches. Typically it is about a 200
usec travel time, with the time itself fluctuating by about 30usec.
with obvious popcorn spikes. 

>applications.  Still, I don't think that the bottle neck is likely to be 
>the computer's ability to get the packet on the wire.  Once the packet 

Actually if I believe tcpdump timestamping compared with the packet
timestamping, it can at times be a few msec to get the packet out onto the
net. Ie, the network card is far worse than the network itself. 


>hits the wire it still has to go through a switch and, perhaps, even a 
>router and another switch to reach its destination in a large building.
>At my last job, we had a "core" switch in the data center, a switch on 
>the first floor, another on the second floor, and yet another in the 
>warehouse area.  A packet originating in the data center had to pass 
>through two switches to get anywhere else in the building.  When we 
>added VLANs to trim the sizes of our broadcast domains, the router had 
>to be consulted to figure out how to get the packet to its destination.

Most of my machines pass through two or three switches along the way ( all
GBit switches by now)



>> 
>> Astonishingly a connection across the country 1000 miles away has a round
>> trip of 40ms, but the time offset is usually about .2msec-- ie the two legs
>> of the trip are amazingly repeatable. 
>> 
>> 

>It sometimes happens that way.  In the general case, the to and from 
>paths are not guaranteed to be the same.  Your query might go direct 
>from New York to Los Angeles and the reply might travel via Dallas-Fort 
>Worth!  The routers work together to move the traffic as quickly and as 
>cheaply as possible but the route cannot be guaranteed unless the entire 
>path is under your control.  I'm sure that some companies need, and can 
>afford, a direct line from New York to Los Angeles but most of us must 
>rely on the internet eventually getting things where they are going.

Fortunately Canada is very one dimensional. So stuff tends to go the same
way there and back. Between Sask and BC the only alternatives are Calgary
or Edmonton, and I suspect that the backbone goes through calgary.



>I've noticed that servers that appear dreadful from 8 AM to 10 PM local 
>time can show amazing improvement when the net quiets down at night.


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-03 Thread Maarten Wiltink
"Unruh" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
[...]
> Astonishingly a connection across the country 1000 miles away has a
> round trip of 40ms, but the time offset is usually about .2msec-- ie
> the two legs of the trip are amazingly repeatable.

And where they aren't, a simple averaging filter works quite well.

Groetjes,
Maarten Wiltink


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-03 Thread David Woolley
In article <[EMAIL PROTECTED]>,
Unruh <[EMAIL PROTECTED]> wrote:

> and the time required to timestamp that packet is about another 2usec, with
> fluctuations depending on whether other interrupts are being serviced.

On both Linux and Windows, interrupt latencies of more than 4ms and
even more than 10ms are quite common.  The interrupt processing time
from the idle loop is not a good indication of the timing on a system
doing real work.  (The above figures are based on clock interrupts 
overrunning at 250 Hz and 100 Hz clock frequencies, typically when doing
IDE disk I/O.)

> Actually if I believe tcpdump timestamping compared with the packet
> timestamping, it can at times be a few msec to get the packet out onto the
> net. Ie, the network card is far worse than the network itself. 

Network contention delays, which these almost certainly are, are normally
considered part of the network delay, not part of the network card delay.

> Most of my machines pass through two or three switches along the way ( all
> GBit switches by now)

I think most switches these days are store and forward, so will incur
contention delays at each stage, unless the network is seriously over-
dimensioned.

> Fortunately Canada is very one dimensional. So stuff tends to go the same
> way there and back. Between Sask and BC the only alternatives are Calgary
> or Edmonton, and I suspect that the backbone goes through calgary.

Most asymmetric delay problems are due to network contention on the link
to the ISP, because many internet users are net consumers and tend
to have peaks at certain times of day when people tend to do their net
accesses.  That tends to result in severe asymmetry for a few tens of 
minutes, with the excess delay being in the downlink direction.  Again,
this is because delays on a properly dimensioned network are predominantly
due to contention, rather than serialisation or speed of light factors.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-04 Thread Spoon
Spoon wrote:

> ntpd kicked my clock forward one second on January 1 at 00:19:38 UTC.
> 
> (My ntp.conf lists 12 servers. Delays range from 28 to 48 ms.)
> 
> Dec 31 23:25:39 offset 0.000329 sec freq -6.715 ppm error 0.000333 poll 8
> Dec 31 23:28:39 offset 0.000329 sec freq -6.715 ppm error 0.000340 poll 8
> Dec 31 23:31:39 offset 0.000329 sec freq -6.715 ppm error 0.000424 poll 8
> Dec 31 23:34:39 offset 0.000403 sec freq -6.714 ppm error 0.000493 poll 8
> Dec 31 23:37:39 offset 0.000270 sec freq -6.714 ppm error 0.000348 poll 8
> Dec 31 23:40:39 offset 0.000270 sec freq -6.714 ppm error 0.000337 poll 8
> Dec 31 23:43:39 offset 0.000268 sec freq -6.714 ppm error 0.000327 poll 8
> Dec 31 23:46:39 offset 0.000268 sec freq -6.714 ppm error 0.000381 poll 8
> Dec 31 23:49:39 offset 0.000268 sec freq -6.714 ppm error 0.000446 poll 8
> Dec 31 23:52:39 offset 0.000268 sec freq -6.714 ppm error 0.000446 poll 8
> Dec 31 23:55:39 offset 0.000268 sec freq -6.714 ppm error 0.000334 poll 8
> Dec 31 23:58:39 offset 0.000268 sec freq -6.714 ppm error 0.000317 poll 8
> Jan  1 00:01:38 offset 0.000268 sec freq -6.714 ppm error 0.000318 poll 8
> Jan  1 00:04:38 offset 0.000268 sec freq -6.714 ppm error 0.447285 poll 8
> Jan  1 00:06:47 synchronized to A, stratum 2
> Jan  1 00:07:38 offset -0.001068 sec freq -6.720 ppm error 0.632509 poll 8
> Jan  1 00:10:38 offset -0.001068 sec freq -6.720 ppm error 0.632509 poll 8
> Jan  1 00:13:38 offset -0.001068 sec freq -6.720 ppm error 0.774695 poll 8
> Jan  1 00:15:39 synchronized to H, stratum 1
> Jan  1 00:16:38 offset -0.001068 sec freq -6.720 ppm error 0.632382 poll 8
> +
> Jan  1 00:19:38 time reset +0.999402 s
> +
> Jan  1 00:19:38 system event 'event_clock_reset' (0x05) status 
> 'sync_alarm, sync_unspec, 15 events, event_peer/strat_chg' (0xc0f4)
> Jan  1 00:19:38 system event 'event_peer/strat_chg' (0x04) status 
> 'sync_alarm, sync_unspec, 15 events, event_clock_reset' (0xc0f5)
> Jan  1 00:19:39 offset 0.00 sec freq -6.720 ppm error 0.447203 poll 4
> Jan  1 00:19:54 peer A event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:19:55 peer B event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:19:59 peer C event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:04 peer D event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:07 peer E event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:08 peer F event 'event_reach' (0x84) status 'unreach, conf, 
> 4 events, event_reach' (0x8044)
> Jan  1 00:20:14 peer G event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:18 peer H event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:24 peer I event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:26 peer J event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:28 peer K event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:39 peer L event 'event_reach' (0x84) status 'unreach, conf, 
> 4 events, event_reach' (0x8044)
> Jan  1 00:20:55 synchronized to A, stratum 2
> Jan  1 00:20:55 system event 'event_sync_chg' (0x03) status 'leap_none, 
> sync_ntp, 15 events, event_peer/strat_chg' (0x6f4)
> Jan  1 00:20:55 system event 'event_peer/strat_chg' (0x04) status 
> 'leap_none, sync_ntp, 15 events, event_sync_chg' (0x6f3)
> Jan  1 00:21:22 synchronized to H, stratum 1
> 
> I also noticed that, the day before, the STA_INS (insert leap second) had
> been set and reset several times.
> 
> Dec 31 00:14:30 kernel time sync status change 0011
> Dec 31 00:27:21 kernel time sync status change 0001
> Dec 31 03:19:46 kernel time sync status change 0011
> Dec 31 03:52:30 kernel time sync status change 0001
> Dec 31 04:09:33 kernel time sync status change 0011
> Dec 31 04:35:11 kernel time sync status change 0001
> Dec 31 07:26:03 kernel time sync status change 0011
> Dec 31 07:47:28 kernel time sync status change 0001
> Dec 31 10:00:51 kernel time sync status change 0011
> Dec 31 10:17:01 kernel time sync status change 0001
> 
> (Apparently, the bit was not set when 2007 ended.)
> 
> Could this be a leap year bug? or did I just lose connectivity at the wrong
> time and it's just a coincidence?
> 
> # ntpq -crv
> assID=0 status=06f4 leap_none, sync_ntp, 15 events, event_peer/strat_chg,
> version="ntpd [EMAIL PROTECTED] Fri Mar 16 10:45:43 UTC 2007 (1)",
> processor="i686", system="Linux/2.6.22.1-rt9", leap=00, stratum=3,
> precision=-20, rootdelay=30.293, rootdispersion=50.341, peer=39672,
> refid=145.238.203.10,
> reftime=cb262893.e5d244fd  Wed, Jan  2 2008 15:13:23.897, poll=8,
> clock=cb262c3b.dbe5d3de  Wed, Jan  2 2008 15:28:59.858, state=4,
> offset=0

Re: [ntp:questions] Leap second bug?

2008-01-04 Thread Martin Burnicki
Spoon wrote:
> Spoon wrote:
> 
>> ntpd kicked my clock forward one second on January 1 at 00:19:38 UTC.
[...]
>> I also noticed that, the day before, the STA_INS (insert leap second) had
>> been set and reset several times.
[...]
>> Could this be a leap year bug? or did I just lose connectivity at the
>> wrong time and it's just a coincidence?

I don't think it's a leap year bug since NTP and the UTC system clock do
only deal with seconds after an epoche. The leap year thing comes into
effect only when those seconds are converted to human-readable calendar
date.

Lost connectivity could not be the reason. Ntpd only passes leap second
announcement on if it has received such announcement
- from an upstream server
- from a reference clock
- from the NIST leap seconds file

Ntpd can not receive a leap second announcement from an upstream server if
the upstream server is not reachable. 

A potential reason could be a bug in ntpd, in which case we had to look at
the source code of the exact version of ntpd, which is [EMAIL PROTECTED]
according to the ntpq output below. Since the billboard does not display a
tai value I assume a NIST leap second file is not involved here.

>> # ntpq -crv
>> assID=0 status=06f4 leap_none, sync_ntp, 15 events, event_peer/strat_chg,
>> version="ntpd [EMAIL PROTECTED] Fri Mar 16 10:45:43 UTC 2007 (1)",
>> processor="i686", system="Linux/2.6.22.1-rt9", leap=00, stratum=3,
>> precision=-20, rootdelay=30.293, rootdispersion=50.341, peer=39672,
>> refid=145.238.203.10,
>> reftime=cb262893.e5d244fd  Wed, Jan  2 2008 15:13:23.897, poll=8,
>> clock=cb262c3b.dbe5d3de  Wed, Jan  2 2008 15:28:59.858, state=4,
>> offset=0.081, frequency=-6.758, jitter=0.525, noise=0.521,
>> stability=0.001
> 
> Would someone care to venture their best guess as to what caused ntpd
> to step the system clock forward in the above scenario?

This could also be due to a firmware bug in a GPS receiver. There have been
such occasions before (not with Meinberg receivers ;-).

Dave, wouldn't it be a good idea to implement a log message indicating by
which means a leap second announcement has been received? So this could be
traced back to the originally faulty time source.

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-04 Thread Spoon
Martin Burnicki wrote:

> Spoon wrote:
>
>> ntpd kicked my clock forward one second on January 1 at 00:19:38 UTC.
> [...]
>> I also noticed that, the day before, the STA_INS (insert leap second) had
>> been set and reset several times.
> [...]
>> Could this be a leap year bug? or did I just lose connectivity at the
>> wrong time and it's just a coincidence?
> 
> I don't think it's a leap year bug since NTP and the UTC system clock do
> only deal with seconds after an epoche. The leap year thing comes into
> effect only when those seconds are converted to human-readable calendar
> date.

Doh! I meant to write "leap second", not "leap year".

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-04 Thread David L. Mills
Spoon wrote:

> Spoon wrote:
> 
>> ntpd kicked my clock forward one second on January 1 at 00:19:38 UTC.
>>
>> (My ntp.conf lists 12 servers. Delays range from 28 to 48 ms.)
>>
>> Dec 31 23:25:39 offset 0.000329 sec freq -6.715 ppm error 0.000333 poll 8
>> Dec 31 23:28:39 offset 0.000329 sec freq -6.715 ppm error 0.000340 poll 8
>> Dec 31 23:31:39 offset 0.000329 sec freq -6.715 ppm error 0.000424 poll 8
>> Dec 31 23:34:39 offset 0.000403 sec freq -6.714 ppm error 0.000493 poll 8
>> Dec 31 23:37:39 offset 0.000270 sec freq -6.714 ppm error 0.000348 poll 8
>> Dec 31 23:40:39 offset 0.000270 sec freq -6.714 ppm error 0.000337 poll 8
>> Dec 31 23:43:39 offset 0.000268 sec freq -6.714 ppm error 0.000327 poll 8
>> Dec 31 23:46:39 offset 0.000268 sec freq -6.714 ppm error 0.000381 poll 8
>> Dec 31 23:49:39 offset 0.000268 sec freq -6.714 ppm error 0.000446 poll 8
>> Dec 31 23:52:39 offset 0.000268 sec freq -6.714 ppm error 0.000446 poll 8
>> Dec 31 23:55:39 offset 0.000268 sec freq -6.714 ppm error 0.000334 poll 8
>> Dec 31 23:58:39 offset 0.000268 sec freq -6.714 ppm error 0.000317 poll 8
>> Jan  1 00:01:38 offset 0.000268 sec freq -6.714 ppm error 0.000318 poll 8
>> Jan  1 00:04:38 offset 0.000268 sec freq -6.714 ppm error 0.447285 poll 8
>> Jan  1 00:06:47 synchronized to A, stratum 2
>> Jan  1 00:07:38 offset -0.001068 sec freq -6.720 ppm error 0.632509 
>> poll 8
>> Jan  1 00:10:38 offset -0.001068 sec freq -6.720 ppm error 0.632509 
>> poll 8
>> Jan  1 00:13:38 offset -0.001068 sec freq -6.720 ppm error 0.774695 
>> poll 8
>> Jan  1 00:15:39 synchronized to H, stratum 1
>> Jan  1 00:16:38 offset -0.001068 sec freq -6.720 ppm error 0.632382 
>> poll 8
>> +
>> Jan  1 00:19:38 time reset +0.999402 s
>> +
>> Jan  1 00:19:38 system event 'event_clock_reset' (0x05) status 
>> 'sync_alarm, sync_unspec, 15 events, event_peer/strat_chg' (0xc0f4)
>> Jan  1 00:19:38 system event 'event_peer/strat_chg' (0x04) status 
>> 'sync_alarm, sync_unspec, 15 events, event_clock_reset' (0xc0f5)
>> Jan  1 00:19:39 offset 0.00 sec freq -6.720 ppm error 0.447203 poll 4
>> Jan  1 00:19:54 peer A event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:19:55 peer B event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:19:59 peer C event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:04 peer D event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:07 peer E event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:08 peer F event 'event_reach' (0x84) status 'unreach, 
>> conf, 4 events, event_reach' (0x8044)
>> Jan  1 00:20:14 peer G event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:18 peer H event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:24 peer I event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:26 peer J event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:28 peer K event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:39 peer L event 'event_reach' (0x84) status 'unreach, 
>> conf, 4 events, event_reach' (0x8044)
>> Jan  1 00:20:55 synchronized to A, stratum 2
>> Jan  1 00:20:55 system event 'event_sync_chg' (0x03) status 
>> 'leap_none, sync_ntp, 15 events, event_peer/strat_chg' (0x6f4)
>> Jan  1 00:20:55 system event 'event_peer/strat_chg' (0x04) status 
>> 'leap_none, sync_ntp, 15 events, event_sync_chg' (0x6f3)
>> Jan  1 00:21:22 synchronized to H, stratum 1
>>
>> I also noticed that, the day before, the STA_INS (insert leap second) had
>> been set and reset several times.
>>
>> Dec 31 00:14:30 kernel time sync status change 0011
>> Dec 31 00:27:21 kernel time sync status change 0001
>> Dec 31 03:19:46 kernel time sync status change 0011
>> Dec 31 03:52:30 kernel time sync status change 0001
>> Dec 31 04:09:33 kernel time sync status change 0011
>> Dec 31 04:35:11 kernel time sync status change 0001
>> Dec 31 07:26:03 kernel time sync status change 0011
>> Dec 31 07:47:28 kernel time sync status change 0001
>> Dec 31 10:00:51 kernel time sync status change 0011
>> Dec 31 10:17:01 kernel time sync status change 0001
>>
>> (Apparently, the bit was not set when 2007 ended.)
>>
>> Could this be a leap year bug? or did I just lose connectivity at the 
>> wrong
>> time and it's just a coincidence?
>>
>> # ntpq -crv
>> assID=0 status=06f4 leap_none, sync_ntp, 15 events, event_peer/strat_chg,
>> version="ntpd [EMAIL PROTECTED] Fri Mar 16 10:45:43 UTC 2007 (1)",
>> processor="i686", system="Linux/2.6.22.1-rt9", leap=00, stratum=3,
>> precision=-20, rootdelay=30.293, rootdispersion=50.341, peer=39672,
>> refid=145.238.203.10,
>> reftime=cb262893.e5d244

Re: [ntp:questions] Leap second bug?

2008-01-04 Thread David L. Mills
Spoon,

Assuming your incident was the beginnin of this year, no leap was 
schedule nor should have been advertisec by any of your servers. The 
current code, which you might not be using, takes a vote of the leap 
indicators in all servers and requires a clear majority before 
scheduling a leap. Maybe some of your friends lied.

The intended behavior if the servers do correctly signal a leap and the 
kernel is unaware of that, is that the step interval will be exceeded 
for about 15 minutes and then the time will be stepped. During that 
interval your clock will appear one second slow relative to the server 
that has correctly inserted a second. There will be no slew, onlly the 
step. The fact that your time showed otherwise suggests either the step 
has been disabled or something else comes unstuck. Our clocks here 
showed no such behavior as yours.

Dave

Spoon wrote:

> Spoon wrote:
> 
>> ntpd kicked my clock forward one second on January 1 at 00:19:38 UTC.
>>
>> (My ntp.conf lists 12 servers. Delays range from 28 to 48 ms.)
>>
>> Dec 31 23:25:39 offset 0.000329 sec freq -6.715 ppm error 0.000333 poll 8
>> Dec 31 23:28:39 offset 0.000329 sec freq -6.715 ppm error 0.000340 poll 8
>> Dec 31 23:31:39 offset 0.000329 sec freq -6.715 ppm error 0.000424 poll 8
>> Dec 31 23:34:39 offset 0.000403 sec freq -6.714 ppm error 0.000493 poll 8
>> Dec 31 23:37:39 offset 0.000270 sec freq -6.714 ppm error 0.000348 poll 8
>> Dec 31 23:40:39 offset 0.000270 sec freq -6.714 ppm error 0.000337 poll 8
>> Dec 31 23:43:39 offset 0.000268 sec freq -6.714 ppm error 0.000327 poll 8
>> Dec 31 23:46:39 offset 0.000268 sec freq -6.714 ppm error 0.000381 poll 8
>> Dec 31 23:49:39 offset 0.000268 sec freq -6.714 ppm error 0.000446 poll 8
>> Dec 31 23:52:39 offset 0.000268 sec freq -6.714 ppm error 0.000446 poll 8
>> Dec 31 23:55:39 offset 0.000268 sec freq -6.714 ppm error 0.000334 poll 8
>> Dec 31 23:58:39 offset 0.000268 sec freq -6.714 ppm error 0.000317 poll 8
>> Jan  1 00:01:38 offset 0.000268 sec freq -6.714 ppm error 0.000318 poll 8
>> Jan  1 00:04:38 offset 0.000268 sec freq -6.714 ppm error 0.447285 poll 8
>> Jan  1 00:06:47 synchronized to A, stratum 2
>> Jan  1 00:07:38 offset -0.001068 sec freq -6.720 ppm error 0.632509 
>> poll 8
>> Jan  1 00:10:38 offset -0.001068 sec freq -6.720 ppm error 0.632509 
>> poll 8
>> Jan  1 00:13:38 offset -0.001068 sec freq -6.720 ppm error 0.774695 
>> poll 8
>> Jan  1 00:15:39 synchronized to H, stratum 1
>> Jan  1 00:16:38 offset -0.001068 sec freq -6.720 ppm error 0.632382 
>> poll 8
>> +
>> Jan  1 00:19:38 time reset +0.999402 s
>> +
>> Jan  1 00:19:38 system event 'event_clock_reset' (0x05) status 
>> 'sync_alarm, sync_unspec, 15 events, event_peer/strat_chg' (0xc0f4)
>> Jan  1 00:19:38 system event 'event_peer/strat_chg' (0x04) status 
>> 'sync_alarm, sync_unspec, 15 events, event_clock_reset' (0xc0f5)
>> Jan  1 00:19:39 offset 0.00 sec freq -6.720 ppm error 0.447203 poll 4
>> Jan  1 00:19:54 peer A event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:19:55 peer B event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:19:59 peer C event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:04 peer D event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:07 peer E event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:08 peer F event 'event_reach' (0x84) status 'unreach, 
>> conf, 4 events, event_reach' (0x8044)
>> Jan  1 00:20:14 peer G event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:18 peer H event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:24 peer I event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:26 peer J event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:28 peer K event 'event_reach' (0x84) status 'unreach, 
>> conf, 2 events, event_reach' (0x8024)
>> Jan  1 00:20:39 peer L event 'event_reach' (0x84) status 'unreach, 
>> conf, 4 events, event_reach' (0x8044)
>> Jan  1 00:20:55 synchronized to A, stratum 2
>> Jan  1 00:20:55 system event 'event_sync_chg' (0x03) status 
>> 'leap_none, sync_ntp, 15 events, event_peer/strat_chg' (0x6f4)
>> Jan  1 00:20:55 system event 'event_peer/strat_chg' (0x04) status 
>> 'leap_none, sync_ntp, 15 events, event_sync_chg' (0x6f3)
>> Jan  1 00:21:22 synchronized to H, stratum 1
>>
>> I also noticed that, the day before, the STA_INS (insert leap second) had
>> been set and reset several times.
>>
>> Dec 31 00:14:30 kernel time sync status change 0011
>> Dec 31 00:27:21 kernel time sync status change 0001
>> Dec 31 03:19:46 kernel time sync status change 0011
>> Dec 31 03:52:30 kernel time sync status change 0

Re: [ntp:questions] Leap second bug?

2008-01-05 Thread David Malone
"David L. Mills" <[EMAIL PROTECTED]> writes:

>The intended behavior if the servers do correctly signal a leap and the 
>kernel is unaware of that, is that the step interval will be exceeded 
>for about 15 minutes and then the time will be stepped. During that 
>interval your clock will appear one second slow relative to the server 
>that has correctly inserted a second. There will be no slew, onlly the 
>step. The fact that your time showed otherwise suggests either the step 
>has been disabled or something else comes unstuck. Our clocks here 
>showed no such behavior as yours.

During the 2005 leap second, I did see some of our peers show an
offset of 0.5 seconds for reasons that I don't understand.  For
example, see the last graph on this page:

http://www.maths.tcd.ie/~dwmalone/time/leap2005_peers.html

It wasn't the only example - several other peers showed an offset
of near 0.5 seconds after the leap - you can find those through the
"more graphs of other peers" link at the bottom of the page.

David.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-05 Thread Unruh
[EMAIL PROTECTED] (David Woolley) writes:

>In article <[EMAIL PROTECTED]>,
>Unruh <[EMAIL PROTECTED]> wrote:

>> and the time required to timestamp that packet is about another 2usec, with
>> fluctuations depending on whether other interrupts are being serviced.

>On both Linux and Windows, interrupt latencies of more than 4ms and
>even more than 10ms are quite common.  The interrupt processing time

I hope you mean 4 or 10usec, not msec.


>from the idle loop is not a good indication of the timing on a system
>doing real work.  (The above figures are based on clock interrupts 
>overrunning at 250 Hz and 100 Hz clock frequencies, typically when doing
>IDE disk I/O.)

Although you might not have. 
The figures I got were on an active machine, although often not used that
much.



___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-06 Thread David Woolley
In article <[EMAIL PROTECTED]>,
Unruh <[EMAIL PROTECTED]> wrote:


> I hope you mean 4 or 10usec, not msec.

No.  I mean milli-seconds, i.e. 1/HZ for HZ = 100 and 250.

I've personally had lost clock interrupts due to a disk driver, on Linux,
at HZ=100, but that was an obsolete high speed interface, on a relatively
slow machine.  People regularly get lost ticks on Linux at HZ=1000, when
using IDE's in non-DMA mode and I also believe they get them at HZ=250.
I believe there have been reports at HZ=100.

Windows users also report lost interrupts, although I'm not 100% sure
whether that applies with the normal HZ=~64 rate or with the, higher,
multimedia rate, which might be instigated by other software, although
ntpd now tends to instigate it itself, to avoid glitches when the rate
changes.

One of the problems is that modern operating system kernels tend to
be written in high level languages, so coders don't cycle count their
interrupt routines and proper use of priority interrupts can be difficult.
Short interrupt routines tend not to re-enable higher priorities at all,
although those won't have the sort of latency given above.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-06 Thread Hal Murray

>I've personally had lost clock interrupts due to a disk driver, on Linux,
>at HZ=100, but that was an obsolete high speed interface, on a relatively
>slow machine.  People regularly get lost ticks on Linux at HZ=1000, when
>using IDE's in non-DMA mode and I also believe they get them at HZ=250.
>I believe there have been reports at HZ=100.

It's easy to miss interrupts at HZ=100 in non-DMA mode.  All you have to
do is a lot of disk activity.  (I admit I haven't tried it with
a modern machine running a recent kernel.)

-- 
These are my opinions, not necessarily my employer's.  I hate spam.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-06 Thread Unruh
[EMAIL PROTECTED] (David Woolley) writes:

>In article <[EMAIL PROTECTED]>,
>Unruh <[EMAIL PROTECTED]> wrote:


>> I hope you mean 4 or 10usec, not msec.

>No.  I mean milli-seconds, i.e. 1/HZ for HZ = 100 and 250.

>I've personally had lost clock interrupts due to a disk driver, on Linux,
>at HZ=100, but that was an obsolete high speed interface, on a relatively

Is the timer interupt edge or level triggered? Ie, does this really mean
that the interrupt was turned off for 4 or 10msec, or just that when the
interrupt occured, it was not serviced?
I have not seen anything like that in my system, but It is possible it is
not heavily enough used. I see offsets fluctuations from the ps of 3us
standard deviation, 

 
>slow machine.  People regularly get lost ticks on Linux at HZ=1000, when
>using IDE's in non-DMA mode and I also believe they get them at HZ=250.
>I believe there have been reports at HZ=100.

>Windows users also report lost interrupts, although I'm not 100% sure
>whether that applies with the normal HZ=~64 rate or with the, higher,
>multimedia rate, which might be instigated by other software, although
>ntpd now tends to instigate it itself, to avoid glitches when the rate
>changes.

>One of the problems is that modern operating system kernels tend to
>be written in high level languages, so coders don't cycle count their
>interrupt routines and proper use of priority interrupts can be difficult.
>Short interrupt routines tend not to re-enable higher priorities at all,
>although those won't have the sort of latency given above.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Leap second bug?

2008-01-09 Thread Spoon
Spoon wrote:

> ntpd kicked my clock forward one second on January 1 at 00:19:38 UTC.
> 
> (My ntp.conf lists 12 servers. Delays range from 28 to 48 ms.)
> 
> Dec 31 23:25:39 offset 0.000329 sec freq -6.715 ppm error 0.000333 poll 8
> Dec 31 23:28:39 offset 0.000329 sec freq -6.715 ppm error 0.000340 poll 8
> Dec 31 23:31:39 offset 0.000329 sec freq -6.715 ppm error 0.000424 poll 8
> Dec 31 23:34:39 offset 0.000403 sec freq -6.714 ppm error 0.000493 poll 8
> Dec 31 23:37:39 offset 0.000270 sec freq -6.714 ppm error 0.000348 poll 8
> Dec 31 23:40:39 offset 0.000270 sec freq -6.714 ppm error 0.000337 poll 8
> Dec 31 23:43:39 offset 0.000268 sec freq -6.714 ppm error 0.000327 poll 8
> Dec 31 23:46:39 offset 0.000268 sec freq -6.714 ppm error 0.000381 poll 8
> Dec 31 23:49:39 offset 0.000268 sec freq -6.714 ppm error 0.000446 poll 8
> Dec 31 23:52:39 offset 0.000268 sec freq -6.714 ppm error 0.000446 poll 8
> Dec 31 23:55:39 offset 0.000268 sec freq -6.714 ppm error 0.000334 poll 8
> Dec 31 23:58:39 offset 0.000268 sec freq -6.714 ppm error 0.000317 poll 8
> Jan  1 00:01:38 offset 0.000268 sec freq -6.714 ppm error 0.000318 poll 8
> Jan  1 00:04:38 offset 0.000268 sec freq -6.714 ppm error 0.447285 poll 8
> Jan  1 00:06:47 synchronized to A, stratum 2
> Jan  1 00:07:38 offset -0.001068 sec freq -6.720 ppm error 0.632509 poll 8
> Jan  1 00:10:38 offset -0.001068 sec freq -6.720 ppm error 0.632509 poll 8
> Jan  1 00:13:38 offset -0.001068 sec freq -6.720 ppm error 0.774695 poll 8
> Jan  1 00:15:39 synchronized to H, stratum 1
> Jan  1 00:16:38 offset -0.001068 sec freq -6.720 ppm error 0.632382 poll 8
> +
> Jan  1 00:19:38 time reset +0.999402 s
> +
> Jan  1 00:19:38 system event 'event_clock_reset' (0x05) status 
> 'sync_alarm, sync_unspec, 15 events, event_peer/strat_chg' (0xc0f4)
> Jan  1 00:19:38 system event 'event_peer/strat_chg' (0x04) status 
> 'sync_alarm, sync_unspec, 15 events, event_clock_reset' (0xc0f5)
> Jan  1 00:19:39 offset 0.00 sec freq -6.720 ppm error 0.447203 poll 4
> Jan  1 00:19:54 peer A event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:19:55 peer B event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:19:59 peer C event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:04 peer D event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:07 peer E event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:08 peer F event 'event_reach' (0x84) status 'unreach, conf, 
> 4 events, event_reach' (0x8044)
> Jan  1 00:20:14 peer G event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:18 peer H event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:24 peer I event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:26 peer J event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:28 peer K event 'event_reach' (0x84) status 'unreach, conf, 
> 2 events, event_reach' (0x8024)
> Jan  1 00:20:39 peer L event 'event_reach' (0x84) status 'unreach, conf, 
> 4 events, event_reach' (0x8044)
> Jan  1 00:20:55 synchronized to A, stratum 2
> Jan  1 00:20:55 system event 'event_sync_chg' (0x03) status 'leap_none, 
> sync_ntp, 15 events, event_peer/strat_chg' (0x6f4)
> Jan  1 00:20:55 system event 'event_peer/strat_chg' (0x04) status 
> 'leap_none, sync_ntp, 15 events, event_sync_chg' (0x6f3)
> Jan  1 00:21:22 synchronized to H, stratum 1
> 
> I also noticed that, the day before, the STA_INS (insert leap second) had
> been set and reset several times.
> 
> Dec 31 00:14:30 kernel time sync status change 0011
> Dec 31 00:27:21 kernel time sync status change 0001
> Dec 31 03:19:46 kernel time sync status change 0011
> Dec 31 03:52:30 kernel time sync status change 0001
> Dec 31 04:09:33 kernel time sync status change 0011
> Dec 31 04:35:11 kernel time sync status change 0001
> Dec 31 07:26:03 kernel time sync status change 0011
> Dec 31 07:47:28 kernel time sync status change 0001
> Dec 31 10:00:51 kernel time sync status change 0011
> Dec 31 10:17:01 kernel time sync status change 0001
> 
> (Apparently, the bit was not set when 2007 ended.)
> 
> Could this be a leap year bug? or did I just lose connectivity at the wrong
> time and it's just a coincidence?
> 
> # ntpq -crv
> assID=0 status=06f4 leap_none, sync_ntp, 15 events, event_peer/strat_chg,
> version="ntpd [EMAIL PROTECTED] Fri Mar 16 10:45:43 UTC 2007 (1)",
> processor="i686", system="Linux/2.6.22.1-rt9", leap=00, stratum=3,
> precision=-20, rootdelay=30.293, rootdispersion=50.341, peer=39672,
> refid=145.238.203.10,
> reftime=cb262893.e5d244fd  Wed, Jan  2 2008 15:13:23.897, poll=8,
> clock=cb262c3b.dbe5d3de  Wed, Jan  2 2008 15:28:59.858, state=4,
> offset=0

Re: [ntp:questions] Leap second bug?

2008-01-10 Thread Martin Burnicki
Spoon wrote:
> I've just noticed the output of dmesg.
> Clock: inserting leap second 23:59:60 UTC
> (on two different systems)
>  
> The strange part is that, on one system, the line does not show up
> in kern.log, while on the second system, it does:
> 
> Jan  8 16:16:05 kernel: Clock: inserting leap second 23:59:60 UTC

That's normal. If ntpd has passed a leap second announcement to the kernel
then the kernel handles the leap second.

Of course whether the message appears or not depends on the implementation
of the kernel. 

Alternatively the kernel of one of the machines may not have received a leap
second announcement. Ntpd also does some plausibility checks (e.g. leap
second only possible at the end of June/December) before it passes the
announcement to the kernel.

And, also the source code of ntpd has changed over time, so the exact
behaviour depends also on the version of ntpd ...

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions