[ntp:questions] The libntp resumee...

2008-09-04 Thread Kay Hayen


thanks to all who replied. Unfortunately the moderation bit made all of my 
last replies expire and instead of reposting them, I choose to sum things up 
in a single post.

The original question was answered. What I really wanted was a libntpq anyway, 
and Heiko has already contributed a libntpq.a as well as a NTP SNMP daemon, 
both of which we want. I understand it's part of ntp-dev, but didn't have the 
chance to look on it (I was on a site visit this week). The plan certainly is 
to use that work and I am heavily reliefed that one apparently no longer has 
to fork the ntpq code base privately just to avoid the external process and 

But I have to thank you people for extra points made and things I have 

1. Mentioning the "25ms" as too long was a very bad idea from me, it sure 
provided only for distraction. I wasn't interested at all in discussing if 
that's a time frame that matters or not. In my world, every delay to 
accessing information needs another justification than "not needed as fast", 
but that leads to distraction as everybody may or may not subscribe to that 
point of view. 

2. I understood that it will be way better to monitor the offset of "external" 
NTP servers via the small time query packets that ntpd use among themselves. 
We will be able to determine a offset and restrict NTP servers that appear to 
malfunction. Benefit: We potentially could disable it in some cases, before 
it ever has an impact on our ntpd servers. 

3. I was not clear enough that our NTP interest has two roles. One as the 
maker of a specific system with specific NTP setup in place for which we have 
provided support. In that role I have come to learn about the necessity of 
NTP monitoring. Two as the maker of a middleware, where we can't tell people 
to change their environment, but are to monitor it. Avoiding errors is 
interesting in the first setup, in the second it's not our option and 

4. Also I learned about the orphaned mode. It wasn't available when our 
current setup was designed (more than 7 years ago I think) and will make for 
a nice enhancement proposal for our system.

5. The NTP rules about how often to query a daemon. I have tried to read up 
about it, but only found general advice. Under the assumption that ntpd is 
not multithreaded querying it at the time it should respond to other servers 
is slightly unfortunate. I was thinking that the query is so fast that it 
doesn't matter. I will have to back it up with numbers. I presume we could 
query the local ntpd for its time in a loop and compare with local current 
time to get an idea if extra libntpq queries degrade it or not.

Best regards,
Kay Hayen
questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-05 Thread Richard B. Gilbert
Kay Hayen wrote:
> Hello,
> thanks to all who replied. Unfortunately the moderation bit made all of my 
> last replies expire and instead of reposting them, I choose to sum things up 
> in a single post.

> 5. The NTP rules about how often to query a daemon. I have tried to read up 
> about it, but only found general advice. Under the assumption that ntpd is 
> not multithreaded querying it at the time it should respond to other servers 
> is slightly unfortunate. I was thinking that the query is so fast that it 
> doesn't matter. I will have to back it up with numbers. I presume we could 
> query the local ntpd for its time in a loop and compare with local current 
> time to get an idea if extra libntpq queries degrade it or not.

The "rules" about how often to query a daemon are not all that 
complicated.  The fact that there ARE rules is due to some history; 
google for "Netgear Wisconsin" for the sordid details.  For a "second 
opinion" google for "DLink PHK".

Briefly, you use the defaults for MINPOLL and MAXPOLL.  You may use the 
"iburst" keyword in a server statement for fast startup.  You may use 
the "burst" keyword ONLY with the permission of the the server's owner.
99.99% of NTP installations will work very well using these rules".  If 
yours does not, ask here for help!

"burst" is intended for systems that make a dialup telephone connection 
to a server three or four times a day.

"iburst" sends an initial burst of eight request packets at intervals of 
two seconds.  Thereafter, the server is polled at intervals between 64 
and 1024 seconds; ntpd adjusts the poll interval within this range as 

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-05 Thread Kay Hayen

Hello Richard,

you wrote:

> The "rules" about how often to query a daemon are not all that
> complicated.  The fact that there ARE rules is due to some history;
> google for "Netgear Wisconsin" for the sordid details.  For a "second
> opinion" google for "DLink PHK".

Fascinating reads indeed, thanks for the pointers. 

What worried me more was how often we can query the local ntpd before it will 
have an adverse effect. Meantime I somehow I sought to convince me I should 
be able to convince myself that ntpq requests are served at a different 
priority (other socket) than ntpd requests are. I didn't find 2 sockets 

> Briefly, you use the defaults for MINPOLL and MAXPOLL.  You may use the
> "iburst" keyword in a server statement for fast startup.  You may use
> the "burst" keyword ONLY with the permission of the the server's owner.
> 99.99% of NTP installations will work very well using these rules".  If
> yours does not, ask here for help!

Now speaking about our system, not the middleware, with connections as 

External NTPs <-> 2 entry hosts <-> 8 other hosts.

And iburst and minpoll=maxpoll=5 to improve the results.

Currently we observe that both entry hosts can both become restricted due to 
large offsets on other hosts, so they become restricted and that will make 
the software refuse to go on. Ideally that would not happen.

I will try to formulate questions:

When the other hosts synchronize to the entry hosts of our system, don't the 
other hosts ntpd know when and how much these entry hosts changed their time 
due to input? 

Would NTP would be more robust if we would configure routing on the entry 
hosts, so that they can all speak directly with the external NTPs on their 

Is the use of ntpdate before starting ntpd recommended and/or does the iburst 
option replace it?

Best regards,
Kay Hayen

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-05 Thread Richard B. Gilbert
Kay Hayen wrote:
> Hello Richard,
> you wrote:
>> The "rules" about how often to query a daemon are not all that
>> complicated.  The fact that there ARE rules is due to some history;
>> google for "Netgear Wisconsin" for the sordid details.  For a "second
>> opinion" google for "DLink PHK".
> Fascinating reads indeed, thanks for the pointers. 
> What worried me more was how often we can query the local ntpd before it will 
> have an adverse effect. Meantime I somehow I sought to convince me I should 
> be able to convince myself that ntpq requests are served at a different 
> priority (other socket) than ntpd requests are. I didn't find 2 sockets 
> though.
>> Briefly, you use the defaults for MINPOLL and MAXPOLL.  You may use the
>> "iburst" keyword in a server statement for fast startup.  You may use
>> the "burst" keyword ONLY with the permission of the the server's owner.
>> 99.99% of NTP installations will work very well using these rules".  If
>> yours does not, ask here for help!
> Now speaking about our system, not the middleware, with connections as 
> follows: 
> External NTPs <-> 2 entry hosts <-> 8 other hosts.
What do you mean by "entry hosts"?

> And iburst and minpoll=maxpoll=5 to improve the results.

Use the default values of minpoll and maxpoll!  Ntpd will adjust the 
polling interval within those limits.  Ntpd is far smarter than you or 
I.  It will normally start by using minpoll and increase the interval 
after it has initial synchronization.  If network conditions deteriorate 
it will decrease the poll interval and increase it as conditions 
improve.  IOW it will use the optimum poll interval for the conditions 
then obtaining.  If you configured seven servers, you might observe ntpd 
using seven DIFFERENT poll intervals, one for each server because seven 
different servers will be reached by at least seven different network paths!
> Currently we observe that both entry hosts can both become restricted due to 
> large offsets on other hosts, so they become restricted and that will make 
> the software refuse to go on. Ideally that would not happen.
> I will try to formulate questions:
> When the other hosts synchronize to the entry hosts of our system, don't the 
> other hosts ntpd know when and how much these entry hosts changed their time 
> due to input? 
> Would NTP would be more robust if we would configure routing on the entry 
> hosts, so that they can all speak directly with the external NTPs on their 
> own?
> Is the use of ntpdate before starting ntpd recommended and/or does the iburst 
> option replace it?
> Best regards,
> Kay Hayen

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-05 Thread Harlan Stenn

I think most of your questions are answered at:


I'd also be happy to discuss with you or anybody else at your company how
membership in the NTP Forum would be of benefit.
Harlan Stenn <[EMAIL PROTECTED]>
http://ntpforum.isc.org  - be a member!

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-05 Thread David Woolley
Kay Hayen wrote:
> External NTPs <-> 2 entry hosts <-> 8 other hosts.
> And iburst and minpoll=maxpoll=5 to improve the results.

If these External NTPs really are external, i.e. not owned by you, do 
not do this without explicit permission from their owners.  There is a 
real risk of countermeasures if you don't.  These may result in poor 
time or no time.  Generally polling with anything less than the default 
MINPOLL and MAXPOLL can be considered abusive and polling with a MAXPOLL 
less than the default MINPOLL will trigger countermeasures in any system 
configure to apply them.

> Currently we observe that both entry hosts can both become restricted due to 
> large offsets on other hosts, so they become restricted and that will make 
> the software refuse to go on. Ideally that would not happen.

I've never triggered countermeasures (kiss of death), but I have a 
feeling that that is what you will observe on an NTP client that is too 
old to recognize the warning it will get from the server.

If you are not subject to countermeasures, you have something very very 
broken if you reach the 1000s drop dead point.  You should be worried, 
but it can happen legitimately, if you exceed the 128ms step threshold.
> I will try to formulate questions:
> When the other hosts synchronize to the entry hosts of our system, don't the 
> other hosts ntpd know when and how much these entry hosts changed their time 
> due to input? 

You seem to be under the misapprehension that ntpd makes step changes on 
each measurement.  It actually makes slow adjustments to effective 
frequency and rate of change of frequency based on s signficant number 
of preceding measurements (Unruh: I'm over-simplifying both the 8 step 
filter and the low pass loop filter here).

> Would NTP would be more robust if we would configure routing on the entry 
> hosts, so that they can all speak directly with the external NTPs on their 
> own?

Ask permission from the owners of the external hosts before doing this, 
as it increases the load you impose. Also, it is likely to result in 
larger offsets between machines.
> Is the use of ntpdate before starting ntpd recommended and/or does the iburst 
> option replace it?

ntpdate is deprecated.  -g is the nearest equivalent function in ntpd.

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-05 Thread Unruh
[EMAIL PROTECTED] (Kay Hayen) writes:

>Hello Richard,

>you wrote:

>> The "rules" about how often to query a daemon are not all that
>> complicated.  The fact that there ARE rules is due to some history;
>> google for "Netgear Wisconsin" for the sordid details.  For a "second
>> opinion" google for "DLink PHK".

>Fascinating reads indeed, thanks for the pointers. 

>What worried me more was how often we can query the local ntpd before it will 
>have an adverse effect. Meantime I somehow I sought to convince me I should 
>be able to convince myself that ntpq requests are served at a different 
>priority (other socket) than ntpd requests are. I didn't find 2 sockets 

Depends on the system but thousands of times per second is not out of the
ballpark. I assume you are not planning anything that severe.
(Some servers bombarded by those idiotic people I believed managed those
kinds of rates.)

>> Briefly, you use the defaults for MINPOLL and MAXPOLL.  You may use the
>> "iburst" keyword in a server statement for fast startup.  You may use
>> the "burst" keyword ONLY with the permission of the the server's owner.
>> 99.99% of NTP installations will work very well using these rules".  If
>> yours does not, ask here for help!

>Now speaking about our system, not the middleware, with connections as 

>External NTPs <-> 2 entry hosts <-> 8 other hosts.

>And iburst and minpoll=maxpoll=5 to improve the results.

On which? That should NOT be on the external NTPs unless you own them. That
will not necessarily improve results-- depends on whether you want short
term accuracy or long term (eg what happens if the connection with the
outside world goes down for 3 days. Do you want to make sure your systems
will keep good time during those three days? Are you willing to buy 25usec
rather thahn 50usec short term accuracy for 10 sec drift over that 3 days?

>Currently we observe that both entry hosts can both become restricted due to 
>large offsets on other hosts, so they become restricted and that will make 
>the software refuse to go on. Ideally that would not happen.

>I will try to formulate questions:

>When the other hosts synchronize to the entry hosts of our system, don't the 
>other hosts ntpd know when and how much these entry hosts changed their time 
>due to input? 

Yes, and no. On one level no-- they trust their sources. However part of
the information they get is the dispersion. That gives some info about how
well those servers are tracking the outside world.

>Would NTP would be more robust if we would configure routing on the entry 
>hosts, so that they can all speak directly with the external NTPs on their 

Multiple paths are always more robust than one path. 

>Is the use of ntpdate before starting ntpd recommended and/or does the iburst 
>option replace it?

Not recommended. 

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-05 Thread Unruh
"Richard B. Gilbert" <[EMAIL PROTECTED]> writes:

>Kay Hayen wrote:
>> Hello Richard,
>> you wrote:
>>> The "rules" about how often to query a daemon are not all that
>>> complicated.  The fact that there ARE rules is due to some history;
>>> google for "Netgear Wisconsin" for the sordid details.  For a "second
>>> opinion" google for "DLink PHK".
>> Fascinating reads indeed, thanks for the pointers. 
>> What worried me more was how often we can query the local ntpd before it 
>> will 
>> have an adverse effect. Meantime I somehow I sought to convince me I should 
>> be able to convince myself that ntpq requests are served at a different 
>> priority (other socket) than ntpd requests are. I didn't find 2 sockets 
>> though.
>>> Briefly, you use the defaults for MINPOLL and MAXPOLL.  You may use the
>>> "iburst" keyword in a server statement for fast startup.  You may use
>>> the "burst" keyword ONLY with the permission of the the server's owner.
>>> 99.99% of NTP installations will work very well using these rules".  If
>>> yours does not, ask here for help!
>> Now speaking about our system, not the middleware, with connections as 
>> follows: 
>> External NTPs <-> 2 entry hosts <-> 8 other hosts.
>What do you mean by "entry hosts"?

>> And iburst and minpoll=maxpoll=5 to improve the results.

>Use the default values of minpoll and maxpoll!  Ntpd will adjust the 
>polling interval within those limits.  Ntpd is far smarter than you or 

Well, you have too much faith in ntp. It is a whole series of comprimises,
many set up in the days when one second network delays were not unknown. 
And one of ht ekey design criteria in that minpoll/maxpoll is to relieve
congestion on the servers. IF he is using his own servers (not outside
servers) then he can decrease the minpoll/maxpoll pairs ( after all the
refclocks run at minpoll=maxpoll 4) But there is a tradeoff. because of the
design of ntp, if you choose a low maxpoll, you will keep the phase errors
smaller, but at the expense of larger drift errors. (it basically averages
over a time interval a few times longer than the maxpoll interval) A longer
timebase means a longer lever arm for determining the drift. But at the
expense of not having as much data to beat down the statistical errors in
the offset. 

Thus, with ntp if you want an accurate determination of the clock drift,
use a longer poll ( eg if there is a chance of your system loosing
connectivity for a few days) If you want lower phase noise while connected,
use a shorter poll. But remember that servers out there will get extremely
upset if you query them too often. 

Essentially you want to be working the Allan minimum to get rid of both
short and long term. But NTP does not determine where that is. It simply
assumes a value. That assumption is not necessarily very good. 
(Close by clock servers, with heavily used machines-- lots of temp
fluctuations-- and the optimum point is much shorter than the assumption.
Ie, statistical errors are much smaller than clock drift errors. 

>I.  It will normally start by using minpoll and increase the interval 
>after it has initial synchronization.  If network conditions deteriorate 
>it will decrease the poll interval and increase it as conditions 
>improve.  IOW it will use the optimum poll interval for the conditions 
>then obtaining.  If you configured seven servers, you might observe ntpd 
>using seven DIFFERENT poll intervals, one for each server because seven 
>different servers will be reached by at least seven different network paths!
>> Currently we observe that both entry hosts can both become restricted due to 
>> large offsets on other hosts, so they become restricted and that will make 
>> the software refuse to go on. Ideally that would not happen.
>> I will try to formulate questions:
>> When the other hosts synchronize to the entry hosts of our system, don't the 
>> other hosts ntpd know when and how much these entry hosts changed their time 
>> due to input? 
>> Would NTP would be more robust if we would configure routing on the entry 
>> hosts, so that they can all speak directly with the external NTPs on their 
>> own?
>> Is the use of ntpdate before starting ntpd recommended and/or does the 
>> iburst 
>> option replace it?
>> Best regards,
>> Kay Hayen

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-06 Thread David Woolley
Kay Hayen wrote:

> thanks to all who replied. Unfortunately the moderation bit made all of my 
> last replies expire and instead of reposting them, I choose to sum things up 
> in a single post.

Expiration is not a moderation thing, it is something done by your 
USENET service provider to manage disk space.  I think mine still has 
the whole thread, and Google groups certainly will, except for any that 
people have told it not to store.

The other thing that can have this effect, is that many Usenet readers, 
by default, hide messages you have already read.

On the other hand, it appears that you are submitting via the mail to 
new gateway.  If your ISP is expiring items in your mailbox so soon, you 
need a new ISP.

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-06 Thread Kay Hayen

> > Now speaking about our system, not the middleware, with connections as
> > follows:
> >
> > External NTPs <-> 2 entry hosts <-> 8 other hosts.
> What do you mean by "entry hosts"?

>From our 10 machines, only 2 have connection to the "external" NTP servers. 
The "entry hosts" are these and servers of the "other" 8 ones.

> > And iburst and minpoll=maxpoll=5 to improve the results.
> Use the default values of minpoll and maxpoll!  Ntpd will adjust the
> polling interval within those limits.  Ntpd is far smarter than you or
> I.  It will normally start by using minpoll and increase the interval
> after it has initial synchronization.  If network conditions deteriorate
> it will decrease the poll interval and increase it as conditions
> improve.  IOW it will use the optimum poll interval for the conditions
> then obtaining.  If you configured seven servers, you might observe ntpd
> using seven DIFFERENT poll intervals, one for each server because seven
> different servers will be reached by at least seven different network
> paths!

Well, to my knowledge we did it because we observed improved convergence 
behaviour on the 8 "other hosts", and particularily because it was not 
working before. At the time they do an "iburst", none of the entry machines 
may be running an ntpd yet, nor may it have completed its own iburst yet. 

They all boot at the same time, so that would be why the low poll value is 
used. As our system runs in isolated environments where people have full 
control, polling this frequent (still only ever 32 seconds) is not a big 

We have requirements to be able to run the software in x seconds after reboot 
or else our customers acceptance tests fail. The requirement makes sense as 
we are talking here about availability of service or not. Obviously the time 
should be as small as possible.

For the servers behind the "entry hosts" I don't see how we could let ntpd 
have its way when it's too slow. 

Our requirements are abnormal, admitted. We require "equal" time on all 10 
machines and that very fast. 

I somehow think we should have something with ntpdate before ntpd is run. It 
would waits for reachability of "ntpd" on the entry hosts and does an ntpdate 
before running the local ntpd with an iburst that will then have less work to 
do. (We shouldn't use a drift file in that case I presume, but due to issues 
with the old middleware NTP supervision, we can't anyway.)

Then we could be faster and be robust against boot order variations. 

Best regards,
Kay Hayen
questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-06 Thread Kay Hayen

Hello Mr. Unruh,

> >What worried me more was how often we can query the local ntpd before it
> > will have an adverse effect. Meantime I somehow I sought to convince me I
> > should be able to convince myself that ntpq requests are served at a
> > different priority (other socket) than ntpd requests are. I didn't find 2
> > sockets though.
> Depends on the system but thousands of times per second is not out of the
> ballpark. I assume you are not planning anything that severe.
> (Some servers bombarded by those idiotic people I believed managed those
> kinds of rates.)

No, not at all. We will only be targeting our local ntpd with ntpq requests 
and then we will likely be able to use low rates.

As we are now for the offsets going to monitor them on our own contacting the 
external ntpd at a rate, we will only need to know when its going to contact 
an ntpd, and then restrict via another ntpq request possibly.

For all of that is no longer critical to be fast. Thank you for pounding on me 
with that. :-)

> >> Briefly, you use the defaults for MINPOLL and MAXPOLL.  You may use the
> >> "iburst" keyword in a server statement for fast startup.  You may use
> >> the "burst" keyword ONLY with the permission of the the server's owner.
> >> 99.99% of NTP installations will work very well using these rules".  If
> >> yours does not, ask here for help!
> >
> >Now speaking about our system, not the middleware, with connections as
> >follows:
> >
> >External NTPs <-> 2 entry hosts <-> 8 other hosts.
> >
> >And iburst and minpoll=maxpoll=5 to improve the results.
> On which? That should NOT be on the external NTPs unless you own them. That
> will not necessarily improve results-- depends on whether you want short
> term accuracy or long term (eg what happens if the connection with the
> outside world goes down for 3 days. Do you want to make sure your systems
> will keep good time during those three days? Are you willing to buy 25usec
> rather thahn 50usec short term accuracy for 10 sec drift over that 3 days?

If the NTP connections fail, we can accept a slow drift very well. But see my 
last response to Richard B. Gilbert about why this is needed. We want the 8 
other hosts to synchronize fast. 

When they "iburst" none of the entry hosts may already have completed its own 
startup, so they need to poll quickly even after the "iburst" or else 
sychronization after reboot will take too long.

> >Currently we observe that both entry hosts can both become restricted due
> > to large offsets on other hosts, so they become restricted and that will
> > make the software refuse to go on. Ideally that would not happen.
> >
> >
> >I will try to formulate questions:
> >
> >When the other hosts synchronize to the entry hosts of our system, don't
> > the other hosts ntpd know when and how much these entry hosts changed
> > their time due to input?
> Yes, and no. On one level no-- they trust their sources. However part of
> the information they get is the dispersion. That gives some info about how
> well those servers are tracking the outside world.

But that would be more of "no". All the increased dispersion on "entry hosts" 
due to required time shifting is going to give us is a slow down in the 
synchronization of the "other hosts".

> >Is the use of ntpdate before starting ntpd recommended and/or does the
> > iburst option replace it?
> Not recommended.

I sort of think that we can build something for the "other" hosts that makes 
them wait for the "entry" hosts to be synchronized. See that response to 
Richard B. Gilbert again. 

We could alternatively want to change ntpd in a way that the iburst lasts 
until a sufficient synchronization was achieved. But it appears to be more 
simply to delay the iburst by delaying the ntpd start until sufficient 
conditions are met.

For the startup of our system that could be a solution that removes the need 
for permanently low poll intervalls, although they are only needed initially.

Best regards,
Kay Hayen

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-06 Thread Kay Hayen

Hello David,

Am Freitag, 5. September 2008 23:50:39 schrieb David Woolley:
> Kay Hayen wrote:
> > External NTPs <-> 2 entry hosts <-> 8 other hosts.
> >
> > And iburst and minpoll=maxpoll=5 to improve the results.
> If these External NTPs really are external, i.e. not owned by you, do
> not do this without explicit permission from their owners.  There is a
> real risk of countermeasures if you don't.  These may result in poor
> time or no time.  Generally polling with anything less than the default
> MINPOLL and MAXPOLL can be considered abusive and polling with a MAXPOLL
> less than the default MINPOLL will trigger countermeasures in any system
> configure to apply them.

They are owned by the same people who then own installations of our system, so 
that wouldn't be an issue.

When I say "restrict" it is our own system that decides that ">x ms" offset is 
too bad and prevents ntpd from talking to it any further with a "restrict" 
command. If all 2 servers of an "other host" are "restricted", it will crash 
the software.

All of that is own our making and control.

Regarding the poll values. I am not sure why we do it the external NTPs as 
well. Could be that the dispersion can be brought down quicker this way 
on "entry hosts" and allow the "other hosts" to synchronize faster with them, 
or could be that we never considered it worthwhile to optimize it away.

> > I will try to formulate questions:
> >
> > When the other hosts synchronize to the entry hosts of our system, don't
> > the other hosts ntpd know when and how much these entry hosts changed
> > their time due to input?
> You seem to be under the misapprehension that ntpd makes step changes on
> each measurement.  It actually makes slow adjustments to effective
> frequency and rate of change of frequency based on s signficant number
> of preceding measurements (Unruh: I'm over-simplifying both the 8 step
> filter and the low pass loop filter here).

Well yes, but between 2 queries from the same client the ntpd will have made a 
certain adjustment. If the client gets to know this value, it will have to 
blame its own clock for that extra difference and assign it dispersion that 
it doesn't deserve.

So, what I don't get is probably more like: How much will a stratum 3 server 
be able to use the stratum 2 server only as an indirection of the stratum 1 

In my mind the stratum 2 server was only trying to be accurate about how old 
the stratum 1 datation is, and that the stratum 3 would be enabled to try and 
converge towards the stratum 1 clock.

In other words I could say: I was expecting the ntpd answer to contain 
upstream ntpd answers as well. And I was expecting the processing ntpd trying 
to guess the stratum 1 time. Instead it seems, it is "only" trying to guess 
the next upstream time and based on dispersions (its own and upstream) it's 
following the direction of the guessed time more or less closely.

That's a different model and I think Mr. Unruh already clarified to me that 
it's not the model that NTP uses. I think "my" model has little experience or 
qualification behind it. Current NTP on the other hand is proven.

But I guess, it explains, why I have had a hard time to ever understand why 
the "entry hosts" are so bad at forwarding the time after reboot. Obviously 
NTP is not and need not be optimized towards simultaneous initialization.

Best regards,
Kay Hayen
questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-07 Thread Harlan Stenn

If you use iburst in your config files and have a "good" value in the
ntp.drift file, ntpd should sync up and be ready to go in about 11 seconds.

Please see:




Harlan Stenn <[EMAIL PROTECTED]>
http://ntpforum.isc.org  - be a member!

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-07 Thread Kay Hayen

Hello Harlan,

you wrote:

> If you use iburst in your config files and have a "good" value in the
> ntp.drift file, ntpd should sync up and be ready to go in about 11 seconds.
> Please see:
>  https://support.ntp.org/bin/view/Support/ConfiguringNTP
> and
>  https://support.ntp.org/bin/view/Support/StartingNTP

I wasn't aware of "ntp-wait" yet. Seems to do (almost) what we might want:

Quote :

If you have services (like some database servers) that require that the time 
is never "stepped" backwards, run: 
   ntp-wait -v

as late as possible in the boot sequence, before starting these time-sensitive 

In effect that is what we want to do before the start of our application. But 
it doesn't solve the problem fully for us. We would want on our "other hosts" 
to have it check remote ntpd. That way we would have:

External NTPs <-> Entry Hosts <-> Other Hosts

The "entry hosts" would do simply local ntp-wait before starting the 
application, but otherwise behave as normal. They only need to iburst and 
then use default poll values.

The "other hosts" would do 2 ntp-wait on the 2 entry hosts. Only once either 
of them finishes, the ntpd is started and boot sequence continued.

Et voila, our simultaneous initialization problems would be gone. Checking the 
man page of ntp-wait on my Debian Testing here (4.2.4p4) it seems we would 
have to enable the query of remote hosts first, but that sounds like a rather 
simple patch.

The fundamental issue is that the "iburst" of the "other" hosts gets done 
before it is entirely useful (the entry hosts are only just synchronizing at 
best) and a remote ntp-wait could solve that.

Best regards,
Kay Hayen

PS: Addressing the support suggestion too. We will consider it definitely. So 
far only our customers have had such contracts for their operational use. But 
as we start to provide a NTP monitoring middleware as well, the situation 
will be entirely different. There we don't control the NTP setup at all, but 
only monitor it and raise alarms that will frequently result in support 
questions to us. We would like to have a partner for these, I presume. I will 
raise the issue in a meeting next week.

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-07 Thread David Woolley
Kay Hayen wrote:

> We could alternatively want to change ntpd in a way that the iburst lasts 
> until a sufficient synchronization was achieved. But it appears to be more 
> simply to delay the iburst by delaying the ntpd start until sufficient 
> conditions are met.

That's not going to be desirable.  Although you might only use it on 
your internal severs, it will soon get round on the grapevine that it is 
a good thing to do, which will result in servers that are down or denied 
to the client, or the networks of ex-servers getting bombarded with 
large numbers of requests, whereas I believe the standard behaviour is 
to back off under those circumstances.

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-07 Thread David Woolley
Harlan Stenn wrote:
> If you use iburst in your config files and have a "good" value in the
> ntp.drift file, ntpd should sync up and be ready to go in about 11 seconds.

It may be in error by up to 128ms under these circumstances, which will 
take an hour or so to correct, during which there will be significant 
frequency excursions.  Note that this isn't a consequence of iburst, 
iburst simply means that it will accept the current local time faster.

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-07 Thread Kay Hayen
Hello David,

Am Sonntag, 7. September 2008 10:36:35 schrieb David Woolley:
> Kay Hayen wrote:
> > We could alternatively want to change ntpd in a way that the iburst lasts
> > until a sufficient synchronization was achieved. But it appears to be
> > more simply to delay the iburst by delaying the ntpd start until
> > sufficient conditions are met.
> That's not going to be desirable.  Although you might only use it on
> your internal severs, it will soon get round on the grapevine that it is
> a good thing to do, which will result in servers that are down or denied
> to the client, or the networks of ex-servers getting bombarded with
> large numbers of requests, whereas I believe the standard behaviour is
> to back off under those circumstances.

Which is why I assume that the project won't accept patches that make ntp-wait 
work with remote hosts. Anybody correct me if I am wrong, because we would 
gladly contribute such patches. In the mean time, we can use the new libntpq 
to achieve the same effect.

The use of ntp-wait on external servers would be pointless and harmful. It 
only makes sense to us, because we sort of _know_ that our startup process is 
in a race of all machines at the same time, because of the joint reboot, 
joint power up after (simulated) power failure.

I think the 11 seconds that Harlan mentioned are something we definitely want 
to have, but fail to meet the preconditions on the "other" hosts, due to the 
lack of a waiting step. But if have measures in place to make sure that 
our "entry" servers are themselves synchronized themselves before bursting on 
them with our "other" hosts, then it will be all graceful I guess.

If done correctly (async to other boot tasks), the delay in starting our 
application could become difficult to notice. And using the new-born libntpq 
with remote hosts, an implementation of ntp-wait for remote servers will be 
rather trivial to make.

Well, to sum it up. I think I got a plan now.

Best regards,
Kay Hayen
questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-07 Thread Harlan Stenn
>>> In article <[EMAIL PROTECTED]>, David Woolley <[EMAIL PROTECTED]> writes:

David> Harlan Stenn wrote:
>> If you use iburst in your config files and have a "good" value in the
>> ntp.drift file, ntpd should sync up and be ready to go in about 11
>> seconds.

David> It may be in error by up to 128ms under these circumstances, which
David> will take an hour or so to correct, during which there will be
David> significant frequency excursions.  Note that this isn't a consequence
David> of iburst, iburst simply means that it will accept the current local
David> time faster.

Please see (and add useful content to):


Harlan Stenn <[EMAIL PROTECTED]>
http://ntpforum.isc.org  - be a member!

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-08 Thread David Woolley
Kay Hayen wrote:
> They are owned by the same people who then own installations of our system, 
> so 
> that wouldn't be an issue.

You will still have to ensure that they do not enable kiss of death on 
those servers.  Also you should make sure that they don't try to use 
w32time, especially older versions, and if your timing requirements are 
tighter than a few tens of milliseconds, that there are no Windows 
machines involved.

> When I say "restrict" it is our own system that decides that ">x ms" offset 
> is 
> too bad and prevents ntpd from talking to it any further with a "restrict" 
> command. If all 2 servers of an "other host" are "restricted", it will crash 
> the software.

You are overriding NTP's selection algorithms.  Effectively you are no 
longer running NTP.
> All of that is own our making and control.
> Regarding the poll values. I am not sure why we do it the external NTPs as 
> well. Could be that the dispersion can be brought down quicker this way 

You are misusing "dispersion".  Dispersion is an estimate of worst case 
drift and reading resolution errors.

> on "entry hosts" and allow the "other hosts" to synchronize faster with them, 
> or could be that we never considered it worthwhile to optimize it away.
> Well yes, but between 2 queries from the same client the ntpd will have made 
> a 
> certain adjustment. If the client gets to know this value, it will have to 

ntpd is making adjustments at least every 4 seconds (old versions) and 
as often as every clock tick.  It does this by adjusting frequency not 
by directly adjusting time.

> blame its own clock for that extra difference and assign it dispersion that 
> it doesn't deserve.
> That's a different model and I think Mr. Unruh already clarified to me that 
> it's not the model that NTP uses. I think "my" model has little experience or 
> qualification behind it. Current NTP on the other hand is proven.
Firstly, I don't know any time synchronisation software that doesn't 
have a large step by step element.

More importantly, if you are going to micro-manage ntpd, you need a deep 
understanding of how it works to know what the statistics really mean 
and know what are realistic expectations.

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-08 Thread Martin Burnicki

David Woolley wrote:
> Kay Hayen wrote:
>> thanks to all who replied. Unfortunately the moderation bit made all of
>> my last replies expire and instead of reposting them, I choose to sum
>> things up in a single post.
> Expiration is not a moderation thing, it is something done by your
> USENET service provider to manage disk space.  I think mine still has
> the whole thread, and Google groups certainly will, except for any that
> people have told it not to store.
> The other thing that can have this effect, is that many Usenet readers,
> by default, hide messages you have already read.
> On the other hand, it appears that you are submitting via the mail to
> new gateway.  If your ISP is expiring items in your mailbox so soon, you
> need a new ISP.

If I understand Kay correctly then the problem is that he responded via the
questions@ list and the moderation bit was set there, which prevented some
of his articles from being gatewayed to the usenet.

In Kay's original thread Steve Kostecke mentioned he had removed the
moderation bit for Kay, but obviously that did not fully help.

Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-08 Thread Kay Hayen

Hello David,

> > When I say "restrict" it is our own system that decides that ">x ms"
> > offset is too bad and prevents ntpd from talking to it any further with a
> > "restrict" command. If all 2 servers of an "other host" are "restricted",
> > it will crash the software.
> You are overriding NTP's selection algorithms.  Effectively you are no
> longer running NTP.

How would it be difference from using the restrict command manually? 

And why would it not be NTP?

> > All of that is own our making and control.
> >
> > Regarding the poll values. I am not sure why we do it the external NTPs
> > as well. Could be that the dispersion can be brought down quicker this
> > way
> You are misusing "dispersion".  Dispersion is an estimate of worst case
> drift and reading resolution errors.

Well, dispersion is going down only with more samples to base estimation on, 
isn't it? And we need that quick, if we want the server to influence the 
hosts behind it quickly, say after a "NTP LAN" failure ended (some people 
have dedicated LANs for NTP).

> > on "entry hosts" and allow the "other hosts" to synchronize faster with
> > them, or could be that we never considered it worthwhile to optimize it
> > away. Well yes, but between 2 queries from the same client the ntpd will
> > have made a certain adjustment. If the client gets to know this value, it
> > will have to
> ntpd is making adjustments at least every 4 seconds (old versions) and
> as often as every clock tick.  It does this by adjusting frequency not
> by directly adjusting time.

I was not concerned with how the kernel makes the adjustments, but rather that 
the a fixed time change over the period is known. The slew rate is known, 
isn't it?

Let me use a car analogy, these things work. :-)

Lets assume a three lane high way with 3 cars that try to drive at the same 
speed. The car to the left is driving at (near) constant speed. The driver in 
the middle accelerates and braces according to his motor behaviour as well as 
the observed difference in speed between him and the other one. Now what 
should the driver to the right do?

In my view, he could take the acceleration of his neighbour into account when 
making estimates of his own error.

Best regards,
Kay Hayen
questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-08 Thread Unruh
[EMAIL PROTECTED] (Kay Hayen) writes:

>Hello David,

>> > When I say "restrict" it is our own system that decides that ">x ms"
>> > offset is too bad and prevents ntpd from talking to it any further with a
>> > "restrict" command. If all 2 servers of an "other host" are "restricted",
>> > it will crash the software.
>> You are overriding NTP's selection algorithms.  Effectively you are no
>> longer running NTP.

>How would it be difference from using the restrict command manually? 

>And why would it not be NTP?

>> > All of that is own our making and control.
>> >
>> > Regarding the poll values. I am not sure why we do it the external NTPs
>> > as well. Could be that the dispersion can be brought down quicker this
>> > way
>> You are misusing "dispersion".  Dispersion is an estimate of worst case
>> drift and reading resolution errors.

>Well, dispersion is going down only with more samples to base estimation on, 
>isn't it? And we need that quick, if we want the server to influence the 
>hosts behind it quickly, say after a "NTP LAN" failure ended (some people 
>have dedicated LANs for NTP).

>> > on "entry hosts" and allow the "other hosts" to synchronize faster with
>> > them, or could be that we never considered it worthwhile to optimize it
>> > away. Well yes, but between 2 queries from the same client the ntpd will
>> > have made a certain adjustment. If the client gets to know this value, it
>> > will have to
>> ntpd is making adjustments at least every 4 seconds (old versions) and
>> as often as every clock tick.  It does this by adjusting frequency not
>> by directly adjusting time.

>I was not concerned with how the kernel makes the adjustments, but rather that 
>the a fixed time change over the period is known. The slew rate is known, 
>isn't it?

>Let me use a car analogy, these things work. :-)

>Lets assume a three lane high way with 3 cars that try to drive at the same 
>speed. The car to the left is driving at (near) constant speed. The driver in 
>the middle accelerates and braces according to his motor behaviour as well as 
>the observed difference in speed between him and the other one. Now what 
>should the driver to the right do?

The cars have the road as a reference. However without the road, how does
car 3 know that car 2 is accelerating and decelerating and that it is not
hiw own car that is misbehaving?  He does not. All he
can do is collect more cars and use the average behaviour to determine who
is behaving badly. 

With two other cars only as a reference there is no way of deciding which
is weird. 

And if he has the road as a reference, then use the road, not either of the
other cars ( ie buy yourself a GPS receiver with PPS and then you will not
have to worry about what other cars are doing).

>In my view, he could take the acceleration of his neighbour into account when 
>making estimates of his own error.

>Best regards,
>Kay Hayen

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-08 Thread Richard B. Gilbert
Unruh wrote:
> [EMAIL PROTECTED] (Kay Hayen) writes:
>> Hello David,
 When I say "restrict" it is our own system that decides that ">x ms"
 offset is too bad and prevents ntpd from talking to it any further with a
 "restrict" command. If all 2 servers of an "other host" are "restricted",
 it will crash the software.
>>> You are overriding NTP's selection algorithms.  Effectively you are no
>>> longer running NTP.
>> How would it be difference from using the restrict command manually? 
>> And why would it not be NTP?
 All of that is own our making and control.

 Regarding the poll values. I am not sure why we do it the external NTPs
 as well. Could be that the dispersion can be brought down quicker this
>>> You are misusing "dispersion".  Dispersion is an estimate of worst case
>>> drift and reading resolution errors.
>> Well, dispersion is going down only with more samples to base estimation on, 
>> isn't it? And we need that quick, if we want the server to influence the 
>> hosts behind it quickly, say after a "NTP LAN" failure ended (some people 
>> have dedicated LANs for NTP).
 on "entry hosts" and allow the "other hosts" to synchronize faster with
 them, or could be that we never considered it worthwhile to optimize it
 away. Well yes, but between 2 queries from the same client the ntpd will
 have made a certain adjustment. If the client gets to know this value, it
 will have to
>>> ntpd is making adjustments at least every 4 seconds (old versions) and
>>> as often as every clock tick.  It does this by adjusting frequency not
>>> by directly adjusting time.
>> I was not concerned with how the kernel makes the adjustments, but rather 
>> that 
>> the a fixed time change over the period is known. The slew rate is known, 
>> isn't it?
>> Let me use a car analogy, these things work. :-)
>> Lets assume a three lane high way with 3 cars that try to drive at the same 
>> speed. The car to the left is driving at (near) constant speed. The driver 
>> in 
>> the middle accelerates and braces according to his motor behaviour as well 
>> as 
>> the observed difference in speed between him and the other one. Now what 
>> should the driver to the right do?
> The cars have the road as a reference. However without the road, how does
> car 3 know that car 2 is accelerating and decelerating and that it is not
> hiw own car that is misbehaving?  He does not. All he
> can do is collect more cars and use the average behaviour to determine who
> is behaving badly.

Car 3 has a speedometer!

> With two other cars only as a reference there is no way of deciding which
> is weird. 
> And if he has the road as a reference, then use the road, not either of the
> other cars ( ie buy yourself a GPS receiver with PPS and then you will not
> have to worry about what other cars are doing).
>> In my view, he could take the acceleration of his neighbour into account 
>> when 
>> making estimates of his own error.
>> Best regards,
>> Kay Hayen

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-08 Thread Richard B. Gilbert
Kay Hayen wrote:
> Hello David,
>>> When I say "restrict" it is our own system that decides that ">x ms"
>>> offset is too bad and prevents ntpd from talking to it any further with a
>>> "restrict" command. If all 2 servers of an "other host" are "restricted",
>>> it will crash the software.
>> You are overriding NTP's selection algorithms.  Effectively you are no
>> longer running NTP.
> How would it be difference from using the restrict command manually? 
> And why would it not be NTP?
>>> All of that is own our making and control.
>>> Regarding the poll values. I am not sure why we do it the external NTPs
>>> as well. Could be that the dispersion can be brought down quicker this
>>> way
>> You are misusing "dispersion".  Dispersion is an estimate of worst case
>> drift and reading resolution errors.
> Well, dispersion is going down only with more samples to base estimation on, 
> isn't it? And we need that quick, if we want the server to influence the 
> hosts behind it quickly, say after a "NTP LAN" failure ended (some people 
> have dedicated LANs for NTP).
>>> on "entry hosts" and allow the "other hosts" to synchronize faster with
>>> them, or could be that we never considered it worthwhile to optimize it
>>> away. Well yes, but between 2 queries from the same client the ntpd will
>>> have made a certain adjustment. If the client gets to know this value, it
>>> will have to
>> ntpd is making adjustments at least every 4 seconds (old versions) and
>> as often as every clock tick.  It does this by adjusting frequency not
>> by directly adjusting time.
> I was not concerned with how the kernel makes the adjustments, but rather 
> that 
> the a fixed time change over the period is known. The slew rate is known, 
> isn't it?
> Let me use a car analogy, these things work. :-)
> Lets assume a three lane high way with 3 cars that try to drive at the same 
> speed. The car to the left is driving at (near) constant speed. The driver in 
> the middle accelerates and braces according to his motor behaviour as well as 
> the observed difference in speed between him and the other one. Now what 
> should the driver to the right do?
> In my view, he could take the acceleration of his neighbour into account when 
> making estimates of his own error.

Why should the driver in the right lane not ignore the driver in the 
middle and try to match his speed to the leftmost driver?  It seems to 
me that this is analogous to preferring the stratum one server to the 
stratum two server!

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-08 Thread Steve Kostecke
On 2008-09-08, Martin Burnicki <[EMAIL PROTECTED]> wrote:

> If I understand Kay correctly then the problem is that he responded
> via the questions@ list and the moderation bit was set there, which
> prevented some of his articles from being gatewayed to the usenet.

When messages are held for moderation they are not sent to the
mailing-list. _None_ of the list subscribers (which includes the
gateway) see those messages until they are released.

> In Kay's original thread Steve Kostecke mentioned he had removed the
> moderation bit for Kay, but obviously that did not fully help.

I stated that I released Kay's messages but that I left the moderation
bit alone and deferred to the list-master.

Steve Kostecke <[EMAIL PROTECTED]>
NTP Public Services Project - http://support.ntp.org/

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-08 Thread David Woolley
Kay Hayen wrote:
> How would it be difference from using the restrict command manually? 

Because manual use would normally be based on significant thought and 
measurements over an extended period.
> And why would it not be NTP?

Because a key part of NTP is the algorithm used to identify and reject 
unreliable sources of time.  These actually work better if you have many 

> > 
> Well, dispersion is going down only with more samples to base estimation on, 

The calculation initially assumes that the source jitter might be very 
large until it has evidence to the contrary.

> isn't it? And we need that quick, if we want the server to influence the 
> hosts behind it quickly, say after a "NTP LAN" failure ended (some people 
> have dedicated LANs for NTP).

iburst covers that.

> I was not concerned with how the kernel makes the adjustments, but rather 
> that 
> the a fixed time change over the period is known. The slew rate is known, 
> isn't it?

The actual change in time in any period should be zero, within 
statistical error.  The real excess slew rate should also be zero within 
statistical error.  The assumed length of a tick, which is probably the 
reciprocal of what you mean by the slew rate, is continuously varying. 
You would need to integrate this to get the excess number of ticks over 
a period, which is, I think your concept of error.

(The big argument between chrony and ntpd is about whether ntpd really 
gives the best estimate of true time for real inputs.)

> Let me use a car analogy, these things work. :-)
> Lets assume a three lane high way with 3 cars that try to drive at the same 
> speed. The car to the left is driving at (near) constant speed. The driver in 
> the middle accelerates and braces according to his motor behaviour as well as 
> the observed difference in speed between him and the other one. Now what 
> should the driver to the right do?
> In my view, he could take the acceleration of his neighbour into account when 
> making estimates of his own error.

Analogies are always unsafe in fora, but the second car doesn't actually 
know its acceleration (remember, if they could actually see the road, 
they would use that as reference).  All it knows is how hard its driver 
is pushing on the accelerator.

Moreover, the drivers are looking at each other through mirrors that are 
vibrating violently and unpredictably, such that the apparent position 
of the neighbours is varying much more than their true relevant 
position.  To a significant extent the mirrors are moving independently 
of each other (this probably requires that the third driver actually be 
the middle one, to make the physical model sensible).

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-08 Thread Unruh
"Richard B. Gilbert" <[EMAIL PROTECTED]> writes:

>Unruh wrote:
>> [EMAIL PROTECTED] (Kay Hayen) writes:
>>> Hello David,
> When I say "restrict" it is our own system that decides that ">x ms"
> offset is too bad and prevents ntpd from talking to it any further with a
> "restrict" command. If all 2 servers of an "other host" are "restricted",
> it will crash the software.
 You are overriding NTP's selection algorithms.  Effectively you are no
 longer running NTP.
>>> How would it be difference from using the restrict command manually? 
>>> And why would it not be NTP?
> All of that is own our making and control.
> Regarding the poll values. I am not sure why we do it the external NTPs
> as well. Could be that the dispersion can be brought down quicker this
> way
 You are misusing "dispersion".  Dispersion is an estimate of worst case
 drift and reading resolution errors.
>>> Well, dispersion is going down only with more samples to base estimation 
>>> on, 
>>> isn't it? And we need that quick, if we want the server to influence the 
>>> hosts behind it quickly, say after a "NTP LAN" failure ended (some people 
>>> have dedicated LANs for NTP).
> on "entry hosts" and allow the "other hosts" to synchronize faster with
> them, or could be that we never considered it worthwhile to optimize it
> away. Well yes, but between 2 queries from the same client the ntpd will
> have made a certain adjustment. If the client gets to know this value, it
> will have to
 ntpd is making adjustments at least every 4 seconds (old versions) and
 as often as every clock tick.  It does this by adjusting frequency not
 by directly adjusting time.
>>> I was not concerned with how the kernel makes the adjustments, but rather 
>>> that 
>>> the a fixed time change over the period is known. The slew rate is known, 
>>> isn't it?
>>> Let me use a car analogy, these things work. :-)
>>> Lets assume a three lane high way with 3 cars that try to drive at the same 
>>> speed. The car to the left is driving at (near) constant speed. The driver 
>>> in 
>>> the middle accelerates and braces according to his motor behaviour as well 
>>> as 
>>> the observed difference in speed between him and the other one. Now what 
>>> should the driver to the right do?
>> The cars have the road as a reference. However without the road, how does
>> car 3 know that car 2 is accelerating and decelerating and that it is not
>> hiw own car that is misbehaving?  He does not. All he
>> can do is collect more cars and use the average behaviour to determine who
>> is behaving badly.

>Car 3 has a speedometer!

Yes, that is with reference to the road. Car three should thus completely
ignore the other two cars and use his speedometer. 

Ie, put up a GPS receiver with a PPS and use that as your time source, and
ignore all the other ntp time sources, except perhaps as sanity checks (eg
if you r speedometer breaks you should get to know about it by occasionally
looking at the other cars)

>> With two other cars only as a reference there is no way of deciding which
>> is weird. 
>> And if he has the road as a reference, then use the road, not either of the
>> other cars ( ie buy yourself a GPS receiver with PPS and then you will not
>> have to worry about what other cars are doing).
>>> In my view, he could take the acceleration of his neighbour into account 
>>> when 
>>> making estimates of his own error.
>>> Best regards,
>>> Kay Hayen

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-09 Thread Martin Burnicki

Steve Kostecke wrote:
> On 2008-09-08, Martin Burnicki <[EMAIL PROTECTED]> wrote:
>> In Kay's original thread Steve Kostecke mentioned he had removed the
>> moderation bit for Kay, but obviously that did not fully help.
> I stated that I released Kay's messages but that I left the moderation
> bit alone and deferred to the list-master.

Sorry, I mis-remembered this.

Has the moderation bit for Kay been set because he posted to the questions@
list without having subscribed to the list?

Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-09 Thread Hal Murray

>The "rules" about how often to query a daemon are not all that 
>complicated.  The fact that there ARE rules is due to some history; 
>google for "Netgear Wisconsin" for the sordid details.  For a "second 
>opinion" google for "DLink PHK".

There is a good summary at:

  NTP server misuse and abuse

These are my opinions, not necessarily my employer's.  I hate spam.

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-09 Thread Uwe Klein
Unruh wrote:
> Yes, that is with reference to the road. Car three should thus completely
> ignore the other two cars and use his speedometer. 
> Ie, put up a GPS receiver with a PPS and use that as your time source, and
> ignore all the other ntp time sources, except perhaps as sanity checks (eg
> if you r speedometer breaks you should get to know about it by occasionally
> looking at the other cars)

A)One GPS to each box or
B) a single GPS with PPS line to all boxes?

Doesn't that impact reliability?

You add the failure probability of a GPS-unit to each Box
where one failure will make the whole system fail.

What about doing startup of all involved boxes from the (outside)
upstream timeserver?

a question in this context:

could I use something like this for a group of boxes to sync:

server $external_upstream_host

foreach box $neighbours
peer $box


questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-09 Thread Steve Kostecke
On 2008-09-09, Martin Burnicki <[EMAIL PROTECTED]> wrote:

> Has the moderation bit for Kay been set because he posted to the questions@
> list without having subscribed to the list?

Kay did the right thing and subscribed to the list before posting to it.

Posts from all new subscribers are held for moderation (i.e. "their
moderation bit is set") until they have demostrated that they are not
attempting to use the list in an abusive manner. This policy keeps out
the "drive-by" spammers.

Steve Kostecke <[EMAIL PROTECTED]>
NTP Public Services Project - http://support.ntp.org/

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-09 Thread Unruh
Uwe Klein <[EMAIL PROTECTED]> writes:

>Unruh wrote:
>> Yes, that is with reference to the road. Car three should thus completely
>> ignore the other two cars and use his speedometer. 
>> Ie, put up a GPS receiver with a PPS and use that as your time source, and
>> ignore all the other ntp time sources, except perhaps as sanity checks (eg
>> if you r speedometer breaks you should get to know about it by occasionally
>> looking at the other cars)

>A)One GPS to each box or
>B) a single GPS with PPS line to all boxes?

Whichever you want. Up to you.

>Doesn't that impact reliability?

>You add the failure probability of a GPS-unit to each Box
>where one failure will make the whole system fail.

So, that is why ntp has backup servers. You have a single failure point
anyway-- the network. It goes down, and nothing can get the time. 

>What about doing startup of all involved boxes from the (outside)
>upstream timeserver?


>a question in this context:

>could I use something like this for a group of boxes to sync:

>server $external_upstream_host

>foreach box $neighbours
>   peer $box



questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-09 Thread Uwe Klein
Unruh wrote:
> Uwe Klein <[EMAIL PROTECTED]> writes:
>>Unruh wrote:
>>>Yes, that is with reference to the road. Car three should thus completely
>>>ignore the other two cars and use his speedometer. 
>>>Ie, put up a GPS receiver with a PPS and use that as your time source, and
>>>ignore all the other ntp time sources, except perhaps as sanity checks (eg
>>>if you r speedometer breaks you should get to know about it by occasionally
>>>looking at the other cars)
>>A)One GPS to each box or
>>B) a single GPS with PPS line to all boxes?
> Whichever you want. Up to you.
>>Doesn't that impact reliability?
>>You add the failure probability of a GPS-unit to each Box
>>where one failure will make the whole system fail.
> So, that is why ntp has backup servers. You have a single failure point
> anyway-- the network. It goes down, and nothing can get the time.

That actually is _three_ different scenarios.

time over the network:

network fails
1: time
2: the system as a whole

failure of network infrastructure
thus does not add to the probability of the complete system failing.

time over PPS/GPS 1 unit with signaling to each box:

network fails
2: the system as a whole

GPS fails
1: time
-> the system as a whole

This adds up to a higher failure rate/probability.

time over PPS/GPS unit per box:

network fails
2: the system as a whole

GPS fails
1: time
-> the system as a whole

This adds up to a higher failure rate/probability.

With the added disadvantage that GPS failure overall
is single failure times number of boxes.


questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-09 Thread Richard B. Gilbert
Unruh wrote:
> Uwe Klein <[EMAIL PROTECTED]> writes:
>> Unruh wrote:
>>> Yes, that is with reference to the road. Car three should thus completely
>>> ignore the other two cars and use his speedometer. 
>>> Ie, put up a GPS receiver with a PPS and use that as your time source, and
>>> ignore all the other ntp time sources, except perhaps as sanity checks (eg
>>> if you r speedometer breaks you should get to know about it by occasionally
>>> looking at the other cars)
>> A)One GPS to each box or
>> B) a single GPS with PPS line to all boxes?
> Whichever you want. Up to you.
>> A:
>> Doesn't that impact reliability?
>> You add the failure probability of a GPS-unit to each Box
>> where one failure will make the whole system fail.
> So, that is why ntp has backup servers. You have a single failure point
> anyway-- the network. It goes down, and nothing can get the time. 

If the possibility of failure of your network or your internet 
connection worries you, you can use a modem and a telephone line as a 
backup!  Or you can get a GPS receiver, WWV/WWVH/WWVB receiver or an 
atomic clock of your very own.  Most sites don't bother because their 
requirements are not that tight.  FWIW, a system that has been 
synchronized by NTP will tend to stay close to the correct time for a 
reasonable period of time as long as the environment does not change 
significantly.  If the network fails AND the air conditioning fails you 
are in trouble!

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-09-09 Thread Unruh
Uwe Klein <[EMAIL PROTECTED]> writes:

>Unruh wrote:
>> Uwe Klein <[EMAIL PROTECTED]> writes:
>>>Unruh wrote:
Yes, that is with reference to the road. Car three should thus completely
ignore the other two cars and use his speedometer. 

Ie, put up a GPS receiver with a PPS and use that as your time source, and
ignore all the other ntp time sources, except perhaps as sanity checks (eg
if you r speedometer breaks you should get to know about it by occasionally
looking at the other cars)
>>>A)One GPS to each box or
>>>B) a single GPS with PPS line to all boxes?
>> Whichever you want. Up to you.
>>>Doesn't that impact reliability?
>>>You add the failure probability of a GPS-unit to each Box
>>>where one failure will make the whole system fail.
>> So, that is why ntp has backup servers. You have a single failure point
>> anyway-- the network. It goes down, and nothing can get the time.

>That actually is _three_ different scenarios.

>time over the network:

>network fails
>   1: time
>   2: the system as a whole

>   failure of network infrastructure
>   thus does not add to the probability of the complete system failing.

>time over PPS/GPS 1 unit with signaling to each box:

>network fails
>   2: the system as a whole

>GPS fails
>   1: time
>   -> the system as a whole

>   This adds up to a higher failure rate/probability.

>time over PPS/GPS unit per box:

>network fails
>   2: the system as a whole

>GPS fails
>   1: time
>   -> the system as a whole

>   This adds up to a higher failure rate/probability.

>   With the added disadvantage that GPS failure overall
>   is single failure times number of boxes.

So, put a GPS connected to each box. That will be a stratum 0 source and
will be selected by ntp. If that fails, have each of the other machines as
a backup. They will be stratum 1 source. Then have the system go out onto
the world wide net to pool.ntp. Those will be stratum 2 or lower. Each
backs up the otehr. Thus each machine will gets its time from GPS (usec
precision) It that fails, they get it from the local machines ( 10s of usec
precision) If that all fails they get it from the net ( ms precision) It
that all fails, you are SOOL. You probably have other worries anyway.

How many belts and braces you want is entirely up to you. 

I would have one GPS on  one machine. Everything gets their time from that,
unless it fails in which case pool.ntp would act as a backup. But it is
entirely up to you. 


questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-10-12 Thread Danny Mayer
Kay Hayen wrote:
> What worried me more was how often we can query the local ntpd before it will 
> have an adverse effect. Meantime I somehow I sought to convince me I should 
> be able to convince myself that ntpq requests are served at a different 
> priority (other socket) than ntpd requests are. I didn't find 2 sockets 
> though.

They aren't. It's the same socket and each packet is responded to in
turn irrespective of the content. It's also not possible to create a
separate socket unless we have a separate command channel and that does
not currently exist and is nowhere defined in the protocol.

>> Briefly, you use the defaults for MINPOLL and MAXPOLL.  You may use the
>> "iburst" keyword in a server statement for fast startup.  You may use
>> the "burst" keyword ONLY with the permission of the the server's owner.
>> 99.99% of NTP installations will work very well using these rules".  If
>> yours does not, ask here for help!
> Now speaking about our system, not the middleware, with connections as 
> follows: 
> External NTPs <-> 2 entry hosts <-> 8 other hosts.
> And iburst and minpoll=maxpoll=5 to improve the results.

This indicates that you don't understand NTP. You should never ever
change the minpoll and maxpoll values unless you understand the NTP
algorithms in detail and understand the consequences of changing them.
The default values were very carefully chosen to provide a balance
between various conflicting requirements to provide the most stable
clock discipline over a wide range of environments. You are
undersampling at the start of NTP and then oversampling as it starts to
stabilize the discipline loop.

> Currently we observe that both entry hosts can both become restricted due to 
> large offsets on other hosts, so they become restricted and that will make 
> the software refuse to go on. Ideally that would not happen.

If the servers that it uses become divergent it will be unable to pick
the "best" one and it will become unsynchronized.

> I will try to formulate questions:
> When the other hosts synchronize to the entry hosts of our system, don't the 
> other hosts ntpd know when and how much these entry hosts changed their time 
> due to input? 
> Would NTP would be more robust if we would configure routing on the entry 
> hosts, so that they can all speak directly with the external NTPs on their 
> own?

Yes since the stratum will be lower so that the error budget will also
be lower.

> Is the use of ntpdate before starting ntpd recommended and/or does the iburst 
> option replace it?

ntpdate is deprecated and is not normally needed. Make sure you start
ntpd with the -g option to step the clock initially to close to the
correct tick.

> Best regards,
> Kay Hayen
questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-10-13 Thread Ryan Malayter
On Tue, Sep 9, 2008 at 2:52 PM, Richard B. Gilbert
> FWIW, a system that has been
> synchronized by NTP will tend to stay close to the correct time for a
> reasonable period of time as long as the environment does not change
> significantly.  If the network fails AND the air conditioning fails you
> are in trouble!

That is, of course, precisely what happens in many long-term power
outages. Typical UPS battery run times in a datacenter are in minutes,
not hours. And UPS rarely backup the cooling system. If you don't have
a working generator on standby with plenty of fuel, you're up the
proverbial creek.

Even if you have the generators, you have to be careful. A colocation
provider recently had an outage that was interesting. A truck ran into
their (exterior) transformers, cutting utility power. No problem, they
have generators, right?. Well, their water chillers could not re-start
fast enough after the generators came on line, so the rapidly
increasing temperature caused shut down about 1/3 of the servers in
their datacenter. All told, their SLA credits amounted to millions of

Focusing on extreme redundancy for one piece of your infrastructure
(time) is sort of pointless if you don't have full tested redundancy
in the lower layers of the system (physcial plant, power, cooling,
network, etc.)
questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-10-13 Thread David Woolley
Danny Mayer wrote:
> Kay Hayen wrote:

>> And iburst and minpoll=maxpoll=5 to improve the results.
> This indicates that you don't understand NTP. You should never ever
> change the minpoll and maxpoll values unless you understand the NTP
> algorithms in detail and understand the consequences of changing them.
> The default values were very carefully chosen to provide a balance
> between various conflicting requirements to provide the most stable

Those conflicting requirements make assumptions about the environment in 
which NTP is operating.  Those assumptions probably aren't valid when 
the servers are on the same high speed, low traffic, network.  Having 
said that, one shouldn't just set minpoll and maxpoll low, but should 
actually measure the results and find optimum values for the actual 

> clock discipline over a wide range of environments. You are
> undersampling at the start of NTP and then oversampling as it starts to
> stabilize the discipline loop.

ntpd always oversamples.  Changing the limits limits the range of filter 
time constants  used.  Setting it low, improves convergence on startup, 
and re-convergence after a temperature change, which is why there is so 
much use of it - ntpd is failing to meet a market demand, and setting 
both these low has become the urban folklore solution.  It also tends to 
minimise the value of "offset" at other times, but that is not 
necessarily good, as offset is not the same thing as error, and, 
ideally, would be uncorrelated with it.

(ntpd starts to back off the time constant long before the startup 
transient is complete, so keeping it artificially low helps there.  For 
temperature changes, it takes time for the time constant to ramp down, 
which is avoided by keeping it low.)

The reasons for not doing it are that it makes ntpd try to follow short 
term variations in offset, which are likely to be due to network 
conditions, rather than true time errors, and it makes the frequency 
less stable, which means that short durations are measured less 
accurately and time will diverge more quickly if connections to the 
servers is lost.  It also imposes an unnecessary load on the servers.

> ntpdate is deprecated and is not normally needed. Make sure you start
> ntpd with the -g option to step the clock initially to close to the
> correct tick.

-g doesn't step the clock, it simply allows the clock to be stepped by 
more than 1000s, the first time.  Clock stepping is still subject to the 
128ms minimum offset.  Both numbers are configurable, although changing 
them may disable some functions.

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-10-13 Thread Unruh
[EMAIL PROTECTED] (Danny Mayer) writes:


>>> Briefly, you use the defaults for MINPOLL and MAXPOLL.  You may use the
>>> "iburst" keyword in a server statement for fast startup.  You may use
>>> the "burst" keyword ONLY with the permission of the the server's owner.
>>> 99.99% of NTP installations will work very well using these rules".  If
>>> yours does not, ask here for help!
>> Now speaking about our system, not the middleware, with connections as 
>> follows: 
>> External NTPs <-> 2 entry hosts <-> 8 other hosts.
>> And iburst and minpoll=maxpoll=5 to improve the results.

>This indicates that you don't understand NTP. You should never ever
>change the minpoll and maxpoll values unless you understand the NTP
>algorithms in detail and understand the consequences of changing them.
>The default values were very carefully chosen to provide a balance
>between various conflicting requirements to provide the most stable
>clock discipline over a wide range of environments. You are
>undersampling at the start of NTP and then oversampling as it starts to
>stabilize the discipline loop.

The lower value on startup is to try to make ntp responsive at the
beginning, because it is so slow to correct errors. The longer value on
running is twofold-- to reduce the network demands on servers ( probably
the most important) and to increase the baseline for drift determination (
because of NTPs memoryless design) The former is important if you are using
public servers. The latter is important if you loose network connectivity
for days at a time. If you use your own server, and your network is stable,
a shorter maxpoll is better-- better control and faster response to
computer clock changes.


>> Currently we observe that both entry hosts can both become restricted due to 
>> large offsets on other hosts, so they become restricted and that will make 
>> the software refuse to go on. Ideally that would not happen.

>If the servers that it uses become divergent it will be unable to pick
>the "best" one and it will become unsynchronized.

>> I will try to formulate questions:
>> When the other hosts synchronize to the entry hosts of our system, don't the 
>> other hosts ntpd know when and how much these entry hosts changed their time 
>> due to input? 

No idea what this means. All a client gets is the offset of its clock with
respect to the server clock, and an estimate of the dispersion of the
server's clock. I do not know what "how much these entry hosts changed
their time due to input" means, but my guess is that the answer is "No, the
clients do not get any information about the internal workings of the

>> Would NTP would be more robust if we would configure routing on the entry 
>> hosts, so that they can all speak directly with the external NTPs on their 
>> own?

>Yes since the stratum will be lower so that the error budget will also
>be lower.

That depends on your routers. If you have routers with bad
latency/dispersion, it may be worse due to network delays/variability.

>> Is the use of ntpdate before starting ntpd recommended and/or does the 
>> iburst 
>> option replace it?

>ntpdate is deprecated and is not normally needed. Make sure you start
>ntpd with the -g option to step the clock initially to close to the
>correct tick.

>> Best regards,
>> Kay Hayen

questions mailing list

Re: [ntp:questions] The libntp resumee...

2008-10-13 Thread David L. Mills

The ntpd parameter constellation is indeed tuned for a necessarily wide 
range of scenarios and may not be optimal for any particular case. From 
an engineering point of view the solution for the minpoll/maxpoll issue 
is obvious. Determine the Allan intercept as described in several 
places, my papers and my book. The poll interval is carefully set at 
1/32 the time constant, which should be at the intercept. So set minpoll 
and maxpoll to the log2 of that value. Yes, the loop is purposely 
oversampled with respect to the time constant, but not with respect to 
the Allan intercept.

The Allan deviation characteristic displayed in the briefings on the NTP 
project page should give a hint how the intercept varies with different 
operating systems and network links. Indeed, if you have a fast LAN, 
PCnet NIC and 3-GHz machine, the optimum poll interval is probably more 
like 4 (16 s), but probably not 3 (8 s), as that invites increased 
vulnerability to frequency surges.

The poll adjust algorithm does not do what you expect. See line 644 et 
seq in ntp_loopfilter.c and the commentary there. This algorithm is the 
result of literally 25 years of experiment and refinement. It is not 
necessarily designed for rapid initial convergence; it is designed to be 
sensitive to frequency surges once convergence has stabilized. The 
frequency file avoids initial convergence if restarted after that.


David Woolley wrote:
> Danny Mayer wrote:
>> Kay Hayen wrote:
>>> And iburst and minpoll=maxpoll=5 to improve the results.
>> This indicates that you don't understand NTP. You should never ever
>> change the minpoll and maxpoll values unless you understand the NTP
>> algorithms in detail and understand the consequences of changing them.
>> The default values were very carefully chosen to provide a balance
>> between various conflicting requirements to provide the most stable
> Those conflicting requirements make assumptions about the environment in 
> which NTP is operating.  Those assumptions probably aren't valid when 
> the servers are on the same high speed, low traffic, network.  Having 
> said that, one shouldn't just set minpoll and maxpoll low, but should 
> actually measure the results and find optimum values for the actual 
> conditions.
>> clock discipline over a wide range of environments. You are
>> undersampling at the start of NTP and then oversampling as it starts to
>> stabilize the discipline loop.
> ntpd always oversamples.  Changing the limits limits the range of filter 
> time constants  used.  Setting it low, improves convergence on startup, 
> and re-convergence after a temperature change, which is why there is so 
> much use of it - ntpd is failing to meet a market demand, and setting 
> both these low has become the urban folklore solution.  It also tends to 
> minimise the value of "offset" at other times, but that is not 
> necessarily good, as offset is not the same thing as error, and, 
> ideally, would be uncorrelated with it.
> (ntpd starts to back off the time constant long before the startup 
> transient is complete, so keeping it artificially low helps there.  For 
> temperature changes, it takes time for the time constant to ramp down, 
> which is avoided by keeping it low.)
> The reasons for not doing it are that it makes ntpd try to follow short 
> term variations in offset, which are likely to be due to network 
> conditions, rather than true time errors, and it makes the frequency 
> less stable, which means that short durations are measured less 
> accurately and time will diverge more quickly if connections to the 
> servers is lost.  It also imposes an unnecessary load on the servers.
>> ntpdate is deprecated and is not normally needed. Make sure you start
>> ntpd with the -g option to step the clock initially to close to the
>> correct tick.
> -g doesn't step the clock, it simply allows the clock to be stepped by 
> more than 1000s, the first time.  Clock stepping is still subject to the 
> 128ms minimum offset.  Both numbers are configurable, although changing 
> them may disable some functions.

questions mailing list