Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-03-03 Thread Martin Burnicki
Hello Ulrich,

Ulrich Windl wrote:
 Is that too simple?
   msyslog(LOG_ERR, authentication key %lu unknown,
   (unsigned long)sys_authkey);

Oooh, of course that't the best fix. 

I have already prepared a patch but I must have been blind that I didn't see
the obvious solution. Since the patch has not yet been committed I'll
update it once more.

Thanks,

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-03-01 Thread Danny Mayer
Ulrich Windl wrote:
 Unruh [EMAIL PROTECTED] writes:
 
 In ntpdate.c around line 542 (4.2.4p4)is the sequence
 if (!authistrusted(sys_authkey)) {
  char buf[10];

  (void) sprintf(buf, %lu, (unsigned long)sys_authkey);
  msyslog(LOG_ERR, authentication key %s unknown, buf);
 
 Is that too simple?
   msyslog(LOG_ERR, authentication key %lu unknown,
   (unsigned long)sys_authkey);
 

In this case it's the right solution. There's no need for an 
intermediate buffer here.

Danny
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-28 Thread Ulrich Windl
Unruh [EMAIL PROTECTED] writes:

 In ntpdate.c around line 542 (4.2.4p4)is the sequence
 if (!authistrusted(sys_authkey)) {
  char buf[10];

  (void) sprintf(buf, %lu, (unsigned long)sys_authkey);
  msyslog(LOG_ERR, authentication key %s unknown, buf);

Is that too simple?
  msyslog(LOG_ERR, authentication key %lu unknown,
  (unsigned long)sys_authkey);


  exit(1);
 }

 Since unsigned long does not have a definite length on all machines, and with 
 the trailing
 zero certainly is potentially longer than 10 bytes, that buf is ripe for
 buffer overflow. 
 It should be something like
char buf[(sizeof(unsigned long)*12/5+2)];
 And/or the sprintf should be an snprintf.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-14 Thread Maarten Wiltink
David L. Mills [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]

 Is there also a random backoff after an increase of the polling
 interval?

 No. However, there is a small dither of a few percent at all poll
 intervals to resist self-synchronization.

 The natural behavior of a bunch of oscillators near the same frequency
 is to become one giant phase-locked oscillator. Adding a bit of random
 fuzz at each poll turns each oscillator into a mini random-walk which
 breaks up that tendency. The fuzz is not a lot, like 10 percent.

Do you mean the dither alluded to above is cumulative?

I was never much good with statistics and remember only that the
expectation of the offset after N steps in a random walk is sqrt(N)
times the average step size. Not a clue what the distribution might
be. Intuitively, I would be aiming for uniform, and randomly adding
half a polling interval delay when doubling it seemed to me like it
would do that.

Groetjes,
Maarten Wiltink


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-13 Thread David Woolley
Hal Murray wrote:

 20 ms sounds like a typical DSL link.  That 1ms accuracy goes out
 the window if you are doing a big download.  (At least on my DSL
 link.)
 

People don't generally do big downloads during the boot of a machine! 
On a big network, the most likely reason for rebooting a timeserver in 
prime time is a power failure.  In which case the whole network is 
likely to be down.

At worst, using ntpdate -b, you only get something like the current ntpd 
behavour.  Typically you end up within a millisecond.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-13 Thread Maarten Wiltink
David L. Mills [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]

 No, there is no random delay at startup. Each association starts one
 second after the previous one. The random backoff occurs only after
 a step.

 Is there also a random backoff after an increase of the polling
 interval?

 No. However, there is a small dither of a few percent at all poll
 intervals to resist self-synchronization.

Wouldn't that be a nice feature to add? If it's currently polling a
server on, say second 100 (reckoned externally) of 256, to go to
either 100 _or 356_ of 512.

I understand that there are already some random waits in the client
code and Internet servers are well protected by random noise. But
for large numbers of clients in a uniform environment that were all
started at about the same time, is there any way they tend to
naturally disperse across the final 1024s polling interval?

Groetjes,
Maarten Wiltink


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-13 Thread David L. Mills
Maarten,

The natural behavior of a bunch of oscillators near the same frequency 
is to become one giant phase-locked oscillator. Adding a bit of random 
fuzz at each poll turns each oscillator into a mini random-walk which 
breaks up that tendency. The fuzz is not a lot, like 10 percent.

Dave

Maarten Wiltink wrote:
 David L. Mills [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
 
 
No, there is no random delay at startup. Each association starts one
second after the previous one. The random backoff occurs only after
a step.

Is there also a random backoff after an increase of the polling
interval?
 
 
No. However, there is a small dither of a few percent at all poll
intervals to resist self-synchronization.
 
 
 Wouldn't that be a nice feature to add? If it's currently polling a
 server on, say second 100 (reckoned externally) of 256, to go to
 either 100 _or 356_ of 512.
 
 I understand that there are already some random waits in the client
 code and Internet servers are well protected by random noise. But
 for large numbers of clients in a uniform environment that were all
 started at about the same time, is there any way they tend to
 naturally disperse across the final 1024s polling interval?
 
 Groetjes,
 Maarten Wiltink
 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread Martin Burnicki
Dave,

David L. Mills wrote:
 Serge,
 
 The behavior after a step is deliberate. The iburst volley after a step
   is delayed a random fraction of the poll interval to avoid implosion
 at a busy server. An additional delay may be enforced to avoid violating
 the headway restrictions. This is not to protect your applications; it
 is to protect the server.

Is it really necessary to insert a random delay after a step? There has
already been a random delay immediately after startup, before the client
has decided that a step was required.

So even if a bunch of clients started up at the same time and had to step,
they wouln't step at the same time, and thus wouldn't do the next iburst
volley at the same time anyway.

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread David Woolley
Harlan Stenn wrote:
 
 For the general use case (LAN and/or WAN and/or jerky path) ntpd behaves
 well.

We are talking typical rather than general cases.  In the typical case, 
1ms after 1 second is a reasonable expectation on a WAN, especially when 
a site is restarting, e.g. after a power failure, or a home system 
switching on, and, therefore, the network load is low.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread Martin Burnicki
Dave,

David L. Mills wrote:
[...]
 The ntpd time constant is purposely set somewhat large at 2000 s, which
 results in a risetime of about 3000 s. This is a compromise for stable
 acquisition for herky-jerky Internet paths and speed of convergence for
 LANs. For typical Internet paths the Allan intercept is about 2000 s.
 For fast LANs with nanosecond clock resolution, the Allan intercept can
 be as low as 250s, which is what the kernel PPS loop is designed for.

Wouldn't it make sense to adjust the time constant depending on the time
after startup, and/or the quality of the responses from the upstream
servers?

I.e. the time constant could be smaller after startup to get a fast initial
correction, and then increase depending on the requirements.

The packet delay and jitter should also give a good indication whether an
upstream server is on the local LAN, or on the internet. So the settings
used to make ntpd work well for the worst cases could be used if those
cases apply, but the limitations could be reduced in non-worst cases.

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread David Woolley
Unruh wrote:
 David Woolley [EMAIL PROTECTED] writes:
 
 Harlan Stenn wrote:
 For the general use case (LAN and/or WAN and/or jerky path) ntpd behaves
 well.
 
 We are talking typical rather than general cases.  In the typical case, 
 1ms after 1 second is a reasonable expectation on a WAN, especially when 
 a site is restarting, e.g. after a power failure, or a home system 
 switching on, and, therefore, the network load is low.
 
 I think you go t your units mixed up. computer A goes down for three days
 due to an avalanch cutting the power. It takes a lot longer than one second
 to resync that computer. A few hours is more like it. 

I was talking about what people could expect from software that behaved 
well; I think you are describing what ntpd actually does here.  My point 
was that ntpd's ability to tolerate really rotten links is irrelevant 
for most users, who are only about 20ms away from their ISP's time 
server, and can expect to read it to about 1ms accuracy.

 If you mean, I shut down ntp and restart it immediately , then 1ms in 1
 minute is reasonable ( you cannot have made enough measurements in 1 sec to
 even know if it is accurate.)
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread Harlan Stenn
Bill,

 In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes:

Unruh Why not? The power comes on on your computer farm of 2000 machines,
Unruh all the clients are the same type so the bootup sequence is
Unruh identical. They all start ntp at the same time, to within a second or
Unruh so. And suddenly the poor server is flooded.

2000 packets hitting ntpd all at once should not be a problem for an ntp
server in that environment.
-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread David L. Mills
Maarten,

No. However, there is a small dither of a few percent at all poll 
intervals to resist self-synchronization.

Dave

Maarten Wiltink wrote:
 David L. Mills [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
 
 
No, there is no random delay at startup. Each association starts one
second after the previous one. The random backoff occurs only after a
step.
 
 
 Is there also a random backoff after an increase of the polling interval?
 
 Groetjes,
 Maarten Wiltink
 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread David L. Mills
Unruh,

Depends who the clients are. An ntpd client will not come up in the 
first second, although successive associations will come up at 2-s 
intervals. I would not expect 2000 clients to come up at the same exact 
time anyway due ordinary latency variations in the boot process. I would 
be more worried about a broacast server coming up with 2000 corporate 
broadcast clients, but in that case the initial client response is 
randomized over the poll interval.

Dave

Unruh wrote:

 Martin Burnicki [EMAIL PROTECTED] writes:
 
 
Dave,
 
 
David L. Mills wrote:

Serge,

The behavior after a step is deliberate. The iburst volley after a step
  is delayed a random fraction of the poll interval to avoid implosion
at a busy server. An additional delay may be enforced to avoid violating
the headway restrictions. This is not to protect your applications; it
is to protect the server.
 
 
Is it really necessary to insert a random delay after a step? There has
already been a random delay immediately after startup, before the client
has decided that a step was required.
 
 
So even if a bunch of clients started up at the same time and had to step,
they wouln't step at the same time, and thus wouldn't do the next iburst
volley at the same time anyway.
 
 
 Why not? The power comes on on your computer farm of 2000 machines, all the 
 clients are the same type so the
 bootup sequence is identical. They all start ntp at the same time, to
 within a second or so. And suddenly the poor server is flooded. 
 
 
Martin
-- 
Martin Burnicki
 
 
Meinberg Funkuhren
Bad Pyrmont
Germany

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread Hal Murray

I was talking about what people could expect from software that behaved 
well; I think you are describing what ntpd actually does here.  My point 
was that ntpd's ability to tolerate really rotten links is irrelevant 
for most users, who are only about 20ms away from their ISP's time 
server, and can expect to read it to about 1ms accuracy.

20 ms sounds like a typical DSL link.  That 1ms accuracy goes out
the window if you are doing a big download.  (At least on my DSL
link.)

-- 
These are my opinions, not necessarily my employer's.  I hate spam.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread Serge Bets
Hello David,

 On Tuesday, February 12, 2008 at 15:04:45 +, David L. Mills wrote:

 Serge Bets wrote:
 ntpd -q can make use of the driftfile to set the kernel frequency
 That was removed as a significant security hazard.

Why exactly?


 If you want to rxplicitly set the frequency, use ntptime -f.

Sure: I can preset the frequency by hand. But not setting the frequency
is not a sensible option: it's required for good ntpq -q operations,
otherwise slews don't end on the zero.


 Ths scheme is designed so you can run ntpd until the kernel frequency
 has stabilized, then kill ntpd and run SNTP client at regular
 intervals.

There is no obstacle to that. When ntpd quits, the kernel runs on the
last computed frequency. Without driftfile, ntpd -q runs above this
frequency. With a driftfile, ntpd -q could even run above this frequency
after a reboot.

The obstacle if one existed would be a frequency reset to zero at
startup, like done by loop_config(LOOP_DRIFTINIT). Fortunately this
doesn't happen in mode_ntpdate (the -q flag).


Serge.
-- 
Serge point Bets arobase laposte point net

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread David Woolley
Martin Burnicki wrote:

 Wouldn't it make sense to adjust the time constant depending on the time
 after startup, and/or the quality of the responses from the upstream
 servers?

It does get adjusted.  We are talking about the minimum value!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread Unruh
Martin Burnicki [EMAIL PROTECTED] writes:

Dave,

David L. Mills wrote:
 Serge,
 
 The behavior after a step is deliberate. The iburst volley after a step
   is delayed a random fraction of the poll interval to avoid implosion
 at a busy server. An additional delay may be enforced to avoid violating
 the headway restrictions. This is not to protect your applications; it
 is to protect the server.

Is it really necessary to insert a random delay after a step? There has
already been a random delay immediately after startup, before the client
has decided that a step was required.

So even if a bunch of clients started up at the same time and had to step,
they wouln't step at the same time, and thus wouldn't do the next iburst
volley at the same time anyway.

Why not? The power comes on on your computer farm of 2000 machines, all the 
clients are the same type so the
bootup sequence is identical. They all start ntp at the same time, to
within a second or so. And suddenly the poor server is flooded. 

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread David L. Mills
Serge,

That was removed as a significant security hazard. If you want to 
rxplicitly set the frequency, use ntptime -f. Ths scheme is designed so 
you can run ntpd until the kernel frequency has stabilized, then kill 
ntpd and run SNTP client at regular intervals. I surely wouldn't 
recommend that, but folks have their biases.

Dave

Serge Bets wrote:

 Hello David,
 
  On Tuesday, February 12, 2008 at 2:43:06 +, David L. Mills wrote:
 
 
Just for clarity, neither the daemon nor kernel frequency is adjusted
in any way with ntpd -q.
 
 
 ntpd -q can make use of the driftfile to set the kernel frequency:
 
 | # ntpd -q -d | grep frequency
 | addto_syslog: frequency initialized -1.752 PPM from /var/lib/ntp/ntp.drift
 
 Note that this is plain necessary for the correct operations of ntpd -q.
 If the kernel frequency was not initialised, then a slew would not end
 right on the zero offset.
 
 
 Serge.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread Serge Bets
Hello David,

 On Tuesday, February 12, 2008 at 3:03:37 +, David L. Mills wrote:

 The behavior after a step is deliberate. The iburst volley after a
 step is delayed a random fraction of the poll interval to avoid
 implosion at a busy server.

Ah OK, I understand now! Thank you.

This makes me wonder: When starting ntpd -gq doing a step and quitting,
then immediatly starting ntpd daemon, this sequence sends 2 iburst
volleys, over around 14 seconds, without the said random delay in
between. Is that not rude to servers? The slew_sleeping script should be
modified to sleep some time after a step. How much? 16 to 64 s?

| /^ntpd: time set .*s$/ {
|   sleep = 16 + int(rand() * 49)
|   success = 1
| }


Serge.
-- 
Serge point Bets arobase laposte point net

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread Serge Bets
Hello David,

 On Tuesday, February 12, 2008 at 2:43:06 +, David L. Mills wrote:

 Just for clarity, neither the daemon nor kernel frequency is adjusted
 in any way with ntpd -q.

ntpd -q can make use of the driftfile to set the kernel frequency:

| # ntpd -q -d | grep frequency
| addto_syslog: frequency initialized -1.752 PPM from /var/lib/ntp/ntp.drift

Note that this is plain necessary for the correct operations of ntpd -q.
If the kernel frequency was not initialised, then a slew would not end
right on the zero offset.


Serge.
-- 
Serge point Bets arobase laposte point net

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread Serge Bets
Hello Harlan,

 On Tuesday, February 12, 2008 at 3:22:59 +, Harlan Stenn wrote:

 Interesting script - thanks.  Would you like me to put it in the
 distribution?

Excellent idea! As contrib example, or installed in bindir along with
ntp-wait?


 what benefit do we get by using the script to delay things while we
 are waiting for a slew to finish while in state 4?

I don't understand the reasoning above your questions, but can reply at
first degree to this one: If we didn't delay after ntpd -q, then the
daemon would be started while the slew is still in progress during some
minutes. This is not a sane situation: The daemon gathers biased data.


Serge.
-- 
Serge point Bets arobase laposte point net

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-12 Thread David L. Mills
Martin,

No, there is no random delay at startup. Each association starts one 
second after the previous one. The random backoff occurs only after a 
step. The fact that the initial backoff is small means that the client 
population is crudely synchronized and could well gang up after a step.

There have been incremental changes over the years to randomize and even 
out the load for busy servers, some of which made folks sad. Originally, 
the code did randomize at startup, but folks hated that since it 
resulted in an initial delay averaging 30 s. Now the backoff occurs only 
when stepped, which is by every measure a rare event. I don't think a 
step has ever happend with our production servers, unless after 
extensive downtime for repair.

You can easily modify the peer_clear() routine in ntp_proto.c to remove 
the backoff. If so, you will not be able to use any server running the 
reference implementation, as the rate violation will result in a dropped 
packet and, if configured, a KoD.

Dave

Martin Burnicki wrote:
 Dave,
 
 David L. Mills wrote:
 
Serge,

The behavior after a step is deliberate. The iburst volley after a step
  is delayed a random fraction of the poll interval to avoid implosion
at a busy server. An additional delay may be enforced to avoid violating
the headway restrictions. This is not to protect your applications; it
is to protect the server.
 
 
 Is it really necessary to insert a random delay after a step? There has
 already been a random delay immediately after startup, before the client
 has decided that a step was required.
 
 So even if a bunch of clients started up at the same time and had to step,
 they wouln't step at the same time, and thus wouldn't do the next iburst
 volley at the same time anyway.
 
 Martin

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread Tom Smith
ntdate -b steps the clock. That's the function under discussion.
The one that's used nearly universally in boot sequences.

-Tom

David L. Mills wrote:
 Guys,
 
 There seems to some misinformation here.
 
 Both ntpdate and ntpd -q set the offset with adjtime() and then exit. 
 After that, stock Unix adjtime() slews the clock at rate 500 PPM, which 
 indeed could take 256 s for an initial offset of 128 ms. A prudent 
 response would be to measure the initial offset and compute the time to 
 wait. The ntp-wait script waits for ntpd to enter state 4, which could 
 happen with an initial offset as high as 128 ms.
 
 The ntpd time constant is purposely set somewhat large at 2000 s, which 
 results in a risetime of about 3000 s. This is a compromise for stable 
 acquisition for herky-jerky Internet paths and speed of convergence for 
 LANs. For typical Internet paths the Allan intercept is about 2000 s. 
 For fast LANs with nanosecond clock resolution, the Allan intercept can 
 be as low as 250s, which is what the kernel PPS loop is designed for.
 
 Both the daemon and kernel loops are engineered so that the time 
 constant is directly proportional to the poll interval and the risetime 
 scales directly. If the poll exponent is set to the minimum 4 (16 s) the 
 risetinme is 500 s. While not admitted in public, the latest snapshot 
 can set the poll interval to 3 (8 s), so the risetime is 250 s. This 
 works just fine on a LAN, but I would never do this on an outside circuit.
 
 Dave
 
 Unruh wrote:
 Harlan Stenn [EMAIL PROTECTED] writes:


 In article [EMAIL PROTECTED], David 
 Woolley [EMAIL PROTECTED] writes:


 David Harlan Stenn wrote:

 Why would ntpd be exiting during a warm start?


 David Because we are discussing using it with the -q option.  If you 
 just
 David use -g, it will take a lot longer to converge within a few
 David milliseconds, as it will not slew at the maximum rate.  If you 
 use
 David -q, you need to force a step if you want fast convergence.


 I still maintain you are barking up the wrong tree.


 In terms of the behavior model of ntp, state 4 is as good as it 
 gets.  You
 are in the right ballpark.


 And as has been commented on numerous times, ntp is state 4 is very 
 slow to
 converge to the best possible time control. This was a deliberate design
 decision, as I understand it, so that in steady state the time is 
 averaged
 over a large number of samples ( not helped by the fact that 85% of 
 samples
 are thrown away), to reduce the statistical error in the clock control.
 Note that at poll 7 the number of actual samples averaged over in the 
 time
 scale of the ntp feedback loop is only about 3, so the statistical
 averaging even with such a long time constant, is not very good.



 If you want something else, something you consider better than 
 state 4,
 please make a case for this and lobby for it.


 I think many people have lobbied for faster response. In the 
 discussion of
 the chrony/ntp comparison, chrony is much faster to correct errors, 
 and at
 least on a local network, better at disciplining the clock as well ( in
 part I think because on such a minimal round trip network, the frequency
 fluctuations dominate over the offset measurement errors-- Ie, the Allen
 intercept is much lower than the assumed 1500 sec. in that kind of
 situation-- also the drift model on real systems is not well modeled 
 by 1/f
 noise.) So, what I think the point is that using ntpdate, one can rapidly
 bring the clock into a few msec of the correct time, rather than waiting
 for the feedback loop to finally eliminate that last 128msec of offset.


 For the case I'm describing the startup script sequence is to fire up
 'ntpd -g' early.  If there are applications that need the system 
 clock to
 be on-track stable (even if a wiggle is being dealt with), that's 
 'state
 4', and running 'ntp-wait' before starting those services is, to 
 the best
 of my knowledge, all that is required.


 David State 4 means within 128ms and using the normal control loop, 
 which
 David has a time constant of around an hour.


 OK, and so what?


 Is State 4 insufficient for your needs, or are you just splitting hairs?


 David For a cold start, it won't reach state 4 for a further 900 
 seconds
 David after first priming the clock filter.


 If the system has a good drift file, I disagree with you.


 David The definition of cold start is that there is no drift file.


 OK, now I know what the definitions are.


 I don't recall offhand the expected time to hit state 4 without a drift
 file.


 1) This should not be the ordinary case
 2) How does this have any bearing on the ntpdate -b discussion?


 And what is the big deal with using different config files?  The 
 config
 file mechanism has include capability so it is trivial to to easily
 maintain common 'base' configuration with customizations for separate
 start/run phases.


 David You are now talking about using -q.  The difficulty is that 
 people
 

Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread Unruh
Harlan Stenn [EMAIL PROTECTED] writes:

 In article [EMAIL PROTECTED], David Woolley [EMAIL PROTECTED] writes:

David Harlan Stenn wrote:
 Why would ntpd be exiting during a warm start?

David Because we are discussing using it with the -q option.  If you just
David use -g, it will take a lot longer to converge within a few
David milliseconds, as it will not slew at the maximum rate.  If you use
David -q, you need to force a step if you want fast convergence.

I still maintain you are barking up the wrong tree.

In terms of the behavior model of ntp, state 4 is as good as it gets.  You
are in the right ballpark.

And as has been commented on numerous times, ntp is state 4 is very slow to
converge to the best possible time control. This was a deliberate design
decision, as I understand it, so that in steady state the time is averaged
over a large number of samples ( not helped by the fact that 85% of samples
are thrown away), to reduce the statistical error in the clock control.
Note that at poll 7 the number of actual samples averaged over in the time
scale of the ntp feedback loop is only about 3, so the statistical
averaging even with such a long time constant, is not very good.


If you want something else, something you consider better than state 4,
please make a case for this and lobby for it.

I think many people have lobbied for faster response. In the discussion of
the chrony/ntp comparison, chrony is much faster to correct errors, and at
least on a local network, better at disciplining the clock as well ( in
part I think because on such a minimal round trip network, the frequency
fluctuations dominate over the offset measurement errors-- Ie, the Allen
intercept is much lower than the assumed 1500 sec. in that kind of
situation-- also the drift model on real systems is not well modeled by 1/f
noise.) So, what I think the point is that using ntpdate, one can rapidly
bring the clock into a few msec of the correct time, rather than waiting
for the feedback loop to finally eliminate that last 128msec of offset.

 For the case I'm describing the startup script sequence is to fire up
 'ntpd -g' early.  If there are applications that need the system clock to
 be on-track stable (even if a wiggle is being dealt with), that's 'state
 4', and running 'ntp-wait' before starting those services is, to the best
 of my knowledge, all that is required.

David State 4 means within 128ms and using the normal control loop, which
David has a time constant of around an hour.

OK, and so what?

Is State 4 insufficient for your needs, or are you just splitting hairs?

David For a cold start, it won't reach state 4 for a further 900 seconds
David after first priming the clock filter.

 If the system has a good drift file, I disagree with you.

David The definition of cold start is that there is no drift file.

OK, now I know what the definitions are.

I don't recall offhand the expected time to hit state 4 without a drift
file.

1) This should not be the ordinary case
2) How does this have any bearing on the ntpdate -b discussion?

 And what is the big deal with using different config files?  The config
 file mechanism has include capability so it is trivial to to easily
 maintain common 'base' configuration with customizations for separate
 start/run phases.

David You are now talking about using -q.  The difficulty is that people
David have enough trouble getting the run phase config file right.

I mention it because it's what you seem to be insisting on talking about.

I was providing a way to address the problems you describe with the (IMO
bad) mechanism (-q) under discussion.

 But the bigger problem is why are you insisting on separate start/run
 phases?  This has not been best practice for quite a while, and if you
 insist on using this method you will be running in to the exact problems
 you are describing.

 No, the best advice is to understand why you have been using ntpdate -b
 so far and understand the pros/cons of the new choices.

David We are talking about system managers and package creators, neither of
David which have much time to study the details.

Blessed are those who get what they deserve.

These are the same folks who must get ssh configurations and various other
network configurations working.

If the stock things work well enough for folks, great.

If folks have suggestions for improvements I welcome them.

If folks want something different I invite them to make a case for it.
Please remember the scope and complexity of the problem case.  It's much
easier to have a simpler solution if one is prepared to ignore certain
problems.  Another case in this point is Maildir.

If somebody is in the situation where they know they have specific
requirements for time, they are in the situation where they have enough
altitude on their requirements to know the costs/benefits of what is
involved in getting there.

Well, I disagree. The sign of a good piece of software is that it does what
it needs to do 

Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread Serge Bets
Hello David,

 On Monday, February 11, 2008 at 19:03:36 +, David L. Mills wrote:

 Both ntpdate and ntpd -q set the offset with adjtime() and then exit.
 After that, stock Unix adjtime() slews the clock at rate 500 PPM,
 which indeed could take 256 s for an initial offset of 128 ms.

And on some systems, adjtime() calls adjtimex(ADJ_OFFSET_SINGLESHOT) to
do the job.

Note that ntpdate does not stop slewing when it reaches the zero offset,
but voluntarily overshoots by 50%. That's why ntpdate -b (forced step)
or ntpd -q (exact slew until zero) are so much better.


 A prudent response would be to measure the initial offset and compute
 the time to wait.

Thanks! That's exactly what does the slew_sleeping script:


#!/bin/sh

function slew_sleeping() {
  awk '
{print}
/^ntpd: time slew .*s$/ {
  sleep = $4 * 2000
  if (sleep  0)
sleep = -sleep
  sleep = int(sleep + 0.99) # rounded by excess
  success = 1
}
/^ntpd: time set .*s$/ {
  success = 1
}
END{
  if (sleep) {
printf wait for the end of time slew, sleeping %d seconds\n, sleep
system(sleep  sleep)
  }
  exit success
}
  '
}

# echo ntpd: time slew -0.003000s | slew_sleeping; exit

while ntpd -gq | slew_sleeping; do :; done; ntpd



Serge.
-- 
Serge point Bets arobase laposte point net

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread Harlan Stenn
 In article [EMAIL PROTECTED], David L. Mills [EMAIL PROTECTED] writes:

David Serge, I didn't believe what you said until I checked the code and it
David does increase the correction by 50%, but limits the overshoot to 50
David ms. Why in the would it overshoot at all?

Dave,  this is one of the many problem with ntpdate and why we wanted to
kill it off since nobody was maintaining it.

As I recall, somebody said For folks who want to run ntpdate out of cron,
we should do a bit of overshoot so we can home in on the right adjustment.

As I recall, the thought was If we start off with no overshoot and make our
adjustment, the next time we run ntpdate we will make the same adjustment
that we just did.  So let's overshoot so next time we will be a bit closer.

I didn't say the idea makes a lot of sense, but hey.

-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread Harlan Stenn
 In article [EMAIL PROTECTED], Tom Smith [EMAIL PROTECTED] writes:

Tom ntdate -b steps the clock. That's the function under discussion.  The
Tom one that's used nearly universally in boot sequences.

Then change the boot sequence.

Using ntpdate -b to step the clock before starting ntpd is no longer best
common practice, and it hasn't been for a decent hunk of time.

-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread David L. Mills
Serge,

I didn't believe what you said until I checked the code and it does 
increase the correction by 50%, but limits the overshoot to 50 ms. Why 
in the would it overshoot at all?

Dave

Serge Bets wrote:

 Hello David,
 
  On Monday, February 11, 2008 at 19:03:36 +, David L. Mills wrote:
 
 
Both ntpdate and ntpd -q set the offset with adjtime() and then exit.
After that, stock Unix adjtime() slews the clock at rate 500 PPM,
which indeed could take 256 s for an initial offset of 128 ms.
 
 
 And on some systems, adjtime() calls adjtimex(ADJ_OFFSET_SINGLESHOT) to
 do the job.
 
 Note that ntpdate does not stop slewing when it reaches the zero offset,
 but voluntarily overshoots by 50%. That's why ntpdate -b (forced step)
 or ntpd -q (exact slew until zero) are so much better.
 
 
 
A prudent response would be to measure the initial offset and compute
the time to wait.
 
 
 Thanks! That's exactly what does the slew_sleeping script:
 
 
 #!/bin/sh
 
 function slew_sleeping() {
   awk '
 {print}
 /^ntpd: time slew .*s$/ {
   sleep = $4 * 2000
   if (sleep  0)
   sleep = -sleep
   sleep = int(sleep + 0.99)   # rounded by excess
   success = 1
 }
 /^ntpd: time set .*s$/ {
   success = 1
 }
 END{
   if (sleep) {
   printf wait for the end of time slew, sleeping %d seconds\n, sleep
   system(sleep  sleep)
   }
   exit success
 }
   '
 }
 
 # echo ntpd: time slew -0.003000s | slew_sleeping; exit
 
 while ntpd -gq | slew_sleeping; do :; done; ntpd
 
 
 
 Serge.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread Harlan Stenn
Serge,

Interesting script - thanks.  Would you like me to put it in the
distribution?

This brings up an underlying question.  It is possible for events to unfold
in a way that while in state 4, events will be such that there will be
future wiggles.  Some of them may even take us out of state 4.

Agreed?

If so, what benefit do we get by using the script to delay things while we
are waiting for a slew to finish while in state 4?

What difference does it make if the system in question is an client as
opposed to a server?

-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread David L. Mills
Guys,

Just for clarity, neither the daemon nor kernel frequency is adjusted in 
any way with ntpd -q.

Serge Bets wrote:

  On Monday, February 11, 2008 at 7:38:53 +, David Woolley wrote:
 
 
Serge Bets wrote:

the kind of slew (singleshot) initiated by ntpd -q comes *above* the
usual frequency correction

That assumes the use of the kernel time discipline
 
 
 Indeed: I sometimes forget this can lack or be disabled, sorry.
 
 
 Serge.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread Serge Bets
Hello Harlan,

 On Monday, February 11, 2008 at 0:33:36 +, Harlan Stenn wrote:

 1) what are you trying to accomplish by the sequence:

  ntpd -gq ; wait a bit; ntpd

 that you do not get with:

  ntpd -g ; ntp-wait

Let's compare. I used a some weeks old ntp-dev 4.2.5p95, because the
latest p113 seems to behave strangely (clearing STA_UNSYNC long before
the clock is really synced). The driftfile exists and has a correct
value. ntp.conf declares one reachable LAN server with iburst. There are
4 main cases: initial phase offset bigger than 128 ms, or below, and
your startup method, or my method.

 -1) Initial phase offset over 128 ms, ntp-wait method:

| 0:00 # ntpd -g; ntp-wait; time_critical_apps
| 0:07 time step == the clock is very near 0 offset (less than a ms),
|  stratum 16, refid .STEP., state 4
| 0:12 ntp-wait terminates == time critical apps can be started
| 1:20 *synchronized, stratum x == ntpd starts serving good time

Timings are in minutes:seconds, relative to startup. Note this last
*sync stage, when ntpd takes a non-16 stratum, comes at a seemingly
random moment, sometimes as early as 0:40.


 -2) Initial phase offset over 128 ms, my slew_sleeping script:

| 0:00 # ntpd -gq | slew_sleeping; ntpd
| 0:07 time step, no sleep == near 0 offset (time critical apps can be
|  started)
| 0:14 *synchronized == ntpd starts serving good time


 -3) Initial phase offset below 128 ms, ntp-wait method (worst case):

| 0:00 # ntpd -g; ntp-wait; time_critical_apps
| 0:07 *synchronized == ntpd starts serving time, a still bad time,
|  because the 128 ms offset is not yet slewed
| 0:12 ntp-wait terminates == time critical apps are started
| 7:30 offset crosses the zero line for the first time, and begins an
|  excursion on the other side (up to maybe 40 ms). The initial good
|  frequency has been modified to slew the phase offset, and is now
|  wildly bad (by perhaps 50 or 70 ppm). The chaos begins, and will
|  stabilize some hours later.


 -4) Initial phase offset below 128 ms, slew_sleeping script:

| 0:00 ntpd -gq | slew_sleeping; ntpd
| 0:07 begin max rate slew, sleeping all the necessary time (max 256
|  seconds)
| 4:23 wake-up == near 0 offset, time critical apps can be started
| 4:30 *synchronized == ntpd starts serving good time


Summary: The ntp-wait method is good at protecting apps against steps,
but not against large offsets (tens or a hundred of ms). The daemon
itself can start serving such less-than-good time. Startup takes more
time to reach a near 0 offset, and can wreck the frequency.

The ntpd -gq method does also avoid steps to applications, if all works
well. But it's not a 100% protection, not the goal. It also protects
apps against large offsets, never serves bad time, and never squashes
the driftfile. It makes a much saner daemon startup, more stable, where
the chaos situation described above (case #3) doesn't happen. It
startups faster, outside of the cases where ntp-wait cheats by
tolerating not yet good offsets.


If necessary, slew_sleeping and ntp-wait can be combined, for a better
level of protection. What about the following, that should survive even
a server temporarily unavailable during startup, further delaying time
critical apps:

| # ntpd -gq | slew_sleeping; ntpd -g; ntp-wait; time_critical_apps

One could also imagine looping ntpd -gq until it works, then sleep, then
ntpd and time_critical_apps (the slew_sleeping scripts has to be
modified to return success code):

| # while ntpd -gq | slew_sleeping; do :; done; ntpd; time_critical_apps


Serge.
-- 
Serge point Bets arobase laposte point net

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread David L. Mills
Tom,

With tinker step .001 in the configuration file, ntpd -q will step the 
clock, unless the residual offset is less than .001 s. This is probably 
more complexity than you can stand. Just keep using ntpdate and be happy.

Dave

Tom Smith wrote:

 ntdate -b steps the clock. That's the function under discussion.
 The one that's used nearly universally in boot sequences.
 
 -Tom
 
 David L. Mills wrote:
 
 Guys,

 There seems to some misinformation here.

 Both ntpdate and ntpd -q set the offset with adjtime() and then exit. 
 After that, stock Unix adjtime() slews the clock at rate 500 PPM, 
 which indeed could take 256 s for an initial offset of 128 ms. A 
 prudent response would be to measure the initial offset and compute 
 the time to wait. The ntp-wait script waits for ntpd to enter state 4, 
 which could happen with an initial offset as high as 128 ms.

 The ntpd time constant is purposely set somewhat large at 2000 s, 
 which results in a risetime of about 3000 s. This is a compromise for 
 stable acquisition for herky-jerky Internet paths and speed of 
 convergence for LANs. For typical Internet paths the Allan intercept 
 is about 2000 s. For fast LANs with nanosecond clock resolution, the 
 Allan intercept can be as low as 250s, which is what the kernel PPS 
 loop is designed for.

 Both the daemon and kernel loops are engineered so that the time 
 constant is directly proportional to the poll interval and the 
 risetime scales directly. If the poll exponent is set to the minimum 4 
 (16 s) the risetinme is 500 s. While not admitted in public, the 
 latest snapshot can set the poll interval to 3 (8 s), so the risetime 
 is 250 s. This works just fine on a LAN, but I would never do this on 
 an outside circuit.

 Dave
snip

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread Unruh
Harlan Stenn [EMAIL PROTECTED] writes:

 In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes:

Harlan In terms of the behavior model of ntp, state 4 is as good as it
Harlan gets.  You are in the right ballpark.

Unruh And as has been commented on numerous times, ntp is state 4 is very
Unruh slow to converge to the best possible time control. This was a
Unruh deliberate design decision, as I understand it, so that in steady
Unruh state the time is averaged over a large number of samples ( not
Unruh helped by the fact that 85% of samples are thrown away), to reduce
Unruh the statistical error in the clock control.  Note that at poll 7 the
Unruh number of actual samples averaged over in the time scale of the ntp
Unruh feedback loop is only about 3, so the statistical averaging even with
Unruh such a long time constant, is not very good.

OK, and please don't take this the wrong way, but So What?

For the general use case (LAN and/or WAN and/or jerky path) ntpd behaves
well.

The question is not does it work well, but does it work the best it can.


As Dave recently replied, if you are only interested in LAN performance
there are tweaks that can be made that will improve the performance.

No, I am interested in the behaviour in general. That is why I am trying to
test it on an ADSL link as well.


The current setup will Just Work regardless of the network environment.

This, to me, is the sign of a good piece of software.

If somebody with extra knowledge can make a local optimization based on
tighter specs, great.

The question is whether or not it can be made better in general.



I suspect that if Dave can be shown that whatever chrony is doing will
behave in the wider space that NTP covers, he will be OK making changes to
use those algorithms.

There may even be a way to choose different algorithms based on the behavior
in evidence.

But you seem to be talking about how improvements can be made and I thought
this original thread was about how there was a *problem*.

This original thread was  about how ntpdate had an too small a buffer for a
given use-- a very easily fixable problem. It then wandered to whether or
not ntpdate should be axed or not. And then as an aside I mentioned further
experiments I was doing on the comparison of chrony and ntp-- mentioned
because one of the reasons ntpdate is used is the slow convergence of ntp
to the true time. I mentioned that chrony has much faster convergence. 
So as sometimes happens in threads, they wander, and in this case I was at
least partially responsible for part of the wander.




 If you want something else, something you consider better than state 4,
 please make a case for this and lobby for it.

Unruh I think many people have lobbied for faster response. In the
Unruh discussion of the chrony/ntp comparison, chrony is much faster to
Unruh correct errors, and at least on a local network, better at
Unruh disciplining the clock as well ( in part I think because on such a
Unruh minimal round trip network, the frequency fluctuations dominate over
Unruh the offset measurement errors-- Ie, the Allen intercept is much lower
Unruh than the assumed 1500 sec. in that kind of situation-- also the drift
Unruh model on real systems is not well modeled by 1/f noise.) So, what I
Unruh think the point is that using ntpdate, one can rapidly bring the
Unruh clock into a few msec of the correct time, rather than waiting for
Unruh the feedback loop to finally eliminate that last 128msec of offset.

OK, and again, I'm seeing you lobby for an enhancement/improvement here (and
I'm all for that).

David (I think) was talking about a *problem*.

I agree with you that we can do better.

I am trying to see if there is also a problem.

Too many potential problems. I am confused about which one. 


Harlan If folks have suggestions for improvements I welcome them.

Harlan If folks want something different I invite them to make a case for
Harlan it.  Please remember the scope and complexity of the problem case.
Harlan It's much easier to have a simpler solution if one is prepared to
Harlan ignore certain problems.  Another case in this point is Maildir.

Harlan If somebody is in the situation where they know they have specific
Harlan requirements for time, they are in the situation where they have
Harlan enough altitude on their requirements to know the costs/benefits of
Harlan what is involved in getting there.

Unruh Well, I disagree. The sign of a good piece of software is that it
Unruh does what it needs to do despite the user having a bad idea of how to
Unruh accomplish the task.

Sounds like NTP.  Folks often have pretty bad ideas about what they need
to do or what problems they think they are solving by doing strange things
and the code works anyway.

Mine was a specific response to the comment that Harlan made. 


But more to the point, what is the *problem* you are trying to solve?  You
are still communicating to me that we can do *better* and I agree with you.
You 

Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread Jason Rabel
I've tried to keep quiet and bite my tongue at this whole ntp vs chrony
thing... But something has been nagging me in the back of my head that i
juat have to know the answer to...

How are you measuring your results? From what I've skimmed over you are
simply using each program's own generated statistics... Wouldn't a more
correct way be to use an external (and calibrated) device to measure /
compare to ensure the results are actually valid? Otherwise you are in
essence comparing apples to oranges...

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread Hal Murray

So, no, I am comparing apples to apples ( the offsets as determined from
the ntp packet exchange mechanism which both use and both report). 

Another approach is to setup a 3rd machine to watch both.

-- 
These are my opinions, not necessarily my employer's.  I hate spam.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-11 Thread David L. Mills
Guys,

There seems to some misinformation here.

Both ntpdate and ntpd -q set the offset with adjtime() and then exit. 
After that, stock Unix adjtime() slews the clock at rate 500 PPM, which 
indeed could take 256 s for an initial offset of 128 ms. A prudent 
response would be to measure the initial offset and compute the time to 
wait. The ntp-wait script waits for ntpd to enter state 4, which could 
happen with an initial offset as high as 128 ms.

The ntpd time constant is purposely set somewhat large at 2000 s, which 
results in a risetime of about 3000 s. This is a compromise for stable 
acquisition for herky-jerky Internet paths and speed of convergence for 
LANs. For typical Internet paths the Allan intercept is about 2000 s. 
For fast LANs with nanosecond clock resolution, the Allan intercept can 
be as low as 250s, which is what the kernel PPS loop is designed for.

Both the daemon and kernel loops are engineered so that the time 
constant is directly proportional to the poll interval and the risetime 
scales directly. If the poll exponent is set to the minimum 4 (16 s) the 
risetinme is 500 s. While not admitted in public, the latest snapshot 
can set the poll interval to 3 (8 s), so the risetime is 250 s. This 
works just fine on a LAN, but I would never do this on an outside circuit.

Dave

Unruh wrote:
 Harlan Stenn [EMAIL PROTECTED] writes:
 
 
In article [EMAIL PROTECTED], David Woolley [EMAIL PROTECTED] writes:
 
 
David Harlan Stenn wrote:

Why would ntpd be exiting during a warm start?
 
 
David Because we are discussing using it with the -q option.  If you just
David use -g, it will take a lot longer to converge within a few
David milliseconds, as it will not slew at the maximum rate.  If you use
David -q, you need to force a step if you want fast convergence.
 
 
I still maintain you are barking up the wrong tree.
 
 
In terms of the behavior model of ntp, state 4 is as good as it gets.  You
are in the right ballpark.
 
 
 And as has been commented on numerous times, ntp is state 4 is very slow to
 converge to the best possible time control. This was a deliberate design
 decision, as I understand it, so that in steady state the time is averaged
 over a large number of samples ( not helped by the fact that 85% of samples
 are thrown away), to reduce the statistical error in the clock control.
 Note that at poll 7 the number of actual samples averaged over in the time
 scale of the ntp feedback loop is only about 3, so the statistical
 averaging even with such a long time constant, is not very good.
 
 
 
If you want something else, something you consider better than state 4,
please make a case for this and lobby for it.
 
 
 I think many people have lobbied for faster response. In the discussion of
 the chrony/ntp comparison, chrony is much faster to correct errors, and at
 least on a local network, better at disciplining the clock as well ( in
 part I think because on such a minimal round trip network, the frequency
 fluctuations dominate over the offset measurement errors-- Ie, the Allen
 intercept is much lower than the assumed 1500 sec. in that kind of
 situation-- also the drift model on real systems is not well modeled by 1/f
 noise.) So, what I think the point is that using ntpdate, one can rapidly
 bring the clock into a few msec of the correct time, rather than waiting
 for the feedback loop to finally eliminate that last 128msec of offset.
 
 
For the case I'm describing the startup script sequence is to fire up
'ntpd -g' early.  If there are applications that need the system clock to
be on-track stable (even if a wiggle is being dealt with), that's 'state
4', and running 'ntp-wait' before starting those services is, to the best
of my knowledge, all that is required.
 
 
David State 4 means within 128ms and using the normal control loop, which
David has a time constant of around an hour.
 
 
OK, and so what?
 
 
Is State 4 insufficient for your needs, or are you just splitting hairs?
 
 
David For a cold start, it won't reach state 4 for a further 900 seconds
David after first priming the clock filter.
 
 
If the system has a good drift file, I disagree with you.
 
 
David The definition of cold start is that there is no drift file.
 
 
OK, now I know what the definitions are.
 
 
I don't recall offhand the expected time to hit state 4 without a drift
file.
 
 
1) This should not be the ordinary case
2) How does this have any bearing on the ntpdate -b discussion?
 
 
And what is the big deal with using different config files?  The config
file mechanism has include capability so it is trivial to to easily
maintain common 'base' configuration with customizations for separate
start/run phases.
 
 
David You are now talking about using -q.  The difficulty is that people
David have enough trouble getting the run phase config file right.
 
 
I mention it because it's what you seem to be insisting on talking about.
 
 
I was providing a way to address the problems you describe with 

Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-10 Thread Harlan Stenn
 In article [EMAIL PROTECTED], David Woolley [EMAIL PROTECTED] writes:

David Harlan Stenn wrote:
 Why would ntpd be exiting during a warm start?

David Because we are discussing using it with the -q option.  If you just
David use -g, it will take a lot longer to converge within a few
David milliseconds, as it will not slew at the maximum rate.  If you use
David -q, you need to force a step if you want fast convergence.

I still maintain you are barking up the wrong tree.

In terms of the behavior model of ntp, state 4 is as good as it gets.  You
are in the right ballpark.

If you want something else, something you consider better than state 4,
please make a case for this and lobby for it.

 For the case I'm describing the startup script sequence is to fire up
 'ntpd -g' early.  If there are applications that need the system clock to
 be on-track stable (even if a wiggle is being dealt with), that's 'state
 4', and running 'ntp-wait' before starting those services is, to the best
 of my knowledge, all that is required.

David State 4 means within 128ms and using the normal control loop, which
David has a time constant of around an hour.

OK, and so what?

Is State 4 insufficient for your needs, or are you just splitting hairs?

David For a cold start, it won't reach state 4 for a further 900 seconds
David after first priming the clock filter.

 If the system has a good drift file, I disagree with you.

David The definition of cold start is that there is no drift file.

OK, now I know what the definitions are.

I don't recall offhand the expected time to hit state 4 without a drift
file.

1) This should not be the ordinary case
2) How does this have any bearing on the ntpdate -b discussion?

 And what is the big deal with using different config files?  The config
 file mechanism has include capability so it is trivial to to easily
 maintain common 'base' configuration with customizations for separate
 start/run phases.

David You are now talking about using -q.  The difficulty is that people
David have enough trouble getting the run phase config file right.

I mention it because it's what you seem to be insisting on talking about.

I was providing a way to address the problems you describe with the (IMO
bad) mechanism (-q) under discussion.

 But the bigger problem is why are you insisting on separate start/run
 phases?  This has not been best practice for quite a while, and if you
 insist on using this method you will be running in to the exact problems
 you are describing.

 No, the best advice is to understand why you have been using ntpdate -b
 so far and understand the pros/cons of the new choices.

David We are talking about system managers and package creators, neither of
David which have much time to study the details.

Blessed are those who get what they deserve.

These are the same folks who must get ssh configurations and various other
network configurations working.

If the stock things work well enough for folks, great.

If folks have suggestions for improvements I welcome them.

If folks want something different I invite them to make a case for it.
Please remember the scope and complexity of the problem case.  It's much
easier to have a simpler solution if one is prepared to ignore certain
problems.  Another case in this point is Maildir.

If somebody is in the situation where they know they have specific
requirements for time, they are in the situation where they have enough
altitude on their requirements to know the costs/benefits of what is
involved in getting there.

-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-10 Thread Serge Bets
Hello David,

 On Sunday, February 10, 2008 at 10:55:29 +, David Woolley wrote:

 However, if it wasn't stepped, because it was already within 128ms, it
 will be slewing at maximum rate. Allowing 100ppm for motherboard
 tolerances, that means that it can take up to a further 320 seconds to
 reach the low milliseconds.

Only 256 seconds maximum, because the kind of slew (singleshot)
initiated by ntpd -q comes *above* the usual frequency correction
already annihiliating the motherboard error.


 I don't believe it would be safe to start ntpd in normal mode within
 that period.

Indeed: the daemon then behaves strangely, not sane at all. Last year
I published here an awk script calling ntpd -gq and then sleeping
until an eventual slew is finished. After that, normal mode ntpd can be
started safely. And of course the daemon really appreciates to startup
with a near-zero initial phase offset.


Serge.
-- 
Serge point Bets arobase laposte point net

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-10 Thread Harlan Stenn
 In article [EMAIL PROTECTED], Serge Bets [EMAIL PROTECTED] writes:

David I don't believe it would be safe to start ntpd in normal mode within
David that period.

Serge Indeed: the daemon then behaves strangely, not sane at all. Last year
Serge I published here an awk script calling ntpd -gq and then sleeping
Serge until an eventual slew is finished. After that, normal mode ntpd can
Serge be started safely. And of course the daemon really appreciates to
Serge startup with a near-zero initial phase offset.

1) what are you trying to accomplish by the sequence:

 ntpd -gq ; wait a bit; ntpd

that you do not get with:

 ntpd -g ; ntp-wait

2) there have been recent changes to the initial frequency/offset situation
with ntp-dev.  Have you tried the latest code to see how it behaves?

-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-10 Thread David Woolley
Serge Bets wrote:

 
 Only 256 seconds maximum, because the kind of slew (singleshot)
 initiated by ntpd -q comes *above* the usual frequency correction
 already annihiliating the motherboard error.

That assumes the use of the kernel time discipline, alhtough if you 
don't have that, it is even more important to use ntpdate -b, if you 
want fast phase convergence, as the time won't drift much between the 
initial set and the start of ntpd.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-09 Thread David Woolley
David L. Mills wrote:
 Harlan,
 
 You make some good points. However, if folks want SNTP from here I think 
 they would prefer it in its own distribution rather than bundle it with 
 the huge NTP distribution. You can make a strong argument to host here 

I don't think you are ever going to get rid of ntpdate from the 
distribution (as supplied by packagers and vendors) until ntpd offers a 
mode which sets the time within about one second of being started.  I'm 
not convinced that SNTP will displace ntpdate for this purpose.  People 
don't want to delay boot sequences, but they also don't want to start 
applications until the time has been set.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-09 Thread Richard B. Gilbert
David Woolley wrote:
 David L. Mills wrote:
 
 Harlan,

 You make some good points. However, if folks want SNTP from here I 
 think they would prefer it in its own distribution rather than bundle 
 it with the huge NTP distribution. You can make a strong argument to 
 host here 
 
 
 I don't think you are ever going to get rid of ntpdate from the 
 distribution (as supplied by packagers and vendors) until ntpd offers a 
 mode which sets the time within about one second of being started.  I'm 
 not convinced that SNTP will displace ntpdate for this purpose.  People 
 don't want to delay boot sequences, but they also don't want to start 
 applications until the time has been set.

How long does ntpd -g take to set the time?  As I understand it, it's 
supposed to query the configured servers, make a best guess as to what 
time it is, set that, and then go to normal operation.

That should put you within a second or so.  If you need better, either 
wait for it, or keep your server alive 24x7x365.  I think most data 
centers do run 24x7x365.  If you're talking about a data center that 
lives under the boss's desk, consider buying a UPS and hope that the 
power doesn't fail for longer than the run time.


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-09 Thread Tom Smith
Richard B. Gilbert wrote:
 David Woolley wrote:
 David L. Mills wrote:

 Harlan,

 You make some good points. However, if folks want SNTP from here I 
 think they would prefer it in its own distribution rather than bundle 
 it with the huge NTP distribution. You can make a strong argument to 
 host here 


 I don't think you are ever going to get rid of ntpdate from the 
 distribution (as supplied by packagers and vendors) until ntpd offers 
 a mode which sets the time within about one second of being started.  
 I'm not convinced that SNTP will displace ntpdate for this purpose.  
 People don't want to delay boot sequences, but they also don't want to 
 start applications until the time has been set.
 
 How long does ntpd -g take to set the time?  As I understand it, it's 
 supposed to query the configured servers, make a best guess as to what 
 time it is, set that, and then go to normal operation.
 
 That should put you within a second or so.  If you need better, either 
 wait for it, or keep your server alive 24x7x365.  I think most data 
 centers do run 24x7x365.  If you're talking about a data center that 
 lives under the boss's desk, consider buying a UPS and hope that the 
 power doesn't fail for longer than the run time.

David is right.

He means be done with it, including hard-setting the clock, within a second.
The accuracy expected, based on ntpdate -b as the benchmark you are trying to
replace, is within a small number of milliseconds of the specified servers.

Sorry, ntpd -q doesn't meet the requirements.

-Tom

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-09 Thread Steve Kostecke
On 2008-02-09, Tom Smith [EMAIL PROTECTED] wrote:

 He means be done with it, including hard-setting the clock, within a
 second. The accuracy expected, based on ntpdate -b as the benchmark
 you are trying to replace, is within a small number of milliseconds of
 the specified servers.

 Sorry, ntpd -q doesn't meet the requirements.

You need to be realistic about your requirements.

In the case of systems which run time sensitive services, or are rarely
rebooted, an ~11 second pause, which is _is_ about the amount of time it
takes for 'ntpq -gq' to do a quick sanity check on your configured time
servers and set the clock, is not unreasonable.

In the case of systems which do not run time critical services there
is no reason why ntpd can not be started with -g and be allowed to set
the clock as the boot progresses. In most cases the clock will be set
before, or very shortly after, the boot sequence is completed.

The big issue in the ntpdate vs ntpd -gq debate is the fact that the
former may be used over unprivileged ports while the latter can not.
This gives ntpdate the advantage in situtations where a firewall is
blocking port 123/UDP.

That's what you should be complaining about, not some trivial 11 second
delay.

-- 
Steve Kostecke [EMAIL PROTECTED]
NTP Public Services Project - http://support.ntp.org/

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-09 Thread Harlan Stenn
 In article [EMAIL PROTECTED], David Woolley [EMAIL PROTECTED] writes:

David I don't think you are ever going to get rid of ntpdate from the
David distribution (as supplied by packagers and vendors) until ntpd offers
David a mode which sets the time within about one second of being started.

The current sntp code can do this now.

David I'm not convinced that SNTP will displace ntpdate for this purpose.

Why not?

David People don't want to delay boot sequences, but they also don't want
David to start applications until the time has been set.

Then I submit you are focusing a bit too deeply on the details and invite
you to take a step back.

I believe the current set of tools can be used in a variety of combinations
that will handle the various cases to the best that we know how to do them.

If you want to get the time set *now* and then start, regardless of how well
the system can maintain that time, we can do that (sntp/ntpdate+ntpd).

If you want to set the time ASAP and have stable system time before starting
your apps, in the usual case you are talking about 11 seconds for this to
happen (ntpd -g, with iburst, early in the boot sequence, using ntp-wait
later in the boot sequence, just before starting time-critical services).

Near as I can recall, any other cases have looser constraints so they're not
particularly interesting for this conversation.
-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-09 Thread Harlan Stenn
 In article [EMAIL PROTECTED], David L. Mills [EMAIL PROTECTED] writes:

David Harlan, You make some good points. However, if folks want SNTP from
David here I think they would prefer it in its own distribution rather than
David bundle it with the huge NTP distribution.

That's not the feedback I have received, but I will note it would be
possible to have an ntp+sntp distribution and a separate sntp
distribution.  It would take a couple of days' time to do this, and I have
much hotter fires to put out first.  Additionally, there will be significant
changes in the code layout as the sntp code is overhauled, so I'd prefer to
wait on this additional distribution tarball until that effort is completed.

David You can make a strong
David argument to host here if the claim that both NTP and SNTP are
David strictly specification conformant. That's why I rewrote the SNTP
David documentation to take out all mention that it could be used as a
David server.

OK.

David The three of us that wrote rfc 2030 had just come down from a massive
David clogging situation at UWisc and NIST and were frantic to get across
David the need for polite client behavior. This has to do with DNS lookups,
David poll intervals and behavior when no response is received. Even so,
David there remains at least three violators of those principles right now
David on two of our public servers. Therefore, if an SNTP product leaves
David here, it really and surely should compley with the on-wire protocol
David in the NTPv4 spec and these best practices.

We're on the same page.

David A aside, I should reveal my biases. At the moment, to configure the
David current software on an Sun Ultra 5 takse 12 minutes, 6 minutes for
David NTP and 6 minutes for SNTP. But, it takes only 8 minutes to compile
David and link all programs, including both NTP and SNTP. It is not now
David possible to build either separately.

I'm not sure what you mean about building separately.

We *used* to be able to build:

- ntp + sntp:
  configure ; make

- ntp only:
  configure --without-sntp ; make

- sntp only:
  cd sntp ; configure ; make

About a year and a half ago we got the SNTP code to the point where it would
build on Unix (nobody has done the work for Windows, but apparently nobody
is asking for it there either - http://bugs.ntp.org/500 has the details).

Since we've been announcing that ntpdate will be deprecated because its
functionality can be replaced by various combinations of ntpd and sntp, we
made sntp a 'required' part of the NTP build.

David As I have said privately before, the NTP daemon can be operated in
David SNTP mode which does everything NTP does, but terminates just after
David the clock has been set for the first time. Yes, it has a rather large
David footprint, but it lasts only about 11 seconds. The downside is that
David it requires a configuration file containing a list of servers. If
David this were done on the command line, NTP in SNTP mode would be
David indistinguishable from SNTP other than a command line option.

You have provided a mechanism for doing this.  It will be an acceptable
choice for a good number of people.  But there is a significant group of
people for whom this particular mechanism will not work.

They require any or all of the following:

- a small footprint
- set the time with the smallest possible delay

While we might be able to achieve the smallest delay with ntpd, I don't
currently see how we can do that while also offering full NTP support from a
single binary and achieve the small footprint.

David So, the ideal solution would seem to include a list of links on the
David NTP home page to external sites and in addition internal links to the
David NTP and SNTP distributions along with a statement that both are
David strictly specification conformant. That might inspire other wannabees
David to make and enforce similar claims.

We already have internal and external links on the ntp.org site.

And if somebody wants additional or different information there, contact
information is also listed in what should be obvious places.
-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-09 Thread Richard B. Gilbert
Harlan Stenn wrote:
 Guys,
 
 This is all discussed pretty well at:
 
  http://support.ntp.org/bin/view/Dev/DeprecatingNtpdate
 
 So far everything I have seen in this thread has already been covered on
 that page.


I just followed the above link.  I see ONE feature missing!

ntpdate -Du  (I think it's -D) does NOT set the clock, it simply tells 
you what it would have done had it been permitted to do so.  I suppose 
this feature is not essential but I've used it a time or two to find out
how my time agreed, or disagreed, with some other server.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-09 Thread Harlan Stenn
 In article [EMAIL PROTECTED], David Woolley [EMAIL PROTECTED] writes:

David Harlan Stenn wrote:
 In article [EMAIL PROTECTED], David Woolley
 [EMAIL PROTECTED] writes:


David I'm not convinced that SNTP will displace ntpdate for this purpose.
 Why not?

David Because ntpdate is fixed in the popular culture and, for the ordinary
David user, SNTP doesn't offer any obvious advantages.

Well, The Plan is to remove ntpdate.  So unless somebody writes a
contributed script, the fact that ntpdate (with its known bugs) is going
away and a documented set of functional equivalents will be available will
probably be all the convincing that is needed.

 If you want to get the time set *now* and then start, regardless of how
 well the system can maintain that time, we can do that
 (sntp/ntpdate+ntpd).

David Not in Dave Mills future of ntpd, as you don't get ntpdate or SNTP.

That would be true if Dave controlled the contents of the distribution.

There is a set of required functionality out there that will be met by the
distribution I control.  There may be distributions I roll that have
subset functionality, and Dave may choose to offer other distributions.

I see no benefit and many problems in forcing this issue too soon, so at
the moment it is a topic for discussion and the situation seems to be on
track right now.

This is, by no means, the most important thing we're all working on right
now.

Getting the sntp code up to spec is far more important, IMO.

 If you want to set the time ASAP and have stable system time before
 starting your apps, in the usual case you are talking about 11 seconds
 for this to happen (ntpd -g, with iburst, early in the boot sequence,
 using ntp-wait later in the boot sequence, just before starting
 time-critical services).

David I suspect that only sets the time to the nearest 128ms, unless it
David does something that ntpd doesn't normally do.

I suspect you are mistaken, and what I describe is correct.

In the case I describe, at the end of that O(11 second) period the clock is
Real Close (ie, the offset is low enough), the frequency drift is known
and compensated for, and ntpd is in state 4.
-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-08 Thread Unruh
Harlan Stenn [EMAIL PROTECTED] writes:

 In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes:

Unruh Harlan Stenn [EMAIL PROTECTED] writes:
 Bill,

 ntpdate is being deprecated.

Unruh Maybe, but it should still not have bugs if it is actually still part
Unruh of the distro.

I mostly agree with you.  And one reason there are a bunch of outstanding
bugs in ntpdate is that nobody has stepped forward to maintain it,
especially after the last round of bugs where we decided that the best thing
to do for ntpdate was kill it off and replace it with sntp.

Speaking of which, I need to ping the folks who volunteered to work on the
SNTP code and see what the status is.

 And it is *much* better to file reports like this using bugs.ntp.org as
 otherwise they tend to get lost in the wind.

Unruh OK. Will do.

I saw that bug get filed - thanks a bunch!

Where it met with the same reaction-- ntpdate is deprecated so why fix the
bug. Do you want to bet that ntpdate will still be there in 2010?

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-08 Thread Harlan Stenn
 In article [EMAIL PROTECTED], David L. Mills [EMAIL PROTECTED] writes:

David Harlan, My position on ntpdate and sntp has always been clear. Remove
David them both from the distribution and let other folks contribute sntp
David products.

Yes, your position has been clear and your opinion has been noted.

David The standards labs in various contries do not recommend the
David NTP reference implementation, they recommend other shrinkwrap
David products.

I'd appreciate references on this point.  And how it is germane to this
discussion?

David There is no need for folks to download the reference
David implementatino only to bring up an sntp product.

Yes, which is why the sntp code can be trivially bundled separately.

The feedback I have received is that the majority of folks want the
distribution to contain both ntp and sntp.

David The matter of concern is an sntp product that strictly conforms to
David the NTPv4 specification as it applies to sntp. There is at least one
David contributor testing the kiss-o'-death rate limit and has apparently
David actually read rfc 2030. On the other hand, there are numerous
David examples of clients that casually violate the rate rules both at
David servers we operate here and at the national labs.

Yup.

David What we should be
David doing is supporting those products that play by the rules and that
David are maintained by other players.

This depends first on the definition of we, and then on the definition of
supporting.

The people who talk to me want an SNTP implementation from the NTP Project.

Nobody is expecting you to ride herd over any SNTP code that may or may not
be part of the same tarball that includes NTP.  I am mulling over different
ideas in this regard.

Two obvious ways to go on NTP/SNTP are to have shared code, or completely
separate codebases.  There is some middle ground regarding support
libraries.

I see difficult tradeoffs with either approach.
-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-08 Thread David L. Mills
Harlan,

You make some good points. However, if folks want SNTP from here I think 
they would prefer it in its own distribution rather than bundle it with 
the huge NTP distribution. You can make a strong argument to host here 
if the claim that both NTP and SNTP are strictly specification 
conformant. That's why I rewrote the SNTP documentation to take out all 
mention that it could be used as a server.

The three of us that wrote rfc 2030 had just come down from a massive 
clogging situation at UWisc and NIST and were frantic to get across the 
need for polite client behavior. This has to do with DNS lookups, poll 
intervals and behavior when no response is received. Even so, there 
remains at least three violators of those principles right now on two of 
our public servers. Therefore, if an SNTP product leaves here, it really 
and surely should compley with the on-wire protocol in the NTPv4 spec 
and these best practices.

A aside, I should reveal my biases. At the moment, to configure the 
current software on an Sun Ultra 5 takse 12 minutes, 6 minutes for NTP 
and 6 minutes for SNTP. But, it takes only 8 minutes to compile and link 
all programs, including both NTP and SNTP. It is not now possible to 
build either separately.

As I have said privately before, the NTP daemon can be operated in SNTP 
mode which does everything NTP does, but terminates just after the clock 
has been set for the first time. Yes, it has a rather large footprint, 
but it lasts only about 11 seconds. The downside is that it requires a 
configuration file containing a list of servers. If this were done on 
the command line, NTP in SNTP mode would be indistinguishable from SNTP 
other than a command line option.

So, the ideal solution would seem to include a list of links on the NTP 
home page to external sites and in addition internal links to the NTP 
and SNTP distributions along with a statement that both are strictly 
specification conformant. That might inspire other wannabees to make and 
enforce similar claims.

Dave

Harlan Stenn wrote:
In article [EMAIL PROTECTED], David L. Mills [EMAIL PROTECTED] writes:
 
 
 David Harlan, My position on ntpdate and sntp has always been clear. Remove
 David them both from the distribution and let other folks contribute sntp
 David products.
 
 Yes, your position has been clear and your opinion has been noted.
 
 David The standards labs in various contries do not recommend the
 David NTP reference implementation, they recommend other shrinkwrap
 David products.
 
 I'd appreciate references on this point.  And how it is germane to this
 discussion?
 
 David There is no need for folks to download the reference
 David implementatino only to bring up an sntp product.
 
 Yes, which is why the sntp code can be trivially bundled separately.
 
 The feedback I have received is that the majority of folks want the
 distribution to contain both ntp and sntp.
 
 David The matter of concern is an sntp product that strictly conforms to
 David the NTPv4 specification as it applies to sntp. There is at least one
 David contributor testing the kiss-o'-death rate limit and has apparently
 David actually read rfc 2030. On the other hand, there are numerous
 David examples of clients that casually violate the rate rules both at
 David servers we operate here and at the national labs.
 
 Yup.
 
 David What we should be
 David doing is supporting those products that play by the rules and that
 David are maintained by other players.
 
 This depends first on the definition of we, and then on the definition of
 supporting.
 
 The people who talk to me want an SNTP implementation from the NTP Project.
 
 Nobody is expecting you to ride herd over any SNTP code that may or may not
 be part of the same tarball that includes NTP.  I am mulling over different
 ideas in this regard.
 
 Two obvious ways to go on NTP/SNTP are to have shared code, or completely
 separate codebases.  There is some middle ground regarding support
 libraries.
 
 I see difficult tradeoffs with either approach.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


[ntp:questions] ntpdate.c unsafe buffer write

2008-02-07 Thread Unruh
In ntpdate.c around line 542 (4.2.4p4)is the sequence
if (!authistrusted(sys_authkey)) {
 char buf[10];

 (void) sprintf(buf, %lu, (unsigned long)sys_authkey);
 msyslog(LOG_ERR, authentication key %s unknown, buf);
 exit(1);
}

Since unsigned long does not have a definite length on all machines, and with 
the trailing
zero certainly is potentially longer than 10 bytes, that buf is ripe for
buffer overflow. 
It should be something like
   char buf[(sizeof(unsigned long)*12/5+2)];
And/or the sprintf should be an snprintf.


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-07 Thread Harlan Stenn
Bill,

ntpdate is being deprecated.

And it is *much* better to file reports like this using bugs.ntp.org as
otherwise they tend to get lost in the wind.

H
--
 In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes:

Unruh In ntpdate.c around line 542 (4.2.4p4)is the sequence if
Unruh (!authistrusted(sys_authkey)) { char buf[10];

Unruh  (void) sprintf(buf, %lu, (unsigned long)sys_authkey);
Unruh msyslog(LOG_ERR, authentication key %s unknown, buf); exit(1);
Unruh }

Unruh Since unsigned long does not have a definite length on all machines,
Unruh and with the trailing zero certainly is potentially longer than 10
Unruh bytes, that buf is ripe for buffer overflow.  It should be something
Unruh like char buf[(sizeof(unsigned long)*12/5+2)]; And/or the sprintf
Unruh should be an snprintf.



-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-07 Thread Unruh
Harlan Stenn [EMAIL PROTECTED] writes:

Bill,

ntpdate is being deprecated.

Maybe, but it should still not have bugs if it is actually still part of
the distro.

And it is *much* better to file reports like this using bugs.ntp.org as
otherwise they tend to get lost in the wind.

OK. Will do.


H
--
 In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes:

Unruh In ntpdate.c around line 542 (4.2.4p4)is the sequence if
Unruh (!authistrusted(sys_authkey)) { char buf[10];

Unruh  (void) sprintf(buf, %lu, (unsigned long)sys_authkey);
Unruh msyslog(LOG_ERR, authentication key %s unknown, buf); exit(1);
Unruh }

Unruh Since unsigned long does not have a definite length on all machines,
Unruh and with the trailing zero certainly is potentially longer than 10
Unruh bytes, that buf is ripe for buffer overflow.  It should be something
Unruh like char buf[(sizeof(unsigned long)*12/5+2)]; And/or the sprintf
Unruh should be an snprintf.



-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] ntpdate.c unsafe buffer write

2008-02-07 Thread Harlan Stenn
 In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes:

Unruh Harlan Stenn [EMAIL PROTECTED] writes:
 Bill,

 ntpdate is being deprecated.

Unruh Maybe, but it should still not have bugs if it is actually still part
Unruh of the distro.

I mostly agree with you.  And one reason there are a bunch of outstanding
bugs in ntpdate is that nobody has stepped forward to maintain it,
especially after the last round of bugs where we decided that the best thing
to do for ntpdate was kill it off and replace it with sntp.

Speaking of which, I need to ping the folks who volunteered to work on the
SNTP code and see what the status is.

 And it is *much* better to file reports like this using bugs.ntp.org as
 otherwise they tend to get lost in the wind.

Unruh OK. Will do.

I saw that bug get filed - thanks a bunch!
-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions