Re: [ntp:questions] Time slew doesn't seem to work

2008-04-14 Thread jkvbe
On Apr 9, 4:40 pm, [EMAIL PROTECTED] (Andy) wrote:


 Two things:
 (1)  Try running with time stepping enabled on that system (i.e. don't
 use the '-x' flag) to see how well the system keeps time.  What kind of
 offset do you have after 1 or 2 hours of operation?
 (2)  Check your drift value when running with time stepping disabled
 (also check it with time stepping enabled).  You can do this with 'ntpq
 -crv' where 'frequency' is the drift value or you can dump the drift
 file (probably /var/lib/ntp/drift).  Note that the drift file is only
 updated once every hour or so.

 I encountered a problem on linux 2.6.18 in which disabling of time
 stepping (using either '-x' or 'tinker step 0') caused the drift value
 to run at or near +/-500ppm and subsequently caused a time offset of
 several milliseconds.  If I allow time steps on that same system, it
 runs with a drift 100ppm and maintains an offset 1ms.  I am using an
 IRIG time source, so I expect high accuracy.  In my system, a time step
 is never needed (i.e. the offset never grows larger than 128ms),
 regardless of whether time stepping is enabled or disabled.  This
 doesn't change the fact that it runs like crap with time stepping disabled.

 Andy

Note that the my set-up was an experiment to check how time slew
worked. What I still don't get is why it works when using ntpdate.
Also strange is the fact that ntpq actually reports the node as
synched with an offset of  1.6s.

Jan

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-12 Thread Unruh
[EMAIL PROTECTED] (Hal Murray) writes:


OK, so there was no magic in that 500PPM limit. Is there a difference
between the tick size adjustment and the frequency adjustment 
(CPU-counter-to-time conversion factor).

Limiting the slew rate to something like that means that
software that is timing things with code like:
  grab time, do something, grab time, subtract
gets a sane answer if it happens to be running while somebody
adjusts the time.

Do you know any code that cares if that is wrong by 10% (which would be
10PPM) Ie, is 10% error insane?

Is 1% (1PPM)?
Ie, .05% seems a bit extreme for that. 



-- 
These are my opinions, not necessarily my employer's.  I hate spam.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-12 Thread Hal Murray
Do you know any code that cares if that is wrong by 10% (which would be
10PPM) Ie, is 10% error insane?

Is 1% (1PPM)?
Ie, .05% seems a bit extreme for that. 

I used to do a lot of performance measurements.

For the stuff I was doing, 10% is easy to spot.  1% is borderline.

-- 
These are my opinions, not necessarily my employer's.  I hate spam.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-11 Thread David L. Mills
David,

The original model implemented in the Alpha kernel does not step the 
clock backward unless the step is greater than two seconds. Rather, it 
stops the clock and advances one microsecond at each read. This applies 
whether NTP slews or steps. Various ports of that code have broken this 
model in every possible way.

The 500-PPM slew once was common in the ubiquitous Unix kernel. The 
value was chosen as a compromise between short slew time for relatively 
small adjustments and moderate resolution during the slew interval. This 
works out to 5 microseconds per tick with a 100-Hz clock and a 5-us 
jitter. In truth this could be changed to anything you want, as long as 
the value is fixed.

Some kernelmongers, including SGI and Linux, have put up fancy code 
designed to reduce the slew time for large adjustments. This inserts and 
additional pole in the clock discipline impulse response which results 
in unstable behavior for adjustments over half a second or so.

The default step threshold is 128 ms; the -x command line option sets it 
to 600 s and does nothing else. The 600-s value was chosen as the 
expected accuracy with eyeball and wristwatch. If the extra pole is not 
there, the original response is preserved over that range and largely 
independent of the slew value itself.

Say you change from 5 us per tick to 1 ms per tick or 100 ms/s. This 
would amortize a 600-s adjustment in almost two hours and reduce the 
resolution to 1 ms. If your extended network requires synchronization to 
better than one second, in all but the last second of that slew the 
network would not be synchronized.

Dave

David Woolley wrote:
 Unruh wrote:
 

 Not at 500PPM limit but if you use the tick adjustment, it is more than
 enough time. (The tick adjust limits out at 100,000PPM)

 I believe ntpd assumes that it is constant.  Having a large tickadj 
 causes poor resolution when using the user space discipline.
 
 I suspect that Dr Mills would say that a high slew rate also compromises 
 the system behaviour when you cascasde multiple strata.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-11 Thread Unruh
David L. Mills [EMAIL PROTECTED] writes:

David,

The original model implemented in the Alpha kernel does not step the 
clock backward unless the step is greater than two seconds. Rather, it 
stops the clock and advances one microsecond at each read. This applies 
whether NTP slews or steps. Various ports of that code have broken this 
model in every possible way.

The 500-PPM slew once was common in the ubiquitous Unix kernel. The 
value was chosen as a compromise between short slew time for relatively 
small adjustments and moderate resolution during the slew interval. This 
works out to 5 microseconds per tick with a 100-Hz clock and a 5-us 
jitter. In truth this could be changed to anything you want, as long as 
the value is fixed.

OK, so there was no magic in that 500PPM limit. Is there a difference
between the tick size adjustment and the frequency adjustment 
(CPU-counter-to-time conversion factor).

Some kernelmongers, including SGI and Linux, have put up fancy code 
designed to reduce the slew time for large adjustments. This inserts and 
additional pole in the clock discipline impulse response which results 
in unstable behavior for adjustments over half a second or so.

That is of course not good. I am a bit uncertain why that instability would
depend on amplitude. Is the response a non-linear response? For linear
responses the amplitude should not matter. But from your words it sounds
like the reponse is amplitude dependent which would of course be
non-linear. And if it is non-linear, once it gets near (1 sec?) the right
time, that linear stable response should take over. 


By the way, do you happen to know how the Linux kernel inpliments the
adjtime system call (adjtimex ADJ_OFFSET_SINGLESHOT) does its slewing?





The default step threshold is 128 ms; the -x command line option sets it 
to 600 s and does nothing else. The 600-s value was chosen as the 
expected accuracy with eyeball and wristwatch. If the extra pole is not 
there, the original response is preserved over that range and largely 
independent of the slew value itself.

Say you change from 5 us per tick to 1 ms per tick or 100 ms/s. This 
would amortize a 600-s adjustment in almost two hours and reduce the 
resolution to 1 ms. If your extended network requires synchronization to 
better than one second, in all but the last second of that slew the 
network would not be synchronized.

This paragraph confuses me. If the clock is 600s out, it is way out no
matter how you slew it back. What do you mean reduce the resolution to
1ms? The resolution is still  1usec and the accuracy is a few hundreds of
seconds. And if the clock is out by 600s the network will not be
synchronised to true time. Or perhaps you mean something else than that
with network would not be synchronized.

Thanks
 



Dave

David Woolley wrote:
 Unruh wrote:
 

 Not at 500PPM limit but if you use the tick adjustment, it is more than
 enough time. (The tick adjust limits out at 100,000PPM)

 I believe ntpd assumes that it is constant.  Having a large tickadj 
 causes poor resolution when using the user space discipline.
 
 I suspect that Dr Mills would say that a high slew rate also compromises 
 the system behaviour when you cascasde multiple strata.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-11 Thread David Woolley
David L. Mills wrote:

 The default step threshold is 128 ms; the -x command line option sets it 
 to 600 s and does nothing else. The 600-s value was chosen as the 

Oops, I thought the units there were ms, which invalidates some of what 
I said.  I was probably partly confused by the disabling of the kernel 
discipline at 0.5 seconds.


 expected accuracy with eyeball and wristwatch. If the extra pole is not 

With a radio controlled wrist watch, 200ms is easily possible.  I'd 
suggest that 10 minutes is about the 95 percentile for when the person 
setting the time doesn't care about the time.  In that context, my 
practical experience is that most times are set within 5 minutes.  As 
someone who regularly uses public transport, I would be disappointed if 
I could get the time to 30 seconds, by reading my wrist watch.

600 seconds is 60% of the drop dead value, so I'm not clear why the two 
weren't made the same.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-11 Thread Andy
jkvbe wrote:
 Hi,

 I've started ntpd with the -x option and defined at run-time (using ntpdc) 3
 servers. The client machine has an offset of +/- 2s with the ntp servers.
 In the NTP log file I find the following statements (extracted out of a
 total of 98):

 9 Apr 07:46:13 ntpd[19257]: time slew 1.781571 s
 9 Apr 08:01:16 ntpd[19257]: time slew 1.781200 s
 9 Apr 08:17:21 ntpd[19257]: time slew 1.781085 s
 9 Apr 08:32:33 ntpd[19257]: time slew 1.781807 s
 9 Apr 08:48:37 ntpd[19257]: time slew 1.782273 s
 9 Apr 09:04:38 ntpd[19257]: time slew 1.781004 s
 9 Apr 09:19:42 ntpd[19257]: time slew 1.781344 s
 9 Apr 09:34:46 ntpd[19257]: time slew 1.780407 s
 9 Apr 09:49:50 ntpd[19257]: time slew 1.778824 s

 The times don't seem to converge.

 When I shut down the ntp daemon and try to slew the time using ntpdate with
 the -B option it does work. The time difference with the ntp servers
 gradually declines.

 We use Suse SLES10 (kernel version: 2.6.16).

 Does anybody have an idea on what's going wrong?

 Thanks,
 Jan
Two things:
(1)  Try running with time stepping enabled on that system (i.e. don't
use the '-x' flag) to see how well the system keeps time.  What kind of
offset do you have after 1 or 2 hours of operation?
(2)  Check your drift value when running with time stepping disabled
(also check it with time stepping enabled).  You can do this with 'ntpq
-crv' where 'frequency' is the drift value or you can dump the drift
file (probably /var/lib/ntp/drift).  Note that the drift file is only
updated once every hour or so.

I encountered a problem on linux 2.6.18 in which disabling of time
stepping (using either '-x' or 'tinker step 0') caused the drift value
to run at or near +/-500ppm and subsequently caused a time offset of
several milliseconds.  If I allow time steps on that same system, it
runs with a drift 100ppm and maintains an offset 1ms.  I am using an
IRIG time source, so I expect high accuracy.  In my system, a time step
is never needed (i.e. the offset never grows larger than 128ms),
regardless of whether time stepping is enabled or disabled.  This
doesn't change the fact that it runs like crap with time stepping disabled.

Andy


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-10 Thread David Woolley
jkvbe wrote:
 
 I've started ntpd with the -x option and defined at run-time (using ntpdc) 3

Are you sure that you are using an unmodified recent version.  I seem to 
remember that even -x will step if the offset is more than 500ms.  If 
not you should address the question to SUSE.

There isn't enough time for the clock to slew 2000ms in 15 minutes.


 servers. The client machine has an offset of +/- 2s with the ntp servers.
 In the NTP log file I find the following statements (extracted out of a
 total of 98):

 We use Suse SLES10 (kernel version: 2.6.16).

Even if this in unmodified code, which version is it?

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-10 Thread Unruh
David Woolley [EMAIL PROTECTED] writes:

jkvbe wrote:
 
 I've started ntpd with the -x option and defined at run-time (using ntpdc) 3

Are you sure that you are using an unmodified recent version.  I seem to 
remember that even -x will step if the offset is more than 500ms.  If 
not you should address the question to SUSE.

There isn't enough time for the clock to slew 2000ms in 15 minutes.

Not at 500PPM limit but if you use the tick adjustment, it is more than
enough time. (The tick adjust limits out at 100,000PPM)



 servers. The client machine has an offset of +/- 2s with the ntp servers.
 In the NTP log file I find the following statements (extracted out of a
 total of 98):

 We use Suse SLES10 (kernel version: 2.6.16).

Even if this in unmodified code, which version is it?

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-10 Thread David Woolley
Unruh wrote:
 
 Not at 500PPM limit but if you use the tick adjustment, it is more than
 enough time. (The tick adjust limits out at 100,000PPM)
 
I believe ntpd assumes that it is constant.  Having a large tickadj 
causes poor resolution when using the user space discipline.

I suspect that Dr Mills would say that a high slew rate also compromises 
the system behaviour when you cascasde multiple strata.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-10 Thread Unruh
David Woolley [EMAIL PROTECTED] writes:

Unruh wrote:
 
 Not at 500PPM limit but if you use the tick adjustment, it is more than
 enough time. (The tick adjust limits out at 100,000PPM)
 
I believe ntpd assumes that it is constant.  Having a large tickadj 
causes poor resolution when using the user space discipline.

Yes, ntpd does limit all slews to 500PPM (the limit on the freq adjust
parameter in adjtimex on linux as well). I was just saying that a slew of 2
sec in 15min IS possible, although ntpd will not do it. 


I suspect that Dr Mills would say that a high slew rate also compromises 
the system behaviour when you cascasde multiple strata.

Not sure why, but maybe. 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


[ntp:questions] Time slew doesn't seem to work

2008-04-09 Thread jkvbe

Hi,

I've started ntpd with the -x option and defined at run-time (using ntpdc) 3
servers. The client machine has an offset of +/- 2s with the ntp servers.
In the NTP log file I find the following statements (extracted out of a
total of 98):

9 Apr 07:46:13 ntpd[19257]: time slew 1.781571 s
9 Apr 08:01:16 ntpd[19257]: time slew 1.781200 s
9 Apr 08:17:21 ntpd[19257]: time slew 1.781085 s
9 Apr 08:32:33 ntpd[19257]: time slew 1.781807 s
9 Apr 08:48:37 ntpd[19257]: time slew 1.782273 s
9 Apr 09:04:38 ntpd[19257]: time slew 1.781004 s
9 Apr 09:19:42 ntpd[19257]: time slew 1.781344 s
9 Apr 09:34:46 ntpd[19257]: time slew 1.780407 s
9 Apr 09:49:50 ntpd[19257]: time slew 1.778824 s

The times don't seem to converge.

When I shut down the ntp daemon and try to slew the time using ntpdate with
the -B option it does work. The time difference with the ntp servers
gradually declines.

We use Suse SLES10 (kernel version: 2.6.16).

Does anybody have an idea on what's going wrong?

Thanks,
Jan


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-09 Thread Richard B. Gilbert
jkvbe wrote:
 Hi,
 
 I've started ntpd with the -x option and defined at run-time (using ntpdc) 3
 servers. The client machine has an offset of +/- 2s with the ntp servers.
 In the NTP log file I find the following statements (extracted out of a
 total of 98):
 
 9 Apr 07:46:13 ntpd[19257]: time slew 1.781571 s
 9 Apr 08:01:16 ntpd[19257]: time slew 1.781200 s
 9 Apr 08:17:21 ntpd[19257]: time slew 1.781085 s
 9 Apr 08:32:33 ntpd[19257]: time slew 1.781807 s
 9 Apr 08:48:37 ntpd[19257]: time slew 1.782273 s
 9 Apr 09:04:38 ntpd[19257]: time slew 1.781004 s
 9 Apr 09:19:42 ntpd[19257]: time slew 1.781344 s
 9 Apr 09:34:46 ntpd[19257]: time slew 1.780407 s
 9 Apr 09:49:50 ntpd[19257]: time slew 1.778824 s
 
 The times don't seem to converge.
 
 When I shut down the ntp daemon and try to slew the time using ntpdate with
 the -B option it does work. The time difference with the ntp servers
 gradually declines.
 
 We use Suse SLES10 (kernel version: 2.6.16).
 
 Does anybody have an idea on what's going wrong?
 
 Thanks,
 Jan
 
 

Something is VERY wrong there.  It looks as if NTPD is making a massive 
correction every fifteen minutes or so!

If you reboot without running NTPD, and set the time manually, how badly 
does it drift?  If it gains or loses more than something like 43 seconds 
per day, NTPD will not work until you get your hardware fixed.  Gaining 
or losing 1 or 2 seconds per day without NTPD is the expected level of 
performance for a typical computer clock.  (You get the finest hardware 
that $2 US can buy!)


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-09 Thread Evandro Menezes
The 15-minute correction is due to the default configuration for
stepout.  In my experience, it's either due to another piece of
software to discipline the clock or a bad drift file, when just
erasing it and restarting NTP should help.

HTH

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-09 Thread Unruh
Richard B. Gilbert [EMAIL PROTECTED] writes:

jkvbe wrote:
 Hi,
 
 I've started ntpd with the -x option and defined at run-time (using ntpdc) 3
 servers. The client machine has an offset of +/- 2s with the ntp servers.
 In the NTP log file I find the following statements (extracted out of a
 total of 98):
 
 9 Apr 07:46:13 ntpd[19257]: time slew 1.781571 s
 9 Apr 08:01:16 ntpd[19257]: time slew 1.781200 s
 9 Apr 08:17:21 ntpd[19257]: time slew 1.781085 s
 9 Apr 08:32:33 ntpd[19257]: time slew 1.781807 s
 9 Apr 08:48:37 ntpd[19257]: time slew 1.782273 s
 9 Apr 09:04:38 ntpd[19257]: time slew 1.781004 s
 9 Apr 09:19:42 ntpd[19257]: time slew 1.781344 s
 9 Apr 09:34:46 ntpd[19257]: time slew 1.780407 s
 9 Apr 09:49:50 ntpd[19257]: time slew 1.778824 s
 
 The times don't seem to converge.
 
 When I shut down the ntp daemon and try to slew the time using ntpdate with
 the -B option it does work. The time difference with the ntp servers
 gradually declines.
 
 We use Suse SLES10 (kernel version: 2.6.16).
 
 Does anybody have an idea on what's going wrong?
 
 Thanks,
 Jan
 
 

Something is VERY wrong there.  It looks as if NTPD is making a massive 
correction every fifteen minutes or so!

If you reboot without running NTPD, and set the time manually, how badly 
does it drift?  If it gains or loses more than something like 43 seconds 
per day, NTPD will not work until you get your hardware fixed.  Gaining 
or losing 1 or 2 seconds per day without NTPD is the expected level of 
performance for a typical computer clock.  (You get the finest hardware 
that $2 US can buy!)

Well, no. 1 or 2 sec is 10-20PPM which is on the good side. 43 sec per day
is like 500PPM which is definitely on the high side. 5-10sec per day is
more typical. Note that chrony(on linux) will fix 43s/day. (It will use the fast
slew-- ie changing the tick size-- as well as the slow slew.) ntp as a
design decision decided that 500PPM was the max it would ever do.  NOt that I
advise a computer with 500PPM freq error. something is wrong and is liable
to be wrong in more places than just the clock. 





___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Time slew doesn't seem to work

2008-04-09 Thread Hal Murray

Well, no. 1 or 2 sec is 10-20PPM which is on the good side. 43 sec per day
is like 500PPM which is definitely on the high side. 5-10sec per day is
more typical. Note that chrony(on linux) will fix 43s/day. (It will use the 
fast
slew-- ie changing the tick size-- as well as the slow slew.) ntp as a
design decision decided that 500PPM was the max it would ever do.  NOt that I
advise a computer with 500PPM freq error. something is wrong and is liable
to be wrong in more places than just the clock. 

Don't overlook software when looking for things that can go wrong.

-- 
These are my opinions, not necessarily my employer's.  I hate spam.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions