Re: [ntp:questions] Time slew doesn't seem to work
On Apr 9, 4:40 pm, [EMAIL PROTECTED] (Andy) wrote: Two things: (1) Try running with time stepping enabled on that system (i.e. don't use the '-x' flag) to see how well the system keeps time. What kind of offset do you have after 1 or 2 hours of operation? (2) Check your drift value when running with time stepping disabled (also check it with time stepping enabled). You can do this with 'ntpq -crv' where 'frequency' is the drift value or you can dump the drift file (probably /var/lib/ntp/drift). Note that the drift file is only updated once every hour or so. I encountered a problem on linux 2.6.18 in which disabling of time stepping (using either '-x' or 'tinker step 0') caused the drift value to run at or near +/-500ppm and subsequently caused a time offset of several milliseconds. If I allow time steps on that same system, it runs with a drift 100ppm and maintains an offset 1ms. I am using an IRIG time source, so I expect high accuracy. In my system, a time step is never needed (i.e. the offset never grows larger than 128ms), regardless of whether time stepping is enabled or disabled. This doesn't change the fact that it runs like crap with time stepping disabled. Andy Note that the my set-up was an experiment to check how time slew worked. What I still don't get is why it works when using ntpdate. Also strange is the fact that ntpq actually reports the node as synched with an offset of 1.6s. Jan ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
[EMAIL PROTECTED] (Hal Murray) writes: OK, so there was no magic in that 500PPM limit. Is there a difference between the tick size adjustment and the frequency adjustment (CPU-counter-to-time conversion factor). Limiting the slew rate to something like that means that software that is timing things with code like: grab time, do something, grab time, subtract gets a sane answer if it happens to be running while somebody adjusts the time. Do you know any code that cares if that is wrong by 10% (which would be 10PPM) Ie, is 10% error insane? Is 1% (1PPM)? Ie, .05% seems a bit extreme for that. -- These are my opinions, not necessarily my employer's. I hate spam. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
Do you know any code that cares if that is wrong by 10% (which would be 10PPM) Ie, is 10% error insane? Is 1% (1PPM)? Ie, .05% seems a bit extreme for that. I used to do a lot of performance measurements. For the stuff I was doing, 10% is easy to spot. 1% is borderline. -- These are my opinions, not necessarily my employer's. I hate spam. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
David, The original model implemented in the Alpha kernel does not step the clock backward unless the step is greater than two seconds. Rather, it stops the clock and advances one microsecond at each read. This applies whether NTP slews or steps. Various ports of that code have broken this model in every possible way. The 500-PPM slew once was common in the ubiquitous Unix kernel. The value was chosen as a compromise between short slew time for relatively small adjustments and moderate resolution during the slew interval. This works out to 5 microseconds per tick with a 100-Hz clock and a 5-us jitter. In truth this could be changed to anything you want, as long as the value is fixed. Some kernelmongers, including SGI and Linux, have put up fancy code designed to reduce the slew time for large adjustments. This inserts and additional pole in the clock discipline impulse response which results in unstable behavior for adjustments over half a second or so. The default step threshold is 128 ms; the -x command line option sets it to 600 s and does nothing else. The 600-s value was chosen as the expected accuracy with eyeball and wristwatch. If the extra pole is not there, the original response is preserved over that range and largely independent of the slew value itself. Say you change from 5 us per tick to 1 ms per tick or 100 ms/s. This would amortize a 600-s adjustment in almost two hours and reduce the resolution to 1 ms. If your extended network requires synchronization to better than one second, in all but the last second of that slew the network would not be synchronized. Dave David Woolley wrote: Unruh wrote: Not at 500PPM limit but if you use the tick adjustment, it is more than enough time. (The tick adjust limits out at 100,000PPM) I believe ntpd assumes that it is constant. Having a large tickadj causes poor resolution when using the user space discipline. I suspect that Dr Mills would say that a high slew rate also compromises the system behaviour when you cascasde multiple strata. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
David L. Mills [EMAIL PROTECTED] writes: David, The original model implemented in the Alpha kernel does not step the clock backward unless the step is greater than two seconds. Rather, it stops the clock and advances one microsecond at each read. This applies whether NTP slews or steps. Various ports of that code have broken this model in every possible way. The 500-PPM slew once was common in the ubiquitous Unix kernel. The value was chosen as a compromise between short slew time for relatively small adjustments and moderate resolution during the slew interval. This works out to 5 microseconds per tick with a 100-Hz clock and a 5-us jitter. In truth this could be changed to anything you want, as long as the value is fixed. OK, so there was no magic in that 500PPM limit. Is there a difference between the tick size adjustment and the frequency adjustment (CPU-counter-to-time conversion factor). Some kernelmongers, including SGI and Linux, have put up fancy code designed to reduce the slew time for large adjustments. This inserts and additional pole in the clock discipline impulse response which results in unstable behavior for adjustments over half a second or so. That is of course not good. I am a bit uncertain why that instability would depend on amplitude. Is the response a non-linear response? For linear responses the amplitude should not matter. But from your words it sounds like the reponse is amplitude dependent which would of course be non-linear. And if it is non-linear, once it gets near (1 sec?) the right time, that linear stable response should take over. By the way, do you happen to know how the Linux kernel inpliments the adjtime system call (adjtimex ADJ_OFFSET_SINGLESHOT) does its slewing? The default step threshold is 128 ms; the -x command line option sets it to 600 s and does nothing else. The 600-s value was chosen as the expected accuracy with eyeball and wristwatch. If the extra pole is not there, the original response is preserved over that range and largely independent of the slew value itself. Say you change from 5 us per tick to 1 ms per tick or 100 ms/s. This would amortize a 600-s adjustment in almost two hours and reduce the resolution to 1 ms. If your extended network requires synchronization to better than one second, in all but the last second of that slew the network would not be synchronized. This paragraph confuses me. If the clock is 600s out, it is way out no matter how you slew it back. What do you mean reduce the resolution to 1ms? The resolution is still 1usec and the accuracy is a few hundreds of seconds. And if the clock is out by 600s the network will not be synchronised to true time. Or perhaps you mean something else than that with network would not be synchronized. Thanks Dave David Woolley wrote: Unruh wrote: Not at 500PPM limit but if you use the tick adjustment, it is more than enough time. (The tick adjust limits out at 100,000PPM) I believe ntpd assumes that it is constant. Having a large tickadj causes poor resolution when using the user space discipline. I suspect that Dr Mills would say that a high slew rate also compromises the system behaviour when you cascasde multiple strata. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
David L. Mills wrote: The default step threshold is 128 ms; the -x command line option sets it to 600 s and does nothing else. The 600-s value was chosen as the Oops, I thought the units there were ms, which invalidates some of what I said. I was probably partly confused by the disabling of the kernel discipline at 0.5 seconds. expected accuracy with eyeball and wristwatch. If the extra pole is not With a radio controlled wrist watch, 200ms is easily possible. I'd suggest that 10 minutes is about the 95 percentile for when the person setting the time doesn't care about the time. In that context, my practical experience is that most times are set within 5 minutes. As someone who regularly uses public transport, I would be disappointed if I could get the time to 30 seconds, by reading my wrist watch. 600 seconds is 60% of the drop dead value, so I'm not clear why the two weren't made the same. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
jkvbe wrote: Hi, I've started ntpd with the -x option and defined at run-time (using ntpdc) 3 servers. The client machine has an offset of +/- 2s with the ntp servers. In the NTP log file I find the following statements (extracted out of a total of 98): 9 Apr 07:46:13 ntpd[19257]: time slew 1.781571 s 9 Apr 08:01:16 ntpd[19257]: time slew 1.781200 s 9 Apr 08:17:21 ntpd[19257]: time slew 1.781085 s 9 Apr 08:32:33 ntpd[19257]: time slew 1.781807 s 9 Apr 08:48:37 ntpd[19257]: time slew 1.782273 s 9 Apr 09:04:38 ntpd[19257]: time slew 1.781004 s 9 Apr 09:19:42 ntpd[19257]: time slew 1.781344 s 9 Apr 09:34:46 ntpd[19257]: time slew 1.780407 s 9 Apr 09:49:50 ntpd[19257]: time slew 1.778824 s The times don't seem to converge. When I shut down the ntp daemon and try to slew the time using ntpdate with the -B option it does work. The time difference with the ntp servers gradually declines. We use Suse SLES10 (kernel version: 2.6.16). Does anybody have an idea on what's going wrong? Thanks, Jan Two things: (1) Try running with time stepping enabled on that system (i.e. don't use the '-x' flag) to see how well the system keeps time. What kind of offset do you have after 1 or 2 hours of operation? (2) Check your drift value when running with time stepping disabled (also check it with time stepping enabled). You can do this with 'ntpq -crv' where 'frequency' is the drift value or you can dump the drift file (probably /var/lib/ntp/drift). Note that the drift file is only updated once every hour or so. I encountered a problem on linux 2.6.18 in which disabling of time stepping (using either '-x' or 'tinker step 0') caused the drift value to run at or near +/-500ppm and subsequently caused a time offset of several milliseconds. If I allow time steps on that same system, it runs with a drift 100ppm and maintains an offset 1ms. I am using an IRIG time source, so I expect high accuracy. In my system, a time step is never needed (i.e. the offset never grows larger than 128ms), regardless of whether time stepping is enabled or disabled. This doesn't change the fact that it runs like crap with time stepping disabled. Andy ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
jkvbe wrote: I've started ntpd with the -x option and defined at run-time (using ntpdc) 3 Are you sure that you are using an unmodified recent version. I seem to remember that even -x will step if the offset is more than 500ms. If not you should address the question to SUSE. There isn't enough time for the clock to slew 2000ms in 15 minutes. servers. The client machine has an offset of +/- 2s with the ntp servers. In the NTP log file I find the following statements (extracted out of a total of 98): We use Suse SLES10 (kernel version: 2.6.16). Even if this in unmodified code, which version is it? ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
David Woolley [EMAIL PROTECTED] writes: jkvbe wrote: I've started ntpd with the -x option and defined at run-time (using ntpdc) 3 Are you sure that you are using an unmodified recent version. I seem to remember that even -x will step if the offset is more than 500ms. If not you should address the question to SUSE. There isn't enough time for the clock to slew 2000ms in 15 minutes. Not at 500PPM limit but if you use the tick adjustment, it is more than enough time. (The tick adjust limits out at 100,000PPM) servers. The client machine has an offset of +/- 2s with the ntp servers. In the NTP log file I find the following statements (extracted out of a total of 98): We use Suse SLES10 (kernel version: 2.6.16). Even if this in unmodified code, which version is it? ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
Unruh wrote: Not at 500PPM limit but if you use the tick adjustment, it is more than enough time. (The tick adjust limits out at 100,000PPM) I believe ntpd assumes that it is constant. Having a large tickadj causes poor resolution when using the user space discipline. I suspect that Dr Mills would say that a high slew rate also compromises the system behaviour when you cascasde multiple strata. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
David Woolley [EMAIL PROTECTED] writes: Unruh wrote: Not at 500PPM limit but if you use the tick adjustment, it is more than enough time. (The tick adjust limits out at 100,000PPM) I believe ntpd assumes that it is constant. Having a large tickadj causes poor resolution when using the user space discipline. Yes, ntpd does limit all slews to 500PPM (the limit on the freq adjust parameter in adjtimex on linux as well). I was just saying that a slew of 2 sec in 15min IS possible, although ntpd will not do it. I suspect that Dr Mills would say that a high slew rate also compromises the system behaviour when you cascasde multiple strata. Not sure why, but maybe. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
[ntp:questions] Time slew doesn't seem to work
Hi, I've started ntpd with the -x option and defined at run-time (using ntpdc) 3 servers. The client machine has an offset of +/- 2s with the ntp servers. In the NTP log file I find the following statements (extracted out of a total of 98): 9 Apr 07:46:13 ntpd[19257]: time slew 1.781571 s 9 Apr 08:01:16 ntpd[19257]: time slew 1.781200 s 9 Apr 08:17:21 ntpd[19257]: time slew 1.781085 s 9 Apr 08:32:33 ntpd[19257]: time slew 1.781807 s 9 Apr 08:48:37 ntpd[19257]: time slew 1.782273 s 9 Apr 09:04:38 ntpd[19257]: time slew 1.781004 s 9 Apr 09:19:42 ntpd[19257]: time slew 1.781344 s 9 Apr 09:34:46 ntpd[19257]: time slew 1.780407 s 9 Apr 09:49:50 ntpd[19257]: time slew 1.778824 s The times don't seem to converge. When I shut down the ntp daemon and try to slew the time using ntpdate with the -B option it does work. The time difference with the ntp servers gradually declines. We use Suse SLES10 (kernel version: 2.6.16). Does anybody have an idea on what's going wrong? Thanks, Jan ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
jkvbe wrote: Hi, I've started ntpd with the -x option and defined at run-time (using ntpdc) 3 servers. The client machine has an offset of +/- 2s with the ntp servers. In the NTP log file I find the following statements (extracted out of a total of 98): 9 Apr 07:46:13 ntpd[19257]: time slew 1.781571 s 9 Apr 08:01:16 ntpd[19257]: time slew 1.781200 s 9 Apr 08:17:21 ntpd[19257]: time slew 1.781085 s 9 Apr 08:32:33 ntpd[19257]: time slew 1.781807 s 9 Apr 08:48:37 ntpd[19257]: time slew 1.782273 s 9 Apr 09:04:38 ntpd[19257]: time slew 1.781004 s 9 Apr 09:19:42 ntpd[19257]: time slew 1.781344 s 9 Apr 09:34:46 ntpd[19257]: time slew 1.780407 s 9 Apr 09:49:50 ntpd[19257]: time slew 1.778824 s The times don't seem to converge. When I shut down the ntp daemon and try to slew the time using ntpdate with the -B option it does work. The time difference with the ntp servers gradually declines. We use Suse SLES10 (kernel version: 2.6.16). Does anybody have an idea on what's going wrong? Thanks, Jan Something is VERY wrong there. It looks as if NTPD is making a massive correction every fifteen minutes or so! If you reboot without running NTPD, and set the time manually, how badly does it drift? If it gains or loses more than something like 43 seconds per day, NTPD will not work until you get your hardware fixed. Gaining or losing 1 or 2 seconds per day without NTPD is the expected level of performance for a typical computer clock. (You get the finest hardware that $2 US can buy!) ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
The 15-minute correction is due to the default configuration for stepout. In my experience, it's either due to another piece of software to discipline the clock or a bad drift file, when just erasing it and restarting NTP should help. HTH ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
Richard B. Gilbert [EMAIL PROTECTED] writes: jkvbe wrote: Hi, I've started ntpd with the -x option and defined at run-time (using ntpdc) 3 servers. The client machine has an offset of +/- 2s with the ntp servers. In the NTP log file I find the following statements (extracted out of a total of 98): 9 Apr 07:46:13 ntpd[19257]: time slew 1.781571 s 9 Apr 08:01:16 ntpd[19257]: time slew 1.781200 s 9 Apr 08:17:21 ntpd[19257]: time slew 1.781085 s 9 Apr 08:32:33 ntpd[19257]: time slew 1.781807 s 9 Apr 08:48:37 ntpd[19257]: time slew 1.782273 s 9 Apr 09:04:38 ntpd[19257]: time slew 1.781004 s 9 Apr 09:19:42 ntpd[19257]: time slew 1.781344 s 9 Apr 09:34:46 ntpd[19257]: time slew 1.780407 s 9 Apr 09:49:50 ntpd[19257]: time slew 1.778824 s The times don't seem to converge. When I shut down the ntp daemon and try to slew the time using ntpdate with the -B option it does work. The time difference with the ntp servers gradually declines. We use Suse SLES10 (kernel version: 2.6.16). Does anybody have an idea on what's going wrong? Thanks, Jan Something is VERY wrong there. It looks as if NTPD is making a massive correction every fifteen minutes or so! If you reboot without running NTPD, and set the time manually, how badly does it drift? If it gains or loses more than something like 43 seconds per day, NTPD will not work until you get your hardware fixed. Gaining or losing 1 or 2 seconds per day without NTPD is the expected level of performance for a typical computer clock. (You get the finest hardware that $2 US can buy!) Well, no. 1 or 2 sec is 10-20PPM which is on the good side. 43 sec per day is like 500PPM which is definitely on the high side. 5-10sec per day is more typical. Note that chrony(on linux) will fix 43s/day. (It will use the fast slew-- ie changing the tick size-- as well as the slow slew.) ntp as a design decision decided that 500PPM was the max it would ever do. NOt that I advise a computer with 500PPM freq error. something is wrong and is liable to be wrong in more places than just the clock. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Time slew doesn't seem to work
Well, no. 1 or 2 sec is 10-20PPM which is on the good side. 43 sec per day is like 500PPM which is definitely on the high side. 5-10sec per day is more typical. Note that chrony(on linux) will fix 43s/day. (It will use the fast slew-- ie changing the tick size-- as well as the slow slew.) ntp as a design decision decided that 500PPM was the max it would ever do. NOt that I advise a computer with 500PPM freq error. something is wrong and is liable to be wrong in more places than just the clock. Don't overlook software when looking for things that can go wrong. -- These are my opinions, not necessarily my employer's. I hate spam. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions