In article <[EMAIL PROTECTED]>, [EMAIL PROTECTED] (Zembower, Kevin) wrote:
> Dec 8 09:03:06 cn2 ntpd[16955]: time reset +3.120367 s > Dec 8 09:23:33 cn2 ntpd[16955]: time reset +3.503628 s You have a serious problem with your machine running slow. On Linux this is often due to lost clock interrupts as a result of using a higher HZ figure in the kernel than the disk driver can support. It could also mean a broken motherboard clock, the effects of power management, a wrong value having been calculated for the CPU frequency, etc. The fact that you report high but intermediate offsets tends to rule out the possibility that you have coflicting clock synchronisation software. > *ntp1.usno.navy. .USNO. 1 u 60 64 177 8.567 827.174 > 551.616 Do you meet the rules of engagement conditions for using a stratum one server (although this one tends to be overloaded and not particularly good as a result)? In any case, note that the offseet has already reached 827ms. > +trane.wu-wien.a 195.13.1.153 3 u 57 64 177 125.292 841.188 > 548.251 > +221-15-178-69.g 140.142.16.34 2 u 50 64 177 107.300 1212.00 > 395.490 These two servers are too far away to be useful, given that you can achieve single figure delays to other servers. > I notice the problem here, and if I run 'watch ntpq -p.' Seldom is my > reachability 377, and it frequently and inexplicitly drops to 1 as I'm This is because the offset becomes unacceptably high, and a step is initiated, before it gets to that point. Whenever the clock is stepped (which is never desirable, after the initial synchronisation) the states of the servers are discarded and ntpd starts over (but with updated frequency and offset estimates). > Is this normal behavior for NTP, to frequently lose the ability to reach > a timeserver? If not, how can I troubleshoot it further? What's probably happening here is that each server is rejected in turn. Server hopping does happen, but not like this. > These time resets seem rather large to me. Is this normal, too? This is the fundamental symptom. > Are there any other diagnostics that I could run to help identify any > problem? Check if the rate of loss correlates with any form of system activity (particularly IDE disks). Disable any power management features. Make sure that HZ=100 or rebuild the kernel to make it so. Check the clock behaviour running MS-DOS or the oldest available Windows (basically to avoid all device activity and use quite large ticks. If it loses at more than 450ppm, get it working in that environement before running the normal system (actually, you can correct pure frequency errors of more than this, but a good machine should be within about 20ppm and the worst I've seen is about 300ppm, so this large an error probably indicates a system that is too unreliable for the job. Check the frequency correction. If it is not on the, 500ppm, end stop, it may indicate that your time loss is intermittent. If you meet the conditions for using stratum one public servers, it would probably be a good idea to dedicate a machine to being the site stratume two server. This can be relatively low specfication (well, actually very low) which means that it is much less likely to suffer from the more technical causes of this sort of problem. Read the recent thread that concluded that a power management related parameter can sometimes avoid a problem. _______________________________________________ questions mailing list [email protected] https://lists.ntp.isc.org/mailman/listinfo/questions
