vwvr6vw wrote: > I am using AIX 5.4 and NTP version 3.4 from what I can tell, which > came with the operating system. ntpq, "v" option returns 3.4y. > > This is not a startup condition. xntpd has been running for a while > before the steps. > > I realize that I have other issues causing the need for steps. I am > investigating this problem as well. I only have one time server, so I > have network issues or something that is causing the need for the > steps. Underneath the one server, we have several of our processors > acting as servers to the rest of our system. If one of our servers is > having interrupt delays from disk activity or some other issue, > perhaps that is causing the dicrepency. Does anyone know the best way > to debug such a situation? > > But I still need to understand why NTP is stepping when the > documentation I have says that it should not with the "-x" option. We > have a distributed processing system with many processors. It is > imperative that time does not step on any of our processors or our > software will detect heartbeat problems. That is what is currently > happening, so I know that real steps are occuring and not just steps > in the ntp.log file. > > >
If you have only one server and it is unstable, that would cause stepping on its clients. As Richard suggested, the output of "ntpq -p" would be helpful. In this case, the output of "ntpq -p [your server]" would also be helpful. You can work back through the chain of servers to find out more about where things are going south. Another helpful data point would be "(x)ntpdc -c loopinfo" showing the frequency/drift rate. If it is consistently large, it indicates a problem. "-x" can often make things worse, causing large corrections to be necessary that might not otherwise be and it should never be used until a system has first been run without it for several days to stabilize on a characteristic drift rate. Under normal operation, (x)ntpd will only step if the offset is larger than 128 msec. A well-configured, well behaved NTP network should never run into that. However, if the system is far off the "correct" time when it starts "-x" can prevent it from getting to the correct time, causing repeated attempts to step. For this reason, most OS's run ntpdate at boot to initially set the time to within a few milliseconds. If that part of your boot is not configured correctly, that could cause attempts to step some time after xntpd starts running. Another common cause is if you have servers configured, including your local undisciplined clock, that do not agree on the time (or if your one or more of your servers itself have that problem). If, as you say, this occurs after things have been stable for some time, you may have a problem with suddenly increased latency in one direction that causes xntpd to calculate an offset larger than actually exists. xntpd adds 1/2 the RTT to the time reported by the server to determine the time that should be applied to the client. If the network delay is longer in one direction than in the other, this can cause a calculated offset different from the real difference between the 2 clocks of 1/2 the difference in the delays in the 2 directions. Over long distances or on heavily loaded systems or networks, this can be significant. -Tom _______________________________________________ questions mailing list [email protected] https://lists.ntp.isc.org/mailman/listinfo/questions
