[EMAIL PROTECTED] wrote: > I'm trying to configure NTP for a cluster. The cluster has a "master" > node and all nodes know which node is master. My NTP configuration is > to have the master node the only node that contacts an external NTP > server, with the other nodes using the master as their time server. > > The master node's ntp.conf: > > server 10.54.141.76 # A time server on another subnet > server 127.127.1.1 > fudge 127.127.1.1 stratum 10 > driftfile /etc/ntp.drift > > The non-master nodes' ntp.conf: > > server 192.168.140.40 # The master node > server 127.127.1.1 > fudge 127.127.1.1 stratum 15 > driftfile /etc/ntp.drift > > So, the master node uses an external time server, and can also uses > its internal clock as a stratum 10 time server. > > The non-master nodes' use the master node as their time server and can > use their internal clock as a stratum 15 time server. > > By setting the master node's system clock to stratum 10, and the non- > master nodes' system clocks to stratum 15, I would expect that the > master node would always be a lower stratum time server than the other > nodes no matter if the master node is able to maintain a connection to > its external time server or not. > > This worked as expected most of the time. I used a simple two node > cluster for testing, with both nodes in the same subnet, and on the > same switch. > > However, on occasion during the first few hours ntpd was running on > the nodes, the non-master node would use its system clock as the > system peer even though it had a higher stratum level than the master > node. During a course of an hour the non-master node would switch > between using the master node or its system clock as the system peer. > > Here's some output from ntpq run on the non-master node (node-2): > > chil43-2# ntpq -p > remote refid st t when poll reach delay offset jitter > ====================================================================== > *node-1 10.54.141.76 3 u 92 128 377 0.089 472.785 109.277 > LOCAL(1) LOCAL(1) 15 l 18 64 377 0.000 0.000 0.002 > > ... And a few minutes later ... > > chil43-2# ntpq -p > remote refid st t when poll reach delay offset jitter > ====================================================================== > node-1 10.54.141.76 3 u 75 256 17 0.085 368.154 35.584 > *LOCAL(1) LOCAL(1) 15 l 11 64 77 0.000 0.000 0.002 > > Eventually, things would settle down and the master node would remain > the system peer for the non-mater node continuously > > A few questions: > 1) Why would ntpd choose to use its higher stratum system clock rather > than a lower stratum server and why would ntpd cycle between them?
Because stratum is a relatively low priority criterion. Note, for example, the far lower jitter on your local clock, the smaller delay, smaller offset, etc. > > 2) In the non-master node's ntp.conf, should I just remove the listing > of its system clock as a server? Is it completely unnecessary as the > non-master node is only a client? Yes. > > 3) In general, is the way I'm setting up the ntp.confs for the nodes > on the cluster reasonable? No. > > (Note: there's been no master node change during the testing. node-1 > has always remained master. And I have scripts that will reconfigure > ntp.conf on each node if the master changes.) > > Many thanks. > > DD > Your problem occurs becuase you have only 2 servers and one of them is phoney. How would it know which one to believe - a correct local clock and a misbehaving remote one or a correct remote one and a wrong local one? Either get rid of the local undisciplined clock or add at least 2 more real ones (which is recommended practice anyway). -Tom _______________________________________________ questions mailing list [email protected] https://lists.ntp.isc.org/mailman/listinfo/questions
