[ntp:questions] Issue with peering and orphan mode

Conner, Matthew Thu, 06 Oct 2011 13:13:47 -0700

We are experiencing an issue using Orphan mode and peering in our ntpd 4.2.6p4 
set-up. With the loss of our stratum 1 time hosts, the stratum 2 are not 
properly choosing a primary time provider. Below is our ntp.conf for all 4 of 
the stratum 2 servers:


                tinker step .010 stepout 60 panic 0
                server tfds1 prefer minpoll 4 maxpoll 5 burst iburst
                server tfds2 minpoll 4 maxpoll 5 burst iburst
                server tfds3 minpoll 4 maxpoll 5 burst iburst

                peer timehost1 minpoll 4 maxpoll 5 burst iburst
                peer timehost2 minpoll 4 maxpoll 5 burst iburst
peer timehost3 minpoll 4 maxpoll 5 burst iburst
peer timehost4 minpoll 4 maxpoll 5 burst iburst

                tos orphan 4

                driftfile /etc/ntp/drift

The stratum 2 (timehost[1-4]) attempt to peer with the loss of the stratum 1 
(tfds[1-3]}. However, instead of them all staying at stratum 4 as was seen when 
using ntpd 4.2.4p7 (have other issues with 4.2.4p7 and need to update), the 
peers are dropping down 1 stratum from the peer they are locking to. Since they 
are peering to one another, this results in the timehosts slowly dropping in 
stratum as they attempt to stay 1 stratum below the locked to host. They 
continue to drop in stratum until reaching a stratum 16. Once they hit stratum 
16, all other hosts disconnect and the peers previously locking to the now 
stratum 16 host will unlock and jump back to a stratum 4. Once at least 1  peer 
jumps back to 4, the others will begin jumping to stratum 4-5. This process 
will repeat itself until the stratum 1 hosts are reconnected or the timehosts 
choose a primary. We have only once seen it stabilize with all 4 hosts and it 
took almost a full 24 hours to do so. With only 3 timehosts r
 unning, they will stabilize within minutes.

>From what we are able to tell, a primary peer is chosen when 3 of the 4 
>timehosts lock to the same peer.  When the 4th peer sees that the others are 
>all connected to it, it syncs to its internal clock and remains a stratum 4. 
>Is this correct, or is something else going on here?

Further questions:
Are the peers intentionally dropping below the orphan mode set stratum, or is 
that a bug?
Are we missing anything in ntp.conf to make orphan mode work properly?
Is this possibly just a limitation on the number of peers?
If working as intended, is there a way to force a primary peer quicker?

Note: We have tested without burst/iburst on the peer declarations as well as 
the removal of the timehost declaration of the host itself. None of these 
modifications had an impact.

Thanks,


Matt

_______________________________________________
questions mailing list
[email protected]
http://lists.ntp.org/listinfo/questions

[ntp:questions] Issue with peering and orphan mode

Reply via email to