On Jul 2, 10:07 am, [EMAIL PROTECTED] wrote: > On Jul 1, 10:57 pm, David Woolley > > > > > > <[EMAIL PROTECTED]> wrote: > > [EMAIL PROTECTED] wrote: > > > > The topology looks like this: > > > > Ext.NTP Server A > > > | > > > | > > > Ext.NTP Server B Ext.NTP Server C > > > | | > > > |---------------------------------------| > > > | > > > | > > > NTP Server D > > > | > > > | > > > NTP Client E > > > > The problem is that my NTP client E rejected its selected NTP server > > > D, which lead to not syncing, leading to offset drifting on NTP Client > > > E. I think I have located the lack of sync to a too large "root > > > dispersion" value sent from the NTP server D. Its value is 1991 as > > > seen below: > > > > # ntpq -c"rv 51316" > > > status=9014 reach, conf, 1 event, event_reach, > > > srcadr=cliente, srcport=123, dstadr=169.254.5.34, dstport=123, > > > leap=00, stratum=2, precision=-16, rootdelay=1.785, > > > rootdispersion=1991.028, refid=10.112.1.14, reach=377, unreach=0, > > > Yup. rootdispersion is high enough for rejection. > > > > hmode=3, pmode=4, hpoll=6, ppoll=6, flash=00 ok, keyid=0, > > > offset=3466396.411, delay=0.567, dispersion=0.956, jitter=37.305, > > > reftime=cc0328d1.feabf9bf Wed, Jun 18 2008 9:25:21.994, > > > org=cc0329cb.5b962c81 Wed, Jun 18 2008 9:29:31.357, > > > rec=cc031c40.f62d86e1 Wed, Jun 18 2008 8:31:44.961, > > > xmt=cc031c40.f5f9b77c Wed, Jun 18 2008 8:31:44.960, > > > filtdelay= 0.57 0.53 0.57 0.52 0.56 0.68 0.52 > > > 1.11, > > > filtoffset= 3466396 3466359 3466320 3466282 3466235 3466198 3466160 > > > 3466123, > > > This exceeds the panic threshold, so, unless this is first time and you > > have -g, NTP will abort if accepts this offset. > > > > filtdisp= 0.03 0.98 1.95 2.93 3.92 4.86 5.81 > > > 6.77 > > > > Upon looking at ntpq -c "as" command on the Client E, the server is in > > > condition reject, most likely due to the high root dispersion. > > > Correct? > > > > # ntpq -c"as" > > > > ind assID status conf reach auth condition last_event cnt > > > =========================================================== > > > 1 51316 9014 yes yes none reject reachable 1 > > > > The problem exists when having the NTP server D to sync with an > > > external NTP server C (stratum 1) having its own system clock as > > > reference. > > > > On NTP Server D: > > > > # ntpq -c "as" > > > ind assID status conf reach auth condition last_event cnt > > > =========================================================== > > > 1 62852 9414 yes yes none candidat reachable 1 > > > 2 62853 9614 yes yes none sys.peer reachable 1 > > > > Upon looking in more detail at the two associations above: > > > > # ntpq -c "rv 62853" > > > status=9614 reach, conf, sel_sys.peer, 1 event, event_reach, > > > srcadr=10.112.1.14, srcport=123, dstadr=10.112.2.90, dstport=123, > > > leap=00, stratum=1, precision=-17, rootdelay=0.000, > > > rootdispersion=10.284, refid=LCL, reach=377, unreach=0, hmode=3, > > > pmode=4, hpoll=10, ppoll=10, flash=00 ok, keyid=0, offset=-1128.193, > > > delay=1.226, dispersion=14.849, jitter=224.514, > > > reftime=cc12fb96.0a522000 Mon, Jun 30 2008 9:28:38.040, > > > org=cc12fbad.30179000 Mon, Jun 30 2008 9:29:01.187, > > > rec=cc12fbae.5110fdd4 Mon, Jun 30 2008 9:29:02.316, > > > xmt=cc12fbae.50bd8b10 Mon, Jun 30 2008 9:29:02.315, > > > filtdelay= 1.23 1.40 1.68 1.50 1.19 1.28 1.10 1.27, > > > filtoffset= -1128.1 -903.68 -1144.7 -1133.5 -814.17 -1125.2 -1125.2 > > > -921.92, > > > filtdisp= 0.04 15.38 30.73 46.10 61.46 76.82 92.21 107.59 > > > > # ntpq -c "rv 62852" > > > status=9414 reach, conf, sel_candidat, 1 event, event_reach, > > > srcadr=10.112.1.13, srcport=123, dstadr=10.112.2.90, dstport=123, > > > leap=00, stratum=2, precision=-17, rootdelay=6.454, > > > rootdispersion=15.533, refid=10.109.1.164, reach=377, unreach=0, > > > hmode=3, pmode=4, hpoll=10, ppoll=10, flash=00 ok, keyid=0, > > > offset=1147.347, delay=1.298, dispersion=14.874, jitter=0.641, > > > reftime=cc12f9fa.ed579000 Mon, Jun 30 2008 9:21:46.927, > > > org=cc12fbd3.785bc000 Mon, Jun 30 2008 9:29:39.470, > > > rec=cc12fbd2.52cdc1fb Mon, Jun 30 2008 9:29:38.323, > > > xmt=cc12fbd2.52726f6f Mon, Jun 30 2008 9:29:38.322, > > > filtdelay= 1.30 1.15 1.47 1.24 1.29 2.20 1.54 1.45, > > > filtoffset= 1147.35 1147.99 1371.63 1132.04 1143.24 1460.54 1150.79 > > > 1150.61, > > > Note that the two servers differ by more than two seconds. I'm not sure > > why they aren't both rejected as false tickers (in systems with LCL > > clocks, it is important to be able to outvote the local clock with > > enough real clocks, and one is far too few to do that! > > > I think rv 0 on D would be instructive, but it looks to me as though D > > is either rejecting both C and B, or it is trying to jump between them > > and the resulting huge jitter is causing the root dispersion to go > > through the roof. (Rather than jumping, it may be using one and > > rejecting the other in its popcorn filter.) > > > > filtdisp= 0.04 15.41 30.79 46.18 61.57 76.91 92.26 107.63 > > > > ...I can see that the one selected (NTP server C, i.e. AssId: 62853) > > > has a ref.id of LCL (meaning it is syncing to its local system clock?) > > > LCL is local clock, which means that any reference clock it actually has > > is broken. > > > Both are selected. The one with the lowest stratum gets to donate its > > stratum and quality data, but they are both survivors, and both will be > > used to calculate the time. > > > I would consider a server claiming to sync to LCL and having stratum 1 > > to be badly misconfigured. Undisciplined local clocks should always > > have the highest stratum that just works, so that they are last choice > > and don't propagate too far. The default for LCL is maybe OK if the > > machine is accurately synchronised by some non-NTP means and steps are > > taken to disable NTP if that source fails. Going lower than the default > > really is a bad idea, and the fact that it is lower than you non-LCL > > server is why you have the anomaly here. > > > > while the other one, the candidate (NTP server B, stratum 2) is having > > > NTP server A as ref.id, meaning syncing it syncs to NTP server A. > > > > Again, when having NTP server D to primarily sync with NTP server C, > > > the "root dispersion" apparently gets too high, while having the NTP > > > server D to sync with NTP server B is fixing the problem. > > > > My question is why the root dispersion becomes too high upon syncing > > > to an external server having its own local system clock as reference > > > (i.e. NTP server C)? > > > Because C and B are not getting times traceable to the same source and > > there isn't an X and Y synchronised to the same source as B, to outvote C.- > > Hide quoted text - > > > - Show quoted text -- Hide quoted text - > > > - Show quoted text - > > Ok, thanks for the quick reply! > > Just to clarify even more. Please correct me if I'm wrong: > > Because B and C are not getting their times traceable to the same > source, NTP on D have difficulties to choose between these two time > sources (as seen, B and C differs more than 2 secs). They are both > survivors and both are used in time calculation, due to lack of reason > to outvote C. > > The one with the lowest stratum (i.e C) gets to donate its quality > data, including a hugh jitter, resulting in root dispersion to go > through the roof. And a high root dispersion value gets NTP on E to > reject NTP on D. > > Correct? > > BR, > Martin- Hide quoted text - > > - Show quoted text -
Additional question: As seen in the logs, server B has a quite low jitter while server C has huge jitter. Why is that? Is it because of a shaky local clock on server C or is it because of server C lacks a reliable source? Thanks in advance! BR, Martin _______________________________________________ questions mailing list [email protected] https://lists.ntp.org/mailman/listinfo/questions
