Re: Stratum one autonomy and assumptions about GPS
Gary E. Miller: > > 1. GPS outage length and frequencies are decreasing > > Don't care. If you need your NTP to work, you need to know it is working. > Otherwise failure are not noticed. OK, the test for "know it is working" is: you have lock, or you had lock less than x seconds ago where x is a worst-case of your drift model to whatever confidence interfal you want to fix. > > 3. There's a lower bound below which outages don't matter; we may be > > there. > > I don't agree. I monitor all my services 24x7, and I do get NTP > problems in my logs. And you also said in recent mail that you don't work with the kind of hardware a serious autonomy-seeker would use. So *your* NTP problems are not determinative, though they could be useful input data for improving error-estimation techniques. > > Any given fixed accuracy target for deviation from UTC, combined with > > a maximum crystal drift rate, defines a longest tolerable GPS outage. > > Not the majority failure mode. That's an interesting statement. What *is*, in your experience, the dominant failure mode. > > We may already be at a technological place where GPS outages don't > > bust the tolerable-error budget, even with cheap hardware. If we > > aren't, we'll probably be there soon. > > We can't define a single tolerable error budget. We can provide some > ranges of options for the user. And that's exactly what I've been pushing towards - to develop some statistical modeling on the basis of which we can make estimates to whatever confidence bound the user wants to set as a parameter. -- http://www.catb.org/~esr/;>Eric S. Raymond signature.asc Description: PGP signature ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Stratum one autonomy and assumptions about GPS
Yo Hal! On Thu, 25 Aug 2016 15:30:25 -0700 Hal Murraywrote: > e...@thyrsus.com said: > > I have a USB thermometer on order, they're cheap. Might I suggest > > you get one and repeat this experiment, actually plotting your > > temperature variation? > > Most CPU chips include a temperature sensor. Which I have found does not correlate with any of my NTP data. I now have a lot of data, just need to finish up the ntpviz temp module. RGDS GARY --- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 g...@rellim.com Tel:+1 541 382 8588 pgps41ssZY_xi.pgp Description: OpenPGP digital signature ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Stratum one autonomy and assumptions about GPS
Yo Eric! On Thu, 25 Aug 2016 00:19:46 -0400 "Eric S. Raymond"wrote: > This was going to be a note to just Hal originally, but it will do the > rest of the team no harm to know more about the scenarios and > assumptions driving some of my design choices. > > Hal objected (off list) to me drawing a conclusion from today's > offset multiplot that check servers aren't necessary when you have > a local GPS - a Stratum 1 really can run autonomously. He said, > correctly of course, that the check servers aren't there to improve > time accuracy when the GPS has sat lock, but to backstop the GPS when > it flakes out. > > I shall now discuss three interlocking reasons this possibility does > not loom as large in my mind as it does in Hal's. > > 1. GPS outage length and frequencies are decreasing Don't care. If you need your NTP to work, you need to know it is working. Otherwise failure are not noticed. > 2. The autonomy scenarios I think about are not hobbyist-budget > productions Yeah, and the big biys REALLY need to know their NTP is right. > 3. There's a lower bound below which outages don't matter; we may be > there. I don't agree. I monitor all my services 24x7, and I do get NTP problems in my logs. > Any given fixed accuracy target for deviation from UTC, combined with > a maximum crystal drift rate, defines a longest tolerable GPS outage. Not the majority failure mode. > We may already be at a technological place where GPS outages don't > bust the tolerable-error budget, even with cheap hardware. If we > aren't, we'll probably be there soon. We can't define a single tolerable error budget. We can provide some ranges of options for the user. RGDS GARY --- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 g...@rellim.com Tel:+1 541 382 8588 pgpaaxhDoxlFQ.pgp Description: OpenPGP digital signature ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel
Re: Use of pool servers reveals unacceptable crash rate in async DNS
Processing old mail... Hal Murray: > > I believe you're right that these platforms don't have it. The question is, > > how important is that fact? Is the performance hit from synchronous DNS > > really a showstopper? I don't know the answer. > > There are two cases I know of where ntpd does a DNS lookup after it gets > started. > > One is the try again when DNS for the normal server case doesn't work during > initialization. It will try again occasionally until it gets an answer. > (which might be negative) > > The main one is the pool code trying for a new server. I think we should be > extending this rather than dropping it. There are several possibles in this > area. The main one would be to verify that a server you are using is still > in the pool. (There isn't a way to do that yet - the pool doesn't have any > DNS support for that.) The other would be to try replacing the poorest > server rather than only replacing dead servers. > > DNS lookups can take a LONG time. I think I've seen 40 seconds on a failing > case. > > If we get the recv time stamp from the OS, I think the DNS delays won't > introduce any lies on the normal path. We could test that by putting a sleep > in the main loop. (There is a filter to reject packets that take too long, > but I think that's time-in-flight and excludes time sitting on the server.) > > There are two cases I can think of where a pause in ntpd would cause > troubles. One is that it would mess up refclocks. The other is that packets > will get dropped if too many of them arrive. > > I think that means we could use the pool command on a system without > refclocks. That covers end nodes and maybe lightly loaded servers. > > --- > > It's worth checking out the input buffering side of things. There may be > some code there that we don't need. I think there is a pool of buffers. > Where can a buffer sit other than on the free queue. Why do we need a pool? The project has more important priorities than chasing this down. But: I have edited this text, adding a few details I have learned since, into a new section for the internals tour (devel/tour.txt). That will give somebody a better-than-nothing place to start if we ever again try something like the cAres replacement. -- http://www.catb.org/~esr/;>Eric S. Raymond ___ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel