We have moved from the meaning of status code 9514 to the more general issue of how NTP shall be supported, so I've collected the relevant threads below.
=================================================== > At 11:19 PM -0400 3/15/09, Danny Mayer wrote: > Joseph Gwinn wrote: > > >>> The FAQ has to be the place for such explanations. > >> I'm not sure if this qualifies as an FAQ as I don't recall that it has > >> come up before. FAQ stands for Frequently Asked Questions. > > > > RAQ then? Rarely Asked Questions > > > > Seriously, I can't believe that I'm the only person in history to be > > perplexed by these status codes, and those little three-word summaries > > are a bit telegraphic. > > > > Joe Gwinn > > > > You aren't the only one. These questions have been asked before by a > number of people. In fact I had to look at this at one point when I was > getting these codes. Of course I just looked at the source code and > never looked for documentation. > > I will tell you that this is a combination of bits so it's not just a > number. Each bit represents a test code that failed so you have quite a > bit to look at. I do know how the status code is structured, and wrote a Mathematica program to automate the decoding. (I use Mathematica to generate the co-plots of loopstats and peerstats data, collect statistics, et al.) What I didn't know was that the definitions of the code bits had changed between v3 and v4. I'll have to dig into the old documentation and see if this code was affected. There is little chance that I will have the time to read enough NTP source code to make sense of it, sufficient to be able to come to reliable conclusions. I'm a system engineer, and time is one issue of many in a system. More generally, it's hopeless to expect the world's sysadmins to read NTP code (or any other kind of code). They just don't have the time, and are responsible for far too many different kinds of box for it to be practical. But a major part of making something reliable in practice is making it possible for a harried sysadmin to nonetheless get it right. (I'm not a sysadmin, but work with many sysadmins. They spend lots of time fighting fires, and are of necessity jacks of all trades, masters of none.) Silently mutating code definitions sounds like a blunder to me. NTP is used on tens to hundreds of millions of computers worldwide. There will never be a pure v4 world. In fact there will still be v3 around when v5 is being introduced. So, if new kinds of status is needed, invent new codes to suit, but do not change the meanings of the codes that are already widely used. In other words, do not undermine your existing base. The Internet folk had the same issue with IPv6, and they concluded that IPv4 was too deeply embedded to ever eliminate, and that there was never going to be a "flag day" when a worldwide changeover would happen. Thus, IPv4 and IPv6 had to coexist and interoperate forever, and so IPv6 was designed to support this. ========================================================== > To: ma...@ntp.org > From: Joe Gwinn <joegw...@comcast.net> > Subject: Re: [ntp:questions] What exactly does "Maximum Distance Exceded" > mean? > Cc: questions@lists.ntp.org > Bcc: gw...@raytheon.com > X-Attachments: > > Status code values fixed. > > At 10:47 PM -0400 3/15/09, Danny Mayer wrote: > Joseph Gwinn wrote: > > Hmm. OK, but I think that we've kind of run off the rails. Let me > > summarize: > > > > 1. Sun Microsystems' current behavior is not the issue, as I'm loading > > old software from an old CD onto old computer hardware, hardware that > > cannot support a newer version of Solaris than v9. > > > > One of these old Solaris boxes did work with NTPv3 running an even older > > version of Solaris, with no 9514 codes, deepening the mystery. > > > > The trouble here is that those codes are *very likely* likely to have > changed between V3 and V4 since there was a large rewrite between the > two. That's why looking at the source code is necessary to get you the > help you need. As discussed in my other reply, mutating codes is a blunder. It's a good-news bad-news thing. The good news is that NTP has succeeded on an unimagined scale. The bad news is that because of that scale, one must be *very* respectful of NTP's existing base, and it *can* be constraining. > > The fact that this obsolete system can most likely support NTPv4 is > > worth investigation, though. > > > > 2. I think that what's happening is that I'm doing something dumb, and > > I bet that there is no real difference in how NTPv3 or NTPv4 would react > > to this faux pas, whatever it turns out to be. Nor is source code > > research needed or requested. > > > > 3. The original question was how to interpret a specific status code, > > 9514. I read the explanation in the documentation, but became no wiser > > for it. Thus my question. > > Which is why you need to look at the source code. Documentation isn't > always clear or definitive but the source code will tell you. It simply cannot be required to read source code to get the definitions of status codes, even if the documentation has to give one definition per NTP version. NTP is used on hundreds of millions of computers. Are we expecting that every time someone gets an unexpected code they either have to read the source code, or pay someone to read it for them? I'm sorry, but that cannot work. > > If there isn't a NTP FAQ entry on this, there probably should be. Our > > sysadmins were flummoxed by the cloud of 9514 codes, and they are far > > too busy to undertake a research project. (The deeper problem is that > > some managers believe that NTP is plug and play, which isn't quite true.) > > > > Mostly it is, but there are always mysteries like this. Yes. Joe
_______________________________________________ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions