Re: [ntp:questions] Using NTP to calibrate sound app

2013-01-26 Thread Joseph Gwinn
In article 51033f49.215309...@news.eternal-september.org,
 no-...@no-place.org wrote:

 I am an app developer who has a precision audio frequency app for
 iPhone and Android devices.  For my app the nominal crystal oscillator
 accuracy in these devices is not sufficient.  Up until now I have been
 providing frequency calibration in my app by instructing the user to
 call the telephone feed of WWV (NIST) audio (using a separate landline
 phone) and let my app listen to the 500 Hz and 600 Hz tones.  By
 analyzing the audio I can correct for the device's audio system clock
 deviation.  Normally they only need to do this once after the app is
 installed because the stability of these devices is OK once I memorize
 the offset.
 
 Now I am considering an alternate means of performing this calibration
 using NTP.  The iPhone and Android devices deliver audio to my app in
 small packets.  A calibration run would consist of an initial NTP
 syncronization with an audio packet, followed by a period of some
 number of minutes during which I will just count audio packets,
 followed by a final NTP synchronization with the last audio packet.
 By knowing the time difference over some number of audio packets I
 hope to calculate the actual audio clock frequency for that device.
 
 My question is about the NTP procedure I should follow to do this.  I
 obviously don't want to hard-code for a specific time server because
 things could change after the user gets my app and it is unfair to
 send a whole block of users to the same server.  The Server Pool looks
 promising.  Does pool.ntp.org just behave like a Stratum 2 server so I
 could hard-code that URL into my implementation of NTP in my app?  I
 would appreciate any observations on the promise of this approach.

How accurately do you need to calibrate?  What will people use this app 
for?

And how long can this calibration take before users revolt?

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP vs RADclock?

2012-06-03 Thread Joseph Gwinn
In article jqgakc$420$1...@dont-email.me,
 David Woolley david@ex.djwhome.demon.invalid wrote:

 skillz...@gmail.com wrote:
  Has there been any independent comparison of NTP vs RADclock [1]?
  Information on the RADclock site seem to indicate it performs pretty
  well, but I haven't seen any analysis except from the RADclock
  authors.
 
 This is the first I've heard of it, so I assume that it has never 
 appeared on this newsgroup, and its authors are not active here.
 
 What really annoys me, though, is that it fails to describe the essence 
 of the algorithm on the first page. That's par for the course for 
 commercial software, but this says that it is open source.  Can you 
 point me to where this information is provided (I think ntpd has a 
 similar problem, though).
 
  
  I'm using NTP today for synchronization between devices on a LAN to
  each other of an internal clock separate from the system's normal wall
  clock (which uses the system's NTP). So I have flexibility with update
  intervals (1-3 seconds acceptable in my case) and other parameters.
  
  [1] http://www.synclab.org/radclock/

I looked at one of their demonstrations, TIM_2008_camera.pdg, where they 
showed 10 microsecond sync over a LAN, with very bad NTP performance.  
But I've done 7 microsconds with NTP in a quiet testbed on 1996-era 
hardware, so I don't know what the problem with NTP was.  I'll have to 
read the paper more carefully.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] local refclock and orphan mode...

2011-11-23 Thread Joseph Gwinn
In article e1rt3rk-0005bs...@stenn.ntp.org,
 Harlan Stenn st...@ntp.org wrote:

 Doug,
 
  On 11/21/2011 01:51 AM, Harlan Stenn wrote:
   I asked this on hackers@ and think a wider audience would be good.
   
   In the old days, we had the local refclock. 

   Now we have orphan mode. 

   Can anybody think of a (good) use-case where one would want *both* the 
   local refclock *and* orphan mode configured for an instance of ntpd?
   
  
  Harlan,
  
  I think you are confused about how things operate around here. The way
  this works is that *we* ask you the questions and then *you* give us
  answers;)
 
 Mostly, that's true :)   Mostly...  In this case I'm trying to make sure
 we don't implement a change that would catch folks by surprise.
 
  The first thing that comes to mind is environments with mixed ntpd
  versions. I realize that pre-4.2.2 was 5 years ago but sometimes change
  management policies read more like change resistance policies.
 
 Sure, but in the old days there was no orphan mode.  And there have been
 some other changes to things that would require updating ntp.conf files.
 
 So while you mention good points, I'm still not hearing anybody say We
 use local refclocks *and* orphan mode and the reason for that is X and
 here's how we expect it to behave. 
 
  What about the infamous interstellar/interplanetary ntp network?
 
 If they are up for upgrading their ntpd instances, they can easily
 upgrade the config files at the same time.
 
 If such networks have what they think is a valid use case for
 simultaneous use of both local refclocks and orphan mode, it would be
 Good to hear what that case is.

I may not be understanding the question correctly, but one 
similar-sounding real-world application comes to mind:

There is a planned radar system where a GPS receiver distributes GPS 
System Time via IRIG-B.  There are a number of Intel x86-64 servers 
running RHEL (with MRG) supporting realtime applications software.  It 
is necessary that this applications code be synchronized to GPS System 
Time, so applications can stay in synch with the radar hardware command 
pipeline.  

One way to achieve this is to provide an IRIG receiver card, the Linux 
I/O driver needed to access the IRIG card, and a compatible reference 
clock driver compiled into the NTPv4 daemon.  (Symmetricom offers such a 
triplet; there may be others as well.)  This allows application code to 
use ordinary kernel-provided timers to trigger software to make the 
donuts exactly when needed.

An alternative approach, successfully used on prior radars, is to have 
the IRIG card (or custom equivalent hardware) generate hardware timing 
interrupts, which interrupts are turned into UNIX signals sent to the 
application code.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] local refclock and orphan mode...

2011-11-23 Thread Joseph Gwinn
In article e1rtlis-0007ax...@stenn.ntp.org,
 Harlan Stenn st...@ntp.org wrote:

 Joe wrote:
  Harlan wrote:
   If such networks have what they think is a valid use case for
   simultaneous use of both local refclocks and orphan mode, it would be
   Good to hear what that case is.
  
  I may not be understanding the question correctly, but one 
  similar-sounding real-world application comes to mind:
  
  There is a planned radar system where a GPS receiver distributes GPS 
  System Time via IRIG-B.  There are a number of Intel x86-64 servers 
  running RHEL (with MRG) supporting realtime applications software.  It 
  is necessary that this applications code be synchronized to GPS System 
  Time, so applications can stay in synch with the radar hardware command 
  pipeline.  
  
  One way to achieve this is to provide an IRIG receiver card, the Linux 
  I/O driver needed to access the IRIG card, and a compatible reference 
  clock driver compiled into the NTPv4 daemon.  (Symmetricom offers such a 
  triplet; there may be others as well.)  This allows application code to 
  use ordinary kernel-provided timers to trigger software to make the 
  donuts exactly when needed.
  
  An alternative approach, successfully used on prior radars, is to have 
  the IRIG card (or custom equivalent hardware) generate hardware timing 
  interrupts, which interrupts are turned into UNIX signals sent to the 
  application code.
 
 If there are multiple machines with these refclocks then just mesh
 them together, and in this case I don't see why either orphan mode or
 the local refclock is needed.

There are two machines with IRIG connections and refclock drivers et al, 
and a factor more servers with only NTP exchange via ethernet.  

By mesh them together, what do you mean?

 
 If I'm missing something, I can see that in the old days a local
 refclock would have been good to keep the machines sync'd together, and
 now orphan mode would do the same thing.
 
 I do not see that in this case that *both* the local refclock driver
 *and* orphan mode would be useful.

That was my suspicion, but of course this entire network is isolated 
from the outside world, with only GPS to keep time by.  Maybe I'm 
unclear on the precise definition of orphan mode?.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] How to keep fake time in past/future?

2011-05-01 Thread Joseph Gwinn
In article slrnirobu0.8be.un...@wormhole.physics.ubc.ca,
 unruh un...@wormhole.physics.ubc.ca wrote:

 On 2011-04-30, Joseph Gwinn joegw...@comcast.net wrote:
  In article slrnirllcv.g22.un...@wormhole.physics.ubc.ca,
   unruh un...@wormhole.physics.ubc.ca wrote:
 
  On 2011-04-29, Cristian Seres cristia...@contrasec.fi wrote:
   Hi!
  
   How would you implement an NTP server which would need to offer a time 
   set deliberately in past/future, say 365*86400 seconds, or even better - 
   first set the freely chosen date on NTP server and then keep the hours, 
   minutes and seconds in sync with the real time?
  
  
  Perhaps you could tell us why in the world you would want to do that?
 
  It's very common, actually.  For instance, in Air Traffic Control they 
  record everything, and later play prior events back, to figure out what 
  happened.  
 
  Alternately, very large and complex training scenarios are written, the 
  scenario happening at some time different from the present, perhaps 
  past, perhaps future.
 
 And why do we not hear from Seres who could tell us why, instead of us
 all guessing here. Any of the guesses I have seen have been trivially
 satisfied by simply killing ntpd and resetting the computer's clock. The
 OP wanted the milliseconds to be right, but the days or years wrong. 
 Or if you want set up one computer with LOCAL clock as only source, and
 set the the inappropriate time, (without any other source of time) reset
 the clock and have the others use it as their server. But of course the
 milliseconds will not be right. But why are we not hearing from the
 person who presumably knows why they want their computer clock
 mistreated thus?

Well, I don't know the details of that system, but for instance if one 
wished to test tolerance for leap seconds, it's necessary to abuse those 
hapless clocks.  I've done this by manually resetting the clocks.  This 
can really upset radar trackers if they aren't designed to tolerate the 
occasional one-second step discontinuity.  Negative leaps are worse than 
positive leaps.

As for simulation, probably the most common approach is to leave GPS, 
NTP, and the local computer clocks alone, instead interposing a software 
layer that can inject the needed constant offset.  But this only works 
if one is able to install such a layer.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] How to keep fake time in past/future?

2011-04-30 Thread Joseph Gwinn
In article slrnirllcv.g22.un...@wormhole.physics.ubc.ca,
 unruh un...@wormhole.physics.ubc.ca wrote:

 On 2011-04-29, Cristian Seres cristia...@contrasec.fi wrote:
  Hi!
 
  How would you implement an NTP server which would need to offer a time 
  set deliberately in past/future, say 365*86400 seconds, or even better - 
  first set the freely chosen date on NTP server and then keep the hours, 
  minutes and seconds in sync with the real time?
 
 
 Perhaps you could tell us why in the world you would want to do that?

It's very common, actually.  For instance, in Air Traffic Control they 
record everything, and later play prior events back, to figure out what 
happened.  

Alternately, very large and complex training scenarios are written, the 
scenario happening at some time different from the present, perhaps 
past, perhaps future.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] UK - GPS Jamming update

2011-04-01 Thread Joseph Gwinn
In article 
AANLkTi=jmmuyufzkdj6pdfhgpadyhkeqkq-vqycjq...@mail.gmail.com,
 Dave Hart daveh...@gmail.com wrote:

 On Thu, Mar 31, 2011 at 3:27 PM, unruh un...@wormhole.physics.ubc.ca wrote:
 
  I think this is a really bad idea. It conveys the impression that it is
  OK to jam GPS-- Even the government thinks it is OK to jam GPS. This
  removes the (admittedly possibly small) moral argument that it is bad to
  jam GPS because of the harm it could do.
  This says Taking GPS off the air for 18 hours is fine.
 
 If you're within a few miles of one of the 40,000 planned LightSquared
 base stations, you'll likely lose GPS for good, if the FCC sticks to
 its current approval to repurpose a GPS-adjacent band allocated for
 space-based use for terrestrial broadband.  Search for LightSquared or
 see for example:
 
 http://freegeographytools.com/2011/update-on-lightsquareds-gps-jamming-proposa
 l
 
 These LightSquared guys apparently know their way around the DC
 beltway, managing to schedule an accelerated approval to coincide with
 Thanksgiving.  I can only suppose they bribed the right people.  Did I
 say bribed?  I'm sorry, expressed their first amendment rights via
 unlimited political donations!  Silly me.

The DoD has begun to push the FCC to protect GPS:

Pentagon Raises Concerns About Lightsquared Wireless, Wall Street 
Journal, 31 March 2011.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] new driver development

2011-03-20 Thread Joseph Gwinn
In article j_mdnrcpita18hjqnz2dnuvz_tudn...@megapath.net,
 hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal Murray) wrote:

 In article 2wlgp.34776$d46.31...@newsfe07.iad,
  Bruce Lilly bruce.li...@gmail.com writes:
 
  o POSIX mutex for synchronized access to shared memory for updates
-- obviates mode 0 / mode 1 / OLDWAY
 
 I'm far from a POSIX wizard.  When I google for POSIX mutex I get
 a bunch of hits that all are part of pthreads.

Pthreads moved into POSIX, so no surprise.


 Does that stuff work across processes rather than threads?
 
 The mutex needs to be in shared memory so both processes can get at it.
 Right?  Who initializes it?

It would be best to read the actual standard.  Mutexes spanning shared 
memory are supported, for exactly the reasons you list.

POSIX.1: http://pubs.opengroup.org/onlinepubs/9699919799/.



And the general history: http://en.wikipedia.org/wiki/POSIX


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] What level of timesynch error is typical on Win XP?

2010-10-22 Thread Joseph Gwinn
In article i9r6cb$62...@news.eternal-september.org,
 David J Taylor david-tay...@blueyonder.co.uk.invalid wrote:

 Joseph Gwinn joegw...@comcast.net wrote in message 
 news:joegwinn-ee48fd.22434621102...@news.giganews.com...
  In article i9pkvb$dc...@news.eternal-september.org,
  David J Taylor david-tay...@blueyonder.co.uk.invalid wrote:
 []
  You might consider providing a local, more precise NTP server with
  something like a small, fan-less Intel Atom system running FreeBSD and
  synched across the network to your GPS time server.  You might be able 
  to
  keep a small box like that in a more temperature controlled 
  environment,
  but even without it might provide a way of smoothing out any jitter due 
  to
  your remote connection to the GPS server.
 
  I'm not convinced that this would help.  NTP reports a round trip time
  of slightly more than 2 mS, which is very close to the two milliseconds
  that ping sees, so it seems unlikely that the time server or intervening
  network is the root cause.
 []
  Joe Gwinn
 
 No, I wasn't convinced either - hence it was just a suggestion.  On the 
 systems here, though, the NTP delay shows around 0.25-0.75 msec to the LAN 
 servers.

I must say that I don't know why ping sees 2 milliseconds, which did 
seem high, but I also don't know the physical location of the 
timeserver, or how many hops (and firewalls) it takes to get there.  
I'll have to explore it with traceroute.

But the 2 ms RT time explains only a millisecond or so of timesync 
error, leaving much error to be explained.  Research continues.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] What level of timesynch error is typical on Win XP?

2010-10-21 Thread Joseph Gwinn
In article i9njj8$ea...@news.eternal-september.org,
 David Woolley da...@ex.djwhome.demon.invalid wrote:

 Joseph Gwinn wrote:
  I have a small network of Windows XP (64 bit) running simulations, with 
  NTPv4 running on all the boxes and using a GPS-based timeserver on the 
  company network.  The ping time to the server is 2 milliseconds from my 
  desk, but I'm seeing random time errors of order plus/minus 5 to 10 
  milliseconds, based on loopstats data.
 
 Loopstats data cannot give you an accurate measure of error when ntpd is 
 locked on.  It will give a value that is distinctly pessimistic.  If 
 ntpd could measure the actual error, it ought to be possible for it to 
 remove that error.

This is certainly true, but loopstats and peerstats data is nonetheless 
useful as a proxy for the unmeasured actual clock offset.  In other 
words, if loopstats data shows adequate stability for my application, 
true offset error will also be adequate.   Unless the transport delay is 
asymmetric, which is not the case here.

Nor do I have the hardware to make better offset measurements than those 
provided by loopstats and peerstats.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] What level of timesynch error is typical on Win XP?

2010-10-21 Thread Joseph Gwinn
In article i9oo4r$5a...@news.eternal-september.org,
 David J Taylor david-tay...@blueyonder.co.uk.invalid wrote:

 David Woolley da...@ex.djwhome.demon.invalid wrote in message 
 news:i9omts$7s...@news.eternal-september.org...
  Richard B. Gilbert wrote:
 
  Also note that Windows' clock ticks every 17 milliseconds.
 
 
  Only when not running ntpd.  ntpd forces the use of multimedia timers.
 
 .. and not when running Windows-7 and possibly Vista, when it's just under 
 1 millisecond, and NTP uses the native timers.

The platforms in question are running Windows XP, not Vista or Windows 
7.  How does this change the answer?

By the way, the hardware is a collection of 8-core HP Z800 workstations 
connected together by copper gigabit ethernet links and local hub, with 
one link going via the company network to the GPS network time server.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] What level of timesynch error is typical on Win XP?

2010-10-21 Thread Joseph Gwinn
In article i9ooo9$9i...@news.eternal-september.org,
 David J Taylor david-tay...@blueyonder.co.uk.invalid wrote:

 Joseph Gwinn joegw...@comcast.net wrote in message 
 news:joegwinn-da4b7b.23340420102...@news.giganews.com...
  In article i9mqek$tr...@news.eternal-september.org,
  David J Taylor david-tay...@blueyonder.co.uk.invalid wrote:
 
   I have a small network of Windows XP (64 bit) running simulations, 
   with
   NTPv4 running on all the boxes and using a GPS-based timeserver on 
   the
   company network.  The ping time to the server is 2 milliseconds from 
   my
   desk, but I'm seeing random time errors of order plus/minus 5 to 10
   milliseconds, based on loopstats data.
  
   This level of timesynch error is OK for the simulation, but still 
   that's
   a lot of error.  I get far better on big UNIX boxes.
  
   The question is if this level of error is reasonable, given the 
   setup.
   I know that timekeeping under Windows is not optimum, but cannot 
   change
   the OS, so the question is if I have gotten things as good as they 
   can
   be, or should I dig deeper.  One thing that comes to mind is to raise
   the priority of the NTP daemon to exceed that of the simulation
   software.
  
   Thanks in advance,
  
   Joe Gwinn
 
  Joe,
 
  This is the performance I see:
 
http://www.satsignal.eu/mrtg/performance_ntp.php
 
  The XP systems are:
 
Feenix: GPS-synched
Narvik: LAN-synced to Pixie (FreeBSD with GPS source)
 
  These are all over the place.  Both hardware and OS seem to matter, by a
  lot.
 
 Hardly all over the place!  Feenix is well within a milliseconds, and 
 Narvik just within a millisecond, and programs on that OS can only read 
 the system time with ~16ms precision.

By all over the place I mean that while some combinations are very 
good, yielding peak offsets well less than a millisecond, some 
combinations yield peak offsets of 25 milliseconds.  In my application, 
only peak offsets matter.


  I can't add a GPS source, and I can't really control temperature.
 
 So you need to keep the polling interval short.

We tried 16 seconds, with no variation allowed, and it didn't make much 
difference.  Currently, NTP is being allowed to choose its own polling 
period.  I don't recall what periods it chose, but I'll look.

What other periods would you suggest, and why?


  I don't think that iburst is the issue, because the randomness persists
  for at least a week, long after the iburst transients will have died
  down.
 
 I never said iburst was an issue, just that the systems will need to be on 
 for several hours before best accuracy is achieved.  It's a pity that NTP 
 doesn't have a faster initial convergence.
 
  My experience is the same. for average behaviour.  But for use in
  realtime, running the daemon at high realtime priority greatly reduces
  the tails of the probability distribution of response times and/or clock
  offsets.
 
  Joe Gwinn
 
 Yes, if the CPU loading is heavy I can quite believe that.

That's the usual cause.  My usual solution is to ensure that the NTP 
daemon has a high realtime priority that well exceeds that of the 
realtime application code.  NTP can be run at the highest realtime 
priority available without difficulty on every system I have tried this 
on.  

Another, more subtle cause, is Network File System (NFS) access being 
used to read or write the local disk from afar.  This completely 
distracts the local OS kernel, at an implied priority that exceeds all 
processes and threads, including NTP running at the highest realtime 
priority.  And yet there may be no record of the activity in syslog. Nor 
is it clear that NFS activity is always counted in the I/O read and 
write statistics kept by the kernel. Diagnosis may require network 
tools, unless one can figure out where the NFS access must be coming 
from and stop it at the source.  


I should explain what I mean by the term realtime priority.  There are 
two related but independent things going on here, a numerical priority 
and a scheduling policy.

A realtime scheduling policy is typically winner take all, where the 
process (and/or thread) having the highest priority can use as much of 
the processor as desired, even if all other processes and threads are 
squeezed out completely.  In other words, realtime scheduling policies 
are completely unfair.  It is the human system designers' responsibility 
to ensure that there is enough computer that nothing critical is unduly 
stalled.

A non-realtime scheduling policy attempts some notion of fairness, where 
all processes and threads make progress at an average rate that is 
determined by their respective numerical priorities.  In such a scheme, 
nobody is completely squeezed out, and no direct human intervention is 
required to ensure this outcome.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] What level of timesynch error is typical on Win XP?

2010-10-21 Thread Joseph Gwinn
In article i9pkif$at...@news.eternal-september.org,
 David J Taylor david-tay...@blueyonder.co.uk.invalid wrote:

 Evandro Menezes evan...@mailinator.com wrote in message 
 news:a376dc23-cb31-441c-9b35-b10a9758c...@a36g2000yqc.googlegroups.com...
 []
  Indeed, since Windows allows a process to be starved from running,
  depending on the load, a higher priority process may block NTP from
  running.  Therefore, although raising the priority for NTP doesn't
  mean that it cannot be starved, it does decrease the likelihood of
  that happening.
 
  Linux, on the other hand, favors fair process scheduling and strives
  to not starve any from running at least for a little while.
 
  HTH
 
 Windows can run NTP at real-time priority, if you give the NTP user that 
 right.  Normal user processes will not then pre-empt NTP.

We are trying this, but given that the error level and pattern showed no 
diurnal variation, I don't really expect changing priority to help.  The 
simulations are run only during the day, so one would expect a diurnal 
variation if CPU load were the issue.

We are also verifying the the Intel power-saving feature, which slows 
the CPU clock et al, is disabled.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


[ntp:questions] What level of timesynch error is typical on Win XP?

2010-10-20 Thread Joseph Gwinn
I have a small network of Windows XP (64 bit) running simulations, with 
NTPv4 running on all the boxes and using a GPS-based timeserver on the 
company network.  The ping time to the server is 2 milliseconds from my 
desk, but I'm seeing random time errors of order plus/minus 5 to 10 
milliseconds, based on loopstats data.

This level of timesynch error is OK for the simulation, but still that's 
a lot of error.  I get far better on big UNIX boxes.

The question is if this level of error is reasonable, given the setup.  
I know that timekeeping under Windows is not optimum, but cannot change 
the OS, so the question is if I have gotten things as good as they can 
be, or should I dig deeper.  One thing that comes to mind is to raise 
the priority of the NTP daemon to exceed that of the simulation software.

Thanks in advance,

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] What level of timesynch error is typical on Win XP?

2010-10-20 Thread Joseph Gwinn
In article i9mqek$tr...@news.eternal-september.org,
 David J Taylor david-tay...@blueyonder.co.uk.invalid wrote:

  I have a small network of Windows XP (64 bit) running simulations, with
  NTPv4 running on all the boxes and using a GPS-based timeserver on the
  company network.  The ping time to the server is 2 milliseconds from my
  desk, but I'm seeing random time errors of order plus/minus 5 to 10
  milliseconds, based on loopstats data.
 
  This level of timesynch error is OK for the simulation, but still that's
  a lot of error.  I get far better on big UNIX boxes.
 
  The question is if this level of error is reasonable, given the setup.
  I know that timekeeping under Windows is not optimum, but cannot change
  the OS, so the question is if I have gotten things as good as they can
  be, or should I dig deeper.  One thing that comes to mind is to raise
  the priority of the NTP daemon to exceed that of the simulation 
  software.
 
  Thanks in advance,
 
  Joe Gwinn
 
 Joe,
 
 This is the performance I see:
 
   http://www.satsignal.eu/mrtg/performance_ntp.php
 
 The XP systems are:
 
   Feenix: GPS-synched
   Narvik: LAN-synced to Pixie (FreeBSD with GPS source)

These are all over the place.  Both hardware and OS seem to matter, by a 
lot.

   
 Your best bet would be to add a GPS source to your Windows PC, when you 
 might expect errors of less than 250 microseconds under stable running 
 (i.e. leave the PC on 24 x 7).  If you can't do that, PC Narvik suggests 
 you might get within +/- 1.5ms.  That's with a configuration file like:
 
 server A  iburst  maxpoll 5
 server B  iburst  maxpoll 5
 server C  iburst  maxpoll 5
 
 where A, B and C all have a GPS source.  All PCs on the same switch, so a 
 much better ping than 2ms.  You could reduce the maxpoll further to 4 (if 
 the server operator agrees) and get somewhat better performance, and 
 keeping the PCs in a stable temperature environment would also be likely 
 to help.  The bumps at 05:00 are when the heating comes on.

I can't add a GPS source, and I can't really control temperature.

I don't think that iburst is the issue, because the randomness persists 
for at least a week, long after the iburst transients will have died 
down.

 
 In my experience, changing the priority of NTP doesn't help a lot, but 
 most of my PCs are not CPU-bound.  But I have given the account the rights 
 to do that.

My experience is the same. for average behaviour.  But for use in 
realtime, running the daemon at high realtime priority greatly reduces 
the tails of the probability distribution of response times and/or clock 
offsets.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Allan deviation survey

2010-09-14 Thread Joseph Gwinn
In article 1f42m7-7gr2@ntp.tmsw.no,
 Terje Mathisen terje.mathisen at tmsw.no wrote:

 Joseph Gwinn wrote:
  The address did not look munged to me either.  It makes perfect sense for a
  physicist to name servers after physics objects.
 
  The standard approach is to put the demunging instructions in your sig.  
  Like:
  Please remove reference to the entrance to a worm's burrow from email 
  address.
 
 No, no!
 
 Please remove interstellar gateway from my address ?
 
  Something too hard for a computer to figure out, but easy for a human.
 :-)

I like it.  But will unruh?

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Allan deviation survey

2010-09-14 Thread Joseph Gwinn
In article slrni8sddh.n3i.un...@wormhole.physics.ubc.ca,
 unruh un...@wormhole.physics.ubc.ca wrote:

 On 2010-09-13, Joseph Gwinn joegw...@comcast.net wrote:
  Unruh,
 
  In article slrni8ru62.i6p.un...@wormhole.physics.ubc.ca,
   unruh un...@wormhole.physics.ubc.ca wrote:
 
  On 2010-09-13, David L. Mills mi...@udel.edu wrote:
  
  [snip]
  
   ... And, by the way, mail sent to your alleged mail address is 
   returned to sender as undeliverable.
  
  Yes, I am sorry about that but it is done in order to slightly reduce
  the spam I get. It should be clear how to alter it, but I realise that
  that makes more work for the responder. For a long time I did not munge
  my address, and as a result am on a number of spam lists.
 
  The address did not look munged to me either.  It makes perfect sense for a 
  physicist to name servers after physics objects.  
 
 Ah, I finally looked at it. I used to use the nn new reader which munged
 my email address. I recently (well a year ago) switched to slrn, and
 just assumed that the same would occur there. Your comments caused me to
 actually read one of my posts as it appeared on the newsgoup, and sure
 enough it is the address of the machine running slrn ( which does not receive 
 mail) instead of the munged address. Sorry about the wrong explanation. If you
 really want to email me you can remove the wormhole. But answering on
 the list is probably better anyway. 

Ah.  I wasn't today trying to email anyone, but I could see the problem should 
I 
try.  Anyway, now that it's understood, it can be fixed.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Allan deviation survey

2010-09-13 Thread Joseph Gwinn
Unruh,

In article slrni8ru62.i6p.un...@wormhole.physics.ubc.ca,
 unruh un...@wormhole.physics.ubc.ca wrote:

 On 2010-09-13, David L. Mills mi...@udel.edu wrote:
 
[snip]
 
  ... And, by the way, mail sent to your alleged mail address is 
  returned to sender as undeliverable.
 
 Yes, I am sorry about that but it is done in order to slightly reduce
 the spam I get. It should be clear how to alter it, but I realise that
 that makes more work for the responder. For a long time I did not munge
 my address, and as a result am on a number of spam lists.

The address did not look munged to me either.  It makes perfect sense for a 
physicist to name servers after physics objects.  

The standard approach is to put the demunging instructions in your sig.  Like: 
Please remove reference to the entrance to a worm's burrow from email 
address.  
Something too hard for a computer to figure out, but easy for a human.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] DATUM TymServe 2000 Op/Sv Manual request

2010-08-08 Thread Joseph Gwinn
In article eov7o.197415$9f6.393...@twister1.libero.it,
 mauri maremovebeforereplyuri...@libero.it wrote:

 In Symmetricom.com this model isn't available.

Call Symmetricom up and ask them.


Joe Gwinn




 E-Mail Sent to this address will be added to the BlackLists 
 n...@blacklist.anitech-systems.invalid ha scritto nel messaggio 
 news:i3hvbr$f2...@news.eternal-september.org...
  mauri wrote:
  I`m looking for DATUM TymServe 2000 (TS2000-GPS) Network
   Time Server Operating/Service manual -
 
  symmetricom.com ? Symmetricom and Datum merged in 2002.
 
  -- 
  E-Mail Sent to this address blackl...@anitech-systems.com
   will be added to the BlackLists.

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] IA approved COTS NTP servers question

2010-06-10 Thread Joseph Gwinn
In article 62b84ad9-7d4c-4074-960e-aae4ef826...@u7g2000yqm.googlegroups.com,
 Fran fran.ho...@jhuapl.edu wrote:

 On Jun 4, 3:13 pm, Greg Hennessy greg.henne...@cox.net wrote:
  On 2010-06-04, Fran fran.ho...@jhuapl.edu wrote:
 
   On Jun 3, 4:49?pm, Greg Hennessy greg.henne...@cox.net wrote:
Do you know of any DISA IA approved COTS NTP servers ?
 
   Why not use tick.usno.navy.mil or tock.usno.navy.mil? Only half a
   smiley.
 
   Thats a funny one Greg, thanks!
 
  On the serious side, if you are worried about having to follow DISA
  STIGS, then it seems safe to assume you are on NIPR or SIPR nets, in
  which case it is probably easier to use the USNO supplied time service
  rather than recreating your own. If for redundancy you wish to run
  your own NTP servers (which you should point to USNO since USNO is
  what all DoD sources are *SUPPOSED* to be using), I'm not aware of any
  COTS NTP servers that are DISA IA approved out of the box.
 
 Greg, thanks again for your help.
 
 We are running on a private net inside a lab, no connections outside
 of the lab. We'll run the NTP server either with a LOCAL reference
 clock driver, IRIG-B, or with GPS.

GPS would be the simplest solution, and there are many classified networks with 
GPS timeservers, so there is ample precedent.  For IA, the key is that a GPS 
receiver does not connect in any way to the internet, so there is no way for 
someone to hack in via the GPS receiver.  The fact that GPS is a DoD system 
doesn't hurt either.


 A short email with Symmetricom said in essence: although there is no
 'IA-mode' to put the NTP servers in, the NTP server is already running
 a limited amount of services, there are controls to further disable
 service and ports. Therefore its seems likely to me the NTP server
 could be configured as required.
 
 The devil is in the details however. So I would need to get funded for
 time to get smart on the applicable IA requirements, get a suitable
 COTS NTP server, configure and test it. Its likely we can get we we
 want, but its not going to be a simple button push like the managers
 would like to hear it is.

Lots of things on networks lack anything resembling IA mode (whatever that 
is), and yet life goes on.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] IA approved COTS NTP servers question

2010-06-04 Thread Joseph Gwinn
In article 9ead1ef5-7000-445b-b7d1-ac1083874...@q8g2000vbm.googlegroups.com,
 Fran fran.ho...@jhuapl.edu wrote:

 Do you know of any DISA IA approved COTS NTP servers ? Didn¹t see any
 in the approved products lists at http://iase.disa.mil/common/index.html
 
 Or, have you configured/tested a COTS NTP server to pass STIG tests ?
 
 Thanks,
 
 Fran
 
 STIG: http://en.wikipedia.org/wiki/Security_Technical_Implementation_Guide

I don't recall that there are any STIGs for NTP timeservers, which are based on 
small dedicated-mode computers running the NTP daemon under some kind of RTOS 
kernel.  

Most timeservers support at least DAC (username and password), but I don't know 
of any that have been evaluated to a protection profile.

Which specific 8500.2 IA Controls (other than those that call out STIGs and 
SRGs) are you responding to?  What is the threat?  

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Network latency questions

2010-05-31 Thread Joseph Gwinn
In article 20100536.48397.fmgrotep...@yahoo.co.uk,
 Frans Grotepass fmgrotep...@yahoo.co.uk wrote:

 Hi all,  Thanks for the responses.
 
 On Thursday 27 May 2010 15:13:50 Joseph Gwinn wrote:
  In article 201005271035.46352.fmgrotep...@yahoo.co.uk,
  
   Frans Grotepass fmgrotep...@yahoo.co.uk wrote:
   Hi all,
  
   Sorry for abusing my membership to this forum for this question.
  
   We are busy with building an embedded application that must retrieve data
   very fast.
  
  Please define very fast in numbers.  For example, 95% of responses must
   be fully received within 1,000 microseconds, and 100% within 10
   milliseconds, or the planet will explode.
 
 This matches the specs.

Blind luck wins again.  This is pretty fast for access to a remote database to 
work well.


  What does the embedded application do?
  
 SMS-routing

How many subscribers?  How big is the database?


The choice is to either have the data locally or go to a central
   server(pool) that contains the data.
  
  Well, locally is always faster and more predictable than remotely, so why
   even consider remotely?
 
 The problem is that the remote db is already available and this will mean 
 replicating the remote db locally. The remote db has all the data in memory. 
 The local response time must be so fast that one needs a db solution with all 
 the data in memory, otherwise the disk seek time will kill us.

How about local caching of the database data, as needed?  What fraction of the 
database is used locally at a worst-case instant in time?

Caching can be done in local memory, using some kind of hash code access method.

Local caching may be useful even if one has a local replica of the database.

By the way, unless you really do need fast access to unpredictable queries, 
there are better designs than relational et al.  Purpose-built databases often 
outperform general purpose databases by a factor of a hundred in speed and in 
footprint.


   In evaluating the network option, I thought that the people here could
   possibly help me with the expected network latency for a Gb network via a
   switch. My gut feeling says that with increased load, the switch will
   bundle the traffic to the different nodes more and this will result in
   higher latency.
  
  Big switches can have transit latencies of a few tens of microseconds, but
   there is far more to it than that.  And if there is a choke point
   somewhere, the observed latencey will vary wildly depending on perhaps
   unrelated traffic and loading, making it appear that the latency varies
   randomly.  The farther the commands and resulting data travel, the more
   vulnerable one is to these effects.
 
 These delays (even with local network) will make the solution impossible. 
 
 Sorry again for posing the questions here. I know this is a blatant off topic 
 post, but getting the details from the internet is a little more difficult 
 and there are so many people here with the knowledge at hand. Thanks for the 
 help.

You're welcome.

I gather you have decided that local databases are required.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Network latency questions

2010-05-27 Thread Joseph Gwinn
In article 201005271035.46352.fmgrotep...@yahoo.co.uk,
 Frans Grotepass fmgrotep...@yahoo.co.uk wrote:

 Hi all,
 
 Sorry for abusing my membership to this forum for this question.
 
 We are busy with building an embedded application that must retrieve data 
 very fast. 

Please define very fast in numbers.  For example, 95% of responses must be 
fully received within 1,000 microseconds, and 100% within 10 milliseconds, or 
the planet will explode.

What does the embedded application do?


  The choice is to either have the data locally or go to a central 
 server(pool) that contains the data. 

Well, locally is always faster and more predictable than remotely, so why even 
consider remotely?


 In evaluating the network option, I thought that the people here could 
 possibly help me with the expected network latency for a Gb network via a 
 switch. My gut feeling says that with increased load, the switch will bundle 
 the traffic to the different nodes more and this will result in higher 
 latency. 

Big switches can have transit latencies of a few tens of microseconds, but 
there 
is far more to it than that.  And if there is a choke point somewhere, the 
observed latencey will vary wildly depending on perhaps unrelated traffic and 
loading, making it appear that the latency varies randomly.  The farther the 
commands and resulting data travel, the more vulnerable one is to these effects.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Quick sync between two computers not connected to the internet

2010-03-24 Thread Joseph Gwinn
In article 7c492e13-48c4-4b06-84ea-81e4e6596...@mac.com,
 Chuck Swiger cswi...@mac.com wrote:
 
 In most cases, it is easier to solve the problem of sync'ing all computers to 
 a correct timesource (and thus all be mutually in sync), then it is to setup 
 a bunch of truly  completely isolated machines which happen to stay in sync. 
  If I really had to solve the latter problem, I would likely connect the 
 machines to a valid NTP timesource long enough to calibrate each machines' 
 intrinsic drift from realtime, and then run time in standalone mode against 
 their local clock.

How good does the timekeeping need to be?  Was the max error ever stated?

Anyway, what I have done in such situations is to anoint a freewheeling 
ordinary 
workstation as the NTP Timeserver (trumpet flare please), and have everybody 
else synch to it.  Synch to external time is by eyeball and wistwatch.  This 
approach does keep them all together, but their sense of time is a good 
indicator of the local temperature wherever that anointed workstation is 
installed.

That said, it is admittedly an approach taken only in desperation, and 
GPS-based 
NTP timeservers are cheap - just buy the box, install it on the network, and 
point everybody to it.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTPv4 Peer Event Codes - secret decoder ring sought

2010-03-19 Thread Joseph Gwinn
Dave,

In article 4ba2c1ff.3060...@udel.edu, David Mills mi...@udel.edu wrote:

 Joe,
 
 You and Dave are working way too hard. The bits and pieces are 
 documented on the ntpq page and on the Event Messages and Status Codes page.

This would be http://www.eecis.udel.edu/~mills/ntp/html/decode.html#peer, 
which I didn't know about, but is exactly what I seek.  And it wasn't a secret 
after all.

But I have a question, a homework example, and a suggestion.

First the question:  The Code field of the Peer Status Word is 4 bits wide, and 
yet codes are defined for values from 1 to 10 hex (decimal 16), which doesn't 
quite map.  How does the code value fit into the field?  Wraparound, so 10 
(TAI) 
becomes zero?


The homework example:  The PSW word that started this exercise is 963a.  If I 
understand, this word decodes as follows:

Status field - host_reachable plus persistent_association

Select field - system_peer (gets the star)

Count field - 3

Code field - become system peer (assuming code values are truncated to 4 bits, 
so hex 10 becomes 0)  

And 9614 decodes to host_reachable plus persistent_association, system_peer 
(gets the star), count=1, and server_reachable.


And the suggestion:  I was misled by some of the NTPv4 documentation, 
specifically the NTPv4 peerstats file documentation in 
http://www.eecis.udel.edu/~mills/ntp/html/monopt.html.  

The note under the table defining peerstats record fields reads The status 
field is encoded in hex format as described in Appendix B of the NTP 
specification RFC 1305.  This is no longer really true, as you discuss below.  
In particular, codes exceeding 5 are not defined in 1305, and some of the 
definitions appear to have changed (or at least have been clarified) so it 
would 
be helpful to add a pointer to 
http://www.eecis.udel.edu/~mills/ntp/html/decode.html#peer to monopt.html.


 RFC-1305  was written in 1992. It's been 18 years since then, so you 
 should expect changes from time to time. Changes are not done lightly; 
 they reflect updates in the algorithms and interpretation of the 
 statistics and state variables. If the interpretation  has not changed, 
 the name and code have not changed. If it has been changed or has become 
 obsolete, the name is not reused.

This is good.  There is far too much existing base to do it any other way.

Thanks,

Joe Gwinn


 Dave
 
 Joseph Gwinn wrote:
 
 In article 
 46f5ae0a-93d6-44ea-812f-e4da2ae2c...@a16g2000pre.googlegroups.com,
  Dave Hart daveh...@gmail.com wrote:
 
   
 
 There were backward-incompatible changes on May 13, 2008 for ntp-dev
 4.2.5p114:
 
 http://ntp.bkbits.net:8080/ntp-dev/?PAGE=csetREV=48295cccnu3e5cmGhOzAS7hA-
 pVG3A
 
 Once again statestr.c is your friend:
 
 http://ntp.bkbits.net:8080/ntp-dev/libntp/statestr.c?PAGE=diffsREV=4829513
 7L4-SOuAy6YZauDbZtW6DRg
 
 If you want to be able to decode these bits for ntpd versions from
 before and after the change correctly, you need to query the version
 string of ntpd, sadly, such as with:
 
 ntpq -c rv 0 version
 
 
 
 So that's how you get the NTP version (rather than the ntpq version)!
 
 When our sysadmins first installed NTPv4, they used the version command of 
 ntpq, 
 which said 4.  Check!  
 
 I came by a few days later to look at the purported NTPv4 loopstats and 
 peerstats files, and (ever suspicious) checked to see what version of NTP 
 had in 
 fact generated them.  Still NTPv3.  The sysadmins had been snookered by 
 ntpq, 
 which failed to make unambiguous whose version it was reporting upon.  
 
 This had also happened to me back in the days of NTPv3, but I was saved 
 because 
 I knew that 4 could not be the answer.  But I never did figure out how to 
 get 
 ntpq to tell me the version of the ntp daemon. 
 
 
   
 
 and then parse for 4.2.5p114 or later.  The format for the version
 string can include an optional -RC# suffix, and before long, there may
 be releases with a -beta# suffix in the -stable branch, such as
 4.2.6p2-beta1 as a prelude to 4.2.6p2-RC1.
 
 
 
 Still evolving, rapidly.  OK.  I will have to find out exactly which version 
 I 
 have.  I have no need to decode status from prior versions.  I need only to 
 understand the status codes from what I am running, to understand what is 
 and is 
 not working in my system.  Fixes have included giving NTP and related 
 traffic 
 its own dedicated LAN and LAN ports on the hosts, to reduce buffeting of NTP 
 packets and/or the daemon by unrelated but heavy packet traffic.  The 
 buffeting 
 causes what appear to be large, random, and often asymmetric transport 
 delays.
 
 Is there available a written discussion of which changes were made and why?  
 This could be worth reading.
 
 More generally, these backward-incompatible changes will cause great 
 confusion 
 and difficulty in transitioning to NTPv4 unless ntpq is kept up to date, and 
 the 
 descriptions of what the various status codes mean are both complete and 
 correct 
 - telegraphic summaries

Re: [ntp:questions] NTPv4 Peer Event Codes - secret decoder ring sought

2010-03-19 Thread Joseph Gwinn
Dave,

In article 81ed5f77-97a2-474d-8c1a-346b2192c...@v34g2000prm.googlegroups.com,
 Dave Hart daveh...@gmail.com wrote:

 On Mar 18, 13:49 UTC, Joseph Gwinn joegw...@comcast.net wrote:
   Dave Hart daveh...@gmail.com wrote:
   If you want to be able to decode these bits for ntpd versions from
   before and after the change correctly, you need to query the version
   string of ntpd, sadly, such as with:
 
   ntpq -c rv 0 version
 
  So that's how you get the NTP version (rather than the ntpq version)!
 
  When our sysadmins first installed NTPv4, they used the version command of 
  ntpq, which said 4.  Check!  
 
  I came by a few days later to look at the purported NTPv4 loopstats and
  peerstats files, and (ever suspicious) checked to see what version of NTP 
  had infact generated them.  Still NTPv3.  The sysadmins had been snookered 
  by 
  ntpq, which failed to make unambiguous whose version it was reporting upon. 
   
 
  This had also happened to me back in the days of NTPv3, but I was saved 
  because
  I knew that 4 could not be the answer.  But I never did figure out how to 
  get ntpq to tell me the version of the ntp daemon.
 
 C:\NTPb\binntpq --version
 ntpq - standard NTP query program - Ver. 4.2.7p20
 
 C:\NTPb\binntpq -c version
 ntpq 4.2.7...@1.2137-o Mar 18 15:04:17.18 (UTC-00:00) 2010  (4)
 
 C:\NTPb\binntpq -c rv 0 version
 version=ntpd 4.2.7...@1.2137-o Mar 14 8:23:33.64 (UTC-00:00) 2010
 (9)
 
 C:\NTPb\bin
 
 The first two commands above are both reporting on the ntpq version,
 in slightly different form.  The third reports on the local ntpd
 version.  Tack on a hostname or IP address, and it'll tell you about a
 remote ntpd version, if you're allowed to use ntpq with the server in
 question.

This is a very useful summary.  I'll pass it on to the sysadmins.


  Is there available a written discussion of which changes were made and why? 
   
  This could be worth reading.
 
 If there is, it would be in the archives of committers@, hackers@, or
 questions@lists.ntp.org (all browsable via http://lists.ntp.org/) from
 around May 13, 2008.  I was not active on the lists at that time.

I'll poke around.  It will no doubt help understanding the genesis of the 
status 
codes and their descriptions in 
http://www.eecis.udel.edu/~mills/ntp/html/decode.html#peer.

 
  Looking at the code you suggested, I also see that the variable names are 
  the
  same as in NTPv3 (and the names imply the original NTPv3 meanings), but the 
  new
  NTPv4 comments on those variables seem to contradict the meanings implied 
  by the
  names.  Not knowing the history makes it difficult to figure out just what 
  is now meant.
 
 I believe the 2008 changes were part of overall cleanup to bring the
 reference implementation in-line with the draft NTP v4 specification.
 The RFC form of that document has just been approved by the IESG and
 should be a proposed standard RFC before too many more weeks.
 Please refer to that document in your search for meaning:
 
 ftp://ftp.rfc-editor.org/in-notes/internet-drafts/draft-ietf-ntp-ntpv4-proto-13.txt
 
 Which is derived from the less ASCII-hamstrung:
 
 http://www.eecis.udel.edu/~mills/database/reports/ntp4/ntp4.pdf

I have read ntp4.pdf (dated June 2006), and it says nothing of status codes and 
the like.  I assume that this is intentional, and that one is expected to 
consult the online documentation for such information.  Perhaps the text ties 
to 
the codes in decode.html#peer.


Thanks,

Joe

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTPv4 Peer Event Codes - secret decoder ring sought

2010-03-19 Thread Joseph Gwinn
In article slrnhq850s.183.un...@wormhole.physics.ubc.ca,
 unruh un...@wormhole.physics.ubc.ca wrote:

 On 2010-03-19, David Mills mi...@udel.edu wrote:
  Joe,
 
  That's a typo; event 16 does not exist. Glad you caught that.
 
 Pretty elaborate typo. Did they mean to give it a number other than 16,
 or were 50 letters somehow mistyped?

Ahh, be nice.  We all know perfectly well how such things happen.


Joe Gwinn


  Joseph Gwinn wrote:
 
 Dave,
 
 In article 4ba2c1ff.3060...@udel.edu, David Mills mi...@udel.edu wrote:
 
   
 
 Joe,
 
 You and Dave are working way too hard. The bits and pieces are 
 documented on the ntpq page and on the Event Messages and Status Codes 
 page.
 
 
 
 This would be http://www.eecis.udel.edu/~mills/ntp/html/decode.html#peer, 
 which I didn't know about, but is exactly what I seek.  And it wasn't a 
 secret 
 after all.
 
 But I have a question, a homework example, and a suggestion.
 
 First the question:  The Code field of the Peer Status Word is 4 bits wide, 
 and 
 yet codes are defined for values from 1 to 10 hex (decimal 16), which 
 doesn't 
 quite map.  How does the code value fit into the field?  Wraparound, so 10 
 (TAI) 
 becomes zero?
 
 
 The homework example:  The PSW word that started this exercise is 963a.  
 If I 
 understand, this word decodes as follows:
 
 Status field - host_reachable plus persistent_association
 
 Select field - system_peer (gets the star)
 
 Count field - 3
 
 Code field - become system peer (assuming code values are truncated to 4 
 bits, 
 so hex 10 becomes 0)  
 
 And 9614 decodes to host_reachable plus persistent_association, system_peer 
 (gets the star), count=1, and server_reachable.
 
 
 And the suggestion:  I was misled by some of the NTPv4 documentation, 
 specifically the NTPv4 peerstats file documentation in 
 http://www.eecis.udel.edu/~mills/ntp/html/monopt.html.  
 
 The note under the table defining peerstats record fields reads The status 
 field is encoded in hex format as described in Appendix B of the NTP 
 specification RFC 1305.  This is no longer really true, as you discuss 
 below.  
 In particular, codes exceeding 5 are not defined in 1305, and some of the 
 definitions appear to have changed (or at least have been clarified) so it 
 would 
 be helpful to add a pointer to 
 http://www.eecis.udel.edu/~mills/ntp/html/decode.html#peer to 
 monopt.html.
 
 
   
 
 RFC-1305  was written in 1992. It's been 18 years since then, so you 
 should expect changes from time to time. Changes are not done lightly; 
 they reflect updates in the algorithms and interpretation of the 
 statistics and state variables. If the interpretation  has not changed, 
 the name and code have not changed. If it has been changed or has become 
 obsolete, the name is not reused.
 
 
 
 This is good.  There is far too much existing base to do it any other way.
 
 Thanks,
 
 Joe Gwinn
 
 
   
 
 Dave
 
 Joseph Gwinn wrote:
 
 
 
 In article 
 46f5ae0a-93d6-44ea-812f-e4da2ae2c...@a16g2000pre.googlegroups.com,
 Dave Hart daveh...@gmail.com wrote:
 
  
 
   
 
 There were backward-incompatible changes on May 13, 2008 for ntp-dev
 4.2.5p114:
 
 http://ntp.bkbits.net:8080/ntp-dev/?PAGE=csetREV=48295cccnu3e5cmGhOzAS7
 hA-
 pVG3A
 
 Once again statestr.c is your friend:
 
 http://ntp.bkbits.net:8080/ntp-dev/libntp/statestr.c?PAGE=diffsREV=4829
 513
 7L4-SOuAy6YZauDbZtW6DRg
 
 If you want to be able to decode these bits for ntpd versions from
 before and after the change correctly, you need to query the version
 string of ntpd, sadly, such as with:
 
 ntpq -c rv 0 version

 
 
 
 So that's how you get the NTP version (rather than the ntpq version)!
 
 When our sysadmins first installed NTPv4, they used the version command 
 of 
 ntpq, 
 which said 4.  Check!  
 
 I came by a few days later to look at the purported NTPv4 loopstats and 
 peerstats files, and (ever suspicious) checked to see what version of NTP 
 had in 
 fact generated them.  Still NTPv3.  The sysadmins had been snookered by 
 ntpq, 
 which failed to make unambiguous whose version it was reporting upon.  
 
 This had also happened to me back in the days of NTPv3, but I was saved 
 because 
 I knew that 4 could not be the answer.  But I never did figure out how 
 to 
 get 
 ntpq to tell me the version of the ntp daemon. 
 
 
  
 
   
 
 and then parse for 4.2.5p114 or later.  The format for the version
 string can include an optional -RC# suffix, and before long, there may
 be releases with a -beta# suffix in the -stable branch, such as
 4.2.6p2-beta1 as a prelude to 4.2.6p2-RC1.

 
 
 
 Still evolving, rapidly.  OK.  I will have to find out exactly which 
 version 
 I 
 have.  I have no need to decode status from prior versions.  I need only 
 to 
 understand the status codes from what I am running, to understand what is 
 and is 
 not working in my system.  Fixes have included giving NTP and related 
 traffic 
 its own dedicated LAN and LAN ports

Re: [ntp:questions] NTPv4 Peer Event Codes - secret decoder ring sought

2010-03-18 Thread Joseph Gwinn
In article 46f5ae0a-93d6-44ea-812f-e4da2ae2c...@a16g2000pre.googlegroups.com,
 Dave Hart daveh...@gmail.com wrote:

 There were backward-incompatible changes on May 13, 2008 for ntp-dev
 4.2.5p114:
 
 http://ntp.bkbits.net:8080/ntp-dev/?PAGE=csetREV=48295cccnu3e5cmGhOzAS7hA-pVG3A
 
 Once again statestr.c is your friend:
 
 http://ntp.bkbits.net:8080/ntp-dev/libntp/statestr.c?PAGE=diffsREV=48295137L4-SOuAy6YZauDbZtW6DRg
 
 If you want to be able to decode these bits for ntpd versions from
 before and after the change correctly, you need to query the version
 string of ntpd, sadly, such as with:
 
 ntpq -c rv 0 version

So that's how you get the NTP version (rather than the ntpq version)!

When our sysadmins first installed NTPv4, they used the version command of 
ntpq, 
which said 4.  Check!  

I came by a few days later to look at the purported NTPv4 loopstats and 
peerstats files, and (ever suspicious) checked to see what version of NTP had 
in 
fact generated them.  Still NTPv3.  The sysadmins had been snookered by ntpq, 
which failed to make unambiguous whose version it was reporting upon.  

This had also happened to me back in the days of NTPv3, but I was saved because 
I knew that 4 could not be the answer.  But I never did figure out how to get 
ntpq to tell me the version of the ntp daemon. 


 and then parse for 4.2.5p114 or later.  The format for the version
 string can include an optional -RC# suffix, and before long, there may
 be releases with a -beta# suffix in the -stable branch, such as
 4.2.6p2-beta1 as a prelude to 4.2.6p2-RC1.

Still evolving, rapidly.  OK.  I will have to find out exactly which version I 
have.  I have no need to decode status from prior versions.  I need only to 
understand the status codes from what I am running, to understand what is and 
is 
not working in my system.  Fixes have included giving NTP and related traffic 
its own dedicated LAN and LAN ports on the hosts, to reduce buffeting of NTP 
packets and/or the daemon by unrelated but heavy packet traffic.  The buffeting 
causes what appear to be large, random, and often asymmetric transport delays.

Is there available a written discussion of which changes were made and why?  
This could be worth reading.

More generally, these backward-incompatible changes will cause great confusion 
and difficulty in transitioning to NTPv4 unless ntpq is kept up to date, and 
the 
descriptions of what the various status codes mean are both complete and 
correct 
- telegraphic summaries are not usually enough for non-developers to understand.

Looking at the code you suggested, I also see that the variable names are the 
same as in NTPv3 (and the names imply the original NTPv3 meanings), but the new 
NTPv4 comments on those variables seem to contradict the meanings implied by 
the 
names.  Not knowing the history makes it difficult to figure out just what is 
now meant.

Thanks,

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTPv4 Peer Event Codes - secret decoder ring sought

2010-03-17 Thread Joseph Gwinn
In article joegwinn-1c741d.09115717032...@news.giganews.com,
 Joseph Gwinn joegw...@comcast.net wrote:

 Dave,
 
 In article 
 8c5b8d60-8780-4bf0-80da-6b1d19410...@k24g2000pro.googlegroups.com,
  Dave Hart daveh...@gmail.com wrote:
 
  On Mar 17, 03:30 UTC, Joseph Gwinn wrote:
   Looking in section B.2.2 of RFC 1305 yields that the Peer Status Field 
   has 
   four subfields, the last (rightmost) one of which being the 4-bit Peer 
   Event 
   Code (page 57), which is defined for values between 0 and 5, and is 
   reserved 
   for values 6 to 15.
  
   Well, I have been seeing two values of Peer Status, 9614 and 963a, both
   hexidecimal.  I understand 9614, but 963A is a mystery, as it implies a 
   Peer
   Event Code of 10 (the A in the rightmost digit), which is undefined and
   reserved in RFC 1305.  
  
  Scan for PEVNT_ in ntp.h:
  
  http://ntp.bkbits.net:8080/ntp-stable/include/ntp.h?PAGE=annoREV=4af5f8cfD
  BBhNWjyJ4XiD74vlioxeg
  
  #define PEVNT_MOBIL (1 | PEER_EVENT) /* mobilize */
  #define PEVNT_DEMOBIL   (2 | PEER_EVENT) /* demobilize */
  #define PEVNT_UNREACH   (3 | PEER_EVENT) /* unreachable */
  #define PEVNT_REACH (4 | PEER_EVENT) /* reachable */
  #define PEVNT_RESTART   (5 | PEER_EVENT) /* restart */
  #define PEVNT_REPLY (6 | PEER_EVENT) /* no reply */
  #define PEVNT_RATE  (7 | PEER_EVENT) /* rate exceeded */
  #define PEVNT_DENY  (8 | PEER_EVENT) /* access denied */
  #define PEVNT_ARMED (9 | PEER_EVENT) /* leap armed */
  #define PEVNT_NEWPEER   (10 | PEER_EVENT) /* sys peer */
  #define PEVNT_CLOCK (11 | PEER_EVENT) /* clock event */
  #define PEVNT_AUTH  (12 | PEER_EVENT) /* bad auth */
  #define PEVNT_POPCORN   (13 | PEER_EVENT) /* popcorn */
  #define PEVNT_XLEAVE(14 | PEER_EVENT) /* interleave mode */
  #define PEVNT_XERR  (15 | PEER_EVENT) /* interleave error */
  #define PEVNT_TAI   (16 | PEER_EVENT) /* TAI */
 
 This almost answers the immediate question, but what exactly is a sys peer 
 event?
 
 More generally, the comments on many event codes lack verbs, defeating the 
 reader.  Perhaps these codes are expanded in ntpq; I'll look ...
 
  
  To match the literal text output by ntpd/ntpq when decoding, see also
  libntp/statestr.c:
  
  http://ntp.bkbits.net:8080/ntp-stable/libntp/statestr.c?PAGE=annoREV=4ac6e
  036jH41_maMfVXyf2VeiFknzQ
 
 ... by following this breadcrumb trail.
 
 
  There may be an easier way, but looking at the source comes naturally
  to me.
 
 There isn't an easier way for most people until the NTPv4 documentation is 
 updated, which is essential.  Right now, the NTPv4 documentation points users 
 and sysadmins to an authoritative but incomplete answer.
 
 I knew that the answer had to be in the ~70,000 lines of NTP source code, but 
 wouldn't really know which rock to look under.  Very few people have the time 
 to know this much source code well enough to find the correct answer, and to 
 know that the found answer is in fact correct.  Which is why the NTPv4 
 documentation 
 needs to be revised to reflect the as-built NTPv4 code.  NTP has many 
 millions 
 of users, but at most a few hundred developers (where a developer is 
 someone who knows his way around the source code).

Well, I started writing the decoder for NTPv4 Peer Status, and soon fetched up 
on the rocks.  It appears from the names that the definitions of the status 
bits 
have changed.  I recall someone saying on comp.protocols.time.ntp that this was 
the case, and now I see what books like confirmation.  Things have changed, and 
yet one can convince oneself that these old and new variables are actually the 
same.  Maybe it's really true.

Another issue is that the other fields of the Peer Status Word may have 
changed.  
Which C structures correspond to which tables in the online documentation?  I 
will need enough structure that I can hand-decode an arbitrarily built but 
compliant Peer Status Word. 


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

[ntp:questions] NTPv4 Peer Event Codes - secret decoder ring sought

2010-03-16 Thread Joseph Gwinn
Well, we just brought NTPv4 up on some IBM AIX 5.3 machines.  Had to compile 
from source code on the target machines to get a daemon that didn't crash upon 
launch.  Anyway, the daemon appears to be happily working, and is happily 
generating loopstats and peerstats files.  So far so good.

The peerstats files contain a Peer Status Word field.  NTPv4 is documented in 
http://www.eecis.udel.edu/~mills/ntp/html/index.html, and NTPv4 peerstats 
files are documented in 
http://www.eecis.udel.edu/~mills/ntp/html/monopt.html.  
The note under the table defining peerstats record fields reads The status 
field is encoded in hex format as described in Appendix B of the NTP 
specification RFC 1305.  (The draft RFC for NTPv4 is innocent of all such 
status information.)

Looking in section B.2.2 of RFC 1305 yields that the Peer Status Field has four 
subfields, the last (rightmost) one of which being the 4-bit Peer Event Code 
(page 57), which is defined for values between 0 and 5, and is reserved for 
values 6 to 15.


Well, I have been seeing two values of Peer Status, 9614 and 963a, both 
hexidecimal.  I understand 9614, but 963A is a mystery, as it implies a Peer 
Event Code of 10 (the A in the rightmost digit), which is undefined and 
reserved in RFC 1305.  

I would guess that NTPv4 has used some of the codes that were held in reserve 
in 
NTPv4, but where are these new codes formally defined?  This is most likely a 
general question, and I would hazard that this isn't the only place where NTPv4 
has outrun its documentation.

Thanks,

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] National time standard differences

2010-02-10 Thread Joseph Gwinn
In article c2bcn.37983$ym4.23...@text.news.virginmedia.com,
 David J Taylor 
 david-tay...@blueyonder.delete-this-bit.and-this-part.co.uk.invalid 
 wrote:

  I've setup an NTP server in the south east asia region synchonising
  with regional NTP servers as well as a couple of servers I am
  responsible for in the US.
 
  remotest t when poll reach   delay   offset  jitter
  ==
  +Japan 1 u  454 1024  377   87.402   -0.560   0.056
  *Japan 1 u  476 1024  377   87.277   -0.810   1.542
  -NorthAmerica  2 u  427 1024  377  285.387  -17.741   5.084
  -NorthAmerica  2 u  429 1024  377  307.208  -17.061   0.083
 
  Is the above delta seen between Japanese and North American time
  sources purely the delta between the outbound / return network path or
  something else?
 
  Chris
 
 Chris,
 
 Most likely asymmetrical paths, yes.  

The wide area networks used for international connections are typically 
SONET rings, which are inherently asymmetrical.  

As one works around a ring, the poll response time is constant (being 
the perimeter of the ring), while the out and back times vary with 
position on the ring with respect to the server one is polling.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] 500ppm - is it too small?

2009-11-13 Thread Joseph Gwinn
In article lak0t6-g6n@klein-habertwedt.de,
 Uwe Klein uwe_klein_habertw...@t-online.de wrote:

 Joseph Gwinn wrote:
  The prototypical example of an orthogonal instruction set was the 
  PDP-11.  The Motorola 68000 family was an outgrowth.
 
 68k   - CISC and still very much alive
 88k   - RISC drifting belly up in the pond.
 
 in ~1990 Motorola wanted to push RISC 88k so much
 you could buy a MVME187 plus SysVR3 unix System
 for half the price of a MVME167. Afaik it was a flop.

I remember the 88K.  Sort of.


 PowerPC later took off in an acceptable way.

Yes.  IBM uses the instruction set to this day in their large servers.

 
 Another side is a bit more interesting:
 Motorola had a perfect design for the 68k
 processor developement path at the time
 they released the inital MC68000.
 Things just worked.

Yes.  I did a fair bit of 680x0 programming over the years.  The 680x0 
still lives, having become a tiny little embedded computer chip for use 
in appliance controllers and the like.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] 500ppm - is it too small?

2009-11-12 Thread Joseph Gwinn
In article 87r5s3syxz@pc9454.klinik.uni-regensburg.de,
 Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:

 nemo_outis a...@xyz.com writes:
 
  Richard B. Gilbert rgilber...@comcast.net wrote in
  news:tlsdnq2e26bblbnxnz2dnuvz_sydn...@giganews.com: 
 
  nemo_outis wrote:
  ...
  I fail to see the value or relevance of 500ppm satisfies 98% of
  computer clocks if some other number, perhaps 5000 ppm, could
  satisfy yet even more than 98% of computer clocks with no downside -
  as indeed seems to be the case!  Chrony, whatever its other merits
  and demerits, is an existence proof for this proposition.
 
 
  I can't follow Dave's math but I'm reasonably sure that there is a
  good reason for the 500 PPM limit.  Since almost all computer clocks
  can meet this criterion I'm not going to worry about it.
 
  Hmm, faith-based ntp?  Not for me.  If there is a good reason I'd 
  like to hear it - 500 ppm has the smell of arbitrariness about it.
 
 As arbitrary as there are 8 bits in a byte.

No, 8 bits isn't arbitrary.  

Computer hardware is simplified if the various word lengths are all 
powers of two.

Eight bits was the smallest power-of-two size that allowed the full  
Roman alphabet including punctuation and control characters to be coded.

There are 5, 6, and 7 bit codes, all now obsolete:

Five-bit: Baudot, used in teletypes.

Six-bit:  Fieldata (Univac and Control Data, and others I assume.)

Seven-bit:  ASCII without parity bit.

Eight bit:  ASCII with parity bit, and EBCDIC 
(http://www.ncsa.illinois.edu/UserInfo/Resources/Hardware/IBMp690/IBM/usr
/share/man/info/en_US/xlf/html/lr425.HTM)

ASCII came from ATT, while EBCDIC came from IBM.


And now sixteen bit: Unicode.  
(http://unicode.org/standard/WhatIsUnicode.html)


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] .1 Microsecond Synchronization

2009-06-06 Thread Joseph Gwinn
In article 2pmdnygdgrswzbrxnz2dnuvz_uidn...@giganews.com,
 ScottyG sco...@pepex.net wrote:

 Hello.
 
 The company I am working for needs to be able to record timestamps in a 
 trading 
 system logs down to a .1 microsecond accuracy.
 
 We will have servers located in London, New York and Chicago. There will be a 
 dedicated resilient link between London and New York.
 
 Searches on the web have made claims that NTP can achieve this accuracy. 
 Unfortunately the sales rep for the NTP server we looked at told me that the 
 best I 
 could expect is 2-5 ms synchronization across servers.
 
 Has anyone had any experience doing this? Can anyone suggest how to achieve 
 this accuracy?
 
 We do have some budget but this but if I need to spend a whole lot on this I 
 need 
 to get in front of my management with the reasons. 
 
 Thank you in advance for any help or suggestions you can give me.

It's not obvious that what the company wants to do is even physically 
possible.  

As many others have mentioned, getting 100 nanosecond synch everywhere 
in a network 10,000 km in diameter isn't easy, and isn't possible 
without expensive specialized hardware.  

However, even with the special hardware, there are very deep problems 
with the whole idea.  

The classic reference is:

Time, clocks, and the ordering of events in a distributed system, 
Leslie Lamport, Communications of the ACM 21,7 (July 1978) pages 558-565.

Paper 26 in 
http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-27 Thread Joseph Gwinn
In article ywn94oxs3u9t@ntp1.isc.org,
 Harlan Stenn st...@ntp.org wrote:

  In article joegwinn-9c92e9.21153416032...@news.giganews.com, Joseph 
  Gwinn joegw...@comcast.net writes:
 
 Joseph In article ywn9tz5tflz6@ntp1.isc.org,
 Joseph  Harlan Stenn st...@ntp.org wrote:
 
   In article joegwinn-6fd03a.17481615032...@news.giganews.com, Joseph
   Gwinn joegw...@comcast.net writes:
  
   I think you are talking about one of my pet peeves:
   
   http://support.ntp.org/bin/view/Dev/NtpVariablesAndNtpq
 
 Joseph I don't think that I have inconsistent versions of ntpd and ntpq,
 Joseph because both came off the same CD from Sun Microsystems.
 
   It's still the same beast.  The bottom line is we currently have opaque
  data being presented to the user, and that is either being offered
  directly to the user (in your case) or is being potentially mis-converted
  by ntpq.
 
 Joseph I have a lot of trouble believing that Sun put inconsistent versions
 Joseph on their Solaris install CDs.
 
 I was not talking about the inconsistent version problem.  I'm talking about
 opaque data.
 
 Joseph Nor am I using NTPQ for decoding.  I decode these codes myself,
 Joseph following Appendix B of RFC-1305.  It turns out that NTPv4 uses the
 Joseph same definitions.  See
 Joseph http://www.eecis.udel.edu/~mills/ntp/html/monopt.html.
 
 Then I may have misunderstood.
 
 My point is that while it's fine for ntpd to send encoded data, I think we
 need to have a way for that data to *also* be sent decoded, or provide
 enough information so programs like ntpq can decode the result, regardless
 of which version of ntpd they are talking to.

It isn't quite bulletproof, but my decoder code also tells NTPv3 and 
NTPv4 loopstats and peerstats records apart, keyed on (loopstats?) 
record length.  

One can only hope that NTPv5 et seq will ensure that that an 
unsupervised and simple decoder program is able to tell record formats 
from various NTP versions apart.

As for a decoded format, that is pure ascii, that would certainly be the 
unix way, and would impose negligible load on all but the most skeletal 
of embedded systems.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


[ntp:questions] NTP Support (Was 'What does Max Distance Exceeded...')

2009-03-16 Thread Joseph Gwinn
We have moved from the meaning of status code 9514 to the more general 
issue of how NTP shall be supported, so I've collected the relevant 
threads below.

===

 At 11:19 PM -0400 3/15/09, Danny Mayer wrote:
 Joseph Gwinn wrote:
 
  The FAQ has to be the place for such explanations.
  I'm not sure if this qualifies as an FAQ as I don't recall that it has
  come up before.  FAQ stands for Frequently Asked Questions.
 
  RAQ then?  Rarely Asked Questions
 
  Seriously, I can't believe that I'm the only person in history to be
  perplexed by these status codes, and those little three-word summaries
  are a bit telegraphic.
 
  Joe Gwinn
 
 
 You aren't the only one. These questions have been asked before by a
 number of people. In fact I had to look at this at one point when I was
 getting these codes. Of course I just looked at the source code and
 never looked for documentation.
 
 I will tell you that this is a combination of bits so it's not just a
 number. Each bit represents a test code that failed so you have quite a
 bit to look at.

I do know how the status code is structured, and wrote a Mathematica 
program to automate the decoding. (I use Mathematica to generate the 
co-plots of loopstats and peerstats data, collect statistics, et al.) 

What I didn't know was that the definitions of the code bits had changed 
between v3 and v4.  I'll have to dig into the old documentation and see 
if this code was affected.

There is little chance that I will have the time to read enough NTP 
source code to make sense of it, sufficient to be able to come to 
reliable conclusions.   I'm a system engineer, and time is one issue of 
many in a system.

More generally, it's hopeless to expect the world's sysadmins to read 
NTP code (or any other kind of code).  They just don't have the time, 
and are responsible for far too many different kinds of box for it to be 
practical.  But a major part of making something reliable in practice is 
making it possible for a harried sysadmin to nonetheless get it right.  
(I'm not a sysadmin, but work with many sysadmins.  They spend lots of 
time fighting fires, and are of necessity jacks of all trades, masters 
of none.)


Silently mutating code definitions sounds like a blunder to me.  NTP is 
used on tens to hundreds of millions of computers worldwide.  There will 
never be a pure v4 world.  In fact there  will still be v3 around when 
v5 is being introduced.  So, if new kinds of status is needed, invent 
new codes to suit, but do not change the meanings of the codes that are 
already widely used.  In other words, do not undermine your existing 
base.

The Internet folk had the same issue with IPv6, and they concluded that 
IPv4 was too deeply embedded to ever eliminate, and that there was never 
going to be a flag day when a worldwide changeover would happen.  
Thus, IPv4 and IPv6 had to coexist and interoperate forever, and so IPv6 
was designed to support this.

==

 To: ma...@ntp.org
 From: Joe Gwinn joegw...@comcast.net
 Subject: Re: [ntp:questions] What exactly does Maximum Distance Exceded  
 mean?
 Cc: questions@lists.ntp.org
 Bcc: gw...@raytheon.com
 X-Attachments: 
 
 Status code values fixed.
 
 At 10:47 PM -0400 3/15/09, Danny Mayer wrote:
 Joseph Gwinn wrote:
  Hmm.  OK, but I think that we've kind of run off the rails.  Let me
  summarize: 
 
  1.  Sun Microsystems' current behavior is not the issue, as I'm loading
  old software from an old CD onto old computer hardware, hardware that
  cannot support a newer version of Solaris than v9. 
 
  One of these old Solaris boxes did work with NTPv3 running an even older
  version of Solaris, with no 9514 codes, deepening the mystery.
 
 
 The trouble here is that those codes are *very likely* likely to have
 changed between V3 and V4 since there was a large rewrite between the
 two. That's why looking at the source code is necessary to get you the
 help you need.

As discussed in my other reply, mutating codes is a blunder.   It's a 
good-news bad-news thing.  The good news is that NTP has succeeded on an 
unimagined scale.  The bad news is that because of that scale, one must 
be *very* respectful of NTP's existing base, and it *can* be 
constraining.


  The fact that this obsolete system can most likely support NTPv4 is
  worth investigation, though.
 
  2.  I think that what's happening is that I'm doing something dumb, and
  I bet that there is no real difference in how NTPv3 or NTPv4 would react
  to this faux pas, whatever it turns out to be.  Nor is source code
  research needed or requested. 
 
  3.  The original question was how to interpret a specific status code,
  9514.  I read the explanation in the documentation, but became no wiser
  for it.  Thus my question. 
 
 Which is why you need to look at the source code. Documentation isn't
 always clear or definitive but the source code will tell you

Re: [ntp:questions] NTP Support (Was 'What does Max Distance Exceeded...')

2009-03-16 Thread Joseph Gwinn
In article 49becf09$0$507$5a6ae...@news.aaisp.net.uk,
 David Woolley da...@ex.djwhome.demon.co.uk.invalid wrote:

 Joseph Gwinn wrote:
  We have moved from the meaning of status code 9514 to the more general 
 
 But you should have kept the thread, even if the subject changed.

Opinion varies on this, but why?  It really was a case of topic drift.

 
  issue of how NTP shall be supported, so I've collected the relevant 
  threads below.
 
  
  More generally, it's hopeless to expect the world's sysadmins to read 
  NTP code (or any other kind of code).  They just don't have the time, 
 
 Generally, you only need to read a small bit of code to answer this sort 
 of question, but if you haven't got the time you should pay someone who 
 does have the time.

It's simply not going to happen, especially for random small questions.  
They will just muddle through, and blame NTP.

Hiring outside help isn't just a money problem, it also requires much 
jumping through bureaucratic hoops, which is very time consuming, so it 
never makes sense for non-major issues.

 
 Historically, open source software was written for use by people who had 
 the ability to support it themselves.  Recently, the relationship has 
 become asymmetric with a lot of people wanting free software and free 
 support.  Whilst some open source software developers may consider it a 
 valuable loss leader to produce a naive user product and support it, may 
 even consider it part of their mission, most open source developers are 
 not that interested in donating that level of free support.

What price success?  Most open-source software would be happy to achieve 
1% of what NTP has achieved.  Only Linux is even in the running, but NTP 
far exceeds Linux.  Given the present scale, for which NTP and its 
community were never designed, what to do?  The single most effective 
thing the community can do is to write good documentation.  Yes, it's 
work, but it's by far the most effective thing one can do, and it's 
really the only practical approach given the immense size of the NTP 
user base.

By the way, when I started digging into the archived NTPv3 
documentation, it said that the peerstats status codes were defined in 
Appendix B of RFC-1305.  OK, I actually knew that. Then, I started 
looking for the corresponding NTPv4 documentation.  The NTPv4 RFC to be 
is innocent of the word peerstats and its status field, but the 
current online documentation (which is for NTPv4) at 
http://www.eecis.udel.edu/~mills/ntp/html/monopt.html also points to 
the same place in RFC-1305, so it appears that the status code 
definitions did not change.  Unless the documentation is in error.  

If so, the simplest global fix is to update the documentation.  As I 
said in the prior posting, it's a non-starter to try to require a 
hundred million people to either read the source code or pay someone to 
do it for them every time NTP throws an uncommon status code.  It just 
won't happen.  On scale alone, the NTP community would be overwhelmed 
with repeated trivial questions.

I'm reminded of the librarians in the town library when I was in high 
school.  There were lots of hand-written cards in the card file, placed 
there by the librarians when they got tired of hearing the same simple 
or silly question.


 Actually a lot of commercial software, these days, is dumbed down, 
 supported, open source material.

True.  But it can be amusing to figure out which one it is, and start to 
ask uncomfortable questions about how can they be better than free, 
especially if they reduced the usefulness of the code.

Joe

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-16 Thread Joseph Gwinn
In article 20090316063319.f2...@bulldog.localhost.org,
 hun...@comcast.net (Rob Neal) wrote:

 On Sun, 15 Mar 2009, Joseph Gwinn wrote:
 
  In article qo6dnzljdv8z4cdunz2dnuvz_hwwn...@giganews.com,
  Richard B. Gilbert rgilber...@comcast.net wrote:
 
[snip]
 
  The FAQ has to be the place for such explanations.
 
  I'm not sure if this qualifies as an FAQ as I don't recall that it has
  come up before.  FAQ stands for Frequently Asked Questions.
 
  RAQ then?  Rarely Asked Questions
 
  Seriously, I can't believe that I'm the only person in history to be
  perplexed by these status codes, and those little three-word summaries
  are a bit telegraphic.
   You have lots of company, sadly. A decoder function would
   be 'Good'. There are obstacles to this, as remarked by
   others, and a serious lack of volunteer time to code a
   solution. No one will stop you, if you wish to contribute
   In the current build I find the TEST status codes in
   ntp.h. They have changed, from release to release, so
   consult your source for particulars.
   There has been considerable improvement in NTP from
   V3 to V4. You should consider upgrading, it really
   is better.

I already have a decoder function, coded in Mathematica, based on 
Appendix B of RFC-1305, which ought to work for NTPv3.  The problem is 
one of documentation, as RFC-1305 is pretty telegraphic, although it 
does point one to the descriptions of the relevant tests.

I gather that the actual NTPv3 code does not really follow RFC-1305.  
That seems to be the bottom line.

Joe

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-16 Thread Joseph Gwinn
In article 1237210896.237...@news1nwk,
 Brian Utterback brian.utterb...@sun.com wrote:

 Joseph Gwinn wrote:
  Also good to know, so I'll know better than to use the Sun compiler (not 
  that it's bad, but that it isn't what's been worked through with NTP).
 
 Not to worry, I make sure that the current -dev branch will always 
 build with the Sun Studio compilers.

Ahh.  Good to know. 

Thanks,

Joe

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-16 Thread Joseph Gwinn
In article ywn9y6v5fm70@ntp1.isc.org,
 Harlan Stenn st...@ntp.org wrote:

  In article joegwinn-4a6c5f.16502115032...@news.giganews.com, Joseph 
  Gwinn joegw...@comcast.net writes:
 
 Joseph What AIX version and Technology Level (~=patch level) have been used
 Joseph to build NTPv4?
 
 powerpc-ibm-aix4.3.3.0
 powerpc-ibm-aix5.1.0.0
 powerpc-ibm-aix5.2.0.0
 powerpc-ibm-aix5.3.0.0

Thanks.  We are using AIX 5.3 TL5 and TL6.  

Although there is loose talk of going to AIX 6.1, who knows if this will 
soon happen.

Joe

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-16 Thread Joseph Gwinn
In article zcsdncxjv-jn2spunz2dnuvz_gawn...@giganews.com,
 Richard B. Gilbert rgilber...@comcast.net wrote:

 Joseph Gwinn wrote:
  In article qo6dnzljdv8z4cdunz2dnuvz_hwwn...@giganews.com,
   Richard B. Gilbert rgilber...@comcast.net wrote:
  
  Joseph Gwinn wrote:
  In article nbwdnvxq_p-y_idunz2dnuvz_thin...@giganews.com,
   Richard B. Gilbert rgilber...@comcast.net wrote:
 
  Joseph Gwinn wrote:
  In article 49bd3907.1080...@ntp.org, ma...@ntp.org (Danny Mayer) 
  wrote:
 
  [snip]
  3.  The original question was how to interpret a specific status code, 
  9514.  I read the explanation in the documentation, but became no wiser 
  for it.  Thus my question.  
 
  If there isn't a NTP FAQ entry on this, there probably should be.  Our 
  sysadmins were flummoxed by the cloud of 5914 codes, and they are far 
  too busy to undertake a research project.  (The deeper problem is that 
  some managers believe that NTP is plug and play, which isn't quite 
  true.)
 
 
  The various answers and questions I've gotten have been quite useful, 
  as 
  they give me a list of things to think about and investigate, things I 
  might not have thought of, or soon thought of.
 
  Joe Gwinn
  Joe,
 
  You need to proofread your message text a little more carefully!!
 
  Which error are you ACTUALLY getting?  You say 9514 and then 5914! 
  Which is it?
  You're right, but it wouldn't help, for an odd reason.
 
  The status code is 9514.
 
  But I have a Clausing 5914 lathe.
 
  Inherent dyslexia inducer.
 
 
  Also, you might try Google with the FULL and EXACT text of the error 
  message!
  It's 9514, pulled from a field in peerstats records.  Think I'll get 
  many false hits?  Qualifying 9514 with peerstats brought me back to this 
  thread.
 
  So, tried Maximum Distance Exceded, got led back to this exact news 
  thread.
 
  But let's say I did find some relevant hits.  This is the Internet.  How 
  would I know which hits to believe?
 
  I would be influenced by who wrote it and who disagreed with him!
  
  But what if I listen to the loudest one?
  
  
  The FAQ has to be the place for such explanations.
  I'm not sure if this qualifies as an FAQ as I don't recall that it has 
  come up before.  FAQ stands for Frequently Asked Questions.
  
  RAQ then?  Rarely Asked Questions
  
  Seriously, I can't believe that I'm the only person in history to be 
  perplexed by these status codes, and those little three-word summaries 
  are a bit telegraphic.
  
  Joe Gwinn
 
 I don't see any of the usual suspects stepping up to take 
 responsibility.  You may have to reverse engineer the code in order to 
 satisfy your curiosity.

Actually, in the earlier part of this thread, better explanations were 
given, before we drifted off into NTP support models.

Joe

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-16 Thread Joseph Gwinn
In article ywn9tz5tflz6@ntp1.isc.org,
 Harlan Stenn st...@ntp.org wrote:

  In article joegwinn-6fd03a.17481615032...@news.giganews.com, Joseph 
  Gwinn joegw...@comcast.net writes:
 
   I think you are talking about one of my pet peeves:
  
  http://support.ntp.org/bin/view/Dev/NtpVariablesAndNtpq
 
 Joseph I don't think that I have inconsistent versions of ntpd and ntpq,
 Joseph because both came off the same CD from Sun Microsystems.
 
 It's still the same beast.  The bottom line is we currently have opaque data
 being presented to the user, and that is either being offered directly to
 the user (in your case) or is being potentially mis-converted by ntpq.

I have a lot of trouble believing that Sun put inconsistent versions on 
their Solaris install CDs.

Nor am I using NTPQ for decoding.  I decode these codes myself, 
following Appendix B of RFC-1305.  It turns out that NTPv4 uses the same 
definitions.  See 
http://www.eecis.udel.edu/~mills/ntp/html/monopt.html.

Joe

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-16 Thread Joseph Gwinn
In article 49be52ab.1000...@ntp.org, ma...@ntp.org (Danny Mayer) 
wrote:

 Joe Gwinn wrote:
  Status code values fixed.
  
  At 10:47 PM -0400 3/15/09, Danny Mayer wrote:
  Joseph Gwinn wrote:
   Hmm.  OK, but I think that we've kind of run off the rails.  Let me
   summarize:
   1.  Sun Microsystems' current behavior is not the issue, as I'm loading
   old software from an old CD onto old computer hardware, hardware that
   cannot support a newer version of Solaris than v9.
   One of these old Solaris boxes did work with NTPv3 running an even
  older
version of Solaris, with no 9514 codes, deepening the mystery.
   
 
  The trouble here is that those codes are *very likely* likely to have
  changed between V3 and V4 since there was a large rewrite between the
  two. That's why looking at the source code is necessary to get you the
  help you need.
  
  As discussed in my other reply, mutating codes is a blunder.   It's a
  good news bad news thing.  The good news is that NTP has succeeded on an
  unimagined scale.  The bad news is that because of that scale, one must
  be *very* respectful of NTP's existing base, and it can be constraining.
  
 
 You won't get any argument from us. However, Dave Mills is responsible
 for these codes and we haven't been able to get him to agree to not
 change the test code numbers and to use new ones if he needs more and
 just not reuse the old ones. He has good reasons for changing the tests
 but changing the meaning of the same code is harder to fathom. His view
 is that these are internal tests but when you are trying to track down a
 problem with your ntp daemon, it's important to know what they mean.

These codes are *not* internal only.  They are documented in RFC-1305, 
Appendix B, which is also pointed to by the NTPv4 documentation 
http://www.eecis.udel.edu/~mills/ntp/html/monopt.html.  These codes 
are quite public.


The fact that this obsolete system can most likely support NTPv4 is
   worth investigation, though.
 
   2.  I think that what's happening is that I'm doing something dumb, and
   I bet that there is no real difference in how NTPv3 or NTPv4 would react
   to this faux pas, whatever it turns out to be.  Nor is source code
   research needed or requested.
   3.  The original question was how to interpret a specific status code,
   9514.  I read the explanation in the documentation, but became no wiser
   for it.  Thus my question. 
 
  Which is why you need to look at the source code. Documentation isn't
  always clear or definitive but the source code will tell you.
  
  It simply cannot be required to read source code to get the definitions
  of status codes, even if the documentation has to give one definition
  per NTP version.  NTP is used on hundreds of millions of computers.  Are
  we expecting that every time someone gets an unexpected code they either
  have to read the source code, or pay someone to read it for them?  I'm
  sorry, but that cannot work.
  
 
 I agree, but I'm not the person you need to persuade. In V4 the flash
 codes are listed in libntp/statestr.c. I don't know about V3.

While given the pointer I may well look, the fundamental issue remains.


 You may also be amused by this sync code:
   { CTL_SST_TS_WRSTWTCH,  sync_wristwatch },

Heh.  Hairy wrist required.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-16 Thread Joseph Gwinn
In article 49bdbbbe.4030...@ntp.org, ma...@ntp.org (Danny Mayer) 
wrote:

 Joseph Gwinn wrote:
  
  What's the story for IBM's AIX?
  
 
 It builds on AIX too. It builds on most Unix systems though maybe not on
 some of the oldest O/S versions.

Including the AIX we use, as mentioned in another posting.

Joe Gwinn


PS:  I'll be offline for the next two weeks, starting tomorrow morning.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-16 Thread Joseph Gwinn
In article 
89b3231f-d79a-4fb1-b449-fe574bca8...@j38g2000yqa.googlegroups.com,
 paul.cro...@softwareag.com wrote:

 Joseph,
 
 If you're not willing to get the source code for NTP and compile it,
 you can download a binary from http://www.sunfreeware.com/.
 It's probably configured with a 'standard' set of
 refclock drivers and, as a consequence, may be larger
 than a custom-configured version.

We will most likely do just that.  Size isn't much of a problem, 
especially for an experiment.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-16 Thread Joseph Gwinn
In article 49bdc52b.20...@ntp.org, ma...@ntp.org (Danny Mayer) wrote:

 Joseph Gwinn wrote:
 
  The FAQ has to be the place for such explanations.
  I'm not sure if this qualifies as an FAQ as I don't recall that it has 
  come up before.  FAQ stands for Frequently Asked Questions.
  
  RAQ then?  Rarely Asked Questions
  
  Seriously, I can't believe that I'm the only person in history to be 
  perplexed by these status codes, and those little three-word summaries 
  are a bit telegraphic.
  
  Joe Gwinn
  
 
 You aren't the only one. These questions have been asked before by a
 number of people. In fact I had to look at this at one point when I was
 getting these codes. Of course I just looked at the source code and
 never looked for documentation.

My fundamental point is that expecting a significant part of the NTP 
user base to read the code simply does not scale, for a host of reasons.

 
 I will tell you that this is a combination of bits so it's not just a
 number. Each bit represents a test code that failed so you have quite a
 bit to look at.

Just for curiosity, how many semicolons are there in the NTPv3 and NTPv4 
codebases?  My impression is that each is about 20,000 or 30,000, but I 
don't know why or where I got the number.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-16 Thread Joseph Gwinn
In article rzidna3pv6cenclunz2dnuvz_o8la...@giganews.com,
 Richard B. Gilbert rgilber...@comcast.net wrote:

 Joseph Gwinn wrote:
  In article 49bdbbbe.4030...@ntp.org, ma...@ntp.org (Danny Mayer) 
  wrote:
  
  Joseph Gwinn wrote:
  What's the story for IBM's AIX?
 
  It builds on AIX too. It builds on most Unix systems though maybe not on
  some of the oldest O/S versions.
  
  Including the AIX we use, as mentioned in another posting.
  
  Joe Gwinn
  
  
  PS:  I'll be offline for the next two weeks, starting tomorrow morning.
 
 If every post here results in an automated I'm out of the office 
 message from you, we will never speak to you again!

I did think of that ... but thought better of it.

Nor did I trust Notes not to get into a fight with the reflector.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-15 Thread Joseph Gwinn
In article 49bc631c.1060...@ntp.org, ma...@ntp.org (Danny Mayer) 
wrote:

 Ronan Flood wrote:
  On Thu, 12 Mar 2009 23:31:11 -0500,
  Joseph Gwinn joegw...@comcast.net wrote:
  
  NTP version 3 is running.  I've been trying to find the command to give 
  me the full version, including dot (like 3.4y), and I get answers, but 
  don't know which one to believe, and if the version given is that of the 
  NTP daemon itself, or of ntpq, or of ntpdate.
  
  It might get logged to /var/adm/messages or somewhere when xntpd starts,
  but try
  
   ntpq -c rv 0 daemon_version
  
 
 However we are not supporting ntp version 3, at least not without
 funding. Is there some reason why you are not running the latest version
 of ntpd?

It's what came with that old version of Solaris, the most modern Solaris 
that will run on the old Sun boxes in question.

Is NTP v4 proven to run on Solaris 9 (SunOS 5.9 Generic May 2002)?

The suspicion is that we have not set something up correctly, not that 
NTP v3 has failed, or that NTP v4 would fare better or worse.  Don't 
understand the comment about funding.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-15 Thread Joseph Gwinn
In article lbgdns8bwol7mydunz2dnuvz_ukwn...@giganews.com,
 Richard B. Gilbert rgilber...@comcast.net wrote:

 Joseph Gwinn wrote:
  In article 49bc631c.1060...@ntp.org, ma...@ntp.org (Danny Mayer) 
  wrote:
  
  Ronan Flood wrote:
  On Thu, 12 Mar 2009 23:31:11 -0500,
  Joseph Gwinn joegw...@comcast.net wrote:
 
  NTP version 3 is running.  I've been trying to find the command to give 
  me the full version, including dot (like 3.4y), and I get answers, but 
  don't know which one to believe, and if the version given is that of the 
  NTP daemon itself, or of ntpq, or of ntpdate.
  It might get logged to /var/adm/messages or somewhere when xntpd starts,
  but try
 
   ntpq -c rv 0 daemon_version
 
  However we are not supporting ntp version 3, at least not without
  funding. Is there some reason why you are not running the latest version
  of ntpd?
  
  It's what came with that old version of Solaris, the most modern Solaris 
  that will run on the old Sun boxes in question.
  
  Is NTP v4 proven to run on Solaris 9 (SunOS 5.9 Generic May 2002)?
 
 Proven?  Please define that!  It works for me.  YMMV!

OK, demonstrated.  As opposed to it should work (but nobody has 
actually done it).

So, you have done it, which is encouraging.


  The suspicion is that we have not set something up correctly, not that 
  NTP v3 has failed, or that NTP v4 would fare better or worse.  Don't 
  understand the comment about funding.
  
 
 Simple enough.  The people who maintain NTPD have to earn a living 
 somehow.  The more unpaid time they volunteer the more difficult it 
 becomes to pay the bills!

Open Source is doomed?  Bill's prayers are answered at last.

 
 Version 3 is AT LEAST six years old.  

So is the OS version being used: 2009-2002= 7 years.  So, they're 
siblings.


  V4.something is current.  Vendors 
 are still shipping V3 because the RFC for V4 has not yet been formally 
 adopted.  Or, if it has been adopted, the adoption is extremely recent.

You forgot to mention laziness and pernicious inertia.

If I recall, on a prior project, the version of NTP that came with the 
Solaris boxes of the day wasn't quite good enough (don't recall why), so 
the software folk downloaded and installed the then latest version of 
NTP (v3 I think) and that worked.  The project team is still around.  I 
think I'll chase the details down.


 If you have a C compiler, you can download the source and build it 
 yourself.  If you don't have a C compiler you can download GCC for free.

Way too much work.  It's never just compile and go.  Significant futzing 
always seems necessary.  Nor should this be necessary for Sun 
anything.  

But nagging at me is a half memory that on that prior project they may 
have had to compile NTP, for some possibly irrelevant reason.  Like 
wanting to use the same toolchain for all code.  Another reason to chase 
the details down.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-15 Thread Joseph Gwinn
In article 49bd3907.1080...@ntp.org, ma...@ntp.org (Danny Mayer) 
wrote:

 Joseph Gwinn wrote:
  In article 49bc631c.1060...@ntp.org, ma...@ntp.org (Danny Mayer) 
  wrote:
  
  Ronan Flood wrote:
  On Thu, 12 Mar 2009 23:31:11 -0500,
  Joseph Gwinn joegw...@comcast.net wrote:
 
  NTP version 3 is running.  I've been trying to find the command to give 
  me the full version, including dot (like 3.4y), and I get answers, but 
  don't know which one to believe, and if the version given is that of the 
  NTP daemon itself, or of ntpq, or of ntpdate.
  It might get logged to /var/adm/messages or somewhere when xntpd starts,
  but try
 
   ntpq -c rv 0 daemon_version
 
  However we are not supporting ntp version 3, at least not without
  funding. Is there some reason why you are not running the latest version
  of ntpd?
  
  It's what came with that old version of Solaris, the most modern Solaris 
  that will run on the old Sun boxes in question.
  
  Is NTP v4 proven to run on Solaris 9 (SunOS 5.9 Generic May 2002)?
  
  The suspicion is that we have not set something up correctly, not that 
  NTP v3 has failed, or that NTP v4 would fare better or worse.  Don't 
  understand the comment about funding.
  
 
 Let me try and clarify my remark. NTP v3 was last released about 10
 years ago. Since then all work and experience has been done on V4 at
 least in these forums. Sun has continued to ship V3 even though it's
 rather obsolete and old. One person within Sun is trying to change that
 but in the meantime users like you are left trying to deal with issues
 with that version. Very few of us in the forum have experience with V3
 and can accurately answer questions about it. All of us have knowledge
 of V4 and what's true of V4 may very well be false for V3. Since
 everyone is a volunteer in this forum, unless they have a lot of spare
 time on their hands noone is going to look at the V3 sources and provide
 you with correct reponses for that version. Sun has support people who's
 job it is to answer such questions, but you of course pay for Sun support.
 
 If we had funding to do it, we would able to provide answers that you
 could rely on. Otherwise we are just guessing that things didn't change
 between V3 and V4 and there were fundamental changes between the two
 major versions.
 
 That was the reason for my remark.

Hmm.  OK, but I think that we've kind of run off the rails.  Let me 
summarize:  

1.  Sun Microsystems' current behavior is not the issue, as I'm loading 
old software from an old CD onto old computer hardware, hardware that 
cannot support a newer version of Solaris than v9.  

One of these old Solaris boxes did work with NTPv3 running an even older 
version of Solaris, with no 5914 codes, deepening the mystery.

The fact that this obsolete system can most likely support NTPv4 is 
worth investigation, though.

2.  I think that what's happening is that I'm doing something dumb, and 
I bet that there is no real difference in how NTPv3 or NTPv4 would react 
to this faux pas, whatever it turns out to be.  Nor is source code 
research needed or requested.  

3.  The original question was how to interpret a specific status code, 
9514.  I read the explanation in the documentation, but became no wiser 
for it.  Thus my question.  

If there isn't a NTP FAQ entry on this, there probably should be.  Our 
sysadmins were flummoxed by the cloud of 5914 codes, and they are far 
too busy to undertake a research project.  (The deeper problem is that 
some managers believe that NTP is plug and play, which isn't quite true.)


The various answers and questions I've gotten have been quite useful, as 
they give me a list of things to think about and investigate, things I 
might not have thought of, or soon thought of.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-15 Thread Joseph Gwinn
In article ywn9r60yinft@ntp1.isc.org,
 Harlan Stenn st...@ntp.org wrote:

  In article joegwinn-795fe1.14394915032...@news.giganews.com, Joseph 
  Gwinn joegw...@comcast.net writes:
 
 Joseph In article 49bd3a1e.2020...@ntp.org, ma...@ntp.org (Danny Mayer)
 Joseph wrote:
 
 Danny NTP v4 builds on most versions of Solaris. If for some reason it does
 Danny not build, Harlan can help with that since he's the build master. We
 Danny have Solaris boxes in the build farm.
 
 I have access to a number of older versions of solaris for build testing.  I
 have no idea now many people are running on these older OSes, but if there
 was a problem I believe I would have heard about it.

I imagine so.


 Joseph What's the story for IBM's AIX?
 
 I have access to some AIX boxes for build testing.  I generally do not have
 access to these boxes for runtime testing.

No runtime testing?  That's not good.  But you'll no doubt hear when 
there's trouble.

What AIX version and Technology Level (~=patch level) have been used to 
build NTPv4?

 
 One of the goals of the NTP Forum is to produce a diverse operational ntp
 build and test farm.  This goal will be realized if a sufficient number of
 organizations join the NTP Forum.
 
   Version 3 is AT LEAST six years old.  V4.something is current.  Vendors
   are still shipping V3 because the RFC for V4 has not yet been formally
   adopted.  Or, if it has been adopted, the adoption is extremely recent.
 
 NTPv3 was an RFC, but never a Standard.
 
 NTPv4 is, I believe, close to becoming a Standard.
 
 Joseph I have read at least one draft.  The fact that it isn't yet a RFC is
 Joseph not a problem.
 
 Good.
 
   If you have a C compiler, you can download the source and build it 
  yourself.  If you don't have a C compiler you can download GCC for free.
  
  We use gcc for our Solaris builds.
 
 Joseph Also good to know, so I'll know better than to use the Sun compiler
 Joseph (not that it's bad, but that it isn't what's been worked through
 Joseph with NTP).
 
 I have platforms where I only use native compilers and not gcc.
 
 Again, these are build platforms, not operational test platforms.

Yes.  The context of the question was if I had to compile NTP from 
source, for whatever reason.  Taming another toolchain is not something 
I would look forward to.


 And again, if there was a problem I believe I would have heard about it.

Oh yes.  The bullseye is invisible, but always there.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-15 Thread Joseph Gwinn
In article nbwdnvxq_p-y_idunz2dnuvz_thin...@giganews.com,
 Richard B. Gilbert rgilber...@comcast.net wrote:

 Joseph Gwinn wrote:
  In article 49bd3907.1080...@ntp.org, ma...@ntp.org (Danny Mayer) 
  wrote:
  
[snip]
  
  3.  The original question was how to interpret a specific status code, 
  9514.  I read the explanation in the documentation, but became no wiser 
  for it.  Thus my question.  
  
  If there isn't a NTP FAQ entry on this, there probably should be.  Our 
  sysadmins were flummoxed by the cloud of 5914 codes, and they are far 
  too busy to undertake a research project.  (The deeper problem is that 
  some managers believe that NTP is plug and play, which isn't quite true.)
  
  
  The various answers and questions I've gotten have been quite useful, as 
  they give me a list of things to think about and investigate, things I 
  might not have thought of, or soon thought of.
  
  Joe Gwinn
 
 Joe,
 
 You need to proofread your message text a little more carefully!!
 
 Which error are you ACTUALLY getting?  You say 9514 and then 5914! 
 Which is it?

You're right, but it wouldn't help, for an odd reason.

The status code is 9514.

But I have a Clausing 5914 lathe.

Inherent dyslexia inducer.


 Also, you might try Google with the FULL and EXACT text of the error 
 message!

It's 9514, pulled from a field in peerstats records.  Think I'll get 
many false hits?  Qualifying 9514 with peerstats brought me back to this 
thread.

So, tried Maximum Distance Exceded, got led back to this exact news 
thread.

But let's say I did find some relevant hits.  This is the Internet.  How 
would I know which hits to believe?

The FAQ has to be the place for such explanations.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-15 Thread Joseph Gwinn
Status codes also fixed below.

In article ywn9mybmil8b@ntp1.isc.org,
 Harlan Stenn st...@ntp.org wrote:

  In article joegwinn-a341a8.15084015032...@news.giganews.com, Joseph 
  Gwinn joegw...@comcast.net writes:
 
 Joseph Let me summarize:
 
 Joseph 1.  Sun Microsystems' current behavior is not the issue, as I'm
 Joseph loading old software from an old CD onto old computer hardware,
 Joseph hardware that cannot support a newer version of Solaris than v9.
 
 Joseph One of these old Solaris boxes did work with NTPv3 running an even
 Joseph older version of Solaris, with no 9514 codes, deepening the mystery.
 
 Joseph The fact that this obsolete system can most likely support NTPv4 is
 Joseph worth investigation, though.
 
 I believe ntp4 will work there, and if it does not and somebody opens a bug
 report on it, I expect it will be fixed.

Yes. 


 Joseph 2.  I think that what's happening is that I'm doing something dumb,
 Joseph and I bet that there is no real difference in how NTPv3 or NTPv4
 Joseph would react to this faux pas, whatever it turns out to be.  Nor is
 Joseph source code research needed or requested.
 
 I mostly agree with you here, more in a bit...
 
 Joseph 3.  The original question was how to interpret a specific status
 Joseph code, 9514.  I read the explanation in the documentation, but became
 Joseph no wiser for it.  Thus my question.
 
 Joseph If there isn't a NTP FAQ entry on this, there probably should be.
 Joseph Our sysadmins were flummoxed by the cloud of 9514 codes, and they
 Joseph are far too busy to undertake a research project.  (The deeper
 Joseph problem is that some managers believe that NTP is plug and play,
 Joseph which isn't quite true.)
 
 I think you are talking about one of my pet peeves:
 
  http://support.ntp.org/bin/view/Dev/NtpVariablesAndNtpq

I don't think that I have inconsistent versions of ntpd and ntpq, 
because both came off the same CD from Sun Microsystems.


 I strongly believe that we should implement something like this.
 
 It will need to be implemented in a way that Dave can tolerate.
 
 The odds of this being implemented are directly proportional to somebody
 doing the work.
 
 Near as I can tell, the best way to have somebody do the work is to have
 money available to pay for the work to be done.
 
 The best way I know to get money to pay for this work to be done is to show
 organizations that by joining the NTP Forum they will be spending money to
 get significant value in return.
 
 I believe projects like this one are one example of that significant
 value.

It would be nice for sure, but I cannot see companies not selling 
time-related equipment having a sufficient business case to fund the NTP 
Forum.  Being a user (versus maker) of such equipment is not generally 
sufficient.  This analysis is not restricted to time-related stuff.  

I spent many years working on POSIX standards.  The only reason my 
employer was willing to support this effort (and the associated travel 
expenses) was that our customers demanded conformance to such standards, 
and also that we help develop those standards.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-15 Thread Joseph Gwinn
In article qo6dnzljdv8z4cdunz2dnuvz_hwwn...@giganews.com,
 Richard B. Gilbert rgilber...@comcast.net wrote:

 Joseph Gwinn wrote:
  In article nbwdnvxq_p-y_idunz2dnuvz_thin...@giganews.com,
   Richard B. Gilbert rgilber...@comcast.net wrote:
  
  Joseph Gwinn wrote:
  In article 49bd3907.1080...@ntp.org, ma...@ntp.org (Danny Mayer) 
  wrote:
 
  [snip]
  3.  The original question was how to interpret a specific status code, 
  9514.  I read the explanation in the documentation, but became no wiser 
  for it.  Thus my question.  
 
  If there isn't a NTP FAQ entry on this, there probably should be.  Our 
  sysadmins were flummoxed by the cloud of 5914 codes, and they are far 
  too busy to undertake a research project.  (The deeper problem is that 
  some managers believe that NTP is plug and play, which isn't quite true.)
 
 
  The various answers and questions I've gotten have been quite useful, as 
  they give me a list of things to think about and investigate, things I 
  might not have thought of, or soon thought of.
 
  Joe Gwinn
  Joe,
 
  You need to proofread your message text a little more carefully!!
 
  Which error are you ACTUALLY getting?  You say 9514 and then 5914! 
  Which is it?
  
  You're right, but it wouldn't help, for an odd reason.
  
  The status code is 9514.
  
  But I have a Clausing 5914 lathe.
  
  Inherent dyslexia inducer.
  
  
  Also, you might try Google with the FULL and EXACT text of the error 
  message!
  
  It's 9514, pulled from a field in peerstats records.  Think I'll get 
  many false hits?  Qualifying 9514 with peerstats brought me back to this 
  thread.
  
  So, tried Maximum Distance Exceded, got led back to this exact news 
  thread.
  
  But let's say I did find some relevant hits.  This is the Internet.  How 
  would I know which hits to believe?
  
 
 I would be influenced by who wrote it and who disagreed with him!

But what if I listen to the loudest one?


  The FAQ has to be the place for such explanations.
 
 I'm not sure if this qualifies as an FAQ as I don't recall that it has 
 come up before.  FAQ stands for Frequently Asked Questions.

RAQ then?  Rarely Asked Questions

Seriously, I can't believe that I'm the only person in history to be 
perplexed by these status codes, and those little three-word summaries 
are a bit telegraphic.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-14 Thread Joseph Gwinn
In article 49bb8860$0$507$5a6ae...@news.aaisp.net.uk,
 David Woolley da...@ex.djwhome.demon.co.uk.invalid wrote:

 Joseph Gwinn wrote:
  In article 49bae109$0$505$5a6ae...@news.aaisp.net.uk,
   Da
  ntpd doesn't care about what the drift is in determining root distance. 
It simply takes the position that the actual local clock will be 
  somewhere within +/- 15ppm of the value which would achieve perfect 
  phase lock with true time.
  
  This part seems to conflict with the max +/- 500 ppm steering authority 
  of NTP.  How are these two limits related?
 
 They are not.  The 500ppm is the range of correction that can be 
 applied.  The 15ppm is a pessimistic estimate of the error in setting 
 that correction.  I.E. if there is a valid time source, ntpd may decide 
 it needs a correction of 300ppm.  In the absence of that time source, it 
 assumes that the correction it really needed was between 285ppm and 
 315ppm, with the uncertainty being due to measurement error and changes 
 in the local clock frequency.  That uncertainty in frequency causes and 
 uncertainty in time which grows with time, until it, when combined with 
 other uncertainties, exceeds 1 second.  The client compares the 
 uncertainty for each server with one second, and when that is exceeded 
 starts ignoring the server.

I understand this, but what perplexes me is that both timeservers, of 
different make and model, are showing the same behavior, so I have to 
believe that the client is somehow not right.  My theory is that we have 
not succeeded in getting a clean and correct startup.  

By the way, the datasets span at least 24 hours, no startup transient is 
seen, and the behavior does not appear to change over the run.  So there 
may also be a configuration error, but it's hard to imagine what could 
do this.


  The assumed maximum reasonable error therefore grows at 15 microseconds 
  per second.
  
  One suspicion I have is that the drifts file has data from some other 
  test still in it.  We will try deleting the drifts file.
 
 Root distance exceeded is a problem with the server, not the client.

Usually, but the fact that both servers are rejected makes me wonder, as 
discussed above.


  Another suspicion is that the computer's sense of time is too far away 
  from that provided by the timeserver.  I would have thought this would 
  cause the daemon to balk and complain, but perhaps there is a window 
  where it will not balk but will struggle mightily.  We will use ntpdate 
  as part of the startup process and see if it matters.
 
 This will cause ntpd to terminate.  I repeat, it is your servers that 
 are being rejected.

Yes.  The question is why.


For the record: 

In this test setup, at any given time, each client (daemon) sees at most 
one timeserver.  

All GPS receivers are fed from a single roof antenna by a splitter.

The mapping between client and server changes only between test runs.  

Each run lasts at least 24 hours, and may run over a weekend.


 Generally, in this sort of case, it is helpful to have output from 
 ntpq's peers, assoc and rv commands, with the latter for each 
 association number from assoc and for 0, i.e. the machine itself.  That 
 should tell us exactly when the servers had valid time, etc.

Yes.  I'll collect this data on Monday.

I have been suspicious of the old Symmetricom ET6010 GPS receiver in the 
lab before.  Elsewhere in the large system we have observed that 
sometimes one must reset the ET6010 to get good time fed to the TS2100 
timeserver; no idea why.  But the fact that the brand new Spectracom 
9383 timeserver cum GPS receiver does the same thing caused focus to 
shift elsewhere.  By the way, all GPS receivers discussed have Rubidium 
local oscillators.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


[ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-13 Thread Joseph Gwinn
I have been debugging some system problems.  The main system is too 
complicated, with too many people doing too many things, so I sought 
quiet refuge in an isolated test system consisting of a NTP timeserver 
connected by a point-to-point ethernet cable to a computer running NTP, 
which generates peerstats and loopstats data.  This test system is 
air-gap isolated from the rest of everything.  Only one timeserver is 
available to a given computer at a time.

The timeserver can be either a Symmetricom ET6010 GPS receiver feeding 
an IRIG-B002 time signal to a Symmetricom TS2100 Network Time Server, or 
a Spectracom 9383 NTP timeserver with built-in GPS receiver.  The GPS 
receivers are driven from a common antenna via a splitter.

The computer can be a Sun Ultra 10 or a Sun Ultra 60, in both cases 
running Solaris 9.  Solid boxes, but old.  The OS version reply is SunOS 
5.9 Generic May 2002.  This was clean installed from CD a week ago, so 
has not had time to collect too many barnicles.

NTP version 3 is running.  I've been trying to find the command to give 
me the full version, including dot (like 3.4y), and I get answers, but 
don't know which one to believe, and if the version given is that of the 
NTP daemon itself, or of ntpq, or of ntpdate.

The full grid of four tests, being two timeservers by two computers, has 
been run.  Many odd things are seen, but the question for today is about 
status codes in peerstats file records.

Most of the replies that NTP is using to update the time have a status 
code of 9514, which translates to the following:

Configured, reachability OK; Current sync source - max distance 
exceeded; Count is 1; Peer now reachable.

The part that has me most perplexed is the max distance exceeded part, 
as this is a direct wired connection, with zero hops, zero delay, and no 
interfering traffic.  Obviously, they are not talking about physical 
distance or hops or the like, so the distance has to have units of 
time.

Although most received replies have status 9514, they are nonetheless 
used to update the loop filter and so appear in the loopstats file.  
When I co-plot loopstats and peerstats, the loopstats dots land on top 
of the peerstats dots.

What is this error likely telling me?  What are the possibilities?  What 
tests will tell the tale?

Thanks to all,

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-13 Thread Joseph Gwinn
In article 
8dca286d-ba66-4caf-8090-a3467064a...@s20g2000yqh.googlegroups.com,
 Mike K Smith mks-use...@dsl.pipex.com wrote:

 On 13 Mar, 04:31, Joseph Gwinn joegw...@comcast.net wrote:
  The timeserver can be either a Symmetricom ET6010 GPS receiver feeding
  an IRIG-B002 time signal to a Symmetricom TS2100 Network Time Server, or
  a Spectracom 9383 NTP timeserver with built-in GPS receiver.  The GPS
  receivers are driven from a common antenna via a splitter.
 I'm familiar with the Spectracom 9383, but not the other equipment.

The Symmetricom units are quite old.


 What does the NTP status page show?
 
 What does the GPS Signal Status show?
 
 How many satellites are you seeing?
 
 Is the device reporting 'Position Hold'?

I'll check.


  Most of the replies that NTP is using to update the time have a status
  code of 9514, which translates to the following:
 
  Configured, reachability OK; Current sync source - max distance
  exceeded; Count is 1; Peer now reachable.
 
  The part that has me most perplexed is the max distance exceeded part,
  as this is a direct wired connection, with zero hops, zero delay, and no
  interfering traffic.  Obviously, they are not talking about physical
  distance or hops or the like, so the distance has to have units of
  time.
 
  Although most received replies have status 9514, they are nonetheless
  used to update the loop filter and so appear in the loopstats file.  

 You say most are 9514, are there any 96xx values?

The status codes seen are 9014 (red), 9514 (orange), and 9614 (green).  

The colors are those of the plotted dots.  Loopstats dots are dark blue 
and smaller, so when co-plotted one sees little bullseyes.

I see 9014 when changing cables or timeservers, and in most other tests 
I see mostly 9614 and a few 9014, and very rarely 9514 until now.

 
  When I co-plot loopstats and peerstats, the loopstats dots land on top
  of the peerstats dots.

 I've never seen a device come up as status 5, but since the RFC1305
 text treats it as current synchronization source; max distance
 exceeded (if limit check implemented) then I guess it makes sense
 that it will use it as the sync source and will update loopstats
 appropriately.

I had not seen 5 before either, but NTP is clearly using these replies 
to update the time.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-13 Thread Joseph Gwinn
In article 49ba0e33$0$505$5a6ae...@news.aaisp.net.uk,
 David Woolley da...@ex.djwhome.demon.co.uk.invalid wrote:

 Joseph Gwinn wrote:
 
  What is this error likely telling me?  What are the possibilities?  What 
  tests will tell the tale?
 
 Your timeservers are unsynchronised, but for some reason not setting 
 their stratum to 16.
 
 Distance exceeded means that the combination of worst case round trip 
 time induced error and an assumed drift of 15ppm since the last valid 
 time on the root server (plus a few minor components) has exceeded 1 second.

How would it know of drift, in this isolated little island?

Perhaps the drift file is causing trouble.

Perhaps ntpdate was needed to get things started properly started.  

I bet the engineer did not do any of these things.  He just plugged 
thing together.  Some of the more recent loopstats and peerstats files 
have data from both timeservers, the transition being marked by a little 
cloud of 5014 status (red dots in my plots).

 
 It commonly happens with w32time servers that have been synchronized 
 once but left to drift.  It can also happen if the servers are orphan 
 mode, and haven't had a real time source for too long, and you are not 
 using the very latest orphan mode code.

This strengthens my impression that we don't have a clean startup, so 
clean startup will be the next thing to try.  

We were equally sloppy before, but didn't get the cloud of 9514 
complaints, probably because until now things weren't quiet enough for 
NTP to worry about the change.

Thanks,

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What exactly does Maximum Distance Exceded mean?

2009-03-13 Thread Joseph Gwinn
In article enydnx61waj9gsbunz2dnuvz_tjin...@giganews.com,
 Richard B. Gilbert rgilber...@comcast.net wrote:

 Joseph Gwinn wrote:
  I have been debugging some system problems.  The main system is too 
  complicated, with too many people doing too many things, so I sought 
  quiet refuge in an isolated test system consisting of a NTP timeserver 
  connected by a point-to-point ethernet cable to a computer running NTP, 
  which generates peerstats and loopstats data.  This test system is 
  air-gap isolated from the rest of everything.  Only one timeserver is 
  available to a given computer at a time.
  
  The timeserver can be either a Symmetricom ET6010 GPS receiver feeding 
  an IRIG-B002 time signal to a Symmetricom TS2100 Network Time Server, or 
  a Spectracom 9383 NTP timeserver with built-in GPS receiver.  The GPS 
  receivers are driven from a common antenna via a splitter.
  
  The computer can be a Sun Ultra 10 or a Sun Ultra 60, in both cases 
  running Solaris 9.  Solid boxes, but old.  The OS version reply is SunOS 
  5.9 Generic May 2002.  This was clean installed from CD a week ago, so 
  has not had time to collect too many barnicles.
  
  NTP version 3 is running.  I've been trying to find the command to give 
  me the full version, including dot (like 3.4y), and I get answers, but 
  don't know which one to believe, and if the version given is that of the 
  NTP daemon itself, or of ntpq, or of ntpdate.
  
  The full grid of four tests, being two timeservers by two computers, has 
  been run.  Many odd things are seen, but the question for today is about 
  status codes in peerstats file records.
  
  Most of the replies that NTP is using to update the time have a status 
  code of 9514, which translates to the following:
  
  Configured, reachability OK; Current sync source - max distance 
  exceeded; Count is 1; Peer now reachable.
  
  The part that has me most perplexed is the max distance exceeded part, 
  as this is a direct wired connection, with zero hops, zero delay, and no 
  interfering traffic.  Obviously, they are not talking about physical 
  distance or hops or the like, so the distance has to have units of 
  time.
  
 I think that, perhaps, maximum distance refers to synchronization 
 distance q.v.  Once upon a time, I knew the definition but my memory 
 has failed me.

The other answers didn't use this exact term, but it sounds like the 
same idea.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] NTP over redundant peer links, undetected loops

2009-02-15 Thread Joseph Gwinn
In article 
5d7f07420902151105m48a5e210s72e8e168e67d1...@mail.gmail.com,
 malay...@gmail.com (Ryan Malayter) wrote:

 On Sun, Feb 15, 2009 at 12:23 PM, Danny Mayer ma...@ntp.org wrote:
 
  Because I want to get away from the notion that these are meant to be IP
  addresses. In addition in an IPv6-only environment that wouldn't work
  either. Why create work when it's unnecessary just to find a valid IP
  address? In addition with anycast addresses are not globally unique. The
  chances that you will create a non-unique random number within a network
  is extremely low.
 
 It depends on the size of the network. The chances of a duplicate
 32-bit number on a network including 65000 hosts is about 40%. The NTP
 Pool network, which comprises at least 10^6 hosts, for example, would
 have collision probability very close to 1.

How did you compute that?  Given that 2^32= ~4*10^9, it's hard to see 
how 10^6 hosts spread at random in a 10^9 codespace could achieve 100% 
collision probability.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-13 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Uwe Klein [EMAIL PROTECTED] wrote:

 Hal Murray wrote:
 I did get a look at the ntpd script today.  Turns out the answer on 
 where it gets the ntp.conf file is right there, near the top, in the 
 line ntpconf=/etc/ntp.conf, even though the ntp man page points us 
 deeper in the /etc hierarchy.  
  
  
 The sysadmin I was working with was real annoyed, as the misinformation 
 in the man page had sent him into circles.  We will add pointer comments 
 to all placebo ntp.conf files, to save future generations of sysadmins 
 from this fate.
  
  I still don't know which ntp.conf you are really using.
  
  I'm looking at a Fedora 6 box.
  
  If you look in /etc/init.d/ntpd, you will see that it mucks about
  with ntpconf (the one above) to find the servers.  Those servers
  get passed to ntpdate.  Mumble.  That's old crap.  There is now
  a command line switch that does the right thing.  I don't see
  where ntpconf gets passed to ntpd as a command line argument.
  
  If the man page says ntpd uses some other config file, it
  is probably right, or at it seems to me that it would be
  more likely that the guy who changed the code also changed
  the man page but didn't fixup the init script.
  
 Does Red Hat write distribution specific manpages? I would be surprised if.

Well, there was a full man page for ntp on RHEL, one that's far longer 
than your example below, and someone wrote it.  Don't know who, but the 
RHEL box was bought from IBM, who are famous for their documentation, so 
I would venture that IBM augmented the man page over what Red Hat 
provides.  And the E in RHEL is Enterprise, and enterprises want full 
documentation delivered with the product.

Joe Gwinn


 
 uwe
 
 This is the recent SuSE Linux Manpage that all ntp related keywords point to:
 NTP(1)
   NTP(1)
 
 
 
 NAME
 NTP - Network Time Protocol
 
 SEE ALSO
 The  NTP  distribution does not include man pages. To learn more 
 about the NTP protocol
 and this software, please install the xntp-doc package included in 
 you SuSE Linux  dis-
 tribution.
 
 In  /usr/share/doc/packages/xntp-doc you will find the complete set 
 of documentation on
 building and configuring a NTP server or client. The documentation is 
 in  the  form  of
 HTML files suitable for browsing and contains links to additional 
 documentation at var-
 ious web sites.
 
 Also included: What about NTP?  Understanding and using the Network 
 Time  Protocol.   A
 first  try  on  a  non-technical  Mini-HOWTO and FAQ on NTP. Edited 
 by Ulrich Windl and
 David Dalton.
 
 Further information on NTP in the Internet  can  be  found  in  the  
 NTP  web  page  at
 http://www.eecis.udel.edu/~ntp/

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-13 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 [EMAIL PROTECTED] (Hal Murray) wrote:

 I did get a look at the ntpd script today.  Turns out the answer on 
 where it gets the ntp.conf file is right there, near the top, in the 
 line ntpconf=/etc/ntp.conf, even though the ntp man page points us 
 deeper in the /etc hierarchy.  
 
 The sysadmin I was working with was real annoyed, as the misinformation 
 in the man page had sent him into circles.  We will add pointer comments 
 to all placebo ntp.conf files, to save future generations of sysadmins 
 from this fate.
 
 I still don't know which ntp.conf you are really using.
 
 I'm looking at a Fedora 6 box.
 
 If you look in /etc/init.d/ntpd, you will see that it mucks about
 with ntpconf (the one above) to find the servers.  Those servers
 get passed to ntpdate.  Mumble.  That's old crap.  There is now
 a command line switch that does the right thing.  I don't see
 where ntpconf gets passed to ntpd as a command line argument.

Next time I have a hands-on session, I'll put something in the purported 
correct ntp.conf file to tell if this file is in fact the Chosen One.

 
 If the man page says ntpd uses some other config file, it
 is probably right, or at it seems to me that it would be
 more likely that the guy who changed the code also changed
 the man page but didn't fixup the init script.

RHEL via IBM and Fedora may or may not be identical, even though both 
ultimately came from Red Hat.  As discussed in another posting, a full 
NTP man page comes with RHEL via IBM, unlike Fedora.

The details of how NTP is managed and run may also have been improved.  
The intent of the service utility is to simplify the day-today 
activities of the sysadmins, and it probably succeeds.  The root problem 
is turning out to be with the documentation of service and ntp, not with 
service and ntp themselves.

It is pretty clear that the RHEL via IBM man page for NTP is incorrect.  
This would not be the first time in history that documentation got out 
of step with code.  

I'll smoke this out by filing a formal bug report with IBM against the 
ntp man page.  This cost us some real money, not just annoyance, and we 
cannot be alone.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-12 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Bill Unruh [EMAIL PROTECTED] wrote:

 Joseph Gwinn [EMAIL PROTECTED] writes:
 
 In article [EMAIL PROTECTED],
  [EMAIL PROTECTED] (Hal Murray) wrote:
 
  I'm not a sysadmin, but am digging into service.  I don't recall that 
  the service man page was that helpful, but will look again.
  
  service is mostly a shortcut to save typing.  If you think it is getting
  in your way, run /etc/init.d/ntpd whatever by hand.  (It also
  fixes up environment and cd-ed directory and whatever.)
 
 Yes.  This is what we did to prove that NTP really could generate 
 loopstats and peerstats.
 
 No I suspect you ran /usr/sbin/ntpd, not /etc/init.d/ntpd
 /etc/init.d/ntpd start should do EXACTLY the same thing as when the system
 runs it on bootup.

If I recall, the line that worked was /etc/init.d/ntpd -c filename of 
our ntp.conf file.  I don't recall that sbin was involved.


  The -x command to bash will print each line as it gets expanded
  and executed.  So you might try something like:
bash -x /etc/init.d/ntpd start
  to see what is really going on.
 
 Another good idea to try.
 
 It of course produces far more output but obviates the need to insert echo
 lines into /etc/init.d/ntpd

Yep.  But if it solves the problem, I won't mind the blather.


 Note, I am wondering what has happened to all these suggestions? Have
 you tried any of them yet? Have you discovered what it is actually using as
 its configuration file?

I have been collecting all the suggestions I have heard here, and will 
try them when the relevant sysadmin is able to spare the time.  This may 
be today (Friday).  Unless he is somehow deflected.


 You might want to post the config file here (ntp.conf) here in case it is
 some error in that file which is causing your problems rather than that
 ntpd is using some other config file. 

We are happily collecting loopstats and peerstats data on RHEL using 
that ntp.conf file, once we started the daemon manually with an explicit 
filepath argument (as described above), so the ntp.conf file itself does 
not appear to be the problem.  I was suspicious of that file too, and so 
had cleaned it down to something like three lines, basically following 
the minimum ntp.conf example given in the online NTP documentation.


By the way, I doubt that it matters here, but this RHEL is running NTPv4.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] What happens if ntp server unavailable at start up?

2008-09-12 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Unruh [EMAIL PROTECTED] wrote:

 Harlan Stenn [EMAIL PROTECTED] writes:
[snip]
 
 Unruh Did the dynamic keyword ever work? The web docs say that it is not
 Unruh yet implimented.
 
 I'm pretty sure it works - what documentation says it doesn't?
 
 Some document on ntp.org describing the options in the ntp.conf file. 
 I do not want to look for it again-- the docs are incredibly hard to
 search-- one of the  problems with making them into an infinite number of
 web pages. 

What I find useful is to use google with a site:udel.edu qualifier.

It would help if the docs entire tree were under ntp.org though.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-10 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 [EMAIL PROTECTED] (Hal Murray) wrote:

 I'm not a sysadmin, but am digging into service.  I don't recall that 
 the service man page was that helpful, but will look again.
 
 service is mostly a shortcut to save typing.  If you think it is getting
 in your way, run /etc/init.d/ntpd whatever by hand.  (It also
 fixes up environment and cd-ed directory and whatever.)

Yes.  This is what we did to prove that NTP really could generate 
loopstats and peerstats.


 The -x command to bash will print each line as it gets expanded
 and executed.  So you might try something like:
   bash -x /etc/init.d/ntpd start
 to see what is really going on.

Another good idea to try.

Thanks,

Joe

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-09 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Unruh [EMAIL PROTECTED] wrote:

 David Woolley [EMAIL PROTECTED] writes:
 
 James Cloos wrote:
 
  I read through most of the replies so far, but one thing I haven't seen
  noted is that this isn't an ntp issue at al
 
 Did you mean service (8).
 
 Treating it as a black box is basically how Red Hat is marketed; it is 
 basically in the same market as Windows.  People who want a white box 
 Linux are more likely to choose something like Slackware.

Agree.

 
 service is a dead simple program. It runs its argument from the /etc/init.d
 directory. 
 
 Anyway, long ago we suggested that he looked in /etc/init.d/ntpd to see if
 there was anything in there that suggested which config file was being
 used. 

I found the script, and started reading it, and will return to it.


 Or insert an echo 
 where ... is the exact line that that script runs to start ntpd to see if
 there are any interesting arguments to ntpd. 

 Or put in an env in there to see exactly what the environment is that
 ntpd sees. 

All good ideas.  Direct, and free of excess assumptions that things are 
as things should be.


 It is just a damn shell script. It is not a black box. 

I've been reading it, and it does seem simple, but haven't really 
studied it yet.


 I think the OP has gone far beyond what the average RHEL administrator 
 is expected to do in terms of looking inside the box.

That's for sure, and is why I'm doing the debugging, even though I'm not 
a sysadmin.   The sysadmins really don't understand NTP.

 
 I find it extremely unlikely that RHEL uses anything but /etc/ntpd.conf but
 if it does then it is up to Redhat to document it. I suspect either user
 error or some admin in the past of this organization has changed things and
 never documented it. 

Judging by the cruft accumulation in the trojan ntp.conf file, this is 
not a virgin install, so I'd bet on confused sysadmins.  They may have 
been trying to get it to work, not realizing that this ntp.conf file is 
only a placebo.


 We have not had a report back from him as to what the
 results were of all the suggestions we made. 

Because there is nothing to report yet, due to the press of other 
business.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-06 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Unruh [EMAIL PROTECTED] wrote:

 Joseph Gwinn [EMAIL PROTECTED] writes:
 
 In article [EMAIL PROTECTED],
  Steve Kostecke [EMAIL PROTECTED] wrote:
 
  On 2008-09-03, Joseph Gwinn [EMAIL PROTECTED] wrote:
  
   Read the service shell script.  It appears to get its file paths from 
   environment variables named after the thing being started and stopped 
   and accessible only in the root environment; this bit of RHEL-specific 
   structure is being chased down.  (Does anyone know where this is 
   documented?)
  
  On Linux OSes init scripts are typically found in /etc/init.d/ or
  /etc/rc.d/init.d/ Look for one named ntp (or something containing ntp).
 
 Yes, and that's where strace led me, where I found a script called ntpd. 
 How the service script interacts with this ntpd script isn't clear.  
 Environment variables seem to be implicated, but a listing of 
 environment variables is not helpful.  Next week I'll digest it all.
 
 service simply runs the program listed as its argument from the /etc/init.d
 directory. 
 
 Ie, service ntpd start is the same as 
 /etc/init.d/ntpd start

True, but there seems to be more to it than that.  Next week.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-05 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Unruh [EMAIL PROTECTED] wrote:

 Joseph Gwinn [EMAIL PROTECTED] writes:
 
 In article [EMAIL PROTECTED],
  Peter J. Cherny [EMAIL PROTECTED] wrote:
 
  Joseph Gwinn wrote:
  ...
   Which brings me to a question:  How does one get NTP to tell you exactly 
   where it is getting such things as the ntp.conf file from, all without 
   ...
  [EMAIL PROTECTED] ~]$ strings /usr/sbin/ntpd|grep ntp.conf
  /etc/ntp.conf
 
 In the RHEL case, this would find exactly the wrong copy of ntp.conf, 
 being the one we were changing to no avail, not the one that NTP was in 
 fact using.
 
 Which one was ntp in fact using?

Don't know yet.  Other than it wasn't the obvious one.

When we do figure it out, all pretenders to the throne will be summarily 
deleted, to prevent confusion.


  [EMAIL PROTECTED] ~]$ strace -f -o x /usr/sbin/ntpd -g
 
 I'll have to look into this.  It sounds like it might be general enough.
 
  
  [EMAIL PROTECTED] ~]# grep ntp.conf x
  3351  open(/etc/ntp.conf, O_RDONLY)   = 4
 
 Doesn't this assume that the correct ntp.conf file is called ntp.conf?  
 It may be common, the standard convention, but it is not required.
 
 The whole point is to find the correct file without making assumptions, 
 because on a strange computer strange things may have been done.
 
 yes, but then do strace as above and look through the file looking for
 something that might be a configuration file. If they call it /lib/libc.so
 then you are probably shit out of luck, but usually they will not do that. 

The strace gave a lot of data, mostly irrelevant, which I will plow 
through next week.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-05 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Steve Kostecke [EMAIL PROTECTED] wrote:

 On 2008-09-03, Joseph Gwinn [EMAIL PROTECTED] wrote:
 
  Read the service shell script.  It appears to get its file paths from 
  environment variables named after the thing being started and stopped 
  and accessible only in the root environment; this bit of RHEL-specific 
  structure is being chased down.  (Does anyone know where this is 
  documented?)
 
 On Linux OSes init scripts are typically found in /etc/init.d/ or
 /etc/rc.d/init.d/ Look for one named ntp (or something containing ntp).

Yes, and that's where strace led me, where I found a script called ntpd. 
How the service script interacts with this ntpd script isn't clear.  
Environment variables seem to be implicated, but a listing of 
environment variables is not helpful.  Next week I'll digest it all.

 
  Which brings me to a question:  How does one get NTP to tell you exactly 
  where it is getting such things as the ntp.conf file from, all without 
  being able to find or see the actual command line or lines that launched 
  the daemon?  I did not see a ntpq command that sounded plausible, 
  although ntpq would be an obvious choice.
 
  This would be very useful for debugging, as each and every platform type 
  seems to have a different approach to handling NTP.  
 
 Why not use the file location features built in to your OS to find all
 possible instances of ntp.conf?
 
 $ locate ntp.conf
 
 or 
 
 $ find / -name ntp.conf
 
 Pipe the output of either of those commands to 'xargs ls -l' to see the
 datestamps of the files.

We did this, but could not tell which one mattered.  Next week.

Nor is it *required* the the ntp configuration file be called ntp.config.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-05 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 David Woolley [EMAIL PROTECTED] wrote:

 Steve Kostecke wrote:
  On 2008-09-03, Joseph Gwinn [EMAIL PROTECTED] wrote:
  
  Read the service shell script.  It appears to get its file paths from 
  environment variables named after the thing being started and stopped 
  and accessible only in the root environment; this bit of RHEL-specific 
  structure is being chased down.  (Does anyone know where this is 
  documented?)
  
  On Linux OSes init scripts are typically found in /etc/init.d/ or
  /etc/rc.d/init.d/ Look for one named ntp (or something containing ntp).
  
 I believe service is just a front end to those scripts, so I presume 
 that, by service shell scripts he is referring to those scripts.  The 
 problem he is having is that they probably source files (bash . command) 
 files containing shell variable definitions from the master 
 configuration directory, maintained by the, typically GUI, configuration 
 tools.  I suspect he hasn't realised that is is sourcing thesse files.

Ahh.  I had figured out the first part of this, but had not figured out 
where the data was kept.  Environment variables didn't have anything 
plausible.  But it has to come from *somewhere*.

The sysadmins know nothing of all this, being AIX and Solaris guys.

 
 Note that not all Linux distributions use this style of startup script, 
 some are based on a more historical style of /etc/rc.

Natch.  That's why ntpq needs a bit more built-in debug support.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-05 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Richard B. Gilbert [EMAIL PROTECTED] wrote:

 Joseph Gwinn wrote:
  In article [EMAIL PROTECTED],
   Martin Burnicki [EMAIL PROTECTED] wrote:
  
  Joe,
 
  Joseph Gwinn wrote:
  In article [EMAIL PROTECTED],
   Peter J. Cherny [EMAIL PROTECTED] wrote:
 
  Joseph Gwinn wrote:
  ...
  Which brings me to a question:  How does one get NTP to tell you
  exactly where it is getting such things as the ntp.conf file from, all
  without
   ...
  [EMAIL PROTECTED] ~]$ strings /usr/sbin/ntpd|grep ntp.conf
  /etc/ntp.conf
  In the RHEL case, this would find exactly the wrong copy of ntp.conf,
  being the one we were changing to no avail, not the one that NTP was in
  fact using.
 
 
  [EMAIL PROTECTED] ~]$ strace -f -o x /usr/sbin/ntpd -g
  I'll have to look into this.  It sounds like it might be general enough.
 
   
  [EMAIL PROTECTED] ~]# grep ntp.conf x
  3351  open(/etc/ntp.conf, O_RDONLY)   = 4
  Doesn't this assume that the correct ntp.conf file is called ntp.conf?
  It may be common, the standard convention, but it is not required.
 
  The whole point is to find the correct file without making assumptions,
  because on a strange computer strange things may have been done.
  I fully agree.
 
  Ntpd generates a bunch of messages about what it has found in the config
  file, at least in debug mode.
 
  Maybe you should open an enhancement request on http://bugs.ntp.org to make
  ntpd also print the name of the config file it is using, maybe only in
  debug mode.
  
  I'm surprised that it doesn't already print the full filename of every 
  file it uses.
  
  Will debug mode do much if the binary wasn't compiled for debug?  I'm 
  trying to use the provided binary, whatever it might be, and recompiling 
  is usually far too much trouble to be practical.  Especially as the 
  effort is per platform type, and we have multiple types.
  
  I will file an enhancement request.  However, my feeling is that this 
  function would be most useful if added to ntpq, and yielded the full 
  filename including directories, as there may be multiple ntp.conf 
  files scattered about.  The key is to get NTP to tell us which file NTP 
  is using, without interference from our firmly held but sadly mistaken 
  assumptions about what NTP ought to be doing.
  
  Joe Gwinn
 
 Since the source to NTPD is available, it's a SMOP to modify it to print 
 out the desired file specification!

True enough, but far too much work.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-05 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Martin Burnicki [EMAIL PROTECTED] wrote:

 Richard B. Gilbert wrote:
  ISTR that ntpd looks in /etc/inet if it is not told to look elsewhere by
  the command that starts ntpd.  This should take care of Unix and
  Unix-like systems.  Windoze??  Ask someone who knows.
 
 AFAIK this is the default location under Solaris, but e.g. under Linux the
 location is just /etc. 
 
 Anyway, this is configured at compile time and maybe overridden by a command
 line parameter, in which case it does not help to know the default.
 
 On some systems the command line parameters are displayed in the process
 list, so you can:
 
 1.) Look at the process list to see if a configuration file has been
 specified
 
 2.) If it has not, grep through the ntp binary to find the path of the
 default config file
 
 3.) see if that file exists
 
 Please note that especially under Windows things may look different. The NTP
 service first tries to open %windir%\ntp.conf, and, if that file does not
 exist, %windir%\system32\drivers\etc\ntp.conf.
 
 The GUI installers provided by Meinberg override these settings with an etc\
 directory below the program installation path, by default \program
 files\ntp\etc. The configured setting can be retrieved from the ImagePath
 registry key of the NTP service registry entry.
 
 If you are upgrading an installation of NTP under Windows then there may
 still be old config files under the older paths, so you have to look
 explicitely which of the file has being read by the running NTP service.
 
 If ntpd would write a log message at startup then you could easily find out
 on every platform which config file has been read.

That would certainly work, and work in all cases.

My problem is to debug NTP problems in multiple systems that I have 
limited knowledge of, ones that may or may not follow the usual 
conventions, or the same conventions, and which may in fact may have 
been hosed up by some sysadmin who knows nothing of NTP save where the 
big red start button is supposed to be.

To be useful in such an environment, debug tools must be platform 
independent and cannot make assumptions about conventions being followed.

I am not worried about the case where someone compiles their own munged 
version of NTP.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-05 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Peter J. Cherny [EMAIL PROTECTED] wrote:

 Joseph Gwinn wrote:
  I will file an enhancement request.  However, my feeling is that this 
  function would be most useful if added to ntpq, and yielded the full 
  filename including directories, as there may be multiple ntp.conf 
  files scattered about.  The key is to get NTP to tell us which file NTP 
  is using, without interference from our firmly held but sadly mistaken 
  assumptions about what NTP ought to be doing.
 
 And you'll file enhancement requests for every other daemon
 on the machine ???

Only the ones that sufficiently annoy me.


 In most OSs the man pages are definitive and mostly correct,
 with changes noted in the release notes.
 If you've paid your support fees, ask the vendor.
 In most of the Unix family, the source is available.

A support question is being placed with IBM.

But newsgroups often think of angles that support does not.

 
 flame
 Else, ask a SysAdmin/SysEngineer/SwEngineer who does this for a living.
 You do have other than junior staff ?
 /flame

Umm.  I'm not a sysadmin.  If we had such a person, I wouldn't be doing 
the debugging.  Few people even know that NTP exists, let alone how it 
works well enough to debug an installation.  When there is a time 
problem, the sysadmins come and grab me to help them.  So far, I've 
always been able to figure the problem out.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file (Joseph Gwinn)

2008-09-05 Thread Joseph Gwinn
In article 
[EMAIL PROTECTED],
 [EMAIL PROTECTED] (Breck Beatie) wrote:

 This isn't quite what you're asking for and it's certainly not ntp
 specific, but one technique that I have used in the past is to replace
 the binary I'm trying to debug with a script which dumps useful
 information and then forwards the exec to the real binary.

This should work, but is a bit of work.  I'll keep it in reserve.

 
 I usually have it dump its environment and the full set of command line
 arguments someplace safe and then exec the original binary.  You could
 certainly have it run the original binary with strace.

I'm going to grind through the strace output next week.

 
 I have friends who'll run the binary with gdbserver and then they
 connect with gdb have their way with the binary.  I've never done that
 so I have no idea how you'd invoke gdbserver.

I don't know if we even have gdbserver.

Joe Gwinn


 Joe Gwinn writes:
  Which brings me to a question:  How does one get NTP to tell you
 exactly 
  where it is getting such things as the ntp.conf file from, all without
 
  being able to find or see the actual command line or lines that
 launched 
  the daemon?  I did not see a ntpq command that sounded plausible, 
  although ntpq would be an obvious choice.
 
  This would be very useful for debugging, as each and every platform
 type 
  seems to have a different approach to handling NTP.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-05 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Richard B. Gilbert [EMAIL PROTECTED] wrote:

 Joseph Gwinn wrote:
  In article [EMAIL PROTECTED],
   Steve Kostecke [EMAIL PROTECTED] wrote:
  
  On 2008-09-03, Joseph Gwinn [EMAIL PROTECTED] wrote:
 
  Read the service shell script.  It appears to get its file paths from 
  environment variables named after the thing being started and stopped 
  and accessible only in the root environment; this bit of RHEL-specific 
  structure is being chased down.  (Does anyone know where this is 
  documented?)
  On Linux OSes init scripts are typically found in /etc/init.d/ or
  /etc/rc.d/init.d/ Look for one named ntp (or something containing ntp).
  
  Yes, and that's where strace led me, where I found a script called ntpd. 
  How the service script interacts with this ntpd script isn't clear.  
  Environment variables seem to be implicated, but a listing of 
  environment variables is not helpful.  Next week I'll digest it all.
  
   
  Which brings me to a question:  How does one get NTP to tell you exactly 
  where it is getting such things as the ntp.conf file from, all without 
  being able to find or see the actual command line or lines that launched 
  the daemon?  I did not see a ntpq command that sounded plausible, 
  although ntpq would be an obvious choice.
 
  This would be very useful for debugging, as each and every platform type 
  seems to have a different approach to handling NTP.  
  Why not use the file location features built in to your OS to find all
  possible instances of ntp.conf?
 
  $ locate ntp.conf
 
  or 
 
  $ find / -name ntp.conf
 
  Pipe the output of either of those commands to 'xargs ls -l' to see the
  datestamps of the files.
  
  We did this, but could not tell which one mattered.  Next week.
  
  Nor is it *required* the the ntp configuration file be called ntp.config.
  
  
  Joe Gwinn
 
 There MIGHT, in rare cases, be good reason NOT to call the configuration 
 file ntp.conf (it's conf not config, unless someone changed it 
 recently).  IF so, both the new name and the reasons for it should be 
 documented!  In most cases it's best to stick with the de facto standard.

I agree completely.  But I didn't set the thing up.  But I do have to 
figure it out and fix it.  And document it.  It did flummox all our 
sysadmins, although as with sysadmins worldwide they are too busy.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-04 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Peter J. Cherny [EMAIL PROTECTED] wrote:

 Joseph Gwinn wrote:
 ...
  Which brings me to a question:  How does one get NTP to tell you exactly 
  where it is getting such things as the ntp.conf file from, all without 
  ...
 [EMAIL PROTECTED] ~]$ strings /usr/sbin/ntpd|grep ntp.conf
 /etc/ntp.conf

In the RHEL case, this would find exactly the wrong copy of ntp.conf, 
being the one we were changing to no avail, not the one that NTP was in 
fact using.


 [EMAIL PROTECTED] ~]$ strace -f -o x /usr/sbin/ntpd -g

I'll have to look into this.  It sounds like it might be general enough.

 
 [EMAIL PROTECTED] ~]# grep ntp.conf x
 3351  open(/etc/ntp.conf, O_RDONLY)   = 4

Doesn't this assume that the correct ntp.conf file is called ntp.conf?  
It may be common, the standard convention, but it is not required.

The whole point is to find the correct file without making assumptions, 
because on a strange computer strange things may have been done.

Thanks,

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-04 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Richard B. Gilbert [EMAIL PROTECTED] wrote:

 Joseph Gwinn wrote:
  We had been struggling with NTP running on Red Hat Enterprise Linux 
  (RHEL) on IBM-built Intel boxes, specifically with getting NTP to 
  generate loopstats and peerstats files.  Basically, nothing worked, 
  despite many attempts.  
  
  Yesterday, I cracked it while working on the problem with one of the 
  sysadmins.  The ntp.config file was very complex, and I suspected it of 
  being mostly flotsam and jetsam from prior uses, and probably in 
  conflict with itself, so we cleaned it down to maybe three lines, and 
  then stopped and started ntpd using the service utility.
  
  The daemon started, but complained that it was unable to synchronize.  
  The sysadmin mentioned that it very often did this, for unknown reasons.  
  Then I noticed that the timeserver IP address was not the same as 
  specified in the simplified ntp.conf file, and sure enough the address 
  that NTP was trying to use was not accessible to ping.
  
  Hmm.  Huh?  NTP cannot be using the ntp.conf file we thought it was.  
  Tried starting NTP manually with -c option and providing the full path 
  to our ntp.conf file.  Success!
  
  Read the service shell script.  It appears to get its file paths from 
  environment variables named after the thing being started and stopped 
  and accessible only in the root environment; this bit of RHEL-specific 
  structure is being chased down.  (Does anyone know where this is 
  documented?)
  
  Which brings me to a question:  How does one get NTP to tell you exactly 
  where it is getting such things as the ntp.conf file from, all without 
  being able to find or see the actual command line or lines that launched 
  the daemon?  I did not see a ntpq command that sounded plausible, 
  although ntpq would be an obvious choice.
  
  This would be very useful for debugging, as each and every platform type 
  seems to have a different approach to handling NTP.  
  
  Joe Gwinn
 
 I don't recall ever encountering such a facility.  Or ever needing one.

You are very fortunate.  I do need one.  

You have confirmed my suspicion that NTP has no such facility.

Use of strace has been suggested.  It is on some but not all platforms 
at present.


 It seems to me that this is the sort of thing that the sysadmin should 
 be documenting.  And if he has not documented it, perhaps you should 
 wonder what's going to happen when he walks in front of a truck, or is 
 shot by an irate husband!

In this case, none of the sysadmins (who are too busy) had any idea what 
was going on, and they didn't know enough about NTP to realize what was 
going on.  The sysadmin had gotten the can't-sync error message many 
times, but didn't quite understand what it was saying.  So even if he is 
hit by an irate truck, his replacement won't necessarily be better or 
worse.

The problem I'm trying to solve is different.  We put NTP on lots of 
different kinds of computer, mostly Unix, but some Windows, and I'm 
looking for diagnosis tools that will tell me what's really going on, 
precisely so I can debug unfamiliar setups no matter how screwed up.

Thanks,

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


[ntp:questions] Finding out where ntpd gets its ntp.conf file

2008-09-03 Thread Joseph Gwinn
We had been struggling with NTP running on Red Hat Enterprise Linux 
(RHEL) on IBM-built Intel boxes, specifically with getting NTP to 
generate loopstats and peerstats files.  Basically, nothing worked, 
despite many attempts.  

Yesterday, I cracked it while working on the problem with one of the 
sysadmins.  The ntp.config file was very complex, and I suspected it of 
being mostly flotsam and jetsam from prior uses, and probably in 
conflict with itself, so we cleaned it down to maybe three lines, and 
then stopped and started ntpd using the service utility.

The daemon started, but complained that it was unable to synchronize.  
The sysadmin mentioned that it very often did this, for unknown reasons.  
Then I noticed that the timeserver IP address was not the same as 
specified in the simplified ntp.conf file, and sure enough the address 
that NTP was trying to use was not accessible to ping.

Hmm.  Huh?  NTP cannot be using the ntp.conf file we thought it was.  
Tried starting NTP manually with -c option and providing the full path 
to our ntp.conf file.  Success!

Read the service shell script.  It appears to get its file paths from 
environment variables named after the thing being started and stopped 
and accessible only in the root environment; this bit of RHEL-specific 
structure is being chased down.  (Does anyone know where this is 
documented?)

Which brings me to a question:  How does one get NTP to tell you exactly 
where it is getting such things as the ntp.conf file from, all without 
being able to find or see the actual command line or lines that launched 
the daemon?  I did not see a ntpq command that sounded plausible, 
although ntpq would be an obvious choice.

This would be very useful for debugging, as each and every platform type 
seems to have a different approach to handling NTP.  

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] NTP Drifts +ve and -ve

2008-08-20 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 [EMAIL PROTECTED] (Arul Murugan) wrote:

 Hi, We are using NTP4, when CPU is very busy some of the UDP packets [are] 
 dropped 
 by the kernel, so the local clock drifts 60 milliseconds from the time 
 server. 

Dropped packets are quite unlikely to be the problem, even if most 
packets never arrive.

More likely is that the NTP daemon is being preempted between taking the 
send timestamp and the sent packet actually appearing on the wire, and 
between received packets actual arrival time and when the daemon is able 
to obtain the receipt timestamp.  These preemptions appear to the daemon 
as very large and random asymmetrical transport delays.  If sufficiently 
common, these bad observations will seep through the various filter 
steps in NTP, and corrupt the measurements of clock offset error used to 
update the servo.  

See http://www.eecis.udel.edu/~mills/stamp.html.


What computer platform and operating system are you using?


One classic solution is to give the NTP demon sufficient realtime 
priority to outrank whatever else the CPU is doing, thus sharply 
reducing fraction of NTP polls that suffer preemption.  

This raised priority will not cause those other activities to be any 
slower because the NTP daemon is an insignificant consumer of CPU 
resources.


  From that point, NTP keeps drifts +ve and -ve for 2 to 3 three days to 
 become stable. The graph looks a like a sine wave oscillating and reaching 
 zero after 3 days. My question are: 

  1. Why [is] NTP drifting +ve and -ve? 

Because the clock servo is being fed contaminated data, as explained 
above.


  2. Why should NTP [be] taking 3 days for correcting 60 milliseconds?  

Because it takes NTP days versus a few hours to slog through all that 
bad data.


  3. Is this a problem or it is expected?

Both.  It is a problem for sure, but is to be expected under these 
circumstances.


Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Dual Mixer Time Difference (DMTD) instruments sought

2008-05-16 Thread Joseph Gwinn
In article 
[EMAIL PROTECTED],
 jlevine [EMAIL PROTECTED] wrote:

 Hello,
 
  While it's unlikely that I will soon get to build such an instrument, I
  am quite interested in how they are built, if only to understand what
  can happen and why.  Can you suggest some articles and/or books and/or
  patents delving into both the theory and the practicalities of building
  DMTD instruments?
 
We (the time and frequency division of NBS/NIST) designed and built
 a dual-mixer systerm in 1980 (more or less). This same system is the one
 that still runs the atomic clock ensemble in Boulder. You can get the 
 publications
 that describe this instrument from the publications database on our web site.
 Go to tf.nist.gov and click on the publications menu. When the menu appears,
 look for author Glaze. The stuff was published in about 1983 or so.
 There were several papers as I recall with various combinations of the folks 
 who
 built the system and the software drivers for it.

This is precisely the kind of pointer I was hoping for.  Thanks.


The system we built was totally analog, but a modern system would probably
 be fully digital. Our system had a resolution of about 0.2 ps and a
 stability of about 3-4 ps. A digital system could do better, mostly because 
 the
 temperature sensitive stuff could be confined to the analog front end whereas 
 we
 had to worry about temperature pretty much everywhere in the system.

That isn't bad for 1980 analog electronics.  I think that the 5120 is 
the digital realization, as discussed in other postings.  That said, the 
5120 is temperature sensitive, and one had to allow many hours for 
temperatures to stabilize, but then the resolution appeared to be about 
0.01 pS. I assume that the improvement from 0.2 pS was due to the fancy 
matched-mixers trick, combined with use of a very low noise oscillator.


 However, the job is not trivial, since even tiny impedance mismatches can
 cause problems at this sub-picosecond resolution. You should watch especially
 for the connectors and the cables. We typically use SMA connectors and
 rigid coax. The inputs are buffered with distribution amplifiers with
 a reverse isolation that is as good as we can make it. About -165 db, I think,
 although I have not looked at that recently. (Note that the problems are not
 adequate digital computing power but plain old analog electronics.)

As I said, I don't think I will be building such an instrument.  But 
it's just this kind of nitty gritty detail I want to be aware of, for 
interest, and for self-protection in the lab.


Even so, we have a detectable sensitivity to temperature at the
 level of ps. This noise level tends to be too small to affect the
 data from cesium standards, but it could be a problem if you were trying to
 calibrate the long-period performance of a device or a transmission system 
 that
 had a small delay, since the residual diurnal temperature sensitivity could
 come to get you. 

What we were doing was to measure the temperature coefficient of 
electrical length of a temperature-stable 10 MHz distribution amplifier, 
the goal being a tempco not exceeding 1.0 pS per degree centigade.  Some 
of the tested amps achieve ~0.5 pS/degree C, in a total delay of ~4.5 
nanoseconds, or ~111 ppm per degree C, call it 100 ppm.

The test consisted of measuring changes in total delay at three 
temperatures, 17, 24, and 31 degrees C.  The problem is that it took at 
least an hour for the amplifier to stabilize at each temperature, so 
instrument drift is a significant source of error.  The measured RC 
time constant of delay of the amplifier in chamber is 14 minutes.

My solution was to compare the amplifier under test to a mechanical 
variable delay unit (Colby Instruments PDL-100A-625PS-5.0NS), using a 
fast sampling scope (200 femtosecond rms jitter(?), averaged down to ~50 
fS) as the null detector.  

The specific circuit is a low-noise oscillator (Symmetricom 1050A) 
driving the first splitter, one output driving the scope sync input, the 
other driving the input of the second splitter.  One output of the 
second splitter drives the reference path, which contains the variable 
delay unit.  The other output drives the device path, which contains the 
amplifier under test.  Both device and reference path cables pass 
through the environmental chamber, with the heated lengths held equal.  
The cables are low tempco as well (~1.5 ppm per degree C).  Everything 
was 50-ohm, at least nominally, but no attempt at precision matching or 
isolation was made, and the connectors and adapters were a mix of 
whatever could be scrounged up in the lab.

This setup yielded clean data, easily sufficient to the purpose.  The 
main limits to accuracy appear to be hysteresis in the amplifiers under 
test, and the cyclic temperature variation of the environmental chamber 
itself.


 If you are in this business then you need professional help.

Heh.  I've been told this before, but the issue 

Re: [ntp:questions] Dual Mixer Time Difference (DMTD) instruments sought

2008-05-14 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Uwe Klein [EMAIL PROTECTED] wrote:

 Joseph Gwinn wrote:
 
  OK.  It sounds like what the 5120 does.  I be that there are a lot of 
  details to get *exactly* right, though.
 Right.
 
 But with having a ten year old Cray in every laptop ...

Computational power must be harnessed to be useful.  I'm talking about 
the considerable human effort required for the harnessing.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Dual Mixer Time Difference (DMTD) instruments sought

2008-05-14 Thread Joseph Gwinn
In article 
[EMAIL PROTECTED],
 jlevine [EMAIL PROTECTED] wrote:

  I may need a Dual Mixer Time Difference (DMTD) instrument, to measure
  picosecond changes in electrical length in a coax plus amplifier time
  reference signal distribution system with total delays in the hundreds
  of nanoseconds, currently operating at 10 MHz (sinewave), but with 100
  MHz likely at some future date.
 
  What DMTD instruments are commercially available?  A google search was
  not successful - all noise no detectable signal, probably because DMTD
  instruments are not that common, and many people build their own.
 
We use dual-mixer systems in our primary time scale and also to
 calibrate and evaluate oscillators and timing hardware. So far as I
 know, the only units that are commercially available are made by Timing
 Solutions, which was recently acquired by Symmetricom. There
 are a number of different configurations, depending how how many
 devices you want to measure, whether they all run at the same
 frequency, etc.

That's been what I'm finding, and now this is being confirmed.

I don't know why Symmetricom keeps the 5120 under their hat.  It's 
really a strange story - the only way to find out that the 5120 is a 
DMTD instrument (done up in all-digital DSP form) was by knowing that 
TSC used to make an analog DMTD instrument, and following TSC's (and 
specifically Dr Stein's) trail in the literature.


It is possible to build these devices on your own,  but it is not
 trivial to get pico-second resolution and stability. Almost everything
 is temperature sensitive at this level of resolution.

I think such instruments are also sensitive to user mood.


While it's unlikely that I will soon get to build such an instrument, I 
am quite interested in how they are built, if only to understand what 
can happen and why.  Can you suggest some articles and/or books and/or 
patents delving into both the theory and the practicalities of building 
DMTD instruments?

Thanks,

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] Dual Mixer Time Difference (DMTD) instruments sought

2008-05-13 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 David L. Mills [EMAIL PROTECTED] wrote:

 Joseph,
 
 I took a look at the instrument instruction manual to see what is going 
 on. In typical todayspeak, Symmetricom doesn't say how the gadget works. 
 I make it what used to be called a Costas direct-conversion receiver. 
 The test signal is connected to two mixers; the reference oscillator is 
 connected to the other mixer inputs in quadrature. The mixer outputs are 
 digitized and filtered, the Q signal is shifted 90 degrees from the I 
 signal and combinted. The result is a baseband SSB dignal which is then 
 Fourier transformed for display. Is this what you have in mind?

Yes, but not quite the whole story.  Although impossible to discern from 
Symmetricom's 5120 datasheet and users guide, there is more to it than 
that.

I found this instrument by accident while researching the literature for 
DMTD information.  This search led me to Timing Solutions Corp (which 
was bought by Symmetricom in 2006) and  Direct-Digital Phase-Noise 
Measurement,  J. Grove, J. Hein, J. Retta, P. Schweiger, W. Solbrig, 
and S.R. Stein, 2004 IEEE International Ultrasonics, Ferroelectrics, and 
Frequency Control Joint 50th Anniversary Conference, pages 287-291.  But 
if this is an advance in the technology, there could be a patent, and 
there was: Two-Channel Digital Phase Detector, US Patent 7,227,346 to 
Wayne E. Solbrig.

I then approached Symmetricom, which led me to the 5120 (1 MHz to 30 
MHz) and the 5125 (future, 1 MHz to 400 MHz).  A section of the above 
article appears in the 5120 users guide.  

I have no idea why Symmetricom doesn't really mention that the 5120 can 
do these things, but I assume that the market for phase noise test sets 
vastly exceeds all other markets for a 5120-like instrument.  

I borrowed an early demo 5120 instrument, and in my somewhat slapdash 
lab setup, it was easily able to resolve 0.01 picosecond (eyeball rms 
width of the traces) changes in delay at 10 MHz while using a very quiet 
oscillator (a Symmetricom 1050A), after warming up overnight.

Joe Gwinn


 Dave
 
 Joseph Gwinn wrote:
 
  In article [EMAIL PROTECTED],
   Joseph Gwinn [EMAIL PROTECTED] wrote:
  
  
 I may need a Dual Mixer Time Difference (DMTD) instrument, to measure 
 picosecond changes in electrical length in a coax plus amplifier time 
 reference signal distribution system with total delays in the hundreds 
 of nanoseconds, currently operating at 10 MHz (sinewave), but with 100 
 MHz likely at some future date.
 
 What DMTD instruments are commercially available?  A google search was 
 not successful - all noise no detectable signal, probably because DMTD 
 instruments are not that common, and many people build their own.
  
  
  The silence, the silence.  I have not found too many commercial DMTF 
  units, but I have found one, although the maker does not market it a 
  such:
  
  The Symmetricom 5120 
  http://www.symmttm.com/products_pn_adev_test_sets_5120A.asp is at 
  heart a digital DMTD instrument, and will make all the usual DMTD 
  measurements, although it is marketed primarily as a phase noise test 
  set.
  
  What else is available?  
  
  
  Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Dual Mixer Time Difference (DMTD) instruments sought

2008-05-13 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Uwe Klein [EMAIL PROTECTED] wrote:

 Joseph Gwinn wrote:
  I may need a Dual Mixer Time Difference (DMTD) instrument, to measure 
  picosecond changes in electrical length in a coax plus amplifier time 
  reference signal distribution system with total delays in the hundreds 
  of nanoseconds, currently operating at 10 MHz (sinewave), but with 100 
  MHz likely at some future date.
  
  What DMTD instruments are commercially available?  A google search was 
  not successful - all noise no detectable signal, probably because DMTD 
  instruments are not that common, and many people build their own.
  
  Thanks,
  
  Joe Gwinn
 
 Take one of the better GS DSO's that have high storage depth.
 Read the shots from the DSO and do all further processing in software?

I don't understand how this would work.  Could you expand the 
description?   And what is GS?

Thanks,

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Dual Mixer Time Difference (DMTD) instruments sought

2008-05-13 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 [EMAIL PROTECTED] (John Ackermann N8UR) wrote:

 Joseph Gwinn said the following on 05/12/2008 10:38 PM:
 
  What DMTD instruments are commercially available?  A google search was 
  not successful - all noise no detectable signal, probably because DMTD 
  instruments are not that common, and many people build their own.
  
  The silence, the silence.  I have not found too many commercial DMTF 
  units, but I have found one, although the maker does not market it a 
  such:
  
  The Symmetricom 5120 
  http://www.symmttm.com/products_pn_adev_test_sets_5120A.asp is at 
  heart a digital DMTD instrument, and will make all the usual DMTD 
  measurements, although it is marketed primarily as a phase noise test 
  set.
  
  What else is available?  
 
 The 5120A is truly a wonderful box, but it's also not cheap (about
 $30K).  It's fully DSP based so all the interesting stuff is done in
 software.  One huge advantage is that the reference and
 device-under-test do not have to be at the same frequency.  There's an
 older version, the 5110A, that has been discontinued but should sell
 used for less than $10K if you can find one.  It's more of a pure DMTD
 box and doesn't do phase noise in a useful way.

The 5110A is analog, I think, although I never did get a users guide.


 I don't know of other commercially marketed products that provide a DMTD
 function.  However, there's been quite a bit of discussion about this
 over on the time-nuts list, and that's probably a better place for your
 question (https://www.febo.com/mailman/listinfo/time-nuts).

I joined, but will lurk for now.

 
 The single most critical piece of a DMTD system is the zero crossing
 detector.  Unless you have a way to increase the slew rate of the low
 frequency beat note by a million or so, trigger jitter in the counter
 will eat up almost all the advantages of the down-mix.  Again, there's
 been some discussion about this on time-nuts, and there are some folks
 there working on designing and building bits of the hardware (at least,
 a couple of months ago there was a fair bit of discussion on the point).

Yes.  And don't forget ground loops.  Noise at 1 Hz is very difficult to 
shield.

I bet one big advantage of the DSP approach is that math is cleaner than 
practical analog hardware.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] no ntp synchronisation: 2s to 6s time shift !

2008-02-24 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Richard B. Gilbert [EMAIL PROTECTED] wrote:

 Hans Jørgen Jakobsen wrote:
  On Sat, 23 Feb 2008 23:38:09 GMT, Danny Mayer wrote:
  
 No need to look. I haven't had the bandwidth to get much done in the 
 last few months. Needless to say it will be much easier to implement on 
 Windows than on Unix machines as Windows uses threads so there's no 
 issue with the transfer of information between processes.
  
  
  Why are threads not used on UNIX?
  /hjj
 
 I've never used them but I believe that Solaris supports threads.

True.  IRIX (SGI) and AIX (IBM) also support threaded applications.

The Solaris kernel is also itself threaded.  I think this is true of 
IRIX and AIX.  The HP-UX kernel is not threaded, although threaded 
applications are supported.

These are the OSs I know something of.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] Lep seconds

2008-01-14 Thread Joseph Gwinn
Dave,

In article [EMAIL PROTECTED],
 David L. Mills [EMAIL PROTECTED] wrote:

 Joseph,
 
 Conversely, if a client syncrhonizes to a server strictly running TAI 
 and never signals leaps, NTP will deliver TAI. NIST, USNO and I have 
 discussed this serveral times and concluded the lessor of two evils is 
 to continue with NTP on UTC.

Yep.  True enough.  But GPS emits TAI (plus an offset), so one can claim 
that configuring the NTP timeserver to emit GPS System Time (not UTC) is 
to generate what is essentially TAI.  This is widely done in the 
big-radar world.

Joe


 Dave
 
 Joseph Gwinn wrote:
 
  In article [EMAIL PROTECTED],
   [EMAIL PROTECTED] (David Woolley) wrote:
  
  
 In article [EMAIL PROTECTED],
 [EMAIL PROTECTED] wrote:
 
 
 compliant.  Is there a similar mod for NTP.  I am
 hoping that there is a mod that will cause NTP to
 supply theoretical UTC (even if it is not ascci).
 
 Both POSIX and NTP use UTC.  Your problem is that you are not using
 using UTC, but, rather, using TAI.
  
  
  Actually, POSIX does *not* use UTC in the normal sense of the word, as 
  no leap seconds are applied.
  
  The fundamental POSIX timescale counts what amount to SI seconds from 
  the POSIX Epoch, 0h 0m 0s UTC 1 January 1970.  Every day contains 
  exactly 86,400 seconds.
  
  That said, if one drives a POSIX box via NTP from a GPS timeserver set 
  to emit UTC (versus GPS System Time), time on the POSIX box will be 
  pretty close to UTC.
  
  Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] SNMP support

2007-12-14 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 [EMAIL PROTECTED] (Svein Skogen) wrote:

 (I took the liberty of removing the lower half of this mail. See
 previous mails in this thread for complete history)
 David L. Mills wrote:
  Heiko,
  
  A couple of comments about this mission. First, last I looked SNMP had 
 a 
  really hard time with floating point and the scaling issues are 
  dangerous. Second, as mentioned several times on the NTP hackers wire, 
 
  we would very much like to shoot ntpdc and its fascist (mode 7) 
  protocol. As of now, many configuration issues can be performed using 
  the mode-6 (ntpq) protocol. While many ntpdc related issues can be 
  easily moved to the mode-6 protocol, which is based on UDP, the monlist
  
  function of ntpdc really needs TCP, as experience with monlist and UDP 
 
  demonstrates. This paritcular combination of UDP and TCP would not be 
  friendly to SNMP.
  
  I continue to speculate that an SNMP agent in an expert system would be
  
  an ideal shotgun marriage between mode-6 and SNMP.
  
  Dave
  
 
 Disclaimer: I know parts of this has already been answered in the
 thread, and I know that a lot of the basis for my comments are made
 solely based on my memory of things, and memory (when you start to get
 my age) isn't a perfect match of things that were, but rather some
 guidlines to how things might have been. Thus I may be totally wrong, or
 answering the wrong question. (Now, I can get to the point. :) )
 
 One of the tricks I used in the old days for handling decimal numbers
 (which is why we need the floating point, isn't it?) was to use two
 variables, or to use a different (moving the decimal point) internal
 value, and dividing by 10^x for display.
 
 I'm guessing that what we need the floating point for, is the precision
 on our peers, and the precision of our drift. These values are (iirc)
 today a float number of milliseconds. And for all simplicity, they
 should remain that way for human presentation, to avoid unnecessary
 confusion.

An alternative that comes to mind is to use the scaled binary 
representation of the base two logarithm of the millisecond value in 
question.  Zero would need to be handled as a special case.  If the 
value is signed, then a sign field will also be needed.

For example, one could express 274591 milliseconds as 1000*Log2(274591)= 
18066.9248, or rounded to the integer 18067.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] Any samples for NTP/SNTP client code?

2007-12-01 Thread Joseph Gwinn
In article 
[EMAIL PROTECTED],
 [EMAIL PROTECTED] wrote:

 Does anybody know of any *practical* samples on how to
 implement NTP/SNTP client?. The goal is to provide accurate
 time for a program/client running on Windows Vista.
 
 Specifically, what values to include in the the request message,
 how to process the reply message, etc.
 
 I am NOT asking how to send/receive UDP datagrams, or where
 to find comprehensive descriptions like RFC documents, or how
 to build or design user interfaces.
 
 Only a narrow description focused on NTP/SNTP request/reply
 datagrams for a simple PC client, preferably in C/C++ source
 code.

I've done this in an embedded realtime system.  (No, the source code is 
not available.)  

In Appendix A of RFC-1305 you will find the format of the NTPv3 
request/response packet.  Send this packet to port 123 of the NTP 
server, and read the reply packet.  It's pretty easy.  

The NTPv3 packet format will work with all timeservers of NTPv3 and 
above.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions


Re: [ntp:questions] NTP architecture recommendation

2007-08-19 Thread Joseph Gwinn
In article [EMAIL PROTECTED],
 Richard B. Gilbert [EMAIL PROTECTED] wrote:

 David J Taylor wrote:
  Richard B. Gilbert wrote:
  []
  
 I don't think that a 14 channel receiver would be useful!  There
 simply are not that many satellites!  The last I knew, there were 27
 NavStar (GPS) satellites in orbit.  Of these, about seven are usually
 above the horizon at any one time.
  
  
  Richard,
  
  There can be more than that - here's an example with 13 visible.
  
http://www.david-taylor.myby.co.uk/software/wxtrack-extras.htm
  
  According to this source, there are 31 active satellites:
  
http://celestrak.com/NORAD/elements/gps-ops.txt
  
  More in orbit, I expect, now dead.
  What do they do with decommissioned GPS satellites?
  
  Cheers,
  David 
  
  
 
 I suspect there's not much they can do with decommissioned GPS satellites!
 
 You could probably launch 27 GPS satellites for the cost of one manned 
 mission to retrieve/repair one.
 
 If you wait long enough, they will come down by themselves!  It may take 
 a few hundred years. . . .

If I recall, they have a deorbiting system, which is a rocket that fires 
against their orbital motion, causing them to fall out of orbit rather 
more quickly than that.  Of course, if the satellite has completely 
failed, the deorbit system won't be listening for commands, but usually 
things won't get that bad suddenly.

Joe Gwinn

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions