Re: [chrony-users] Isolated time domains
On Tue, 3 Dec 2013, Tomalak Geret'kal wrote: On 03/12/2013 01:29, Bill Unruh wrote: [snip] I concede all of that. Though, once you have figured out what you want to happen, it's still worth testing. Agreed. But becareful of your tests as well. The UBC cosmic microwave background group lost out possibly on a Nobel prize because the tests their rocket payload was subject to was far harsher than it needed to be. The equipment had to be fixed after the test which took about 6-8 mohths, which allowed COBE to report their results for the CMB spectrum first. Subjecting your system to too stringent tests can backfire on you. Test for expected conditions, not absurd conditions. Otherwise you waste time fixing problems that will never occur. Tom -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 Physics | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
RE: [chrony-users] kernel PPS troubleshooting
I think it is not an issue of actually losing PPS for long periods of time so much as chrony ignoring "valid" PPS pulses as faulty. Note: I'm calling the pulses "valid" since I can see them through ppstest and the chrony debug output looks like it sees them with offsets below 5ms but ignores them. We should be able to set our data collection program up to check for GPS lock and chrony's selected source to set "noselect" on the GPS after the PPS has locked on, and then unset it if we actually lose a PPS signal and need to reacquire. -Original Message- From: Bill Unruh [mailto:un...@physics.ubc.ca] Sent: Monday, December 02, 2013 1:56 PM To: chrony-users@chrony.tuxfamily.org Subject: RE: [chrony-users] kernel PPS troubleshooting The key purpose of the gps is to supply the seconds for the PPS. Once it has done that it is no longer needed. Thus you could have the gps run with pps for a while, and then do a noselect on it using chronyc. That way chrony would rely on the free running os the system clock to supply the seconds, and the pps to supply the microseconds. However it is disturbing that you are losing pps for long periods of time. That might indicate that there is something wrong with your gps receiver. I know I had trouble with mine that the antenna was defective. On Mon, 2 Dec 2013, Battocchi, Scott L. wrote: > Hi All, > Sorry for the delayed response. I have collected 36 hours of data with the > following sources: > refclock PPS /dev/pps1 refid PPSi > refclock SHM 2 offset 0.530 delay 0.01 refid GPSi server > 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server > 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server > 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect > > Since we will be running disconnected from real NTP servers in our > application I had the 3 NTP servers as noselect so that I could track the GPS > and PPS against them, but not actually use them in the selection algorithm. > >> Miroslav said: >> If a source disappears for 8 polling intervals, chronyd will select another >> source even if it's much worse. I agree that could be improved. With NMEA >> sources it's usually better to use the noselect option or don't configure it >> at all. > Since we will not have access to a network time source and will be relying on > GPSD/NMEA to get us in the correct ballpark on system startup, is there > another configuration option we can try to minimize the snapping back to GPS > so quickly? > > The three attached plots are: > 4hr_offsets: Hours 0-4, offsets straight from statistics.log > 4hr_offsets_PPSadjusted: Hours 0-4, adjusted offsets assuming PPS was always > 0 and using the most recent PPS value to adjust the actual offset in > statistics.log > Syncsource_PPSadjusted: Hours 2-4, same data as PPSadjusted but with > background highlighted according to active sync source from tracking.log > > Looking through the refclocks.log it seems as though even with both PPS and > GPS present and having samples filtered, often after a GPS filtered entry in > the log PPS samples would be dropped completely until one or more subsequent > GPS filtered entries. > {14 GPSi samples and 14 PPSi samples} > 2013-11-27 23:08:38.999883 PPSi 15 N 1 2.455370e-04 1.161940e-04 > 2.265e-04 > 2013-11-27 23:08:36.999489 PPSi- N - -5.107210e-04 > 1.854e-04 > 2013-11-27 23:08:39.600949 GPSi 15 N 0 -6.007421e-01 -7.094921e-02 > 2.206e-02 > 2013-11-27 23:08:33.249250 GPSi- N - - -1.925024e-02 > 6.892e-03 > {14 GPSi samples, NO PPSi samples} > 2013-11-27 23:08:55.532367 GPSi 15 N 0 -5.323654e-01 -2.367523e-03 > 2.179e-02 > 2013-11-27 23:08:46.365687 GPSi- N - - -3.568759e-02 > 7.070e-03 > {14 GPSi samples, NO PPSi samples} > 2013-11-27 23:09:43.590657 GPSi 15 N 0 -5.906571e-01 -6.065711e-02 > 2.146e-02 > 2013-11-27 23:09:37.901101 GPSi- N - - -7.110153e-02 > 6.716e-03 > {14 GPSi samples, NO PPSi samples} > 2013-11-27 23:10:00.489102 GPSi 15 N 0 -4.891029e-01 4.089708e-02 > 2.124e-02 > 2013-11-27 23:09:52.357123 GPSi- N - - -2.712306e-02 > 6.472e-03 > 2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-04 > 1.896e-04 > 2013-11-27 23:10:01.561675 GPSi0 N 0 -5.618044e-01 -3.167506e-02 > 2.047e-02 > {14 GPSi samples, 14 PPSi samples for 3 more rounds, before dropping > PPS samples again} > > I have the full console output as well with debugging enabled and am trying > to figure out how best to parse and analyze it. One thing I notices in > comparison to my previous run is that all of the ignored PPS samples are > coming from line 465 in refclock.c: > refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored > second=0.99657 sync=0 dist=1.5 > and not line 440 like they were on the previous run: > refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored >
RE: [chrony-users] kernel PPS troubleshooting
On Mon, 2 Dec 2013, Battocchi, Scott L. wrote: I think it is not an issue of actually losing PPS for long periods of time so much as chrony ignoring "valid" PPS pulses as faulty. Note: I'm calling the pulses "valid" since I can see them through ppstest and the chrony debug output looks like it sees them with offsets below 5ms but ignores them. It should not be ignoring them-- that does sound like a bug. My only concern is that I have not seen my system ignore pps pulses (but then I do not use the kernel pps-- I use my own driver which feeds the pps through the shm). We should be able to set our data collection program up to check for GPS lock and chrony's selected source to set "noselect" on the GPS after the PPS has locked on, and then unset it if we actually lose a PPS signal and need to reacquire. You would have to lose lock for a LONG time to need to reuse the gps to set the seconds. Typically pps will bring the system drift to much less than 1 PPM, which would take a month to produce a 1 second error. Ie you would have to lose lock for a month before you would need to reuse GPS.In which case something far more serious than "lose lock" has happened. Ie, even with a sporadically working PPS, you should be able to get the computer time to within a second by say gps, and then forget about it. -Original Message- From: Bill Unruh [mailto:un...@physics.ubc.ca] Sent: Monday, December 02, 2013 1:56 PM To: chrony-users@chrony.tuxfamily.org Subject: RE: [chrony-users] kernel PPS troubleshooting The key purpose of the gps is to supply the seconds for the PPS. Once it has done that it is no longer needed. Thus you could have the gps run with pps for a while, and then do a noselect on it using chronyc. That way chrony would rely on the free running os the system clock to supply the seconds, and the pps to supply the microseconds. However it is disturbing that you are losing pps for long periods of time. That might indicate that there is something wrong with your gps receiver. I know I had trouble with mine that the antenna was defective. On Mon, 2 Dec 2013, Battocchi, Scott L. wrote: Hi All, Sorry for the delayed response. I have collected 36 hours of data with the following sources: refclock PPS /dev/pps1 refid PPSi refclock SHM 2 offset 0.530 delay 0.01 refid GPSi server 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect Since we will be running disconnected from real NTP servers in our application I had the 3 NTP servers as noselect so that I could track the GPS and PPS against them, but not actually use them in the selection algorithm. Miroslav said: If a source disappears for 8 polling intervals, chronyd will select another source even if it's much worse. I agree that could be improved. With NMEA sources it's usually better to use the noselect option or don't configure it at all. Since we will not have access to a network time source and will be relying on GPSD/NMEA to get us in the correct ballpark on system startup, is there another configuration option we can try to minimize the snapping back to GPS so quickly? The three attached plots are: 4hr_offsets: Hours 0-4, offsets straight from statistics.log 4hr_offsets_PPSadjusted: Hours 0-4, adjusted offsets assuming PPS was always 0 and using the most recent PPS value to adjust the actual offset in statistics.log Syncsource_PPSadjusted: Hours 2-4, same data as PPSadjusted but with background highlighted according to active sync source from tracking.log Looking through the refclocks.log it seems as though even with both PPS and GPS present and having samples filtered, often after a GPS filtered entry in the log PPS samples would be dropped completely until one or more subsequent GPS filtered entries. {14 GPSi samples and 14 PPSi samples} 2013-11-27 23:08:38.999883 PPSi 15 N 1 2.455370e-04 1.161940e-04 2.265e-04 2013-11-27 23:08:36.999489 PPSi- N - -5.107210e-04 1.854e-04 2013-11-27 23:08:39.600949 GPSi 15 N 0 -6.007421e-01 -7.094921e-02 2.206e-02 2013-11-27 23:08:33.249250 GPSi- N - - -1.925024e-02 6.892e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:08:55.532367 GPSi 15 N 0 -5.323654e-01 -2.367523e-03 2.179e-02 2013-11-27 23:08:46.365687 GPSi- N - - -3.568759e-02 7.070e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:09:43.590657 GPSi 15 N 0 -5.906571e-01 -6.065711e-02 2.146e-02 2013-11-27 23:09:37.901101 GPSi- N - - -7.110153e-02 6.716e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:10:00.489102 GPSi 15 N 0 -4.891029e-01 4.089708e-02 2.124e-02 2013-11-27 23:09:52.357123 GPSi- N - - -2.712306e-02 6.472e-03 2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-04 1.896e-04 2013-11-27 23:10:01.561675 GPSi0 N 0
RE: [chrony-users] kernel PPS troubleshooting
Hi All, Sorry for the delayed response. I have collected 36 hours of data with the following sources: refclock PPS /dev/pps1 refid PPSi refclock SHM 2 offset 0.530 delay 0.01 refid GPSi server 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect Since we will be running disconnected from real NTP servers in our application I had the 3 NTP servers as noselect so that I could track the GPS and PPS against them, but not actually use them in the selection algorithm. > Miroslav said: >If a source disappears for 8 polling intervals, chronyd will select another >source even if it's much worse. I agree that could be improved. With NMEA >sources it's usually better to use the noselect option or don't configure it >at all. Since we will not have access to a network time source and will be relying on GPSD/NMEA to get us in the correct ballpark on system startup, is there another configuration option we can try to minimize the snapping back to GPS so quickly? The three attached plots are: 4hr_offsets: Hours 0-4, offsets straight from statistics.log 4hr_offsets_PPSadjusted: Hours 0-4, adjusted offsets assuming PPS was always 0 and using the most recent PPS value to adjust the actual offset in statistics.log Syncsource_PPSadjusted: Hours 2-4, same data as PPSadjusted but with background highlighted according to active sync source from tracking.log Looking through the refclocks.log it seems as though even with both PPS and GPS present and having samples filtered, often after a GPS filtered entry in the log PPS samples would be dropped completely until one or more subsequent GPS filtered entries. {14 GPSi samples and 14 PPSi samples} 2013-11-27 23:08:38.999883 PPSi 15 N 1 2.455370e-04 1.161940e-04 2.265e-04 2013-11-27 23:08:36.999489 PPSi- N - -5.107210e-04 1.854e-04 2013-11-27 23:08:39.600949 GPSi 15 N 0 -6.007421e-01 -7.094921e-02 2.206e-02 2013-11-27 23:08:33.249250 GPSi- N - - -1.925024e-02 6.892e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:08:55.532367 GPSi 15 N 0 -5.323654e-01 -2.367523e-03 2.179e-02 2013-11-27 23:08:46.365687 GPSi- N - - -3.568759e-02 7.070e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:09:43.590657 GPSi 15 N 0 -5.906571e-01 -6.065711e-02 2.146e-02 2013-11-27 23:09:37.901101 GPSi- N - - -7.110153e-02 6.716e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:10:00.489102 GPSi 15 N 0 -4.891029e-01 4.089708e-02 2.124e-02 2013-11-27 23:09:52.357123 GPSi- N - - -2.712306e-02 6.472e-03 2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-04 1.896e-04 2013-11-27 23:10:01.561675 GPSi0 N 0 -5.618044e-01 -3.167506e-02 2.047e-02 {14 GPSi samples, 14 PPSi samples for 3 more rounds, before dropping PPS samples again} I have the full console output as well with debugging enabled and am trying to figure out how best to parse and analyze it. One thing I notices in comparison to my previous run is that all of the ignored PPS samples are coming from line 465 in refclock.c: refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored second=0.99657 sync=0 dist=1.5 and not line 440 like they were on the previous run: refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored offdiff=-0.313099609 refdisp=0.041061551 disp=0.022734546 Thanks, Scott -Original Message- From: Bill Unruh [mailto:un...@physics.ubc.ca] Sent: Friday, November 29, 2013 11:48 AM To: chrony-users@chrony.tuxfamily.org Subject: Re: [chrony-users] kernel PPS troubleshooting On Fri, 29 Nov 2013, Miroslav Lichvar wrote: > On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote: >> On Fri, 29 Nov 2013, Bill Unruh wrote: >> By the way, does the kernel PPS do median filtering before passing on >> the times to chrony? (Ie, taking the median of say the past 16 inputs >> and throwing away the 6 worst outliers and then retaking the median?) > > The kernel doesn't filter the PPS samples in any way. In chronyd the > PPS driver fetches the latest PPS sample from the kernel once per > second and the refclock poll (16 seconds by default) runs the median > filter. Ah. OK. > >> Anyway, it should not be switching sources unless the deviation of >> the selected source exceeds the variance of the alternative (or >> unless the source has disappeared for a suitable number of poll >> intervals, probably related to how long one would expect to wait for >> the drift rate variance to make the system clock deviate by more than >> the second source's variance. Ie, you are far better off letting a >> clock drift unconstrained for a while than to jump to source which has a >> huge (factors of a 1000) worse variance. > > The selection algorithm prefers sources with shortest distance (with > refclock that's the measured