Re: [chrony-users] kernel PPS troubleshooting
On Wed, Dec 11, 2013 at 10:11:48AM -0800, Bill Unruh wrote: > On Wed, 11 Dec 2013, Miroslav Lichvar wrote: > >When no source is selected, the PPS samples are ignored. If the SHM > >source doesn't move to the acceptable range to overlap with the PPS > >source in 8 polling intervals, the PPS source is marked as unreachable > >and the SHM source is selected as the only available source. > > That sounds like a bug. PPS should always be part of the selection process. It > is almost by definition the correct source. And certainly it could be argued > that the PPS should be the selected source, not the nmea. Of course some > people (me) us shm to deliver pps to chrony, so shm should not automatically > be downgraded, but a kernel pps it seems certainly should not be downgraded. I think it works as expected. When there is a PPS source and a SHM source and they don't agree, what do you do? Pick the PPS source only because it's from the PPS driver? The SHM source can be from a PPS signal too (as is in your case). If it was configured with the prefer flag, I'd probably agree. > >The configured delay is included in the interval used in the source > >selection algorithm, so increasing the value from 0.01 to 0.4 or > >larger should fix the problem. > > A user should not have to do this or know this. That would be nice, but I'm not sure how should chrony detect that the source is a falseticker without comparing it to other sources. The recommended configuration is to mark such sources with noselect and use them only for PPS locking. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Tue, Dec 10, 2013 at 08:29:25PM -0500, Battocchi, Scott L. wrote: > I've attached the tracking, measurements, refclocks, and sources logs trimmed > to start at the 2.35 hour mark (to coincide with the graph colored by sync > source in my previous mail). I also moved the rolling header line for each > log to the start of these trimmed ones and removed any subsequent headers > from the remainder of the file. They each run about 16 minutes and through > multiple sync source selections. I did not include any logs from the first > two minutes where sync=1 and dist actually changed since that seemed to be a > startup artifact and not related to the rest of the long run issues. It seems the dropping of the PPS source is caused by SHM source having too small configured delay. The long-term stability of the SHM source is worse than the short-term jitter, so the measured dispersion (in one polling interval) of the SHM source is sometimes smaller than the current offset, which means it doesn't overlap with the PPS source in the source selection algorithm and no source is selected with the "no majority" message. When no source is selected, the PPS samples are ignored. If the SHM source doesn't move to the acceptable range to overlap with the PPS source in 8 polling intervals, the PPS source is marked as unreachable and the SHM source is selected as the only available source. The configured delay is included in the interval used in the source selection algorithm, so increasing the value from 0.01 to 0.4 or larger should fix the problem. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Wed, 11 Dec 2013, Miroslav Lichvar wrote: On Tue, Dec 10, 2013 at 08:29:25PM -0500, Battocchi, Scott L. wrote: I've attached the tracking, measurements, refclocks, and sources logs trimmed to start at the 2.35 hour mark (to coincide with the graph colored by sync source in my previous mail). I also moved the rolling header line for each log to the start of these trimmed ones and removed any subsequent headers from the remainder of the file. They each run about 16 minutes and through multiple sync source selections. I did not include any logs from the first two minutes where sync=1 and dist actually changed since that seemed to be a startup artifact and not related to the rest of the long run issues. It seems the dropping of the PPS source is caused by SHM source having too small configured delay. The long-term stability of the SHM source is worse than the short-term jitter, so the measured dispersion (in one polling interval) of the SHM source is sometimes smaller than the current offset, which means it doesn't overlap with the PPS source in the source selection algorithm and no source is selected with the "no majority" message. When no source is selected, the PPS samples are ignored. If the SHM source doesn't move to the acceptable range to overlap with the PPS source in 8 polling intervals, the PPS source is marked as unreachable and the SHM source is selected as the only available source. That sounds like a bug. PPS should always be part of the selection process. It is almost by definition the correct source. And certainly it could be argued that the PPS should be the selected source, not the nmea. Of course some people (me) us shm to deliver pps to chrony, so shm should not automatically be downgraded, but a kernel pps it seems certainly should not be downgraded. The configured delay is included in the interval used in the source selection algorithm, so increasing the value from 0.01 to 0.4 or larger should fix the problem. A user should not have to do this or know this. -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 Physics&Astronomy | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
RE: [chrony-users] kernel PPS troubleshooting
On Tuesday, December 03, 2013 9:18 AM Miroslav Lichvar wrote: > On Mon, Dec 02, 2013 at 02:59:05PM -0500, Battocchi, Scott L. wrote: >> Since we will not have access to a network time source and will be relying >> on GPSD/NMEA to get us in the correct ballpark on system startup, is there >> another configuration option we can try to minimize the snapping back to GPS >> so quickly? > You can mark the NMEA source as noselect and still lock the PPS source to it. > The PPS samples will be ignored when NMEA is off by more than > 0.2 seconds, but according to your graphs that shouldn't happen very often. I will try that in the next couple of days >> The three attached plots are: >> 4hr_offsets: Hours 0-4, offsets straight from statistics.log >> 4hr_offsets_PPSadjusted: Hours 0-4, adjusted offsets assuming PPS was >> always 0 and using the most recent PPS value to adjust the actual offset in >> statistics.log >> Syncsource_PPSadjusted: Hours 2-4, same data as PPSadjusted but with >> background highlighted according to active sync source from tracking.log > Nice graphs! Thanks, figuring out the best way to convey all of the associated log messages has consumed more brainpower than I'd like to admit... >> I have the full console output as well with debugging enabled and am trying >> to figure out how best to parse and analyze it. One thing I notices in >> comparison to my previous run is that all of the ignored PPS samples are >> coming from line 465 in refclock.c: >> refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored >> second=0.99657 sync=0 dist=1.5 > Hm, that's weird. Do they all have the same sync and dist value? Could you > please attach corresponding parts of the tracking, refclock and statistics > logs around the time when the PPS source is dropped? So a quick grep through the trace log shows that there were 99671 pulses ignored with sync=0, all of which had dist=1.5 (no other dist. was reported with sync=0) 96 pulses ignored with sync=1, which happened in 6 groupings each starting with dis=7.x or 8.x and ending after 16 cycles (one second per cycle) ending at 22.x or 23.x All 6 of these groupings came in the first 2.5 minutes after starting chrony. I've attached the tracking, measurements, refclocks, and sources logs trimmed to start at the 2.35 hour mark (to coincide with the graph colored by sync source in my previous mail). I also moved the rolling header line for each log to the start of these trimmed ones and removed any subsequent headers from the remainder of the file. They each run about 16 minutes and through multiple sync source selections. I did not include any logs from the first two minutes where sync=1 and dist actually changed since that seemed to be a startup artifact and not related to the rest of the long run issues. I also have a trimmed console output at 810kB that I can send along if interested. >> and not line 440 like they were on the previous run: >> refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored >> offdiff=-0.313099609 refdisp=0.041061551 disp=0.022734546 > They are ignored in a different place because the lock option wasn't used > this time. Ahh, that's makes sense. Thanks again for all the help, Scott Battocchi tracking_2.35hrs_in.log Description: tracking_2.35hrs_in.log measurements_2.35hrs_in.log Description: measurements_2.35hrs_in.log refclocks_2.35hrs_in.log Description: refclocks_2.35hrs_in.log statistics_2.35hrs_in.log Description: statistics_2.35hrs_in.log
Re: [chrony-users] kernel PPS troubleshooting
On Mon, Dec 02, 2013 at 02:59:05PM -0500, Battocchi, Scott L. wrote: > Since we will not have access to a network time source and will be relying on > GPSD/NMEA to get us in the correct ballpark on system startup, is there > another configuration option we can try to minimize the snapping back to GPS > so quickly? You can mark the NMEA source as noselect and still lock the PPS source to it. The PPS samples will be ignored when NMEA is off by more than 0.2 seconds, but according to your graphs that shouldn't happen very often. > The three attached plots are: > 4hr_offsets: Hours 0-4, offsets straight from statistics.log > 4hr_offsets_PPSadjusted: Hours 0-4, adjusted offsets assuming PPS was always > 0 and using the most recent PPS value to adjust the actual offset in > statistics.log > Syncsource_PPSadjusted: Hours 2-4, same data as PPSadjusted but with > background highlighted according to active sync source from tracking.log Nice graphs! > I have the full console output as well with debugging enabled and am trying > to figure out how best to parse and analyze it. One thing I notices in > comparison to my previous run is that all of the ignored PPS samples are > coming from line 465 in refclock.c: > refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored > second=0.99657 sync=0 dist=1.5 Hm, that's weird. Do they all have the same sync and dist value? Could you please attach corresponding parts of the tracking, refclock and statistics logs around the time when the PPS source is dropped? > and not line 440 like they were on the previous run: > refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored > offdiff=-0.313099609 refdisp=0.041061551 disp=0.022734546 They are ignored in a different place because the lock option wasn't used this time. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
RE: [chrony-users] kernel PPS troubleshooting
I think it is not an issue of actually losing PPS for long periods of time so much as chrony ignoring "valid" PPS pulses as faulty. Note: I'm calling the pulses "valid" since I can see them through ppstest and the chrony debug output looks like it sees them with offsets below 5ms but ignores them. We should be able to set our data collection program up to check for GPS lock and chrony's selected source to set "noselect" on the GPS after the PPS has locked on, and then unset it if we actually lose a PPS signal and need to reacquire. -Original Message- From: Bill Unruh [mailto:un...@physics.ubc.ca] Sent: Monday, December 02, 2013 1:56 PM To: chrony-users@chrony.tuxfamily.org Subject: RE: [chrony-users] kernel PPS troubleshooting The key purpose of the gps is to supply the seconds for the PPS. Once it has done that it is no longer needed. Thus you could have the gps run with pps for a while, and then do a noselect on it using chronyc. That way chrony would rely on the free running os the system clock to supply the seconds, and the pps to supply the microseconds. However it is disturbing that you are losing pps for long periods of time. That might indicate that there is something wrong with your gps receiver. I know I had trouble with mine that the antenna was defective. On Mon, 2 Dec 2013, Battocchi, Scott L. wrote: > Hi All, > Sorry for the delayed response. I have collected 36 hours of data with the > following sources: > refclock PPS /dev/pps1 refid PPSi > refclock SHM 2 offset 0.530 delay 0.01 refid GPSi server > 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server > 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server > 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect > > Since we will be running disconnected from real NTP servers in our > application I had the 3 NTP servers as noselect so that I could track the GPS > and PPS against them, but not actually use them in the selection algorithm. > >> Miroslav said: >> If a source disappears for 8 polling intervals, chronyd will select another >> source even if it's much worse. I agree that could be improved. With NMEA >> sources it's usually better to use the noselect option or don't configure it >> at all. > Since we will not have access to a network time source and will be relying on > GPSD/NMEA to get us in the correct ballpark on system startup, is there > another configuration option we can try to minimize the snapping back to GPS > so quickly? > > The three attached plots are: > 4hr_offsets: Hours 0-4, offsets straight from statistics.log > 4hr_offsets_PPSadjusted: Hours 0-4, adjusted offsets assuming PPS was always > 0 and using the most recent PPS value to adjust the actual offset in > statistics.log > Syncsource_PPSadjusted: Hours 2-4, same data as PPSadjusted but with > background highlighted according to active sync source from tracking.log > > Looking through the refclocks.log it seems as though even with both PPS and > GPS present and having samples filtered, often after a GPS filtered entry in > the log PPS samples would be dropped completely until one or more subsequent > GPS filtered entries. > {14 GPSi samples and 14 PPSi samples} > 2013-11-27 23:08:38.999883 PPSi 15 N 1 2.455370e-04 1.161940e-04 > 2.265e-04 > 2013-11-27 23:08:36.999489 PPSi- N - -5.107210e-04 > 1.854e-04 > 2013-11-27 23:08:39.600949 GPSi 15 N 0 -6.007421e-01 -7.094921e-02 > 2.206e-02 > 2013-11-27 23:08:33.249250 GPSi- N - - -1.925024e-02 > 6.892e-03 > {14 GPSi samples, NO PPSi samples} > 2013-11-27 23:08:55.532367 GPSi 15 N 0 -5.323654e-01 -2.367523e-03 > 2.179e-02 > 2013-11-27 23:08:46.365687 GPSi- N - - -3.568759e-02 > 7.070e-03 > {14 GPSi samples, NO PPSi samples} > 2013-11-27 23:09:43.590657 GPSi 15 N 0 -5.906571e-01 -6.065711e-02 > 2.146e-02 > 2013-11-27 23:09:37.901101 GPSi- N - - -7.110153e-02 > 6.716e-03 > {14 GPSi samples, NO PPSi samples} > 2013-11-27 23:10:00.489102 GPSi 15 N 0 -4.891029e-01 4.089708e-02 > 2.124e-02 > 2013-11-27 23:09:52.357123 GPSi- N - - -2.712306e-02 > 6.472e-03 > 2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-04 > 1.896e-04 > 2013-11-27 23:10:01.561675 GPSi0 N 0 -5.618044e-01 -3.167506e-02 > 2.047e-02 > {14 GPSi samples, 14 PPSi samples for 3 more rounds, before dropping > PPS samples again} > > I have the full console output as well with debugging enabled and am trying > to figure out how best to parse and analyze it. One thing I notices in > comparison to my previous run is that all of the ignored PPS samples are > comin
RE: [chrony-users] kernel PPS troubleshooting
On Mon, 2 Dec 2013, Battocchi, Scott L. wrote: I think it is not an issue of actually losing PPS for long periods of time so much as chrony ignoring "valid" PPS pulses as faulty. Note: I'm calling the pulses "valid" since I can see them through ppstest and the chrony debug output looks like it sees them with offsets below 5ms but ignores them. It should not be ignoring them-- that does sound like a bug. My only concern is that I have not seen my system ignore pps pulses (but then I do not use the kernel pps-- I use my own driver which feeds the pps through the shm). We should be able to set our data collection program up to check for GPS lock and chrony's selected source to set "noselect" on the GPS after the PPS has locked on, and then unset it if we actually lose a PPS signal and need to reacquire. You would have to lose lock for a LONG time to need to reuse the gps to set the seconds. Typically pps will bring the system drift to much less than 1 PPM, which would take a month to produce a 1 second error. Ie you would have to lose lock for a month before you would need to reuse GPS.In which case something far more serious than "lose lock" has happened. Ie, even with a sporadically working PPS, you should be able to get the computer time to within a second by say gps, and then forget about it. -Original Message- From: Bill Unruh [mailto:un...@physics.ubc.ca] Sent: Monday, December 02, 2013 1:56 PM To: chrony-users@chrony.tuxfamily.org Subject: RE: [chrony-users] kernel PPS troubleshooting The key purpose of the gps is to supply the seconds for the PPS. Once it has done that it is no longer needed. Thus you could have the gps run with pps for a while, and then do a noselect on it using chronyc. That way chrony would rely on the free running os the system clock to supply the seconds, and the pps to supply the microseconds. However it is disturbing that you are losing pps for long periods of time. That might indicate that there is something wrong with your gps receiver. I know I had trouble with mine that the antenna was defective. On Mon, 2 Dec 2013, Battocchi, Scott L. wrote: Hi All, Sorry for the delayed response. I have collected 36 hours of data with the following sources: refclock PPS /dev/pps1 refid PPSi refclock SHM 2 offset 0.530 delay 0.01 refid GPSi server 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect Since we will be running disconnected from real NTP servers in our application I had the 3 NTP servers as noselect so that I could track the GPS and PPS against them, but not actually use them in the selection algorithm. Miroslav said: If a source disappears for 8 polling intervals, chronyd will select another source even if it's much worse. I agree that could be improved. With NMEA sources it's usually better to use the noselect option or don't configure it at all. Since we will not have access to a network time source and will be relying on GPSD/NMEA to get us in the correct ballpark on system startup, is there another configuration option we can try to minimize the snapping back to GPS so quickly? The three attached plots are: 4hr_offsets: Hours 0-4, offsets straight from statistics.log 4hr_offsets_PPSadjusted: Hours 0-4, adjusted offsets assuming PPS was always 0 and using the most recent PPS value to adjust the actual offset in statistics.log Syncsource_PPSadjusted: Hours 2-4, same data as PPSadjusted but with background highlighted according to active sync source from tracking.log Looking through the refclocks.log it seems as though even with both PPS and GPS present and having samples filtered, often after a GPS filtered entry in the log PPS samples would be dropped completely until one or more subsequent GPS filtered entries. {14 GPSi samples and 14 PPSi samples} 2013-11-27 23:08:38.999883 PPSi 15 N 1 2.455370e-04 1.161940e-04 2.265e-04 2013-11-27 23:08:36.999489 PPSi- N - -5.107210e-04 1.854e-04 2013-11-27 23:08:39.600949 GPSi 15 N 0 -6.007421e-01 -7.094921e-02 2.206e-02 2013-11-27 23:08:33.249250 GPSi- N - - -1.925024e-02 6.892e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:08:55.532367 GPSi 15 N 0 -5.323654e-01 -2.367523e-03 2.179e-02 2013-11-27 23:08:46.365687 GPSi- N - - -3.568759e-02 7.070e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:09:43.590657 GPSi 15 N 0 -5.906571e-01 -6.065711e-02 2.146e-02 2013-11-27 23:09:37.901101 GPSi- N - - -7.110153e-02 6.716e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:10:00.489102 GPSi 15 N 0 -4.891029e-01 4.089708e-02 2.124e-02 2013-11-27 23:09:52.357123 GPSi- N - - -2.712306e-02 6.472e-03 2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-
RE: [chrony-users] kernel PPS troubleshooting
On Thursday, November 28, 2013 6:15 AM Miroslav Lichvar wrote: >On Wed, Nov 27, 2013 at 04:06:58PM -0500, Battocchi, Scott L. wrote: >> I ran the GPS while connected to a handful of ntp servers and saw that my >> gps offset (originally 0.180) was too low, so I bumped it up to 0.530 for >> the next two tests. I've attached plots of the offset as recorded in the >> statistics.log file, if there are other metrics that would be useful I'm >> happy to graph them and send them out. >> ntp.png is with 5 pool servers and the GPS set to noselect (PPS is not >> locked to anything, but is selectable) gps.png is after the ntp test but >> back to just using the GPS and PPS, it looks like sometimes GPS gets >> selected as the source forcing the PPS signal to look like it is drifting >> relative to the system. >That looks similar to what I see with with a Garmin 18x LVC. This is a capture >30 hours long I did some time ago (the NMEA source's offset value was set to >0.5): >http://mlichvar.fedorapeople.org/tmp/18x_nmea.png >Since gpsd has added support for kernel PPS, I think it's better to use the >SHM 1 or SOCK source instead of PPS. Let it handle the HW details and pair the >PPS and NMEA samples. I could not see how to get GPSD to associate a kernel PPS source (our /dev/pps1 is driven by the PPS-GPIO kernel module and does not come in through the serial port's DCD line) with a NMEA source. Without a PPS signal coming into GPSD I didn't seem to get any data into chrony through the SOCK interface even though GPSD did see and successfully connect to it according to the GPSD debug output. Thanks, Scott -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
RE: [chrony-users] kernel PPS troubleshooting
Hi All, Sorry for the delayed response. I have collected 36 hours of data with the following sources: refclock PPS /dev/pps1 refid PPSi refclock SHM 2 offset 0.530 delay 0.01 refid GPSi server 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect Since we will be running disconnected from real NTP servers in our application I had the 3 NTP servers as noselect so that I could track the GPS and PPS against them, but not actually use them in the selection algorithm. > Miroslav said: >If a source disappears for 8 polling intervals, chronyd will select another >source even if it's much worse. I agree that could be improved. With NMEA >sources it's usually better to use the noselect option or don't configure it >at all. Since we will not have access to a network time source and will be relying on GPSD/NMEA to get us in the correct ballpark on system startup, is there another configuration option we can try to minimize the snapping back to GPS so quickly? The three attached plots are: 4hr_offsets: Hours 0-4, offsets straight from statistics.log 4hr_offsets_PPSadjusted: Hours 0-4, adjusted offsets assuming PPS was always 0 and using the most recent PPS value to adjust the actual offset in statistics.log Syncsource_PPSadjusted: Hours 2-4, same data as PPSadjusted but with background highlighted according to active sync source from tracking.log Looking through the refclocks.log it seems as though even with both PPS and GPS present and having samples filtered, often after a GPS filtered entry in the log PPS samples would be dropped completely until one or more subsequent GPS filtered entries. {14 GPSi samples and 14 PPSi samples} 2013-11-27 23:08:38.999883 PPSi 15 N 1 2.455370e-04 1.161940e-04 2.265e-04 2013-11-27 23:08:36.999489 PPSi- N - -5.107210e-04 1.854e-04 2013-11-27 23:08:39.600949 GPSi 15 N 0 -6.007421e-01 -7.094921e-02 2.206e-02 2013-11-27 23:08:33.249250 GPSi- N - - -1.925024e-02 6.892e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:08:55.532367 GPSi 15 N 0 -5.323654e-01 -2.367523e-03 2.179e-02 2013-11-27 23:08:46.365687 GPSi- N - - -3.568759e-02 7.070e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:09:43.590657 GPSi 15 N 0 -5.906571e-01 -6.065711e-02 2.146e-02 2013-11-27 23:09:37.901101 GPSi- N - - -7.110153e-02 6.716e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:10:00.489102 GPSi 15 N 0 -4.891029e-01 4.089708e-02 2.124e-02 2013-11-27 23:09:52.357123 GPSi- N - - -2.712306e-02 6.472e-03 2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-04 1.896e-04 2013-11-27 23:10:01.561675 GPSi0 N 0 -5.618044e-01 -3.167506e-02 2.047e-02 {14 GPSi samples, 14 PPSi samples for 3 more rounds, before dropping PPS samples again} I have the full console output as well with debugging enabled and am trying to figure out how best to parse and analyze it. One thing I notices in comparison to my previous run is that all of the ignored PPS samples are coming from line 465 in refclock.c: refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored second=0.99657 sync=0 dist=1.5 and not line 440 like they were on the previous run: refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored offdiff=-0.313099609 refdisp=0.041061551 disp=0.022734546 Thanks, Scott -Original Message- From: Bill Unruh [mailto:un...@physics.ubc.ca] Sent: Friday, November 29, 2013 11:48 AM To: chrony-users@chrony.tuxfamily.org Subject: Re: [chrony-users] kernel PPS troubleshooting On Fri, 29 Nov 2013, Miroslav Lichvar wrote: > On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote: >> On Fri, 29 Nov 2013, Bill Unruh wrote: >> By the way, does the kernel PPS do median filtering before passing on >> the times to chrony? (Ie, taking the median of say the past 16 inputs >> and throwing away the 6 worst outliers and then retaking the median?) > > The kernel doesn't filter the PPS samples in any way. In chronyd the > PPS driver fetches the latest PPS sample from the kernel once per > second and the refclock poll (16 seconds by default) runs the median > filter. Ah. OK. > >> Anyway, it should not be switching sources unless the deviation of >> the selected source exceeds the variance of the alternative (or >> unless the source has disappeared for a suitable number of poll >> intervals, probably related to how long one would expect to wait for >> the drift rate variance to make the system clock deviate by more than >> the second source's variance. Ie, you are far better off letting a >> clock drift unconstrained for a while than to jump to source which has a >> huge (factor
RE: [chrony-users] kernel PPS troubleshooting
The key purpose of the gps is to supply the seconds for the PPS. Once it has done that it is no longer needed. Thus you could have the gps run with pps for a while, and then do a noselect on it using chronyc. That way chrony would rely on the free running os the system clock to supply the seconds, and the pps to supply the microseconds. However it is disturbing that you are losing pps for long periods of time. That might indicate that there is something wrong with your gps receiver. I know I had trouble with mine that the antenna was defective. On Mon, 2 Dec 2013, Battocchi, Scott L. wrote: Hi All, Sorry for the delayed response. I have collected 36 hours of data with the following sources: refclock PPS /dev/pps1 refid PPSi refclock SHM 2 offset 0.530 delay 0.01 refid GPSi server 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect Since we will be running disconnected from real NTP servers in our application I had the 3 NTP servers as noselect so that I could track the GPS and PPS against them, but not actually use them in the selection algorithm. Miroslav said: If a source disappears for 8 polling intervals, chronyd will select another source even if it's much worse. I agree that could be improved. With NMEA sources it's usually better to use the noselect option or don't configure it at all. Since we will not have access to a network time source and will be relying on GPSD/NMEA to get us in the correct ballpark on system startup, is there another configuration option we can try to minimize the snapping back to GPS so quickly? The three attached plots are: 4hr_offsets: Hours 0-4, offsets straight from statistics.log 4hr_offsets_PPSadjusted: Hours 0-4, adjusted offsets assuming PPS was always 0 and using the most recent PPS value to adjust the actual offset in statistics.log Syncsource_PPSadjusted: Hours 2-4, same data as PPSadjusted but with background highlighted according to active sync source from tracking.log Looking through the refclocks.log it seems as though even with both PPS and GPS present and having samples filtered, often after a GPS filtered entry in the log PPS samples would be dropped completely until one or more subsequent GPS filtered entries. {14 GPSi samples and 14 PPSi samples} 2013-11-27 23:08:38.999883 PPSi 15 N 1 2.455370e-04 1.161940e-04 2.265e-04 2013-11-27 23:08:36.999489 PPSi- N - -5.107210e-04 1.854e-04 2013-11-27 23:08:39.600949 GPSi 15 N 0 -6.007421e-01 -7.094921e-02 2.206e-02 2013-11-27 23:08:33.249250 GPSi- N - - -1.925024e-02 6.892e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:08:55.532367 GPSi 15 N 0 -5.323654e-01 -2.367523e-03 2.179e-02 2013-11-27 23:08:46.365687 GPSi- N - - -3.568759e-02 7.070e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:09:43.590657 GPSi 15 N 0 -5.906571e-01 -6.065711e-02 2.146e-02 2013-11-27 23:09:37.901101 GPSi- N - - -7.110153e-02 6.716e-03 {14 GPSi samples, NO PPSi samples} 2013-11-27 23:10:00.489102 GPSi 15 N 0 -4.891029e-01 4.089708e-02 2.124e-02 2013-11-27 23:09:52.357123 GPSi- N - - -2.712306e-02 6.472e-03 2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-04 1.896e-04 2013-11-27 23:10:01.561675 GPSi0 N 0 -5.618044e-01 -3.167506e-02 2.047e-02 {14 GPSi samples, 14 PPSi samples for 3 more rounds, before dropping PPS samples again} I have the full console output as well with debugging enabled and am trying to figure out how best to parse and analyze it. One thing I notices in comparison to my previous run is that all of the ignored PPS samples are coming from line 465 in refclock.c: refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored second=0.99657 sync=0 dist=1.5 and not line 440 like they were on the previous run: refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored offdiff=-0.313099609 refdisp=0.041061551 disp=0.022734546 Thanks, Scott -Original Message- From: Bill Unruh [mailto:un...@physics.ubc.ca] Sent: Friday, November 29, 2013 11:48 AM To: chrony-users@chrony.tuxfamily.org Subject: Re: [chrony-users] kernel PPS troubleshooting On Fri, 29 Nov 2013, Miroslav Lichvar wrote: On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote: On Fri, 29 Nov 2013, Bill Unruh wrote: By the way, does the kernel PPS do median filtering before passing on the times to chrony? (Ie, taking the median of say the past 16 inputs and throwing away the 6 worst outliers and then retaking the median?) The kernel doesn't filter the PPS samples in any way. In chronyd the PPS driver fetches the latest PPS sample from the kernel once per second and the refclock poll (16 seconds by default) runs the median filter. Ah. OK. Anyway, it should not be switching sources unless
Re: [chrony-users] kernel PPS troubleshooting
On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote: > On Fri, 29 Nov 2013, Bill Unruh wrote: > By the way, does the kernel PPS do median filtering before passing on the > times to chrony? (Ie, taking the median of say the past 16 inputs and throwing > away the 6 worst outliers and then retaking the median?) The kernel doesn't filter the PPS samples in any way. In chronyd the PPS driver fetches the latest PPS sample from the kernel once per second and the refclock poll (16 seconds by default) runs the median filter. > Anyway, it should not be switching sources unless the deviation of the > selected source exceeds the variance of the alternative (or unless the source > has disappeared for a suitable number of poll intervals, probably related to > how long one would expect to wait for the drift rate variance to make the > system clock deviate by more than the second source's variance. Ie, you are > far better off letting a clock drift unconstrained for a while than to jump to > source which has a huge (factors of a 1000) worse variance. The selection algorithm prefers sources with shortest distance (with refclock that's the measured dispersion + configured delay). If there are more sources with similar distance they will be combined together. If a source disappears for 8 polling intervals, chronyd will select another source even if it's much worse. I agree that could be improved. With NMEA sources it's usually better to use the noselect option or don't configure it at all. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On 29/11/2013 18:21, Miroslav Lichvar wrote: Anyway, it should not be switching sources unless the deviation of the selected source exceeds the variance of the alternative (or unless the source has disappeared for a suitable number of poll intervals, probably related to how long one would expect to wait for the drift rate variance to make the system clock deviate by more than the second source's variance. Ie, you are far better off letting a clock drift unconstrained for a while than to jump to source which has a huge (factors of a 1000) worse variance. The selection algorithm prefers sources with shortest distance (with refclock that's the measured dispersion + configured delay). If there are more sources with similar distance they will be combined together. If a source disappears for 8 polling intervals, chronyd will select another source even if it's much worse. I agree that could be improved. With NMEA sources it's usually better to use the noselect option or don't configure it at all. With PPS and NMEA sources, I found chrony bouncing between the two unless I marked the NMEA source as "noselect" (see thread from August 2012). It's still on my todo list to get more debugging information on this, as Bill indicated that it may have been a bug (21/08/2012 22:58). Possibly the same thing is happening here? Tom -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Thu, Nov 28, 2013 at 11:11:18AM -0800, Bill Unruh wrote: > On Thu, 28 Nov 2013, Miroslav Lichvar wrote: > >That looks similar to what I see with with a Garmin 18x LVC. This is a > >capture 30 hours long I did some time ago (the NMEA source's offset > >value was set to 0.5): > > > >http://mlichvar.fedorapeople.org/tmp/18x_nmea.png > > Is this the nmea time or the PPS time? And is the vertical axis seconds or > milliseconds? That's the NMEA time (as provided by gpsd) when the clock was synchronized to PPS. It's unfortunately in seconds. I think it was with 115200 baud rate. > The problem in his case is that the PPS signal is occasionally > (but far too often) off by almost .3 sec. That is rediculous. And it is only > when the gps-nmea and the PPS are the only sources. He said chronyd was switching between the PPS and GPS sources, so the 0.3s spike could be just the PPS-NMEA offset. The other graph with chronyd using NTP sources doesn't seem to have this problem. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Fri, 29 Nov 2013, Miroslav Lichvar wrote: On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote: On Fri, 29 Nov 2013, Bill Unruh wrote: By the way, does the kernel PPS do median filtering before passing on the times to chrony? (Ie, taking the median of say the past 16 inputs and throwing away the 6 worst outliers and then retaking the median?) The kernel doesn't filter the PPS samples in any way. In chronyd the PPS driver fetches the latest PPS sample from the kernel once per second and the refclock poll (16 seconds by default) runs the median filter. Ah. OK. Anyway, it should not be switching sources unless the deviation of the selected source exceeds the variance of the alternative (or unless the source has disappeared for a suitable number of poll intervals, probably related to how long one would expect to wait for the drift rate variance to make the system clock deviate by more than the second source's variance. Ie, you are far better off letting a clock drift unconstrained for a while than to jump to source which has a huge (factors of a 1000) worse variance. The selection algorithm prefers sources with shortest distance (with refclock that's the measured dispersion + configured delay). If there are more sources with similar distance they will be combined together. If a source disappears for 8 polling intervals, chronyd will select another source even if it's much worse. I agree that could be improved. With NMEA sources it's usually better to use the noselect option or don't configure it at all. It looks in the source code as if it grabs a new source as soon as the source disappears, but that was really not a very good look I had at the code. If only only had nmea and pps, one needs the nmea at least at start up to get the time to within a half second or so, but thereafter of course it probably should not be used unless the PPS disappears for quite a while ( in which case the nmea is liable to be not very good either) Certainly it would be good to find out what was happening with his clock hopping. -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Fri, 29 Nov 2013, Bill Unruh wrote: On Fri, 29 Nov 2013, Miroslav Lichvar wrote: > The problem in his case is that the PPS signal is occasionally > (but far too often) off by almost .3 sec. That is rediculous. And it is > only > when the gps-nmea and the PPS are the only sources. He said chronyd was switching between the PPS and GPS sources, so the 0.3s spike could be just the PPS-NMEA offset. The other graph with chronyd using NTP sources doesn't seem to have this problem. Hm, I guess that would do it. But why would it be switching like that? If it is doing so, then there is a problem with the chrony selection algorithm. Your solution of having gpsd handle it all is a possible one, but chrony itself should not be behaving that way. The nmea has a huge variance, while the PPS variance should be tiny, and it should be being selected. Or is the PPS exceeding its variance occasionally and chrony thinking it has gone rogue, selects the nmea? By this time I do not remember the selection algorithm sufficiently well to be able to say. By the way, does the kernel PPS do median filtering before passing on the times to chrony? (Ie, taking the median of say the past 16 inputs and throwing away the 6 worst outliers and then retaking the median?) Anyway, it should not be switching sources unless the deviation of the selected source exceeds the variance of the alternative (or unless the source has disappeared for a suitable number of poll intervals, probably related to how long one would expect to wait for the drift rate variance to make the system clock deviate by more than the second source's variance. Ie, you are far better off letting a clock drift unconstrained for a while than to jump to source which has a huge (factors of a 1000) worse variance. -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 Physics&Astronomy | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Fri, 29 Nov 2013, Miroslav Lichvar wrote: The problem in his case is that the PPS signal is occasionally (but far too often) off by almost .3 sec. That is rediculous. And it is only when the gps-nmea and the PPS are the only sources. He said chronyd was switching between the PPS and GPS sources, so the 0.3s spike could be just the PPS-NMEA offset. The other graph with chronyd using NTP sources doesn't seem to have this problem. Hm, I guess that would do it. But why would it be switching like that? If it is doing so, then there is a problem with the chrony selection algorithm. Your solution of having gpsd handle it all is a possible one, but chrony itself should not be behaving that way. The nmea has a huge variance, while the PPS variance should be tiny, and it should be being selected. Or is the PPS exceeding its variance occasionally and chrony thinking it has gone rogue, selects the nmea? By this time I do not remember the selection algorithm sufficiently well to be able to say. -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On 28/11/2013 20:54, Bill Unruh wrote: And on further thought, I also concede your point, since pps does not really give the fractions of a second either, but just gives the second mark. You do need an additional "clock" to actually tell you the fractions of a second. Yes... Anyway, I hope I clarified what I meant. ... and yes. :) Tom -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On 28/11/2013 20:05, Bill Unruh wrote: On Thu, 28 Nov 2013, Tomalak Geret'kal wrote: On 28/11/2013 19:11, Bill Unruh wrote: Is this the nmea time or the PPS time? What is "PPS time"? PPS provides timing, not time. In my nomenclature, they are the same. PPS does supply time but just the fractional seconds part of it. (Just as ntp supplies time by only the fractional "centuries" part of it-- You probably would not argue that ntp does not supply time just the timing.) It probably comes from the traditional notion of time as something useful to humans, i.e. something down to minutes or seconds at least but up to hours or days/months/years. In the commercial world of timing sync (e.g. telecoms networks) we say timing vs time to differentiate the two, and do not allow sub-second timing to count as any indication of absolute "time", but I concede your point. Tom -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On 28/11/2013 19:11, Bill Unruh wrote: Is this the nmea time or the PPS time? What is "PPS time"? PPS provides timing, not time. Tom -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Wed, Nov 27, 2013 at 04:06:58PM -0500, Battocchi, Scott L. wrote: > I ran the GPS while connected to a handful of ntp servers and saw that my gps > offset (originally 0.180) was too low, so I bumped it up to 0.530 for the > next two tests. I've attached plots of the offset as recorded in the > statistics.log file, if there are other metrics that would be useful I'm > happy to graph them and send them out. > ntp.png is with 5 pool servers and the GPS set to noselect (PPS is not locked > to anything, but is selectable) > gps.png is after the ntp test but back to just using the GPS and PPS, it > looks like sometimes GPS gets selected as the source forcing the PPS signal > to look like it is drifting relative to the system. That looks similar to what I see with with a Garmin 18x LVC. This is a capture 30 hours long I did some time ago (the NMEA source's offset value was set to 0.5): http://mlichvar.fedorapeople.org/tmp/18x_nmea.png Since gpsd has added support for kernel PPS, I think it's better to use the SHM 1 or SOCK source instead of PPS. Let it handle the HW details and pair the PPS and NMEA samples. > I think a portion of my original confusion was that the chronyc sources > command was indicating that the pulse had never been seen, as opposed to it > being seen and ignored. I need to compare the GPS logs with the chrony logs > to see if the changing offset is a function of the number of satellites in > view, otherwise I don't have a great explanation for the wander seen in the > ntp plot. >From what I remember from other discussions about NMEA timing, it mainly depends on how is the firmware implemented and the number of visible satellites may have nothing to do with it. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
And on further thought, I also concede your point, since pps does not really give the fractions of a second either, but just gives the second mark. You do need an additional "clock" to actually tell you the fractions of a second. Anyway, I hope I clarified what I meant. On Thu, 28 Nov 2013, Tomalak Geret'kal wrote: On 28/11/2013 20:05, Bill Unruh wrote: On Thu, 28 Nov 2013, Tomalak Geret'kal wrote: > On 28/11/2013 19:11, Bill Unruh wrote: > > Is this the nmea time or the PPS time? > > What is "PPS time"? PPS provides timing, not time. In my nomenclature, they are the same. PPS does supply time but just the fractional seconds part of it. (Just as ntp supplies time by only the fractional "centuries" part of it-- You probably would not argue that ntp does not supply time just the timing.) > It probably comes from the traditional notion of time as something useful to humans, i.e. something down to minutes or seconds at least but up to hours or days/months/years. In the commercial world of timing sync (e.g. telecoms networks) we say timing vs time to differentiate the two, and do not allow sub-second timing to count as any indication of absolute "time", but I concede your point. Tom -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 Physics&Astronomy | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Thu, 28 Nov 2013, Tomalak Geret'kal wrote: On 28/11/2013 19:11, Bill Unruh wrote: Is this the nmea time or the PPS time? What is "PPS time"? PPS provides timing, not time. In my nomenclature, they are the same. PPS does supply time but just the fractional seconds part of it. (Just as ntp supplies time by only the fractional "centuries" part of it-- You probably would not argue that ntp does not supply time just the timing.) Tom -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 Physics&Astronomy | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Thu, 28 Nov 2013, Miroslav Lichvar wrote: On Wed, Nov 27, 2013 at 04:06:58PM -0500, Battocchi, Scott L. wrote: I ran the GPS while connected to a handful of ntp servers and saw that my gps offset (originally 0.180) was too low, so I bumped it up to 0.530 for the next two tests. I've attached plots of the offset as recorded in the statistics.log file, if there are other metrics that would be useful I'm happy to graph them and send them out. ntp.png is with 5 pool servers and the GPS set to noselect (PPS is not locked to anything, but is selectable) gps.png is after the ntp test but back to just using the GPS and PPS, it looks like sometimes GPS gets selected as the source forcing the PPS signal to look like it is drifting relative to the system. That looks similar to what I see with with a Garmin 18x LVC. This is a capture 30 hours long I did some time ago (the NMEA source's offset value was set to 0.5): http://mlichvar.fedorapeople.org/tmp/18x_nmea.png Is this the nmea time or the PPS time? And is the vertical axis seconds or milliseconds? The problem in his case is that the PPS signal is occasionally (but far too often) off by almost .3 sec. That is rediculous. And it is only when the gps-nmea and the PPS are the only sources. I see nothing like that with my Sure gps with PPS driving chrony. On the other hand I use a "self rolled" pps interrupt driver on the parallel port, not the Linux supplied serial port driver. But the graph where he runs the PPS together with the external ntp sources shows no sign of that kind of absurd jumps in the PPS time, so that would suggest that the interrupt handler is OK. Since gpsd has added support for kernel PPS, I think it's better to use the SHM 1 or SOCK source instead of PPS. Let it handle the HW details and pair the PPS and NMEA samples. I think a portion of my original confusion was that the chronyc sources command was indicating that the pulse had never been seen, as opposed to it being seen and ignored. I need to compare the GPS logs with the chrony logs to see if the changing offset is a function of the number of satellites in view, otherwise I don't have a great explanation for the wander seen in the ntp plot. From what I remember from other discussions about NMEA timing, it mainly depends on how is the firmware implemented and the number of visible satellites may have nothing to do with it. -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 Physics&Astronomy | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
RE: [chrony-users] kernel PPS troubleshooting
The reason for the additional GPS strings is this system will actually be moving around and we also need to get the position and fix quality information through gpsd. I ran the GPS while connected to a handful of ntp servers and saw that my gps offset (originally 0.180) was too low, so I bumped it up to 0.530 for the next two tests. I've attached plots of the offset as recorded in the statistics.log file, if there are other metrics that would be useful I'm happy to graph them and send them out. ntp.png is with 5 pool servers and the GPS set to noselect (PPS is not locked to anything, but is selectable) gps.png is after the ntp test but back to just using the GPS and PPS, it looks like sometimes GPS gets selected as the source forcing the PPS signal to look like it is drifting relative to the system. I think a portion of my original confusion was that the chronyc sources command was indicating that the pulse had never been seen, as opposed to it being seen and ignored. I need to compare the GPS logs with the chrony logs to see if the changing offset is a function of the number of satellites in view, otherwise I don't have a great explanation for the wander seen in the ntp plot. Thanks, Scott -Original Message- From: Miroslav Lichvar [mailto:mlich...@redhat.com] Sent: Wednesday, November 27, 2013 1:44 AM To: chrony-users@chrony.tuxfamily.org Subject: Re: [chrony-users] kernel PPS troubleshooting On Tue, Nov 26, 2013 at 08:49:19PM -0500, Battocchi, Scott L. wrote: > Bill, > Thanks for taking an initial look. I've added my system to our network to > compare our GPS time with the general NTP pool and it looks like our GPS > could be right on the edge of that 0.4s window. I'm going to let it run for > a bit like this and report back after trying a larger offset for our SHM > refclock. The receiver I am using is an MTK3339 if anyone else has a > standard offset they use (default speed and strings (9600 8n1 with GGA GSA > RMC VTG and VSG enabled). Yes, from the log it looks like the SHM and PPS sources are too far from each other (large offdiff value). Also, the GPS source might be too jittery to be used reliably as the locking reference for PPS. One way to find out which one is wrong is to add a good NTP source as the reference, add the noselect option to the GPS and PPS sources (without any locking) and observe the offset values in the refclocks log or chronyc sourcestats output. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
RE: [chrony-users] kernel PPS troubleshooting
On Wed, 27 Nov 2013, Battocchi, Scott L. wrote: The reason for the additional GPS strings is this system will actually be moving around and we also need to get the position and fix quality information through gpsd. I ran the GPS while connected to a handful of ntp servers and saw that my gps offset (originally 0.180) was too low, so I bumped it up to 0.530 for the next two tests. I've attached plots of the offset as recorded in the statistics.log file, if there are other metrics that would be useful I'm happy to graph them and send them out. ntp.png is with 5 pool servers and the GPS set to noselect (PPS is not locked to anything, but is selectable) Hard to read. The vertical axis is what? seconds? The tick marks all lie on top of each otehr, so it is hard to figure out what is going on. It looks like using the remote ntp sources disciplines the clock to within better than 10ms (it should be good to about 50 micro, not milli, seconds unless you have a really bad network.) I agree on the second graph, the behaviour of the pps is bizarre. It is really not clear why the pps should suddenlyhave .3 sec offset. The time on the computer should coast far far better than that. In fact, the gps should only really be necessary in order to get the system time to within a few hunndred ms, and after that the PPS should be able to discipline the clock all on its own (in 16 sec the system clock should not get to more than a ms away from the true time even free running). Ffor pps to suddenly indicate .3ms offset would imply that your clock drifted at 2PPM which is absurd. So either there is something really seriously wrong with your system clock (eg some other program is coming in and altering the clock behind chrony's back) or there is a severe bug in chrony (but I run chrony with a PPS-- but my own driver and I see offsets of 10 micro seconds, not 300 milliseconds) or with the PPS driver (but then why is the first graph where the PPS does not discipline the clock showing none of those absurd jumps.) Note that I also find it weird that your gps time fluctuates by almost 1 second peak to peak. It really really should be much better than that. What do the refclock, measurement and statistics logs show for those times when the PPS offset jumps so much? gps.png is after the ntp test but back to just using the GPS and PPS, it looks like sometimes GPS gets selected as the source forcing the PPS signal to look like it is drifting relative to the system. I think a portion of my original confusion was that the chronyc sources command was indicating that the pulse had never been seen, as opposed to it being seen and ignored. I need to compare the GPS logs with the chrony logs to see if the changing offset is a function of the number of satellites in view, otherwise I don't have a great explanation for the wander seen in the ntp plot. Thanks, Scott -Original Message- From: Miroslav Lichvar [mailto:mlich...@redhat.com] Sent: Wednesday, November 27, 2013 1:44 AM To: chrony-users@chrony.tuxfamily.org Subject: Re: [chrony-users] kernel PPS troubleshooting On Tue, Nov 26, 2013 at 08:49:19PM -0500, Battocchi, Scott L. wrote: Bill, Thanks for taking an initial look. I've added my system to our network to compare our GPS time with the general NTP pool and it looks like our GPS could be right on the edge of that 0.4s window. I'm going to let it run for a bit like this and report back after trying a larger offset for our SHM refclock. The receiver I am using is an MTK3339 if anyone else has a standard offset they use (default speed and strings (9600 8n1 with GGA GSA RMC VTG and VSG enabled). Yes, from the log it looks like the SHM and PPS sources are too far from each other (large offdiff value). Also, the GPS source might be too jittery to be used reliably as the locking reference for PPS. One way to find out which one is wrong is to add a good NTP source as the reference, add the noselect option to the GPS and PPS sources (without any locking) and observe the offset values in the refclocks log or chronyc sourcestats output. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org. -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 Physics&Astronomy | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Tue, Nov 26, 2013 at 08:49:19PM -0500, Battocchi, Scott L. wrote: > Bill, > Thanks for taking an initial look. I've added my system to our network to > compare our GPS time with the general NTP pool and it looks like our GPS > could be right on the edge of that 0.4s window. I'm going to let it run for > a bit like this and report back after trying a larger offset for our SHM > refclock. The receiver I am using is an MTK3339 if anyone else has a > standard offset they use (default speed and strings (9600 8n1 with GGA GSA > RMC VTG and VSG enabled). Yes, from the log it looks like the SHM and PPS sources are too far from each other (large offdiff value). Also, the GPS source might be too jittery to be used reliably as the locking reference for PPS. One way to find out which one is wrong is to add a good NTP source as the reference, add the noselect option to the GPS and PPS sources (without any locking) and observe the offset values in the refclocks log or chronyc sourcestats output. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
RE: [chrony-users] kernel PPS troubleshooting
On Tue, 26 Nov 2013, Battocchi, Scott L. wrote: Bill, Thanks for taking an initial look. I've added my system to our network to compare our GPS time with the general NTP pool and it looks like our GPS could be right on the edge of that 0.4s window. I'm going to let it run for a bit like this and report back after trying a larger offset for our SHM refclock. The receiver I am using is an MTK3339 if anyone else has a standard offset they use (default speed and strings (9600 8n1 with GGA GSA RMC VTG and VSG enabled). Why do you have all of those sentences? You only need one for timing. Thanks, SCott -Original Message- From: Bill Unruh [mailto:un...@physics.ubc.ca] Sent: Tuesday, November 26, 2013 1:43 PM To: chrony-users@chrony.tuxfamily.org Subject: RE: [chrony-users] kernel PPS troubleshooting The pps can only give you when the second turnover occurs, it cannot tell you which second that is. That MUST be given by some other time source, which could be the nmea sentences from the gps or by some other source. The problem with the nmea is that it is usually late. Late by something like .5 to 1 sec. But chrony must be confident that the system time is within less than .5 sec of the real time before it will trust the PPS. source. Now, it looks to me on a very quick look that this is not happening for some reason, and so that pps data is being rejected. I have not looked at Miroslav's code to figure out exactly what is being reported, so am not at allconfident I am reading it properly. On Tue, 26 Nov 2013, Battocchi, Scott L. wrote: Miraslov, Thanks for the modified source, I've recompiled it with --enable-trace and do indeed get a lot more information. I modified the original chrony.conf to drop the external gps (GPSe/PPSe) since they were generating a lot of sample ignored trace messages (no valid fix and no updating pps), so the reports below are with only the GPSi/PPSi sources active in the configuration. I've tried to copy key portions of the run below to avoid attaching the 5MB trace log, I'm open to other methods of sharing the whole log if there is interest. I have attached the tracking.log since it shows when PPS was available compared to the long periods where GPS was active (and the PPS was coming into /dev/pps1). It looks like the pulse is ignored when offdiff is relatively large (>0.2?), but that the offdiff steps very quickly between valid and invalid. It is also possible I'm interpreting the pulse handling completely incorrectly. Starting the modified chrony all appears well for a while but within a couple of minutes most of the PPS pulses are ignored. PPS goes in and out of being ignored for the next ~5 minutes before disappearing for another 90 minutes. After that brief recovery, it is ignored for the rest of the run: :~/chronytrace# ./chrony [Jd -d main.c:355:(main)[26-18:00:28] chronyd version DEVELOPMENT starting sys_linux.c:1022:(get_version_specific_details)[26-18:00:28] Linux kernel major=3 minor=3 patch=0 sys_linux.c:1080:(get_version_specific_details)[26-18:00:28] hz=100 shift_hz=7 freq_scale=1. nominal_tick=1 slew_delta_tick=833 max_tick_bias=1000 shift_pll=2 local.c:565:(lcl_RegisterSystemDrivers)[26-18:00:28] Local freq=297.043ppm refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock PPS added poll=4 dpoll=0 filter=16 refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock SHM added poll=4 dpoll=0 filter=16 reference.c:194:(REF_Initialise)[26-18:00:28] Initial frequency 297.043 ppm sources.c:331:(SRC_SetSelectable)[26-18:00:28] PPSi sources.c:331:(SRC_SetSelectable)[26-18:00:28] GPSi refclock.c:416:(RCL_AddPulse)[26-18:00:28] refclock pulse ignored no ref sample refclock.c:687:(filter_add_sample)[26-18:00:28] filter sample 0 t=Tue 11/26/13 18:00:28.080062 offset=1.059937008 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:29] refclock pulse ignored offdiff=-0.459021095 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:29] filter sample 1 t=Tue 11/26/13 18:00:29.060753 offset=1.079246196 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:30] refclock pulse ignored offdiff=-0.440026443 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:30] filter sample 2 t=Tue 11/26/13 18:00:30.076668 offset=1.063331638 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:31] refclock pulse ignored offdiff=-0.456248500 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:31] filter sample 3 t=Tue 11/26/13 18:00:31.121044 offset=1.018955039 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:32] refclock pulse ignored offdiff=0.499073485 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:32] filter sample 4 t=Tue 11/26/13 18:00:32.121227 offset=1.018772016 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:33] refclock pul
RE: [chrony-users] kernel PPS troubleshooting
Bill, Thanks for taking an initial look. I've added my system to our network to compare our GPS time with the general NTP pool and it looks like our GPS could be right on the edge of that 0.4s window. I'm going to let it run for a bit like this and report back after trying a larger offset for our SHM refclock. The receiver I am using is an MTK3339 if anyone else has a standard offset they use (default speed and strings (9600 8n1 with GGA GSA RMC VTG and VSG enabled). Thanks, SCott -Original Message- From: Bill Unruh [mailto:un...@physics.ubc.ca] Sent: Tuesday, November 26, 2013 1:43 PM To: chrony-users@chrony.tuxfamily.org Subject: RE: [chrony-users] kernel PPS troubleshooting The pps can only give you when the second turnover occurs, it cannot tell you which second that is. That MUST be given by some other time source, which could be the nmea sentences from the gps or by some other source. The problem with the nmea is that it is usually late. Late by something like .5 to 1 sec. But chrony must be confident that the system time is within less than .5 sec of the real time before it will trust the PPS. source. Now, it looks to me on a very quick look that this is not happening for some reason, and so that pps data is being rejected. I have not looked at Miroslav's code to figure out exactly what is being reported, so am not at allconfident I am reading it properly. On Tue, 26 Nov 2013, Battocchi, Scott L. wrote: > Miraslov, > Thanks for the modified source, I've recompiled it with --enable-trace and > do indeed get a lot more information. > I modified the original chrony.conf to drop the external gps (GPSe/PPSe) > since they were generating a lot of sample ignored trace messages (no valid > fix and no updating pps), so the reports below are with only the GPSi/PPSi > sources active in the configuration. > > I've tried to copy key portions of the run below to avoid attaching the 5MB > trace log, I'm open to other methods of sharing the whole log if there is > interest. I have attached the tracking.log since it shows when PPS was > available compared to the long periods where GPS was active (and the PPS was > coming into /dev/pps1). It looks like the pulse is ignored when offdiff is > relatively large (>0.2?), but that the offdiff steps very quickly between > valid and invalid. It is also possible I'm interpreting the pulse handling > completely incorrectly. > > Starting the modified chrony all appears well for a while but within a couple > of minutes most of the PPS pulses are ignored. PPS goes in and out of being > ignored for the next ~5 minutes before disappearing for another 90 minutes. > After that brief recovery, it is ignored for the rest of the run: > :~/chronytrace# ./chrony [Jd -d > main.c:355:(main)[26-18:00:28] chronyd version DEVELOPMENT starting > sys_linux.c:1022:(get_version_specific_details)[26-18:00:28] Linux > kernel major=3 minor=3 patch=0 > sys_linux.c:1080:(get_version_specific_details)[26-18:00:28] hz=100 > shift_hz=7 freq_scale=1. nominal_tick=1 > slew_delta_tick=833 max_tick_bias=1000 shift_pll=2 > local.c:565:(lcl_RegisterSystemDrivers)[26-18:00:28] Local > freq=297.043ppm refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock > PPS added poll=4 dpoll=0 filter=16 > refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock SHM added > poll=4 dpoll=0 filter=16 reference.c:194:(REF_Initialise)[26-18:00:28] > Initial frequency 297.043 ppm > sources.c:331:(SRC_SetSelectable)[26-18:00:28] PPSi > sources.c:331:(SRC_SetSelectable)[26-18:00:28] GPSi > refclock.c:416:(RCL_AddPulse)[26-18:00:28] refclock pulse ignored no > ref sample refclock.c:687:(filter_add_sample)[26-18:00:28] filter > sample 0 t=Tue 11/26/13 18:00:28.080062 offset=1.059937008 > dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:29] > refclock pulse ignored offdiff=-0.459021095 refdisp=0.03000 > disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:29] > filter sample 1 t=Tue 11/26/13 18:00:29.060753 offset=1.079246196 > dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:30] > refclock pulse ignored offdiff=-0.440026443 refdisp=0.03000 > disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:30] > filter sample 2 t=Tue 11/26/13 18:00:30.076668 offset=1.063331638 > dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:31] > refclock pulse ignored offdiff=-0.456248500 refdisp=0.03000 > disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:31] > filter sample 3 t=Tue 11/26/13 18:00:31.121044 offset=1.018955039 > dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:32] > refclock pulse ignored offdiff=0.499073485 refdisp=0.03000 > disp=0.02001 refclock.c:687:(fil
RE: [chrony-users] kernel PPS troubleshooting
Miraslov, Thanks for the modified source, I've recompiled it with --enable-trace and do indeed get a lot more information. I modified the original chrony.conf to drop the external gps (GPSe/PPSe) since they were generating a lot of sample ignored trace messages (no valid fix and no updating pps), so the reports below are with only the GPSi/PPSi sources active in the configuration. I've tried to copy key portions of the run below to avoid attaching the 5MB trace log, I'm open to other methods of sharing the whole log if there is interest. I have attached the tracking.log since it shows when PPS was available compared to the long periods where GPS was active (and the PPS was coming into /dev/pps1). It looks like the pulse is ignored when offdiff is relatively large (>0.2?), but that the offdiff steps very quickly between valid and invalid. It is also possible I'm interpreting the pulse handling completely incorrectly. Starting the modified chrony all appears well for a while but within a couple of minutes most of the PPS pulses are ignored. PPS goes in and out of being ignored for the next ~5 minutes before disappearing for another 90 minutes. After that brief recovery, it is ignored for the rest of the run: :~/chronytrace# ./chrony [Jd -d main.c:355:(main)[26-18:00:28] chronyd version DEVELOPMENT starting sys_linux.c:1022:(get_version_specific_details)[26-18:00:28] Linux kernel major=3 minor=3 patch=0 sys_linux.c:1080:(get_version_specific_details)[26-18:00:28] hz=100 shift_hz=7 freq_scale=1. nominal_tick=1 slew_delta_tick=833 max_tick_bias=1000 shift_pll=2 local.c:565:(lcl_RegisterSystemDrivers)[26-18:00:28] Local freq=297.043ppm refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock PPS added poll=4 dpoll=0 filter=16 refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock SHM added poll=4 dpoll=0 filter=16 reference.c:194:(REF_Initialise)[26-18:00:28] Initial frequency 297.043 ppm sources.c:331:(SRC_SetSelectable)[26-18:00:28] PPSi sources.c:331:(SRC_SetSelectable)[26-18:00:28] GPSi refclock.c:416:(RCL_AddPulse)[26-18:00:28] refclock pulse ignored no ref sample refclock.c:687:(filter_add_sample)[26-18:00:28] filter sample 0 t=Tue 11/26/13 18:00:28.080062 offset=1.059937008 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:29] refclock pulse ignored offdiff=-0.459021095 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:29] filter sample 1 t=Tue 11/26/13 18:00:29.060753 offset=1.079246196 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:30] refclock pulse ignored offdiff=-0.440026443 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:30] filter sample 2 t=Tue 11/26/13 18:00:30.076668 offset=1.063331638 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:31] refclock pulse ignored offdiff=-0.456248500 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:31] filter sample 3 t=Tue 11/26/13 18:00:31.121044 offset=1.018955039 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:32] refclock pulse ignored offdiff=0.499073485 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:32] filter sample 4 t=Tue 11/26/13 18:00:32.121227 offset=1.018772016 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:33] refclock pulse ignored offdiff=0.498576090 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:33] filter sample 5 t=Tue 11/26/13 18:00:32.702642 offset=1.437357065 dispersion=0.03000 refclock.c:447:(RCL_AddPulse)[26-18:00:34] refclock pulse second=0.479480414 offset=1.520519586 offdiff=-0.083162521 samplediff=0.776838000 refclock.c:687:(filter_add_sample)[26-18:00:34] filter sample 0 t=Tue 11/26/13 18:00:33.479480 offset=1.520519586 dispersion=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:34] filter sample 6 t=Tue 11/26/13 18:00:33.704596 offset=1.435403234 dispersion=0.03000 refclock.c:447:(RCL_AddPulse)[26-18:00:35] refclock pulse second=0.479162252 offset=1.520837748 offdiff=-0.085434514 samplediff=0.774566000 refclock.c:687:(filter_add_sample)[26-18:00:35] filter sample 1 t=Tue 11/26/13 18:00:34.479162 offset=1.520837748 dispersion=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:35] filter sample 7 t=Tue 11/26/13 18:00:34.692316 offset=1.447683341 dispersion=0.03000 refclock.c:447:(RCL_AddPulse)[26-18:00:36] refclock pulse second=0.478844465 offset=1.521155535 offdiff=-0.073472194 samplediff=0.786528000 refclock.c:687:(filter_add_sample)[26-18:00:36] filter sample 2 t=Tue 11/26/13 18:00:35.478844 offset=1.521155535 dispersion=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:36] filter sample 8 t=Tue 11/26/13 18:00:35.708774 offset=1.431225071 dispersion=0.03000 refclock.c:447:(RCL_AddPulse)[26-18:00:37] refclock pulse second=0.478551879 offset=1.521448121 offdiff=-0.090223050 samplediff=0.769777000 refclock.c:687:(fil
RE: [chrony-users] kernel PPS troubleshooting
The pps can only give you when the second turnover occurs, it cannot tell you which second that is. That MUST be given by some other time source, which could be the nmea sentences from the gps or by some other source. The problem with the nmea is that it is usually late. Late by something like .5 to 1 sec. But chrony must be confident that the system time is within less than .5 sec of the real time before it will trust the PPS. source. Now, it looks to me on a very quick look that this is not happening for some reason, and so that pps data is being rejected. I have not looked at Miroslav's code to figure out exactly what is being reported, so am not at allconfident I am reading it properly. On Tue, 26 Nov 2013, Battocchi, Scott L. wrote: Miraslov, Thanks for the modified source, I've recompiled it with --enable-trace and do indeed get a lot more information. I modified the original chrony.conf to drop the external gps (GPSe/PPSe) since they were generating a lot of sample ignored trace messages (no valid fix and no updating pps), so the reports below are with only the GPSi/PPSi sources active in the configuration. I've tried to copy key portions of the run below to avoid attaching the 5MB trace log, I'm open to other methods of sharing the whole log if there is interest. I have attached the tracking.log since it shows when PPS was available compared to the long periods where GPS was active (and the PPS was coming into /dev/pps1). It looks like the pulse is ignored when offdiff is relatively large (>0.2?), but that the offdiff steps very quickly between valid and invalid. It is also possible I'm interpreting the pulse handling completely incorrectly. Starting the modified chrony all appears well for a while but within a couple of minutes most of the PPS pulses are ignored. PPS goes in and out of being ignored for the next ~5 minutes before disappearing for another 90 minutes. After that brief recovery, it is ignored for the rest of the run: :~/chronytrace# ./chrony [Jd -d main.c:355:(main)[26-18:00:28] chronyd version DEVELOPMENT starting sys_linux.c:1022:(get_version_specific_details)[26-18:00:28] Linux kernel major=3 minor=3 patch=0 sys_linux.c:1080:(get_version_specific_details)[26-18:00:28] hz=100 shift_hz=7 freq_scale=1. nominal_tick=1 slew_delta_tick=833 max_tick_bias=1000 shift_pll=2 local.c:565:(lcl_RegisterSystemDrivers)[26-18:00:28] Local freq=297.043ppm refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock PPS added poll=4 dpoll=0 filter=16 refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock SHM added poll=4 dpoll=0 filter=16 reference.c:194:(REF_Initialise)[26-18:00:28] Initial frequency 297.043 ppm sources.c:331:(SRC_SetSelectable)[26-18:00:28] PPSi sources.c:331:(SRC_SetSelectable)[26-18:00:28] GPSi refclock.c:416:(RCL_AddPulse)[26-18:00:28] refclock pulse ignored no ref sample refclock.c:687:(filter_add_sample)[26-18:00:28] filter sample 0 t=Tue 11/26/13 18:00:28.080062 offset=1.059937008 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:29] refclock pulse ignored offdiff=-0.459021095 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:29] filter sample 1 t=Tue 11/26/13 18:00:29.060753 offset=1.079246196 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:30] refclock pulse ignored offdiff=-0.440026443 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:30] filter sample 2 t=Tue 11/26/13 18:00:30.076668 offset=1.063331638 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:31] refclock pulse ignored offdiff=-0.456248500 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:31] filter sample 3 t=Tue 11/26/13 18:00:31.121044 offset=1.018955039 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:32] refclock pulse ignored offdiff=0.499073485 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:32] filter sample 4 t=Tue 11/26/13 18:00:32.121227 offset=1.018772016 dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:33] refclock pulse ignored offdiff=0.498576090 refdisp=0.03000 disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:33] filter sample 5 t=Tue 11/26/13 18:00:32.702642 offset=1.437357065 dispersion=0.03000 refclock.c:447:(RCL_AddPulse)[26-18:00:34] refclock pulse second=0.479480414 offset=1.520519586 offdiff=-0.083162521 samplediff=0.776838000 -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Mon, Nov 25, 2013 at 06:50:01PM -0500, Battocchi, Scott L. wrote: > I've recently cross-compiled chrony-0308330 to run on our armv5 platform and > it seems to silently/selectively ignore our PPS source even when it is > present. Currently all testing is being done with our cheap receiver > (GPSi/PPSi below). After ~hours I get a handful of entries into the > refclocks.log for the PPSi source, but no mention on the console that the > source is or is not present. Right this instant we are getting updates to > /sys/class/pps/pps1/assert every second but chronyc sources shows the LastRX > as 26 minutes ago. > > Is there a way to enable more verbose debugging of the chrony source > selection/rejection process so that I can see why it is rejecting what look > to be good PPS updates? I'm happy to provide more information, logs, or > compile options as necessary. The configuration looks good. A similar setup works fine here (although I've only one GPS). There are a number of places where the PPS sample can be dropped. I've added some new trace messages to help us see what's going on. Can you please pull from git, run configure with --enable-trace, recompile and see what refclock messages do you get? -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] kernel PPS troubleshooting
On Mon, 25 Nov 2013, Battocchi, Scott L. wrote: Hi, I'm very new to chrony and am trying to do something that I believe to be supported. I am trying to sync our system to two local GPS receivers depending on whether one or both of them have a fix, with a preference for the higher priced receiver. As we are on an embedded platform I'm using the PPS-GPIO kernel module to get our PPS signal in. I'm working off of the chrony development branch for the PHC support because we will eventually will want to tie our PTP capable PHY into chrony and the rest of the system. I've recently cross-compiled chrony-0308330 to run on our armv5 platform and it seems to silently/selectively ignore our PPS source even when it is present. Currently all testing is being done with our cheap receiver (GPSi/PPSi below). After ~hours I get a handful of entries into the refclocks.log for the PPSi source, but no mention on the console that the source is or is not present. Right this instant we are getting updates to /sys/class/pps/pps1/assert every second but chronyc sources shows the LastRX as 26 minutes ago. Is there a way to enable more verbose debugging of the chrony source selection/rejection process so that I can see why it is rejecting what look to be good PPS updates? I'm happy to provide more information, logs, or compile options as necessary. Thanks in advance! Scott Our system has the following sources currently configured: /dev/ttyS0 is our high priced GPS /dev/eser2 is our cheap GPS /dev/pps0 is the pps signal from our expensive GPS /dev/pps1 is the pps signal form our cheap GPS So are you getting stuff into /dev/pps{0,1}? I assume that the ttyS0 and eser2 are NMEA type data, not PPS data. Have you looked at /var/log/chrony/refclock? it should tell you if chrony is seeing the inputs and rejecting them for some reason. Also look at /proc/interrupts to see if the interupts are coming in. (Note that I do not know the PPS-GPIO modules so do not know what it reports) We are using gpsd (3.10) to read in the GPSs as follows: gpsd -bn /dev/ttyS0 /dev/eser2 the PPS-GPIO module is configured to look for rising edges on the two gpios, and connect them to pps0 and pps1, this works at ppstest captures consecutive reads from /dev/pps1 while I was running the chrony testing. chrony.conf (commented out the socket interface to gpsd, a question for another post): refclock PPS /dev/pps0 lock GPSe refid PPSe refclock PPS /dev/pps1 lock GPSi refid PPSi refclock SHM 0 offset 0.001 delay 0.0001 refid GPSe refclock SHM 2 offset 0.140 delay 0.01 refid GPSi #refclock SOCK /var/run/chrony.ttyS0.sock refid GPSe #refclock SOCK /var/run/chrony.eser2.sock offset 0.140 delay 0.01 refid GPSi logdir /var/log/chrony log measurements statistics tracking refclocks the following is the console output from chronyd -d: :~# ./chronyd -d main.c:355:(main)[25-22:50:52] chronyd version DEVELOPMENT starting sys_linux.c:1022:(get_version_specific_details)[25-22:50:53] Linux kernel major=3 minor=3 patch=0 sys_linux.c:1080:(get_version_specific_details)[25-22:50:53] hz=100 shift_hz=7 freq_scale=1. nominal_tick=1 slew_delta_tick=833 max_tick_bias=1000 shift_pll=2 sources.c:913:(SRC_SelectSource)[25-22:51:56] Selected source GPSi -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 Physics&Astronomy | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.