Re: [chrony-users] kernel PPS troubleshooting

2013-12-12 Thread Miroslav Lichvar
On Wed, Dec 11, 2013 at 10:11:48AM -0800, Bill Unruh wrote:
> On Wed, 11 Dec 2013, Miroslav Lichvar wrote:
> >When no source is selected, the PPS samples are ignored. If the SHM
> >source doesn't move to the acceptable range to overlap with the PPS
> >source in 8 polling intervals, the PPS source is marked as unreachable
> >and the SHM source is selected as the only available source.
> 
> That sounds like a bug. PPS should always be part of the selection process. It
> is almost by definition the correct source. And certainly it could be argued
> that the PPS should be the selected source, not the nmea. Of course some
> people (me) us shm to deliver pps to chrony, so shm should not automatically
> be downgraded, but a kernel pps it seems certainly should not be downgraded.

I think it works as expected. When there is a PPS source and a SHM
source and they don't agree, what do you do? Pick the PPS source only
because it's from the PPS driver? The SHM source can be from a PPS
signal too (as is in your case). If it was configured with the prefer
flag, I'd probably agree.

> >The configured delay is included in the interval used in the source
> >selection algorithm, so increasing the value from 0.01 to 0.4 or
> >larger should fix the problem.
> 
> A user should not have to do this or know this.

That would be nice, but I'm not sure how should chrony detect that the
source is a falseticker without comparing it to other sources.

The recommended configuration is to mark such sources with noselect
and use them only for PPS locking.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-12-11 Thread Miroslav Lichvar
On Tue, Dec 10, 2013 at 08:29:25PM -0500, Battocchi, Scott L. wrote:
> I've attached the tracking, measurements, refclocks, and sources logs trimmed 
> to start at the 2.35 hour mark (to coincide with the graph colored by sync 
> source in my previous mail).  I also moved the rolling header line for each 
> log to the start of these trimmed ones and removed any subsequent headers 
> from the remainder of the file.  They each run about 16 minutes and through 
> multiple sync source selections.  I did not include any logs from the first 
> two  minutes where sync=1 and dist actually changed since that seemed to be a 
> startup artifact and not related to the rest of the long run issues.

It seems the dropping of the PPS source is caused by SHM source having
too small configured delay. The long-term stability of the SHM source
is worse than the short-term jitter, so the measured dispersion (in
one polling interval) of the SHM source is sometimes smaller than the
current offset, which means it doesn't overlap with the PPS source in
the source selection algorithm and no source is selected with the "no
majority" message.

When no source is selected, the PPS samples are ignored. If the SHM
source doesn't move to the acceptable range to overlap with the PPS
source in 8 polling intervals, the PPS source is marked as unreachable
and the SHM source is selected as the only available source.

The configured delay is included in the interval used in the source
selection algorithm, so increasing the value from 0.01 to 0.4 or
larger should fix the problem.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-12-11 Thread Bill Unruh

On Wed, 11 Dec 2013, Miroslav Lichvar wrote:


On Tue, Dec 10, 2013 at 08:29:25PM -0500, Battocchi, Scott L. wrote:

I've attached the tracking, measurements, refclocks, and sources logs trimmed 
to start at the 2.35 hour mark (to coincide with the graph colored by sync 
source in my previous mail).  I also moved the rolling header line for each log 
to the start of these trimmed ones and removed any subsequent headers from the 
remainder of the file.  They each run about 16 minutes and through multiple 
sync source selections.  I did not include any logs from the first two  minutes 
where sync=1 and dist actually changed since that seemed to be a startup 
artifact and not related to the rest of the long run issues.


It seems the dropping of the PPS source is caused by SHM source having
too small configured delay. The long-term stability of the SHM source
is worse than the short-term jitter, so the measured dispersion (in
one polling interval) of the SHM source is sometimes smaller than the
current offset, which means it doesn't overlap with the PPS source in
the source selection algorithm and no source is selected with the "no
majority" message.

When no source is selected, the PPS samples are ignored. If the SHM
source doesn't move to the acceptable range to overlap with the PPS
source in 8 polling intervals, the PPS source is marked as unreachable
and the SHM source is selected as the only available source.


That sounds like a bug. PPS should always be part of the selection process. It
is almost by definition the correct source. And certainly it could be argued
that the PPS should be the selected source, not the nmea. Of course some
people (me) us shm to deliver pps to chrony, so shm should not automatically
be downgraded, but a kernel pps it seems certainly should not be downgraded.




The configured delay is included in the interval used in the source
selection algorithm, so increasing the value from 0.01 to 0.4 or
larger should fix the problem.


A user should not have to do this or know this. 





--
William G. Unruh   |  Canadian Institute for| Tel: +1(604)822-3273
Physics&Astronomy  | Advanced Research  | Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 |  and Gravity   |  www.theory.physics.ubc.ca/

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



RE: [chrony-users] kernel PPS troubleshooting

2013-12-10 Thread Battocchi, Scott L.
On  Tuesday, December 03, 2013 9:18 AM Miroslav Lichvar wrote: 
> On Mon, Dec 02, 2013 at 02:59:05PM -0500, Battocchi, Scott L. wrote:

>> Since we will not have access to a network time source and will be relying 
>> on GPSD/NMEA to get us in the correct ballpark on system startup, is there 
>> another configuration option we can try to minimize the snapping back to GPS 
>> so quickly?
> You can mark the NMEA source as noselect and still lock the PPS source to it. 
> The PPS samples will be ignored when NMEA is off by more than
> 0.2 seconds, but according to your graphs that shouldn't happen very often.

I will try that in the next couple of days

>> The three attached plots are:
>> 4hr_offsets:  Hours 0-4, offsets straight from statistics.log
>> 4hr_offsets_PPSadjusted:  Hours 0-4, adjusted offsets assuming PPS was 
>> always 0 and using the most recent PPS value to adjust the actual offset in 
>> statistics.log
>> Syncsource_PPSadjusted:   Hours 2-4, same data as PPSadjusted but with 
>> background highlighted according to active sync source from tracking.log
> Nice graphs!

Thanks, figuring out the best way to convey all of the associated log messages 
has consumed more brainpower than I'd like to admit...

>> I have the full console output as well with debugging enabled and am trying 
>> to figure out how best to parse and analyze it.  One thing I notices in 
>> comparison to my previous run is that all of the ignored PPS samples are 
>> coming from line 465 in refclock.c:
>> refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored 
>> second=0.99657 sync=0 dist=1.5
> Hm, that's weird. Do they all have the same sync and dist value? Could you 
> please attach corresponding parts of the tracking, refclock and statistics 
> logs around the time when the PPS source is dropped?

So a quick grep through the trace log shows that there were
99671 pulses ignored with sync=0, all of which had dist=1.5 (no other dist. was 
reported with sync=0)
96 pulses ignored with sync=1, which happened in 6 groupings each starting with 
dis=7.x or 8.x and ending after 16 cycles (one second per cycle) ending at 22.x 
or 23.x
All 6 of these groupings came in the first 2.5 minutes after starting chrony.

I've attached the tracking, measurements, refclocks, and sources logs trimmed 
to start at the 2.35 hour mark (to coincide with the graph colored by sync 
source in my previous mail).  I also moved the rolling header line for each log 
to the start of these trimmed ones and removed any subsequent headers from the 
remainder of the file.  They each run about 16 minutes and through multiple 
sync source selections.  I did not include any logs from the first two  minutes 
where sync=1 and dist actually changed since that seemed to be a startup 
artifact and not related to the rest of the long run issues.
I also have a trimmed console output at 810kB that I can send along if 
interested.

>> and not line 440 like they were on the previous run:
>> refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored 
>> offdiff=-0.313099609 refdisp=0.041061551 disp=0.022734546
> They are ignored in a different place because the lock option wasn't used 
> this time.

Ahh, that's makes sense.

Thanks again for all the help,
Scott Battocchi


tracking_2.35hrs_in.log
Description: tracking_2.35hrs_in.log


measurements_2.35hrs_in.log
Description: measurements_2.35hrs_in.log


refclocks_2.35hrs_in.log
Description: refclocks_2.35hrs_in.log


statistics_2.35hrs_in.log
Description: statistics_2.35hrs_in.log


Re: [chrony-users] kernel PPS troubleshooting

2013-12-03 Thread Miroslav Lichvar
On Mon, Dec 02, 2013 at 02:59:05PM -0500, Battocchi, Scott L. wrote:
> Since we will not have access to a network time source and will be relying on 
> GPSD/NMEA to get us in the correct ballpark on system startup, is there 
> another configuration option we can try to minimize the snapping back to GPS 
> so quickly?

You can mark the NMEA source as noselect and still lock the PPS source
to it. The PPS samples will be ignored when NMEA is off by more than
0.2 seconds, but according to your graphs that shouldn't happen very
often.

> The three attached plots are:
> 4hr_offsets:  Hours 0-4, offsets straight from statistics.log
> 4hr_offsets_PPSadjusted:  Hours 0-4, adjusted offsets assuming PPS was always 
> 0 and using the most recent PPS value to adjust the actual offset in 
> statistics.log
> Syncsource_PPSadjusted:   Hours 2-4, same data as PPSadjusted but with 
> background highlighted according to active sync source from tracking.log

Nice graphs!

> I have the full console output as well with debugging enabled and am trying 
> to figure out how best to parse and analyze it.  One thing I notices in 
> comparison to my previous run is that all of the ignored PPS samples are 
> coming from line 465 in refclock.c:
> refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored 
> second=0.99657 sync=0 dist=1.5

Hm, that's weird. Do they all have the same sync and dist value? Could
you please attach corresponding parts of the tracking, refclock and
statistics logs around the time when the PPS source is dropped?

> and not line 440 like they were on the previous run:
> refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored 
> offdiff=-0.313099609 refdisp=0.041061551 disp=0.022734546

They are ignored in a different place because the lock option
wasn't used this time.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



RE: [chrony-users] kernel PPS troubleshooting

2013-12-02 Thread Battocchi, Scott L.
I think it is not an issue of actually losing PPS for long periods of time so 
much as chrony ignoring "valid" PPS pulses as faulty.  Note: I'm calling the 
pulses "valid" since I can see them through ppstest and the chrony debug output 
looks like it sees them with offsets below 5ms but ignores them.

We should be able to set our data collection program up to check for GPS lock 
and chrony's selected source to set "noselect" on the GPS after the PPS has 
locked on, and then unset it if we actually lose a PPS signal and need to 
reacquire.

-Original Message-
From: Bill Unruh [mailto:un...@physics.ubc.ca] 
Sent: Monday, December 02, 2013 1:56 PM
To: chrony-users@chrony.tuxfamily.org
Subject: RE: [chrony-users] kernel PPS troubleshooting

The key purpose of the gps is to supply the seconds for the PPS. Once it has 
done that it is no longer needed. Thus you could have the gps run with pps for 
a while, and then do a noselect on it using chronyc. That way chrony would rely 
on the free running os the system clock to supply the seconds, and the pps to 
supply the microseconds.

However it is disturbing that you are losing pps for long periods of time.
That might indicate that there is something wrong with your gps receiver. I 
know I had trouble with mine that the antenna was defective.


On Mon, 2 Dec 2013, Battocchi, Scott L. wrote:

> Hi All,
> Sorry for the delayed response.  I have collected 36 hours of data with the 
> following sources:
> refclock PPS /dev/pps1 refid PPSi
> refclock SHM 2 offset 0.530 delay 0.01 refid GPSi server 
> 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 
> 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server 
> 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect
>
> Since we will be running disconnected from real NTP servers in our 
> application I had the 3 NTP servers as noselect so that I could track the GPS 
> and PPS against them, but not actually use them in the selection algorithm.
>
>> Miroslav said:
>> If a source disappears for 8 polling intervals, chronyd will select another 
>> source even if it's much worse. I agree that could be improved. With NMEA 
>> sources it's usually better to use the noselect option or don't configure it 
>> at all.
> Since we will not have access to a network time source and will be relying on 
> GPSD/NMEA to get us in the correct ballpark on system startup, is there 
> another configuration option we can try to minimize the snapping back to GPS 
> so quickly?
>
> The three attached plots are:
> 4hr_offsets:  Hours 0-4, offsets straight from statistics.log
> 4hr_offsets_PPSadjusted:  Hours 0-4, adjusted offsets assuming PPS was always 
> 0 and using the most recent PPS value to adjust the actual offset in 
> statistics.log
> Syncsource_PPSadjusted:   Hours 2-4, same data as PPSadjusted but with 
> background highlighted according to active sync source from tracking.log
>
> Looking through the refclocks.log it seems as though even with both PPS and 
> GPS present and having samples filtered, often after a GPS filtered entry in 
> the log PPS samples would be dropped completely until one or more subsequent 
> GPS filtered entries.
> {14 GPSi samples and 14 PPSi samples}
> 2013-11-27 23:08:38.999883 PPSi   15 N 1  2.455370e-04  1.161940e-04  
> 2.265e-04
> 2013-11-27 23:08:36.999489 PPSi- N -   -5.107210e-04  
> 1.854e-04
> 2013-11-27 23:08:39.600949 GPSi   15 N 0 -6.007421e-01 -7.094921e-02  
> 2.206e-02
> 2013-11-27 23:08:33.249250 GPSi- N -   -   -1.925024e-02  
> 6.892e-03
> {14 GPSi samples, NO PPSi samples}
> 2013-11-27 23:08:55.532367 GPSi   15 N 0 -5.323654e-01 -2.367523e-03  
> 2.179e-02
> 2013-11-27 23:08:46.365687 GPSi- N -   -   -3.568759e-02  
> 7.070e-03
> {14 GPSi samples, NO PPSi samples}
> 2013-11-27 23:09:43.590657 GPSi   15 N 0 -5.906571e-01 -6.065711e-02  
> 2.146e-02
> 2013-11-27 23:09:37.901101 GPSi- N -   -   -7.110153e-02  
> 6.716e-03
> {14 GPSi samples, NO PPSi samples}
> 2013-11-27 23:10:00.489102 GPSi   15 N 0 -4.891029e-01  4.089708e-02  
> 2.124e-02
> 2013-11-27 23:09:52.357123 GPSi- N -   -   -2.712306e-02  
> 6.472e-03
> 2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-04  
> 1.896e-04
> 2013-11-27 23:10:01.561675 GPSi0 N 0 -5.618044e-01 -3.167506e-02  
> 2.047e-02
> {14 GPSi samples, 14 PPSi samples for 3 more rounds, before dropping 
> PPS samples again}
>
> I have the full console output as well with debugging enabled and am trying 
> to figure out how best to parse and analyze it.  One thing I notices in 
> comparison to my previous run is that all of the ignored PPS samples are 
> comin

RE: [chrony-users] kernel PPS troubleshooting

2013-12-02 Thread Bill Unruh

On Mon, 2 Dec 2013, Battocchi, Scott L. wrote:


I think it is not an issue of actually losing PPS for long periods of time so much as chrony 
ignoring "valid" PPS pulses as faulty.  Note: I'm calling the pulses "valid" 
since I can see them through ppstest and the chrony debug output looks like it sees them with 
offsets below 5ms but ignores them.


It should not be ignoring them-- that does sound like a bug. My only concern
is that I have not seen my system ignore pps pulses (but then I do not use the
kernel pps-- I use my own driver which feeds the pps through the shm).




We should be able to set our data collection program up to check for GPS lock and 
chrony's selected source to set "noselect" on the GPS after the PPS has locked 
on, and then unset it if we actually lose a PPS signal and need to reacquire.


You would have to lose lock for a LONG time to need to reuse the gps to set
the seconds. Typically pps will bring the system drift to much less than 1
PPM, which would take a month to produce a 1 second error. Ie you would have
to lose lock for a month before you would need to reuse GPS.In which case
something far more serious than "lose lock" has happened. 
Ie, even with a sporadically working PPS, you should be able to  get the

computer time to within a second by say gps, and then forget about it.





-Original Message-
From: Bill Unruh [mailto:un...@physics.ubc.ca]
Sent: Monday, December 02, 2013 1:56 PM
To: chrony-users@chrony.tuxfamily.org
Subject: RE: [chrony-users] kernel PPS troubleshooting

The key purpose of the gps is to supply the seconds for the PPS. Once it has 
done that it is no longer needed. Thus you could have the gps run with pps for 
a while, and then do a noselect on it using chronyc. That way chrony would rely 
on the free running os the system clock to supply the seconds, and the pps to 
supply the microseconds.

However it is disturbing that you are losing pps for long periods of time.
That might indicate that there is something wrong with your gps receiver. I 
know I had trouble with mine that the antenna was defective.


On Mon, 2 Dec 2013, Battocchi, Scott L. wrote:


Hi All,
Sorry for the delayed response.  I have collected 36 hours of data with the 
following sources:
refclock PPS /dev/pps1 refid PPSi
refclock SHM 2 offset 0.530 delay 0.01 refid GPSi server
1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server
2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect server
3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect

Since we will be running disconnected from real NTP servers in our application 
I had the 3 NTP servers as noselect so that I could track the GPS and PPS 
against them, but not actually use them in the selection algorithm.


Miroslav said:
If a source disappears for 8 polling intervals, chronyd will select another 
source even if it's much worse. I agree that could be improved. With NMEA 
sources it's usually better to use the noselect option or don't configure it at 
all.

Since we will not have access to a network time source and will be relying on 
GPSD/NMEA to get us in the correct ballpark on system startup, is there another 
configuration option we can try to minimize the snapping back to GPS so quickly?

The three attached plots are:
4hr_offsets:  Hours 0-4, offsets straight from statistics.log
4hr_offsets_PPSadjusted:  Hours 0-4, adjusted offsets assuming PPS was always 0 
and using the most recent PPS value to adjust the actual offset in 
statistics.log
Syncsource_PPSadjusted:   Hours 2-4, same data as PPSadjusted but with 
background highlighted according to active sync source from tracking.log

Looking through the refclocks.log it seems as though even with both PPS and GPS 
present and having samples filtered, often after a GPS filtered entry in the 
log PPS samples would be dropped completely until one or more subsequent GPS 
filtered entries.
{14 GPSi samples and 14 PPSi samples}
2013-11-27 23:08:38.999883 PPSi   15 N 1  2.455370e-04  1.161940e-04  2.265e-04
2013-11-27 23:08:36.999489 PPSi- N -   -5.107210e-04  1.854e-04
2013-11-27 23:08:39.600949 GPSi   15 N 0 -6.007421e-01 -7.094921e-02  2.206e-02
2013-11-27 23:08:33.249250 GPSi- N -   -   -1.925024e-02  6.892e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:08:55.532367 GPSi   15 N 0 -5.323654e-01 -2.367523e-03  2.179e-02
2013-11-27 23:08:46.365687 GPSi- N -   -   -3.568759e-02  7.070e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:09:43.590657 GPSi   15 N 0 -5.906571e-01 -6.065711e-02  2.146e-02
2013-11-27 23:09:37.901101 GPSi- N -   -   -7.110153e-02  6.716e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:10:00.489102 GPSi   15 N 0 -4.891029e-01  4.089708e-02  2.124e-02
2013-11-27 23:09:52.357123 GPSi- N -   -   -2.712306e-02  6.472e-03
2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-

RE: [chrony-users] kernel PPS troubleshooting

2013-12-02 Thread Battocchi, Scott L.
On Thursday, November 28, 2013 6:15 AM  Miroslav Lichvar wrote:
>On Wed, Nov 27, 2013 at 04:06:58PM -0500, Battocchi, Scott L. wrote:
>> I ran the GPS while connected to a handful of ntp servers and saw that my 
>> gps offset (originally 0.180) was too low, so I bumped it up to 0.530 for 
>> the next two tests.  I've attached plots of the offset as recorded in the 
>> statistics.log file, if there are other metrics that would be useful I'm 
>> happy to graph them and send them out.
>> ntp.png is with 5 pool servers and the GPS set to noselect (PPS is not 
>> locked to anything, but is selectable) gps.png is after the ntp test but 
>> back to just using the GPS and PPS, it looks like sometimes GPS gets 
>> selected as the source forcing the PPS signal to look like it is drifting 
>> relative to the system.

>That looks similar to what I see with with a Garmin 18x LVC. This is a capture 
>30 hours long I did some time ago (the NMEA source's offset value was set to 
>0.5):

>http://mlichvar.fedorapeople.org/tmp/18x_nmea.png

>Since gpsd has added support for kernel PPS, I think it's better to use the 
>SHM 1 or SOCK source instead of PPS. Let it handle the HW details and pair the 
>PPS and NMEA samples.

I could not see how to get GPSD to associate a kernel PPS source (our /dev/pps1 
is driven by the PPS-GPIO kernel module and does not come in through the serial 
port's DCD line) with a NMEA source.  Without a PPS signal coming into GPSD I 
didn't seem to get any data into chrony through the SOCK interface even though 
GPSD did see and successfully connect to it according to the GPSD debug output.

Thanks,
Scott

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



RE: [chrony-users] kernel PPS troubleshooting

2013-12-02 Thread Battocchi, Scott L.
Hi All,
Sorry for the delayed response.  I have collected 36 hours of data with the 
following sources:
refclock PPS /dev/pps1 refid PPSi
refclock SHM 2 offset 0.530 delay 0.01 refid GPSi
server 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect
server 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect
server 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect

Since we will be running disconnected from real NTP servers in our application 
I had the 3 NTP servers as noselect so that I could track the GPS and PPS 
against them, but not actually use them in the selection algorithm.

> Miroslav said:
>If a source disappears for 8 polling intervals, chronyd will select another 
>source even if it's much worse. I agree that could be improved. With NMEA 
>sources it's usually better to use the noselect option or don't configure it 
>at all.
Since we will not have access to a network time source and will be relying on 
GPSD/NMEA to get us in the correct ballpark on system startup, is there another 
configuration option we can try to minimize the snapping back to GPS so quickly?

The three attached plots are:
4hr_offsets:  Hours 0-4, offsets straight from statistics.log
4hr_offsets_PPSadjusted:  Hours 0-4, adjusted offsets assuming PPS was always 0 
and using the most recent PPS value to adjust the actual offset in 
statistics.log
Syncsource_PPSadjusted:   Hours 2-4, same data as PPSadjusted but with 
background highlighted according to active sync source from tracking.log

Looking through the refclocks.log it seems as though even with both PPS and GPS 
present and having samples filtered, often after a GPS filtered entry in the 
log PPS samples would be dropped completely until one or more subsequent GPS 
filtered entries.
{14 GPSi samples and 14 PPSi samples}
2013-11-27 23:08:38.999883 PPSi   15 N 1  2.455370e-04  1.161940e-04  2.265e-04
2013-11-27 23:08:36.999489 PPSi- N -   -5.107210e-04  1.854e-04
2013-11-27 23:08:39.600949 GPSi   15 N 0 -6.007421e-01 -7.094921e-02  2.206e-02
2013-11-27 23:08:33.249250 GPSi- N -   -   -1.925024e-02  6.892e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:08:55.532367 GPSi   15 N 0 -5.323654e-01 -2.367523e-03  2.179e-02
2013-11-27 23:08:46.365687 GPSi- N -   -   -3.568759e-02  7.070e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:09:43.590657 GPSi   15 N 0 -5.906571e-01 -6.065711e-02  2.146e-02
2013-11-27 23:09:37.901101 GPSi- N -   -   -7.110153e-02  6.716e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:10:00.489102 GPSi   15 N 0 -4.891029e-01  4.089708e-02  2.124e-02
2013-11-27 23:09:52.357123 GPSi- N -   -   -2.712306e-02  6.472e-03
2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-04  1.896e-04
2013-11-27 23:10:01.561675 GPSi0 N 0 -5.618044e-01 -3.167506e-02  2.047e-02
{14 GPSi samples, 14 PPSi samples for 3 more rounds, before dropping PPS 
samples again}

I have the full console output as well with debugging enabled and am trying to 
figure out how best to parse and analyze it.  One thing I notices in comparison 
to my previous run is that all of the ignored PPS samples are coming from line 
465 in refclock.c:
refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored 
second=0.99657 sync=0 dist=1.5
and not line 440 like they were on the previous run:
refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored 
offdiff=-0.313099609 refdisp=0.041061551 disp=0.022734546

Thanks,
Scott

-Original Message-
From: Bill Unruh [mailto:un...@physics.ubc.ca] 
Sent: Friday, November 29, 2013 11:48 AM
To: chrony-users@chrony.tuxfamily.org
Subject: Re: [chrony-users] kernel PPS troubleshooting

On Fri, 29 Nov 2013, Miroslav Lichvar wrote:

> On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote:
>> On Fri, 29 Nov 2013, Bill Unruh wrote:
>> By the way, does the kernel PPS do median filtering before passing on 
>> the times to chrony? (Ie, taking the median of say the past 16 inputs 
>> and throwing away the 6 worst outliers and then retaking the median?)
>
> The kernel doesn't filter the PPS samples in any way. In chronyd the 
> PPS driver fetches the latest PPS sample from the kernel once per 
> second and the refclock poll (16 seconds by default) runs the median 
> filter.

Ah. OK.

>
>> Anyway, it should not be switching sources unless the deviation of 
>> the selected source exceeds the variance of the alternative (or 
>> unless the source has disappeared for a suitable number of poll 
>> intervals, probably related to how long one would expect to wait for 
>> the drift rate variance to make the system clock deviate by more than 
>> the second source's variance. Ie, you are far better off letting a 
>> clock drift unconstrained for a while than to jump to source which has a 
>> huge (factor

RE: [chrony-users] kernel PPS troubleshooting

2013-12-02 Thread Bill Unruh

The key purpose of the gps is to supply the seconds for the PPS. Once it has
done that it is no longer needed. Thus you could have the gps run with pps for
a while, and then do a noselect on it using chronyc. That way chrony would
rely on the free running os the system clock to supply the seconds, and the
pps to supply the microseconds.

However it is disturbing that you are losing pps for long periods of time.
That might indicate that there is something wrong with your gps receiver. I
know I had trouble with mine that the antenna was defective.


On Mon, 2 Dec 2013, Battocchi, Scott L. wrote:


Hi All,
Sorry for the delayed response.  I have collected 36 hours of data with the 
following sources:
refclock PPS /dev/pps1 refid PPSi
refclock SHM 2 offset 0.530 delay 0.01 refid GPSi
server 1.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect
server 2.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect
server 3.us.pool.ntp.org minpoll 5 maxpoll 10 maxdelay 0.4 noselect

Since we will be running disconnected from real NTP servers in our application 
I had the 3 NTP servers as noselect so that I could track the GPS and PPS 
against them, but not actually use them in the selection algorithm.


Miroslav said:
If a source disappears for 8 polling intervals, chronyd will select another 
source even if it's much worse. I agree that could be improved. With NMEA 
sources it's usually better to use the noselect option or don't configure it at 
all.

Since we will not have access to a network time source and will be relying on 
GPSD/NMEA to get us in the correct ballpark on system startup, is there another 
configuration option we can try to minimize the snapping back to GPS so quickly?

The three attached plots are:
4hr_offsets:  Hours 0-4, offsets straight from statistics.log
4hr_offsets_PPSadjusted:  Hours 0-4, adjusted offsets assuming PPS was always 0 
and using the most recent PPS value to adjust the actual offset in 
statistics.log
Syncsource_PPSadjusted:   Hours 2-4, same data as PPSadjusted but with 
background highlighted according to active sync source from tracking.log

Looking through the refclocks.log it seems as though even with both PPS and GPS 
present and having samples filtered, often after a GPS filtered entry in the 
log PPS samples would be dropped completely until one or more subsequent GPS 
filtered entries.
{14 GPSi samples and 14 PPSi samples}
2013-11-27 23:08:38.999883 PPSi   15 N 1  2.455370e-04  1.161940e-04  2.265e-04
2013-11-27 23:08:36.999489 PPSi- N -   -5.107210e-04  1.854e-04
2013-11-27 23:08:39.600949 GPSi   15 N 0 -6.007421e-01 -7.094921e-02  2.206e-02
2013-11-27 23:08:33.249250 GPSi- N -   -   -1.925024e-02  6.892e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:08:55.532367 GPSi   15 N 0 -5.323654e-01 -2.367523e-03  2.179e-02
2013-11-27 23:08:46.365687 GPSi- N -   -   -3.568759e-02  7.070e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:09:43.590657 GPSi   15 N 0 -5.906571e-01 -6.065711e-02  2.146e-02
2013-11-27 23:09:37.901101 GPSi- N -   -   -7.110153e-02  6.716e-03
{14 GPSi samples, NO PPSi samples}
2013-11-27 23:10:00.489102 GPSi   15 N 0 -4.891029e-01  4.089708e-02  2.124e-02
2013-11-27 23:09:52.357123 GPSi- N -   -   -2.712306e-02  6.472e-03
2013-11-27 23:10:00.000461 PPSi0 N 1 -5.952060e-04 -4.616970e-04  1.896e-04
2013-11-27 23:10:01.561675 GPSi0 N 0 -5.618044e-01 -3.167506e-02  2.047e-02
{14 GPSi samples, 14 PPSi samples for 3 more rounds, before dropping PPS 
samples again}

I have the full console output as well with debugging enabled and am trying to 
figure out how best to parse and analyze it.  One thing I notices in comparison 
to my previous run is that all of the ignored PPS samples are coming from line 
465 in refclock.c:
refclock.c:465:(RCL_AddPulse)[28-14:20:00] refclock pulse ignored 
second=0.99657 sync=0 dist=1.5




and not line 440 like they were on the previous run:
refclock.c:440:(RCL_AddPulse)[26-18:03:56] refclock pulse ignored 
offdiff=-0.313099609 refdisp=0.041061551 disp=0.022734546

Thanks,
Scott

-Original Message-
From: Bill Unruh [mailto:un...@physics.ubc.ca]
Sent: Friday, November 29, 2013 11:48 AM
To: chrony-users@chrony.tuxfamily.org
Subject: Re: [chrony-users] kernel PPS troubleshooting

On Fri, 29 Nov 2013, Miroslav Lichvar wrote:


On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote:

On Fri, 29 Nov 2013, Bill Unruh wrote:
By the way, does the kernel PPS do median filtering before passing on
the times to chrony? (Ie, taking the median of say the past 16 inputs
and throwing away the 6 worst outliers and then retaking the median?)


The kernel doesn't filter the PPS samples in any way. In chronyd the
PPS driver fetches the latest PPS sample from the kernel once per
second and the refclock poll (16 seconds by default) runs the median
filter.


Ah. OK.




Anyway, it should not be switching sources unless 

Re: [chrony-users] kernel PPS troubleshooting

2013-11-29 Thread Miroslav Lichvar
On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote:
> On Fri, 29 Nov 2013, Bill Unruh wrote:
> By the way, does the kernel PPS do median filtering before passing on the
> times to chrony? (Ie, taking the median of say the past 16 inputs and throwing
> away the 6 worst outliers and then retaking the median?)

The kernel doesn't filter the PPS samples in any way. In chronyd the
PPS driver fetches the latest PPS sample from the kernel once per
second and the refclock poll (16 seconds by default) runs the median
filter.

> Anyway, it should not be switching sources unless the deviation of the
> selected source exceeds the variance of the alternative (or unless the source
> has disappeared for a suitable number of poll intervals, probably related to
> how long one would expect to wait for the drift rate variance to make the
> system clock deviate by more than the second source's variance. Ie, you are
> far better off letting a clock drift unconstrained for a while than to jump to
> source which has a huge (factors of a 1000) worse variance.

The selection algorithm prefers sources with shortest distance (with
refclock that's the measured dispersion + configured delay). If there
are more sources with similar distance they will be combined together.

If a source disappears for 8 polling intervals, chronyd will select
another source even if it's much worse. I agree that could be
improved. With NMEA sources it's usually better to use the noselect
option or don't configure it at all.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-29 Thread Tomalak Geret'kal

On 29/11/2013 18:21, Miroslav Lichvar wrote:

Anyway, it should not be switching sources unless the deviation of the
selected source exceeds the variance of the alternative (or unless the source
has disappeared for a suitable number of poll intervals, probably related to
how long one would expect to wait for the drift rate variance to make the
system clock deviate by more than the second source's variance. Ie, you are
far better off letting a clock drift unconstrained for a while than to jump to
source which has a huge (factors of a 1000) worse variance.

The selection algorithm prefers sources with shortest distance (with
refclock that's the measured dispersion + configured delay). If there
are more sources with similar distance they will be combined together.

If a source disappears for 8 polling intervals, chronyd will select
another source even if it's much worse. I agree that could be
improved. With NMEA sources it's usually better to use the noselect
option or don't configure it at all.



With PPS and NMEA sources, I found chrony bouncing between 
the two unless I marked the NMEA source as "noselect" (see 
thread from August 2012).


It's still on my todo list to get more debugging information 
on this, as Bill indicated that it may have been a bug 
(21/08/2012 22:58). Possibly the same thing is happening here?


Tom

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-29 Thread Miroslav Lichvar
On Thu, Nov 28, 2013 at 11:11:18AM -0800, Bill Unruh wrote:
> On Thu, 28 Nov 2013, Miroslav Lichvar wrote:
> >That looks similar to what I see with with a Garmin 18x LVC. This is a
> >capture 30 hours long I did some time ago (the NMEA source's offset
> >value was set to 0.5):
> >
> >http://mlichvar.fedorapeople.org/tmp/18x_nmea.png
> 
> Is this the nmea time or the PPS time? And is the vertical axis seconds or
> milliseconds?

That's the NMEA time (as provided by gpsd) when the clock was
synchronized to PPS. It's unfortunately in seconds. I think it was
with 115200 baud rate.

> The problem in his case is that the PPS signal is occasionally
> (but far too often) off by almost .3 sec. That is rediculous. And it is only
> when the gps-nmea and the PPS are the only sources.

He said chronyd was switching between the PPS and GPS sources, so the
0.3s spike could be just the PPS-NMEA offset. The other graph with
chronyd using NTP sources doesn't seem to have this problem.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-29 Thread Bill Unruh

On Fri, 29 Nov 2013, Miroslav Lichvar wrote:


On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote:

On Fri, 29 Nov 2013, Bill Unruh wrote:
By the way, does the kernel PPS do median filtering before passing on the
times to chrony? (Ie, taking the median of say the past 16 inputs and throwing
away the 6 worst outliers and then retaking the median?)


The kernel doesn't filter the PPS samples in any way. In chronyd the
PPS driver fetches the latest PPS sample from the kernel once per
second and the refclock poll (16 seconds by default) runs the median
filter.


Ah. OK.




Anyway, it should not be switching sources unless the deviation of the
selected source exceeds the variance of the alternative (or unless the source
has disappeared for a suitable number of poll intervals, probably related to
how long one would expect to wait for the drift rate variance to make the
system clock deviate by more than the second source's variance. Ie, you are
far better off letting a clock drift unconstrained for a while than to jump to
source which has a huge (factors of a 1000) worse variance.


The selection algorithm prefers sources with shortest distance (with
refclock that's the measured dispersion + configured delay). If there
are more sources with similar distance they will be combined together.

If a source disappears for 8 polling intervals, chronyd will select
another source even if it's much worse. I agree that could be
improved. With NMEA sources it's usually better to use the noselect
option or don't configure it at all.


It looks in the source code as if it grabs a new source as soon as the source
disappears, but that was really not a very good look I had at the code.

If only only had nmea and pps, one needs the nmea at least at start up to get
the time to within a half second or so, but thereafter of course it probably
should not be used unless the PPS disappears for quite a while ( in which case
the nmea is liable to be not very good either)

Certainly it would be good to find out what was happening with his clock
hopping.







--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-29 Thread Bill Unruh

On Fri, 29 Nov 2013, Bill Unruh wrote:


On Fri, 29 Nov 2013, Miroslav Lichvar wrote:


>  The problem in his case is that the PPS signal is occasionally
>  (but far too often) off by almost .3 sec. That is rediculous. And it is 
>  only

>  when the gps-nmea and the PPS are the only sources.

 He said chronyd was switching between the PPS and GPS sources, so the
 0.3s spike could be just the PPS-NMEA offset. The other graph with
 chronyd using NTP sources doesn't seem to have this problem.




Hm, I guess that would do it. But why would it be switching like that? If it
is doing so, then there is a problem with the chrony selection algorithm. 
Your

solution of having gpsd handle it all is a possible one, but chrony itself
should not be behaving that way. The nmea has a huge variance, while the PPS
variance should be tiny, and it should be being selected. Or is the PPS
exceeding its variance occasionally and chrony thinking it has gone rogue,
selects the nmea? By this time I do not remember the selection algorithm 
sufficiently well to be

able to say.


By the way, does the kernel PPS do median filtering before passing on the
times to chrony? (Ie, taking the median of say the past 16 inputs and throwing
away the 6 worst outliers and then retaking the median?)

Anyway, it should not be switching sources unless the deviation of the
selected source exceeds the variance of the alternative (or unless the source
has disappeared for a suitable number of poll intervals, probably related to
how long one would expect to wait for the drift rate variance to make the
system clock deviate by more than the second source's variance. Ie, you are
far better off letting a clock drift unconstrained for a while than to jump to
source which has a huge (factors of a 1000) worse variance.








--
William G. Unruh   |  Canadian Institute for| Tel: +1(604)822-3273
Physics&Astronomy  | Advanced Research  | Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 |  and Gravity   |  www.theory.physics.ubc.ca/

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-29 Thread Bill Unruh

On Fri, 29 Nov 2013, Miroslav Lichvar wrote:


The problem in his case is that the PPS signal is occasionally
(but far too often) off by almost .3 sec. That is rediculous. And it is only
when the gps-nmea and the PPS are the only sources.


He said chronyd was switching between the PPS and GPS sources, so the
0.3s spike could be just the PPS-NMEA offset. The other graph with
chronyd using NTP sources doesn't seem to have this problem.




Hm, I guess that would do it. But why would it be switching like that? If it
is doing so, then there is a problem with the chrony selection algorithm. Your
solution of having gpsd handle it all is a possible one, but chrony itself
should not be behaving that way. The nmea has a huge variance, while the PPS
variance should be tiny, and it should be being selected. Or is the PPS
exceeding its variance occasionally and chrony thinking it has gone rogue,
selects the nmea? 
By this time I do not remember the selection algorithm sufficiently well to be

able to say.


--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-28 Thread Tomalak Geret'kal

On 28/11/2013 20:54, Bill Unruh wrote:
And on further thought, I also concede your point, since 
pps does not really
give the fractions of a second either, but just gives the 
second mark. You do
need an additional "clock" to actually tell you the 
fractions of a second.



Yes...


Anyway, I hope I clarified what I meant.


... and yes. :)

Tom

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-28 Thread Tomalak Geret'kal

On 28/11/2013 20:05, Bill Unruh wrote:

On Thu, 28 Nov 2013, Tomalak Geret'kal wrote:


On 28/11/2013 19:11, Bill Unruh wrote:

 Is this the nmea time or the PPS time?


What is "PPS time"? PPS provides timing, not time.


In my nomenclature, they are the same. PPS does supply 
time but just the
fractional seconds part of it. (Just as ntp supplies time 
by only the fractional
"centuries" part of it-- You probably would not argue that 
ntp does not supply

time just the timing.)


It probably comes from the traditional notion of time as 
something useful to humans, i.e. something down to minutes 
or seconds at least but up to hours or days/months/years.


In the commercial world of timing sync (e.g. telecoms 
networks) we say timing vs time to differentiate the two, 
and do not allow sub-second timing to count as any 
indication of absolute "time", but I concede your point.


Tom

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-28 Thread Tomalak Geret'kal

On 28/11/2013 19:11, Bill Unruh wrote:

Is this the nmea time or the PPS time?


What is "PPS time"? PPS provides timing, not time.

Tom

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-28 Thread Miroslav Lichvar
On Wed, Nov 27, 2013 at 04:06:58PM -0500, Battocchi, Scott L. wrote:
> I ran the GPS while connected to a handful of ntp servers and saw that my gps 
> offset (originally 0.180) was too low, so I bumped it up to 0.530 for the 
> next two tests.  I've attached plots of the offset as recorded in the 
> statistics.log file, if there are other metrics that would be useful I'm 
> happy to graph them and send them out.
> ntp.png is with 5 pool servers and the GPS set to noselect (PPS is not locked 
> to anything, but is selectable)
> gps.png is after the ntp test but back to just using the GPS and PPS, it 
> looks like sometimes GPS gets selected as the source forcing the PPS signal 
> to look like it is drifting relative to the system.

That looks similar to what I see with with a Garmin 18x LVC. This is a
capture 30 hours long I did some time ago (the NMEA source's offset
value was set to 0.5):

http://mlichvar.fedorapeople.org/tmp/18x_nmea.png

Since gpsd has added support for kernel PPS, I think it's better to
use the SHM 1 or SOCK source instead of PPS. Let it handle the HW
details and pair the PPS and NMEA samples.

> I think a portion of my original confusion was that the chronyc sources 
> command was indicating that the pulse had never been seen, as opposed to it 
> being seen and ignored.  I need to compare the GPS logs with the chrony logs 
> to see if the changing offset is a function of the number of satellites in 
> view, otherwise I don't have a great explanation for the wander seen in the 
> ntp plot.

>From what I remember from other discussions about NMEA timing, it
mainly depends on how is the firmware implemented and the number of
visible satellites may have nothing to do with it.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-28 Thread Bill Unruh

And on further thought, I also concede your point, since pps does not really
give the fractions of a second either, but just gives the second mark. You do
need an additional "clock" to actually tell you the fractions of a second.

Anyway, I hope I clarified what I meant.


On Thu, 28 Nov 2013, Tomalak Geret'kal wrote:


On 28/11/2013 20:05, Bill Unruh wrote:

 On Thu, 28 Nov 2013, Tomalak Geret'kal wrote:

>  On 28/11/2013 19:11, Bill Unruh wrote:
> >   Is this the nmea time or the PPS time?
> 
>  What is "PPS time"? PPS provides timing, not time.


 In my nomenclature, they are the same. PPS does supply time but just the
 fractional seconds part of it. (Just as ntp supplies time by only the
 fractional
 "centuries" part of it-- You probably would not argue that ntp does not
 supply
 time just the timing.)
> 
It probably comes from the traditional notion of time as something useful to 
humans, i.e. something down to minutes or seconds at least but up to hours or 
days/months/years.


In the commercial world of timing sync (e.g. telecoms networks) we say timing 
vs time to differentiate the two, and do not allow sub-second timing to count 
as any indication of absolute "time", but I concede your point.


Tom




--
William G. Unruh   |  Canadian Institute for| Tel: +1(604)822-3273
Physics&Astronomy  | Advanced Research  | Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 |  and Gravity   |  www.theory.physics.ubc.ca/

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-28 Thread Bill Unruh

On Thu, 28 Nov 2013, Tomalak Geret'kal wrote:


On 28/11/2013 19:11, Bill Unruh wrote:

 Is this the nmea time or the PPS time?


What is "PPS time"? PPS provides timing, not time.


In my nomenclature, they are the same. PPS does supply time but just the
fractional seconds part of it. (Just as ntp supplies time by only the fractional
"centuries" part of it-- You probably would not argue that ntp does not supply
time just the timing.)


Tom




--
William G. Unruh   |  Canadian Institute for| Tel: +1(604)822-3273
Physics&Astronomy  | Advanced Research  | Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 |  and Gravity   |  www.theory.physics.ubc.ca/

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-28 Thread Bill Unruh

On Thu, 28 Nov 2013, Miroslav Lichvar wrote:


On Wed, Nov 27, 2013 at 04:06:58PM -0500, Battocchi, Scott L. wrote:

I ran the GPS while connected to a handful of ntp servers and saw that my gps 
offset (originally 0.180) was too low, so I bumped it up to 0.530 for the next 
two tests.  I've attached plots of the offset as recorded in the statistics.log 
file, if there are other metrics that would be useful I'm happy to graph them 
and send them out.
ntp.png is with 5 pool servers and the GPS set to noselect (PPS is not locked 
to anything, but is selectable)
gps.png is after the ntp test but back to just using the GPS and PPS, it looks 
like sometimes GPS gets selected as the source forcing the PPS signal to look 
like it is drifting relative to the system.


That looks similar to what I see with with a Garmin 18x LVC. This is a
capture 30 hours long I did some time ago (the NMEA source's offset
value was set to 0.5):

http://mlichvar.fedorapeople.org/tmp/18x_nmea.png


Is this the nmea time or the PPS time? And is the vertical axis seconds or
milliseconds? The problem in his case is that the PPS signal is occasionally
(but far too often) off by almost .3 sec. That is rediculous. And it is only
when the gps-nmea and the PPS are the only sources.

I see nothing like that with my Sure gps with PPS driving chrony. On the other
hand I use a "self rolled" pps interrupt driver on the parallel port, not the
Linux supplied serial port driver. 
But the graph where he runs the PPS together with the external ntp sources

shows no sign of that kind of absurd jumps in the PPS time, so that would
suggest that the interrupt handler is OK.




Since gpsd has added support for kernel PPS, I think it's better to
use the SHM 1 or SOCK source instead of PPS. Let it handle the HW
details and pair the PPS and NMEA samples.


I think a portion of my original confusion was that the chronyc sources command 
was indicating that the pulse had never been seen, as opposed to it being seen 
and ignored.  I need to compare the GPS logs with the chrony logs to see if the 
changing offset is a function of the number of satellites in view, otherwise I 
don't have a great explanation for the wander seen in the ntp plot.



From what I remember from other discussions about NMEA timing, it

mainly depends on how is the firmware implemented and the number of
visible satellites may have nothing to do with it.




--
William G. Unruh   |  Canadian Institute for| Tel: +1(604)822-3273
Physics&Astronomy  | Advanced Research  | Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 |  and Gravity   |  www.theory.physics.ubc.ca/

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



RE: [chrony-users] kernel PPS troubleshooting

2013-11-27 Thread Battocchi, Scott L.
The reason for the additional GPS strings is this system will actually be 
moving around and we also need to get the position and fix quality information 
through gpsd.

I ran the GPS while connected to a handful of ntp servers and saw that my gps 
offset (originally 0.180) was too low, so I bumped it up to 0.530 for the next 
two tests.  I've attached plots of the offset as recorded in the statistics.log 
file, if there are other metrics that would be useful I'm happy to graph them 
and send them out.
ntp.png is with 5 pool servers and the GPS set to noselect (PPS is not locked 
to anything, but is selectable)
gps.png is after the ntp test but back to just using the GPS and PPS, it looks 
like sometimes GPS gets selected as the source forcing the PPS signal to look 
like it is drifting relative to the system.

I think a portion of my original confusion was that the chronyc sources command 
was indicating that the pulse had never been seen, as opposed to it being seen 
and ignored.  I need to compare the GPS logs with the chrony logs to see if the 
changing offset is a function of the number of satellites in view, otherwise I 
don't have a great explanation for the wander seen in the ntp plot.

Thanks,
Scott

-Original Message-
From: Miroslav Lichvar [mailto:mlich...@redhat.com] 
Sent: Wednesday, November 27, 2013 1:44 AM
To: chrony-users@chrony.tuxfamily.org
Subject: Re: [chrony-users] kernel PPS troubleshooting

On Tue, Nov 26, 2013 at 08:49:19PM -0500, Battocchi, Scott L. wrote:
> Bill,
> Thanks for taking an initial look.  I've added my system to our network to 
> compare our GPS time with the general NTP pool and it looks like our GPS 
> could be right on the edge of that 0.4s window.  I'm going to let it run for 
> a bit like this and report back after trying a larger offset for our SHM 
> refclock.  The receiver I am using is an MTK3339 if anyone else has a 
> standard offset they use (default speed and strings (9600 8n1 with GGA GSA 
> RMC VTG and VSG enabled).

Yes, from the log it looks like the SHM and PPS sources are too far from each 
other (large offdiff value). Also, the GPS source might be too jittery to be 
used reliably as the locking reference for PPS.

One way to find out which one is wrong is to add a good NTP source as the 
reference, add the noselect option to the GPS and PPS sources (without any 
locking) and observe the offset values in the refclocks log or chronyc 
sourcestats output.

--
Miroslav Lichvar

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



RE: [chrony-users] kernel PPS troubleshooting

2013-11-27 Thread Bill Unruh

On Wed, 27 Nov 2013, Battocchi, Scott L. wrote:


The reason for the additional GPS strings is this system will actually be 
moving around and we also need to get the position and fix quality information 
through gpsd.

I ran the GPS while connected to a handful of ntp servers and saw that my gps 
offset (originally 0.180) was too low, so I bumped it up to 0.530 for the next 
two tests.  I've attached plots of the offset as recorded in the statistics.log 
file, if there are other metrics that would be useful I'm happy to graph them 
and send them out.
ntp.png is with 5 pool servers and the GPS set to noselect (PPS is not locked 
to anything, but is selectable)


Hard to read. The vertical axis is what? seconds?
The tick marks all lie on top of each otehr, so it is hard to figure out what
is going on. 
It looks like using the remote ntp sources disciplines the clock to within

better than 10ms (it should be good to about 50 micro, not milli, seconds
unless you have  a really bad network.)

I agree on the second graph, the behaviour of the pps is bizarre. It is really
not clear why the pps should suddenlyhave .3 sec offset. The time on the
computer should coast far far better than that. In fact, the gps should only
really be necessary in order to get the system time to within a few hunndred
ms, and after that the PPS should be able to discipline the clock all on its
own (in 16 sec the system clock should not get to more than a ms away from the
true time even free running). 
Ffor pps to suddenly indicate .3ms offset would imply that your clock drifted

at 2PPM which is absurd. So either there is something really seriously
wrong with your system clock (eg some other program is coming in and altering
the clock behind chrony's back) or there is a severe bug in chrony (but I run
chrony with a PPS-- but my own driver and I see offsets of 10 micro seconds,
not 300 milliseconds) or with the PPS driver (but then why is the first graph
where the PPS does not discipline the clock showing none of those absurd
jumps.)
Note that I also find it weird that your gps time fluctuates by almost 1
second peak to peak. It really really should be much better than that.

What do the refclock, measurement and statistics logs show for those times when 
the PPS
offset jumps so much?



gps.png is after the ntp test but back to just using the GPS and PPS, it looks 
like sometimes GPS gets selected as the source forcing the PPS signal to look 
like it is drifting relative to the system.

I think a portion of my original confusion was that the chronyc sources command 
was indicating that the pulse had never been seen, as opposed to it being seen 
and ignored.  I need to compare the GPS logs with the chrony logs to see if the 
changing offset is a function of the number of satellites in view, otherwise I 
don't have a great explanation for the wander seen in the ntp plot.

Thanks,
Scott

-Original Message-
From: Miroslav Lichvar [mailto:mlich...@redhat.com]
Sent: Wednesday, November 27, 2013 1:44 AM
To: chrony-users@chrony.tuxfamily.org
Subject: Re: [chrony-users] kernel PPS troubleshooting

On Tue, Nov 26, 2013 at 08:49:19PM -0500, Battocchi, Scott L. wrote:

Bill,
Thanks for taking an initial look.  I've added my system to our network to 
compare our GPS time with the general NTP pool and it looks like our GPS could 
be right on the edge of that 0.4s window.  I'm going to let it run for a bit 
like this and report back after trying a larger offset for our SHM refclock.  
The receiver I am using is an MTK3339 if anyone else has a standard offset they 
use (default speed and strings (9600 8n1 with GGA GSA RMC VTG and VSG enabled).


Yes, from the log it looks like the SHM and PPS sources are too far from each 
other (large offdiff value). Also, the GPS source might be too jittery to be 
used reliably as the locking reference for PPS.

One way to find out which one is wrong is to add a good NTP source as the 
reference, add the noselect option to the GPS and PPS sources (without any 
locking) and observe the offset values in the refclocks log or chronyc 
sourcestats output.

--
Miroslav Lichvar

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.




--
William G. Unruh   |  Canadian Institute for| Tel: +1(604)822-3273
Physics&Astronomy  | Advanced Research  | Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 |  and Gravity   |  www.theory.physics.ubc.ca/

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-27 Thread Miroslav Lichvar
On Tue, Nov 26, 2013 at 08:49:19PM -0500, Battocchi, Scott L. wrote:
> Bill,
> Thanks for taking an initial look.  I've added my system to our network to 
> compare our GPS time with the general NTP pool and it looks like our GPS 
> could be right on the edge of that 0.4s window.  I'm going to let it run for 
> a bit like this and report back after trying a larger offset for our SHM 
> refclock.  The receiver I am using is an MTK3339 if anyone else has a 
> standard offset they use (default speed and strings (9600 8n1 with GGA GSA 
> RMC VTG and VSG enabled).

Yes, from the log it looks like the SHM and PPS sources are too far
from each other (large offdiff value). Also, the GPS source might be
too jittery to be used reliably as the locking reference for PPS.

One way to find out which one is wrong is to add a good NTP source as
the reference, add the noselect option to the GPS and PPS sources
(without any locking) and observe the offset values in the refclocks
log or chronyc sourcestats output.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



RE: [chrony-users] kernel PPS troubleshooting

2013-11-26 Thread Bill Unruh

On Tue, 26 Nov 2013, Battocchi, Scott L. wrote:


Bill,
Thanks for taking an initial look.  I've added my system to our network to 
compare our GPS time with the general NTP pool and it looks like our GPS could 
be right on the edge of that 0.4s window.  I'm going to let it run for a bit 
like this and report back after trying a larger offset for our SHM refclock.  
The receiver I am using is an MTK3339 if anyone else has a standard offset they 
use (default speed and strings (9600 8n1 with GGA GSA RMC VTG and VSG enabled).


Why do you have all of those sentences? You only need one for timing.



Thanks,
SCott

-Original Message-
From: Bill Unruh [mailto:un...@physics.ubc.ca]
Sent: Tuesday, November 26, 2013 1:43 PM
To: chrony-users@chrony.tuxfamily.org
Subject: RE: [chrony-users] kernel PPS troubleshooting

The pps can only give you when the second turnover occurs, it cannot tell you 
which second that is. That MUST be given by some other time source, which could 
be the nmea sentences from the gps or by some other source. The problem with 
the nmea is that it is usually late. Late by something like .5 to 1 sec.
But chrony must be confident that the system time is within less than .5 sec of 
the real time before it will trust the PPS. source. Now, it looks to me on a 
very quick look that this is not happening for some reason, and so that pps 
data is being rejected.

 I have not looked at Miroslav's code to figure out exactly what is being 
reported, so am not at allconfident I am reading it properly.


On Tue, 26 Nov 2013, Battocchi, Scott L. wrote:


Miraslov,
Thanks for the modified source,  I've recompiled it with --enable-trace and do 
indeed get a lot more information.
I modified the original chrony.conf to drop the external gps (GPSe/PPSe) since 
they were generating a lot of sample ignored trace messages (no valid fix and 
no updating pps), so the reports below are with only the GPSi/PPSi sources 
active in the configuration.

I've tried to copy key portions of the run below to avoid attaching the 5MB trace 
log, I'm open to other methods of sharing the whole log if there is interest.  I 
have attached the tracking.log since it shows when PPS was available compared to 
the long periods where GPS was active (and the PPS was coming into /dev/pps1).  It 
looks like the pulse is ignored when offdiff is relatively large (>0.2?), but 
that the offdiff steps very quickly between valid and invalid.  It is also 
possible I'm interpreting the pulse handling completely incorrectly.

Starting the modified chrony all appears well for a while but within a couple 
of minutes most of the PPS pulses are ignored.  PPS goes in and out of being 
ignored for the next ~5 minutes before disappearing for another 90 minutes.  
After that brief recovery, it is ignored for the rest of the run:
:~/chronytrace# ./chrony [Jd -d
main.c:355:(main)[26-18:00:28] chronyd version DEVELOPMENT starting
sys_linux.c:1022:(get_version_specific_details)[26-18:00:28] Linux
kernel major=3 minor=3 patch=0
sys_linux.c:1080:(get_version_specific_details)[26-18:00:28] hz=100
shift_hz=7 freq_scale=1. nominal_tick=1
slew_delta_tick=833 max_tick_bias=1000 shift_pll=2
local.c:565:(lcl_RegisterSystemDrivers)[26-18:00:28] Local
freq=297.043ppm refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock
PPS added poll=4 dpoll=0 filter=16
refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock SHM added
poll=4 dpoll=0 filter=16 reference.c:194:(REF_Initialise)[26-18:00:28]
Initial frequency 297.043 ppm
sources.c:331:(SRC_SetSelectable)[26-18:00:28] PPSi
sources.c:331:(SRC_SetSelectable)[26-18:00:28] GPSi
refclock.c:416:(RCL_AddPulse)[26-18:00:28] refclock pulse ignored no
ref sample refclock.c:687:(filter_add_sample)[26-18:00:28] filter
sample 0 t=Tue 11/26/13 18:00:28.080062 offset=1.059937008
dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:29]
refclock pulse ignored offdiff=-0.459021095 refdisp=0.03000
disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:29]
filter sample 1 t=Tue 11/26/13 18:00:29.060753 offset=1.079246196
dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:30]
refclock pulse ignored offdiff=-0.440026443 refdisp=0.03000
disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:30]
filter sample 2 t=Tue 11/26/13 18:00:30.076668 offset=1.063331638
dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:31]
refclock pulse ignored offdiff=-0.456248500 refdisp=0.03000
disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:31]
filter sample 3 t=Tue 11/26/13 18:00:31.121044 offset=1.018955039
dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:32]
refclock pulse ignored offdiff=0.499073485 refdisp=0.03000
disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:32]
filter sample 4 t=Tue 11/26/13 18:00:32.121227 offset=1.018772016
dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:33]
refclock pul

RE: [chrony-users] kernel PPS troubleshooting

2013-11-26 Thread Battocchi, Scott L.
Bill,
Thanks for taking an initial look.  I've added my system to our network to 
compare our GPS time with the general NTP pool and it looks like our GPS could 
be right on the edge of that 0.4s window.  I'm going to let it run for a bit 
like this and report back after trying a larger offset for our SHM refclock.  
The receiver I am using is an MTK3339 if anyone else has a standard offset they 
use (default speed and strings (9600 8n1 with GGA GSA RMC VTG and VSG enabled).
Thanks,
SCott

-Original Message-
From: Bill Unruh [mailto:un...@physics.ubc.ca] 
Sent: Tuesday, November 26, 2013 1:43 PM
To: chrony-users@chrony.tuxfamily.org
Subject: RE: [chrony-users] kernel PPS troubleshooting

The pps can only give you when the second turnover occurs, it cannot tell you 
which second that is. That MUST be given by some other time source, which could 
be the nmea sentences from the gps or by some other source. The problem with 
the nmea is that it is usually late. Late by something like .5 to 1 sec.
But chrony must be confident that the system time is within less than .5 sec of 
the real time before it will trust the PPS. source. Now, it looks to me on a 
very quick look that this is not happening for some reason, and so that pps 
data is being rejected.

  I have not looked at Miroslav's code to figure out exactly what is being 
reported, so am not at allconfident I am reading it properly.


On Tue, 26 Nov 2013, Battocchi, Scott L. wrote:

> Miraslov,
> Thanks for the modified source,  I've recompiled it with --enable-trace and 
> do indeed get a lot more information.
> I modified the original chrony.conf to drop the external gps (GPSe/PPSe) 
> since they were generating a lot of sample ignored trace messages (no valid 
> fix and no updating pps), so the reports below are with only the GPSi/PPSi 
> sources active in the configuration.
>
> I've tried to copy key portions of the run below to avoid attaching the 5MB 
> trace log, I'm open to other methods of sharing the whole log if there is 
> interest.  I have attached the tracking.log since it shows when PPS was 
> available compared to the long periods where GPS was active (and the PPS was 
> coming into /dev/pps1).  It looks like the pulse is ignored when offdiff is 
> relatively large (>0.2?), but that the offdiff steps very quickly between 
> valid and invalid.  It is also possible I'm interpreting the pulse handling 
> completely incorrectly.
>
> Starting the modified chrony all appears well for a while but within a couple 
> of minutes most of the PPS pulses are ignored.  PPS goes in and out of being 
> ignored for the next ~5 minutes before disappearing for another 90 minutes.  
> After that brief recovery, it is ignored for the rest of the run:
> :~/chronytrace# ./chrony [Jd -d
> main.c:355:(main)[26-18:00:28] chronyd version DEVELOPMENT starting 
> sys_linux.c:1022:(get_version_specific_details)[26-18:00:28] Linux 
> kernel major=3 minor=3 patch=0 
> sys_linux.c:1080:(get_version_specific_details)[26-18:00:28] hz=100 
> shift_hz=7 freq_scale=1. nominal_tick=1 
> slew_delta_tick=833 max_tick_bias=1000 shift_pll=2 
> local.c:565:(lcl_RegisterSystemDrivers)[26-18:00:28] Local 
> freq=297.043ppm refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock 
> PPS added poll=4 dpoll=0 filter=16 
> refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock SHM added 
> poll=4 dpoll=0 filter=16 reference.c:194:(REF_Initialise)[26-18:00:28] 
> Initial frequency 297.043 ppm 
> sources.c:331:(SRC_SetSelectable)[26-18:00:28] PPSi 
> sources.c:331:(SRC_SetSelectable)[26-18:00:28] GPSi 
> refclock.c:416:(RCL_AddPulse)[26-18:00:28] refclock pulse ignored no 
> ref sample refclock.c:687:(filter_add_sample)[26-18:00:28] filter 
> sample 0 t=Tue 11/26/13 18:00:28.080062 offset=1.059937008 
> dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:29] 
> refclock pulse ignored offdiff=-0.459021095 refdisp=0.03000 
> disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:29] 
> filter sample 1 t=Tue 11/26/13 18:00:29.060753 offset=1.079246196 
> dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:30] 
> refclock pulse ignored offdiff=-0.440026443 refdisp=0.03000 
> disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:30] 
> filter sample 2 t=Tue 11/26/13 18:00:30.076668 offset=1.063331638 
> dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:31] 
> refclock pulse ignored offdiff=-0.456248500 refdisp=0.03000 
> disp=0.02001 refclock.c:687:(filter_add_sample)[26-18:00:31] 
> filter sample 3 t=Tue 11/26/13 18:00:31.121044 offset=1.018955039 
> dispersion=0.03000 refclock.c:440:(RCL_AddPulse)[26-18:00:32] 
> refclock pulse ignored offdiff=0.499073485 refdisp=0.03000 
> disp=0.02001 refclock.c:687:(fil

RE: [chrony-users] kernel PPS troubleshooting

2013-11-26 Thread Battocchi, Scott L.
Miraslov,
Thanks for the modified source,  I've recompiled it with --enable-trace and do 
indeed get a lot more information.
I modified the original chrony.conf to drop the external gps (GPSe/PPSe) since 
they were generating a lot of sample ignored trace messages (no valid fix and 
no updating pps), so the reports below are with only the GPSi/PPSi sources 
active in the configuration.

I've tried to copy key portions of the run below to avoid attaching the 5MB 
trace log, I'm open to other methods of sharing the whole log if there is 
interest.  I have attached the tracking.log since it shows when PPS was 
available compared to the long periods where GPS was active (and the PPS was 
coming into /dev/pps1).  It looks like the pulse is ignored when offdiff is 
relatively large (>0.2?), but that the offdiff steps very quickly between valid 
and invalid.  It is also possible I'm interpreting the pulse handling 
completely incorrectly.

Starting the modified chrony all appears well for a while but within a couple 
of minutes most of the PPS pulses are ignored.  PPS goes in and out of being 
ignored for the next ~5 minutes before disappearing for another 90 minutes.  
After that brief recovery, it is ignored for the rest of the run:
:~/chronytrace# ./chrony [Jd -d
main.c:355:(main)[26-18:00:28] chronyd version DEVELOPMENT starting
sys_linux.c:1022:(get_version_specific_details)[26-18:00:28] Linux kernel 
major=3 minor=3 patch=0
sys_linux.c:1080:(get_version_specific_details)[26-18:00:28] hz=100 shift_hz=7 
freq_scale=1. nominal_tick=1 slew_delta_tick=833 max_tick_bias=1000 
shift_pll=2
local.c:565:(lcl_RegisterSystemDrivers)[26-18:00:28] Local freq=297.043ppm
refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock PPS added poll=4 dpoll=0 
filter=16
refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock SHM added poll=4 dpoll=0 
filter=16
reference.c:194:(REF_Initialise)[26-18:00:28] Initial frequency 297.043 ppm
sources.c:331:(SRC_SetSelectable)[26-18:00:28] PPSi
sources.c:331:(SRC_SetSelectable)[26-18:00:28] GPSi
refclock.c:416:(RCL_AddPulse)[26-18:00:28] refclock pulse ignored no ref sample
refclock.c:687:(filter_add_sample)[26-18:00:28] filter sample 0 t=Tue 11/26/13 
18:00:28.080062 offset=1.059937008 dispersion=0.03000
refclock.c:440:(RCL_AddPulse)[26-18:00:29] refclock pulse ignored 
offdiff=-0.459021095 refdisp=0.03000 disp=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:29] filter sample 1 t=Tue 11/26/13 
18:00:29.060753 offset=1.079246196 dispersion=0.03000
refclock.c:440:(RCL_AddPulse)[26-18:00:30] refclock pulse ignored 
offdiff=-0.440026443 refdisp=0.03000 disp=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:30] filter sample 2 t=Tue 11/26/13 
18:00:30.076668 offset=1.063331638 dispersion=0.03000
refclock.c:440:(RCL_AddPulse)[26-18:00:31] refclock pulse ignored 
offdiff=-0.456248500 refdisp=0.03000 disp=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:31] filter sample 3 t=Tue 11/26/13 
18:00:31.121044 offset=1.018955039 dispersion=0.03000
refclock.c:440:(RCL_AddPulse)[26-18:00:32] refclock pulse ignored 
offdiff=0.499073485 refdisp=0.03000 disp=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:32] filter sample 4 t=Tue 11/26/13 
18:00:32.121227 offset=1.018772016 dispersion=0.03000
refclock.c:440:(RCL_AddPulse)[26-18:00:33] refclock pulse ignored 
offdiff=0.498576090 refdisp=0.03000 disp=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:33] filter sample 5 t=Tue 11/26/13 
18:00:32.702642 offset=1.437357065 dispersion=0.03000
refclock.c:447:(RCL_AddPulse)[26-18:00:34] refclock pulse second=0.479480414 
offset=1.520519586 offdiff=-0.083162521 samplediff=0.776838000
refclock.c:687:(filter_add_sample)[26-18:00:34] filter sample 0 t=Tue 11/26/13 
18:00:33.479480 offset=1.520519586 dispersion=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:34] filter sample 6 t=Tue 11/26/13 
18:00:33.704596 offset=1.435403234 dispersion=0.03000
refclock.c:447:(RCL_AddPulse)[26-18:00:35] refclock pulse second=0.479162252 
offset=1.520837748 offdiff=-0.085434514 samplediff=0.774566000
refclock.c:687:(filter_add_sample)[26-18:00:35] filter sample 1 t=Tue 11/26/13 
18:00:34.479162 offset=1.520837748 dispersion=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:35] filter sample 7 t=Tue 11/26/13 
18:00:34.692316 offset=1.447683341 dispersion=0.03000
refclock.c:447:(RCL_AddPulse)[26-18:00:36] refclock pulse second=0.478844465 
offset=1.521155535 offdiff=-0.073472194 samplediff=0.786528000
refclock.c:687:(filter_add_sample)[26-18:00:36] filter sample 2 t=Tue 11/26/13 
18:00:35.478844 offset=1.521155535 dispersion=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:36] filter sample 8 t=Tue 11/26/13 
18:00:35.708774 offset=1.431225071 dispersion=0.03000
refclock.c:447:(RCL_AddPulse)[26-18:00:37] refclock pulse second=0.478551879 
offset=1.521448121 offdiff=-0.090223050 samplediff=0.769777000
refclock.c:687:(fil

RE: [chrony-users] kernel PPS troubleshooting

2013-11-26 Thread Bill Unruh

The pps can only give you when the second turnover occurs, it cannot tell you
which second that is. That MUST be given by some other time source, which
could be the nmea sentences from the gps or by some other source. The problem
with the nmea is that it is usually late. Late by something like .5 to 1 sec.
But chrony must be confident that the system time is within less than .5 sec
of the real time before it will trust the PPS. source. Now, it looks to me on
a very quick look that this is not happening for some reason, and so that pps
data is being rejected.

 I have not looked at Miroslav's code to figure out exactly what is being
reported, so am not at allconfident I am reading it properly.


On Tue, 26 Nov 2013, Battocchi, Scott L. wrote:


Miraslov,
Thanks for the modified source,  I've recompiled it with --enable-trace and do 
indeed get a lot more information.
I modified the original chrony.conf to drop the external gps (GPSe/PPSe) since 
they were generating a lot of sample ignored trace messages (no valid fix and 
no updating pps), so the reports below are with only the GPSi/PPSi sources 
active in the configuration.

I've tried to copy key portions of the run below to avoid attaching the 5MB trace 
log, I'm open to other methods of sharing the whole log if there is interest.  I 
have attached the tracking.log since it shows when PPS was available compared to 
the long periods where GPS was active (and the PPS was coming into /dev/pps1).  It 
looks like the pulse is ignored when offdiff is relatively large (>0.2?), but 
that the offdiff steps very quickly between valid and invalid.  It is also 
possible I'm interpreting the pulse handling completely incorrectly.

Starting the modified chrony all appears well for a while but within a couple 
of minutes most of the PPS pulses are ignored.  PPS goes in and out of being 
ignored for the next ~5 minutes before disappearing for another 90 minutes.  
After that brief recovery, it is ignored for the rest of the run:
:~/chronytrace# ./chrony [Jd -d
main.c:355:(main)[26-18:00:28] chronyd version DEVELOPMENT starting
sys_linux.c:1022:(get_version_specific_details)[26-18:00:28] Linux kernel 
major=3 minor=3 patch=0
sys_linux.c:1080:(get_version_specific_details)[26-18:00:28] hz=100 shift_hz=7 
freq_scale=1. nominal_tick=1 slew_delta_tick=833 max_tick_bias=1000 
shift_pll=2
local.c:565:(lcl_RegisterSystemDrivers)[26-18:00:28] Local freq=297.043ppm
refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock PPS added poll=4 dpoll=0 
filter=16
refclock.c:253:(RCL_AddRefclock)[26-18:00:28] refclock SHM added poll=4 dpoll=0 
filter=16
reference.c:194:(REF_Initialise)[26-18:00:28] Initial frequency 297.043 ppm
sources.c:331:(SRC_SetSelectable)[26-18:00:28] PPSi
sources.c:331:(SRC_SetSelectable)[26-18:00:28] GPSi
refclock.c:416:(RCL_AddPulse)[26-18:00:28] refclock pulse ignored no ref sample
refclock.c:687:(filter_add_sample)[26-18:00:28] filter sample 0 t=Tue 11/26/13 
18:00:28.080062 offset=1.059937008 dispersion=0.03000
refclock.c:440:(RCL_AddPulse)[26-18:00:29] refclock pulse ignored 
offdiff=-0.459021095 refdisp=0.03000 disp=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:29] filter sample 1 t=Tue 11/26/13 
18:00:29.060753 offset=1.079246196 dispersion=0.03000
refclock.c:440:(RCL_AddPulse)[26-18:00:30] refclock pulse ignored 
offdiff=-0.440026443 refdisp=0.03000 disp=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:30] filter sample 2 t=Tue 11/26/13 
18:00:30.076668 offset=1.063331638 dispersion=0.03000
refclock.c:440:(RCL_AddPulse)[26-18:00:31] refclock pulse ignored 
offdiff=-0.456248500 refdisp=0.03000 disp=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:31] filter sample 3 t=Tue 11/26/13 
18:00:31.121044 offset=1.018955039 dispersion=0.03000
refclock.c:440:(RCL_AddPulse)[26-18:00:32] refclock pulse ignored 
offdiff=0.499073485 refdisp=0.03000 disp=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:32] filter sample 4 t=Tue 11/26/13 
18:00:32.121227 offset=1.018772016 dispersion=0.03000
refclock.c:440:(RCL_AddPulse)[26-18:00:33] refclock pulse ignored 
offdiff=0.498576090 refdisp=0.03000 disp=0.02001
refclock.c:687:(filter_add_sample)[26-18:00:33] filter sample 5 t=Tue 11/26/13 
18:00:32.702642 offset=1.437357065 dispersion=0.03000
refclock.c:447:(RCL_AddPulse)[26-18:00:34] refclock pulse second=0.479480414 
offset=1.520519586 offdiff=-0.083162521 samplediff=0.776838000



--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-26 Thread Miroslav Lichvar
On Mon, Nov 25, 2013 at 06:50:01PM -0500, Battocchi, Scott L. wrote:
> I've recently cross-compiled chrony-0308330 to run on our armv5 platform and 
> it seems to silently/selectively ignore our PPS source even when it is 
> present.  Currently all testing is being done with our cheap receiver 
> (GPSi/PPSi below).   After ~hours I get a handful of entries into the 
> refclocks.log for the PPSi source, but no mention on the console that the 
> source is or is not present.  Right this instant we are getting updates to 
> /sys/class/pps/pps1/assert every second but chronyc sources shows the LastRX 
> as 26 minutes ago.
> 
> Is there a way to enable more verbose debugging of the chrony source 
> selection/rejection process so that I can see why it is rejecting what look 
> to be good PPS updates?  I'm happy to provide more information, logs, or 
> compile options as necessary.

The configuration looks good. A similar setup works fine here
(although I've only one GPS).

There are a number of places where the PPS sample can be dropped. I've
added some new trace messages to help us see what's going on. Can you
please pull from git, run configure with --enable-trace, recompile and
see what refclock messages do you get?

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-25 Thread Bill Unruh

On Mon, 25 Nov 2013, Battocchi, Scott L. wrote:


Hi,
I'm very new to chrony and am trying to do something that I believe to be 
supported.  I am trying to sync our system to two local GPS receivers depending 
on whether one or both of them have a fix, with a preference for the higher 
priced receiver.  As we are on an embedded platform I'm using the PPS-GPIO  
kernel module to get our PPS signal in.  I'm working off of the chrony 
development branch for the PHC support because we will eventually will want to 
tie our PTP capable PHY into chrony and the rest of the system.

I've recently cross-compiled chrony-0308330 to run on our armv5 platform and it 
seems to silently/selectively ignore our PPS source even when it is present.  
Currently all testing is being done with our cheap receiver (GPSi/PPSi below).  
 After ~hours I get a handful of entries into the refclocks.log for the PPSi 
source, but no mention on the console that the source is or is not present.  
Right this instant we are getting updates to /sys/class/pps/pps1/assert every 
second but chronyc sources shows the LastRX as 26 minutes ago.

Is there a way to enable more verbose debugging of the chrony source 
selection/rejection process so that I can see why it is rejecting what look to 
be good PPS updates?  I'm happy to provide more information, logs, or compile 
options as necessary.

Thanks in advance!
Scott

Our system has the following sources currently configured:
/dev/ttyS0 is our high priced GPS
/dev/eser2 is our cheap GPS
/dev/pps0 is the pps signal from our expensive GPS
/dev/pps1 is the pps signal form our cheap GPS


So are you getting stuff into /dev/pps{0,1}? I assume that the ttyS0 and eser2
are NMEA type data, not PPS data.
Have you looked at /var/log/chrony/refclock? it should tell you if chrony is
seeing the inputs and rejecting them for some reason. Also look at
/proc/interrupts to see if the interupts are coming in. 
(Note that I do not know the PPS-GPIO modules so do not know what it reports)





We are using gpsd (3.10) to read in the GPSs as follows:
gpsd -bn /dev/ttyS0 /dev/eser2

the PPS-GPIO module is configured to look for rising edges on the two gpios, 
and connect them to pps0 and pps1, this works at ppstest captures consecutive 
reads from /dev/pps1 while I was running the chrony testing.

chrony.conf (commented out the socket interface to gpsd, a question for another 
post):
refclock PPS /dev/pps0 lock GPSe refid PPSe
refclock PPS /dev/pps1 lock GPSi refid PPSi
refclock SHM 0 offset 0.001 delay 0.0001 refid GPSe
refclock SHM 2 offset 0.140 delay 0.01 refid GPSi
#refclock SOCK /var/run/chrony.ttyS0.sock refid GPSe
#refclock SOCK /var/run/chrony.eser2.sock offset 0.140 delay 0.01 refid GPSi
logdir /var/log/chrony
log measurements statistics tracking refclocks

the following is the console output from chronyd -d:
:~# ./chronyd -d
main.c:355:(main)[25-22:50:52] chronyd version DEVELOPMENT starting
sys_linux.c:1022:(get_version_specific_details)[25-22:50:53] Linux kernel 
major=3 minor=3 patch=0
sys_linux.c:1080:(get_version_specific_details)[25-22:50:53] hz=100 shift_hz=7 
freq_scale=1. nominal_tick=1 slew_delta_tick=833 max_tick_bias=1000 
shift_pll=2
sources.c:913:(SRC_SelectSource)[25-22:51:56] Selected source GPSi





--
William G. Unruh   |  Canadian Institute for| Tel: +1(604)822-3273
Physics&Astronomy  | Advanced Research  | Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 |  and Gravity   |  www.theory.physics.ubc.ca/

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.