Re: [chrony-users] kernel PPS troubleshooting

2013-11-29 Thread Miroslav Lichvar
On Fri, Nov 29, 2013 at 09:46:32AM -0800, Bill Unruh wrote:
> On Fri, 29 Nov 2013, Bill Unruh wrote:
> By the way, does the kernel PPS do median filtering before passing on the
> times to chrony? (Ie, taking the median of say the past 16 inputs and throwing
> away the 6 worst outliers and then retaking the median?)

The kernel doesn't filter the PPS samples in any way. In chronyd the
PPS driver fetches the latest PPS sample from the kernel once per
second and the refclock poll (16 seconds by default) runs the median
filter.

> Anyway, it should not be switching sources unless the deviation of the
> selected source exceeds the variance of the alternative (or unless the source
> has disappeared for a suitable number of poll intervals, probably related to
> how long one would expect to wait for the drift rate variance to make the
> system clock deviate by more than the second source's variance. Ie, you are
> far better off letting a clock drift unconstrained for a while than to jump to
> source which has a huge (factors of a 1000) worse variance.

The selection algorithm prefers sources with shortest distance (with
refclock that's the measured dispersion + configured delay). If there
are more sources with similar distance they will be combined together.

If a source disappears for 8 polling intervals, chronyd will select
another source even if it's much worse. I agree that could be
improved. With NMEA sources it's usually better to use the noselect
option or don't configure it at all.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-29 Thread Tomalak Geret'kal

On 29/11/2013 18:21, Miroslav Lichvar wrote:

Anyway, it should not be switching sources unless the deviation of the
selected source exceeds the variance of the alternative (or unless the source
has disappeared for a suitable number of poll intervals, probably related to
how long one would expect to wait for the drift rate variance to make the
system clock deviate by more than the second source's variance. Ie, you are
far better off letting a clock drift unconstrained for a while than to jump to
source which has a huge (factors of a 1000) worse variance.

The selection algorithm prefers sources with shortest distance (with
refclock that's the measured dispersion + configured delay). If there
are more sources with similar distance they will be combined together.

If a source disappears for 8 polling intervals, chronyd will select
another source even if it's much worse. I agree that could be
improved. With NMEA sources it's usually better to use the noselect
option or don't configure it at all.



With PPS and NMEA sources, I found chrony bouncing between 
the two unless I marked the NMEA source as "noselect" (see 
thread from August 2012).


It's still on my todo list to get more debugging information 
on this, as Bill indicated that it may have been a bug 
(21/08/2012 22:58). Possibly the same thing is happening here?


Tom

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-29 Thread Miroslav Lichvar
On Thu, Nov 28, 2013 at 11:11:18AM -0800, Bill Unruh wrote:
> On Thu, 28 Nov 2013, Miroslav Lichvar wrote:
> >That looks similar to what I see with with a Garmin 18x LVC. This is a
> >capture 30 hours long I did some time ago (the NMEA source's offset
> >value was set to 0.5):
> >
> >http://mlichvar.fedorapeople.org/tmp/18x_nmea.png
> 
> Is this the nmea time or the PPS time? And is the vertical axis seconds or
> milliseconds?

That's the NMEA time (as provided by gpsd) when the clock was
synchronized to PPS. It's unfortunately in seconds. I think it was
with 115200 baud rate.

> The problem in his case is that the PPS signal is occasionally
> (but far too often) off by almost .3 sec. That is rediculous. And it is only
> when the gps-nmea and the PPS are the only sources.

He said chronyd was switching between the PPS and GPS sources, so the
0.3s spike could be just the PPS-NMEA offset. The other graph with
chronyd using NTP sources doesn't seem to have this problem.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] kernel PPS troubleshooting

2013-11-29 Thread Bill Unruh

On Fri, 29 Nov 2013, Bill Unruh wrote:


On Fri, 29 Nov 2013, Miroslav Lichvar wrote:


>  The problem in his case is that the PPS signal is occasionally
>  (but far too often) off by almost .3 sec. That is rediculous. And it is 
>  only

>  when the gps-nmea and the PPS are the only sources.

 He said chronyd was switching between the PPS and GPS sources, so the
 0.3s spike could be just the PPS-NMEA offset. The other graph with
 chronyd using NTP sources doesn't seem to have this problem.




Hm, I guess that would do it. But why would it be switching like that? If it
is doing so, then there is a problem with the chrony selection algorithm. 
Your

solution of having gpsd handle it all is a possible one, but chrony itself
should not be behaving that way. The nmea has a huge variance, while the PPS
variance should be tiny, and it should be being selected. Or is the PPS
exceeding its variance occasionally and chrony thinking it has gone rogue,
selects the nmea? By this time I do not remember the selection algorithm 
sufficiently well to be

able to say.


By the way, does the kernel PPS do median filtering before passing on the
times to chrony? (Ie, taking the median of say the past 16 inputs and throwing
away the 6 worst outliers and then retaking the median?)

Anyway, it should not be switching sources unless the deviation of the
selected source exceeds the variance of the alternative (or unless the source
has disappeared for a suitable number of poll intervals, probably related to
how long one would expect to wait for the drift rate variance to make the
system clock deviate by more than the second source's variance. Ie, you are
far better off letting a clock drift unconstrained for a while than to jump to
source which has a huge (factors of a 1000) worse variance.








--
William G. Unruh   |  Canadian Institute for| Tel: +1(604)822-3273
Physics  | Advanced Research  | Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 |  and Gravity   |  www.theory.physics.ubc.ca/

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.