Re: [ntp:questions] How do I prevent sudden system time jumps.

2011-07-14 Thread Miroslav Lichvar
On Wed, Jul 13, 2011 at 04:49:37PM -0500, Hal Murray wrote:
> >The problem is that the adjustment takes to large steps, not that it
> >takes to long time.
> 
> ntpd will slew the clock at 500 PPM.  You may be willing to wait a while
> for a second or two, but it takes a long time if you have to adjust
> by several minutes or an hour.  That may be OK for your setup, but you
> should think about it.
> 
> If you are using the -x flag, be sure to check out the -g flag that will
> let it do one long jump at startup time.

The -g option only allows the initial offset correction to be larger
than 1000 seconds, it doesn't affect the step value.

When both -x and -g are used and the initial offset is just below 600
seconds (-x is alias for "tinker step 600"), it will still take about
two weeks to correct the offset.

An interesting scenario is when tinker step is larger than panic and
the initial offset is between the two. With -g the first clock update
is allowed, but the clock is not stepped so the following updates will
still be be over the panic limit and ntpd will abort.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] How do I prevent sudden system time jumps.

2011-07-14 Thread Miroslav Lichvar
On Wed, Jul 13, 2011 at 10:51:46PM +0100, David Woolley wrote:
> Hal Murray wrote:
> >>OK.  I asked since a timewarp of 200ms is a bit surprising for real HW,
> >>but is something to be expected if you were running in a VM.
> >
> >It's easy to get a time-warp of 200 ms on a DSL link.  Just download
> >a huge file, say a CD.  The queuing delay on the input to the DSL
> >link turns into asymmetrical delays.  I've seen delays up to 3.5 seconds.
> >
> 
> The huff and puff tinker optin can help mitigate this.

Please note that while the huffpuff option works very well with large
temporary asymmetric delays, it makes things worse with normal
delays as the offset will contain network jitter.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] [ntp:hackers] ntpdate removal is coming

2011-07-18 Thread Miroslav Lichvar
On Sun, Jul 17, 2011 at 12:57:06PM -0700, Harlan Stenn wrote:
> Just to be clear, there *used* to be some reasons to set the clock
> before starting ntpd.  In general, there is no need to do this anymore
> and I have not heard any good reasons it should still be needed.
> 
> If anybody knows of any *good* reasons to set the clock before starting
> ntpd, please speak up.

With the -x option (or any larger tinker step) setting the clock
before starting ntpd is useful to avoid possibly very long initial
offset correction.

That wouldn't be needed if ntpd had an option to always step on start.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Sure GPS - Very High Jitter and Offset

2011-08-15 Thread Miroslav Lichvar
On Mon, Aug 15, 2011 at 12:46:04PM -0500, Ken Link wrote:
> Does anyone know of any Linux CLI tools that display (in some way or
> another) the serial control signals received on a serial port?

You can cat the /proc/tty/driver/serial file in a loop and see if the
CD bit is flipping. 

0: uart:16550A port:03F8 irq:4 tx:1422 rx:184849390 fe:4891 pe:31 RTS|DTR|CD

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Sure GPS - Very High Jitter and Offset

2011-08-16 Thread Miroslav Lichvar
On Mon, Aug 15, 2011 at 10:58:45PM -0500, Ken Link wrote:
> Now, I get timestamps in the assert and clear files: 1313465775.004708342#4545
> 
> However, the source is still jittery! 15ms sometimes, like it's not
> using the PPS signal at all.

If you compare few assert timestamps in row, how stable is the offset?
Couple of microseconds?

> The important configuration lines are:
> server 127.127.20.0 mode 18 minpoll 4 prefer
> fudge 127.127.20.0 flag1 1 flag2 0 flag3 1 flag4 0

You wrote earlier that you use a 2.6.38 kernel. I think the kernel PPS
discipline was added later, so maybe it would help to remove the flag3
setting.

Also, how is marked the PPS source in ntpq -p output?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Sure GPS - Very High Jitter and Offset

2011-08-16 Thread Miroslav Lichvar
On Tue, Aug 16, 2011 at 09:42:09AM -0500, Ken Link wrote:
> I wrote a script to compare the previous/current assert timestamp
> (ignoring the seconds), and this is what I get when NTP is *not*
> running:
> 
> -.04357
> -.07328
> -.08905
> -.07969
> -.07593
> -.09174
> -.06566
> -.08152
> -.07500
> -.06854

That looks good.

> I tried setting flag3 to 0 but it didn't appear to make a difference.
> Also, I would have expected NTP to mark the clock as a PPS source with
> 'o', but when I let it sync it seems to stick with '*' instead:
> 
> $ ntpq -p
>  remote   refid  st t when poll reach   delay   offset  jitter
> ==
>  LOCAL(0).LOCL.  12 l  232   64   100.0000.000   0.002
> *GPS_NMEA(0) .GPS.0 l7   16  3770.000   20.012  26.423

It seems the GPS driver is not getting or is ignoring the PPS signal.
I think there were some issues fixed recently in it. I'd try the ATOM
driver (22) first to verify ntpd was compiled with PPS support and is
able to use it.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Sure GPS - Very High Jitter and Offset

2011-08-16 Thread Miroslav Lichvar
On Tue, Aug 16, 2011 at 12:49:16PM -0500, Ken Link wrote:
 
> I tried copying the header from ntpsrc/ports/winnt/include/timepps.h
> to /usr/include/timepps.h, but no dice. Do I just need to copy some
> more headers somewhere or does this mean I have to recompile the
> kernel?

I think you just need the timepps.h header, try this one
https://raw.github.com/ago/pps-tools/HEAD/timepps.h

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Fwd: Re: NetBSD GPS/PPS using 4.2.6p3

2011-08-22 Thread Miroslav Lichvar
On Sun, Aug 21, 2011 at 02:55:55PM -0700, A C wrote:
> That is where I obtained the ppstest code and then later I
> discovered the test code within the ntpd source distribution.  The
> NetBSD list also suggested that I compare kernel traces on the two
> programs.  It seems that ntpd's pps-api code behaves a bit
> differently than ntpd itself when it interfaces with the kernel.  I
> can provide traces to anyone that would like them for both the
> pps-api test program and ntpd 4.2.6p3.

> 127.127.22.1  flag2 0 flag3 1 refid PPS\n\n
>  11255  1 ntpd CALL  ioctl(7,PPS_IOC_SETPARAMS,0x1052d204)
>  11255  1 ntpd CALL  ioctl(7,PPS_IOC_KCBIND,0xefffdf8c)

A shot in the dark, have you tried removing "flag3 1" to disable the
kernel PPS discipline?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Accuracy of GPS device

2011-09-02 Thread Miroslav Lichvar
On Fri, Sep 02, 2011 at 09:50:05AM +0100, Miguel Gonçalves wrote:
> I found out the problem and just for the record I'll explain...
> 
> The offset is larger than the delay because NTPd is using 10.0.2.254 (more
> on this switch later) as a time source and it shouldn't because it has two
> local stratum 1 clocks that are closer (0.170 ms vs 0.583 ms) are show less
> jitter. Anyway... to prove my point I removed 10.0.2.254 (the **internal**
> switch) from the configuration and here's the result of ntpq -p as of now:

It would be interesting to see the root distances for the three
servers. I think it's reasonable to expect the weights of the stratum
1 servers to be much higher than the weight of the third server, so
the combined offset isn't affected much by the third server. But what
I think it's happening here is the high default dispersion rate (15
ppm) increases the root distance so much that the weights are not that
much different. Setting "tinker dispersion" to a more realistic value
like 1 ppm (or even to 0.1 ppm in your case, see my comment below)
should help.

You can also use "tos minclock 2" to limit the number of combined
sources. 

> $ ntpq -p 10.0.2.2
>  remote   refid  st t when poll reach   delay   offset
>  jitter
> ==
> +10.0.2.10   .GPS.1 u  889 1024  3770.179   -0.066
> 0.083
> *10.0.2.9.GPS.1 u  391 1024  3770.166   -0.084
> 0.051

Those are very good numbers for such high polling interval. Is the
crystal oscillator thermally stabilized? 

In any case I'd suggest to use a shorter maximum poll interval. The
default maxpoll is way too high for jitters normally seen on LANs if
you want best accuracy.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] garmin 18x and linux

2011-09-05 Thread Miroslav Lichvar
On Sat, Sep 03, 2011 at 08:05:21AM -0500, steven Sommars wrote:
> I monitored Garmin LVC (corrected firmware) NMEA time and saw variance of up
> to 50msec.  I wonder if the variation in NMEA time depends on GPS signal
> quality.

I'm wondering what is the cause of the variance too.

With 18x LVC (firmware 3.70) I see errors up to 150 ms. That
wouldn't be that bad if it was randomly distributed.

A capture over 30 hours:
http://mlichvar.fedorapeople.org/tmp/18x_nmea.png

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] garmin 18x and linux

2011-09-05 Thread Miroslav Lichvar
On Mon, Sep 05, 2011 at 03:04:54PM +, unruh wrote:
> On 2011-09-05, Miroslav Lichvar  wrote:
> > With 18x LVC (firmware 3.70) I see errors up to 150 ms. That
> > wouldn't be that bad if it was randomly distributed.
> >
> > A capture over 30 hours:
> > http://mlichvar.fedorapeople.org/tmp/18x_nmea.png
> 
> This was captured how? Is that the beginning or the end of the nmea
> sentence?
> You have some where the offset is negative. Does this really mean that
> the nmea came in before the beginning of the second it referred to?

No, I forgot to mention that was already with 0.5s correction applied.
It's from gpsd which seems to make the NMEA receive timestamp after
the message is processed.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] garmin 18x and linux

2011-09-06 Thread Miroslav Lichvar
On Mon, Sep 05, 2011 at 04:47:20PM +, unruh wrote:
> On 2011-09-05, Miroslav Lichvar  wrote:
> > It's from gpsd which seems to make the NMEA receive timestamp after
> > the message is processed.
> >
> Never did understand that. Timestamping the beginning of the sentences
> is cheap enough and easy enough. 
> Mind you, your fluctuations are far more than I would expect simply from 
> variations in the length of the sentences.
> Are there more sentences delivered than just the one gpsd uses?

There are other messages enabled (I like to monitor the visibility of
satellites in cgps), but RMC and GGA are transmitted first. The baud
rate is set to 115200. The measured time it takes to transmit one
batch is about 85 +/- 10 ms.

Here is another capture, this time only over couple hours, but it's
the offset to the beginning of the transfer (i.e. start of RMC).

http://mlichvar.fedorapeople.org/tmp/18x_nmea2.png

The offset still moves in a 300ms range.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Loop Frequency and Offset

2011-09-27 Thread Miroslav Lichvar
On Mon, Sep 26, 2011 at 02:29:42PM +0100, Miguel Gonçalves wrote:
> So... I was getting around 180 ppm frequency offset in /etc/ntp.drift and
> this value also appeared in loopstats file.
> 
> I checked the machdep.tsc_freq value (this is estimated at boot but can be
> changed after the boot completes):
> 
> tick# sysctl machdep.tsc_freq
> machdep.tsc_freq: 498053689
> 
> Now... if the clock is running 180 ppm fast (I assumed slower and the result
> was the opposite i.e. it ran even faster) I have to decrease the frequency
> by 180 ppm:
> 
> 498053689 - (498053689 * 180/100) = 497964040

Just a minor correction (but I've seen others do that too). The value
in driftfile is how much the system clock has to be sped up to match
UTC, not how much the clock is slower than UTC. The error is tiny at
such small values, but if you want to invert it accurately, you need
to divide it by 1.000180 instead of multiply by 0.999820.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Loop Frequency and Offset

2011-09-27 Thread Miroslav Lichvar
On Tue, Sep 27, 2011 at 10:20:57AM +0100, David Woolley wrote:
> Richard B. Gilbert wrote:
> 
> >
> >I don't believe that accuracy of 1 microsecond , or less, is
> >obtainable without without installing a GPS Timing Receiver or an
> >atomic clock of some sort.
> 
> He asked for an offset of 1 microsecond (presumably RMS or 90
> percentile?), not an accuracy of 1 microsecond.
> 
> If you ignore systematic errors, an offset of 1 microsecond
> corresponds to an accuracy in the low 100s of nanoseconds.

Only if the loop is tracking frequency changes well. If you see in
the loopstats log long runs of offsets with the same sing, the actual
error is probably closer to the reported offset.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP on embedded Linux with GPRS connection

2011-11-24 Thread Miroslav Lichvar
On Wed, Nov 23, 2011 at 08:06:28AM -0800, mas...@tlen.pl wrote:
> - use hwclock --adjust. So after ntpd synchronises the time, I would
> have to issue "hwclock -w -u" and repeat it after at least 24h, so
> hwclock can estimate the drift. Then repeat this process.

To use the --adjust option with ntpd you'll need to make sure the
kernel RTC synchronization (11 minute mode) is not enabled as it would
throw off the RTC drift estimation. See hwclock(8) for more
information.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Ginormous offset and slow convergance

2011-11-30 Thread Miroslav Lichvar
On Wed, Nov 30, 2011 at 10:24:45PM +, unruh wrote:
> If he has peerstats log file, he can look at it and see what teh offset
> is of the oncore and the other ntp sources to see if it is really
> misbehaving that badly. Also, if it is out by 16 sec, why in the world
> has ntp not stepped the time? The threshold is 128ms. 

I think it did step and more than once. I'd suspect a bug in the
firmware in the GPS-UTC offset handling, current offset is 15 seconds
and that is visible in one of the ntpq outputs in the original post.

Would be interesting to know if this happens on every ntpd restart or
only shortly after the GPS unit was powered up.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Ginormous offset and slow convergance

2011-11-30 Thread Miroslav Lichvar
On Wed, Nov 30, 2011 at 11:28:22PM +, unruh wrote:
> On 2011-11-30, Miroslav Lichvar  wrote:
> > On Wed, Nov 30, 2011 at 10:24:45PM +, unruh wrote:
> >> If he has peerstats log file, he can look at it and see what teh offset
> >> is of the oncore and the other ntp sources to see if it is really
> >> misbehaving that badly. Also, if it is out by 16 sec, why in the world
> >> has ntp not stepped the time? The threshold is 128ms. 
> >
> > I think it did step and more than once. I'd suspect a bug in the
> > firmware in the GPS-UTC offset handling, current offset is 15 seconds
> > and that is visible in one of the ntpq outputs in the original post.
> 
> But how could he get a 16 second offset, after starting out with a .1 s
> and 1 s offset. At 500PPM, 16 sec takes 32000 sec  (10 hr) to accumulate
>  which is poll interval 15. Ie, I cannot see how ntpd could have
>  allowed that huge an offset to occur. 

ntpd doesn't step more than once per 15 minutes. What I think was
happening: on start the clock is good to couple ms, NTP servers are
not reachable yet, but GPS is off by 16s, ntpd steps immediately; GPS
is off by 15s, NTP servers are off by 16s, ntpd doesn't step yet; GPS
and NTP are off by 16s, ntpd steps back and stabilizes.

The loopstats log would be useful.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Ginormous offset and slow convergance

2011-12-02 Thread Miroslav Lichvar
On Thu, Dec 01, 2011 at 12:24:44AM +, Pete Ashdown wrote:
> Miroslav Lichvar  writes:
> 
> >Would be interesting to know if this happens on every ntpd restart or
> >only shortly after the GPS unit was powered up.
> 
> Every restart (that doesn't have 127.127.0.1 in the config).

That would suggest a problem rather on the ntpd side. I wasn't able to
analyze the oncore debug messages in your other post, but maybe you
could try to switch the unit to NMEA mode and use the NMEA driver or
try it with gpsd and SHM driver and see if that makes a difference.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] New ntp Server

2011-12-08 Thread Miroslav Lichvar
On Thu, Dec 08, 2011 at 03:55:50AM +, Mark C. Stephens wrote:
> Oh no I am quite happy with my hp3325A ;)
> 
> 
> Well Okay, after a slight detour trying to get ilo100 to work, I loaded 
> centos 6.0 x64 on the DL165 G2 (computer) and found it has 3.3V PCI slots. So 
> none of my Serial I/O cards fit, being 5V. I have seen people take a dremel 
> to them to cut a 3.3V notch, but I am not a 100% sure this works. 
> 
> Centos 6.0 is really impressive I have to say. Also the PPS kernel module is 
> already built and installed, just need to load it.

The kernel includes general PPS support, but there is no support for
PPS on serial devices (pps_ldisc module). You'll probably need to use
a newer version of kernel or backport the module to the old version.
You'll also need to recompile the ntp package with the timepps.h
header.

It might be easier to try a newer distro. For instance, Fedora 14 and
later have kernel, ntp and chrony packages compiled with PPS support
and it should work out of the box, even with SELinux enabled :).

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] New ntp Server

2011-12-08 Thread Miroslav Lichvar
On Thu, Dec 08, 2011 at 07:53:15AM +, Mark C. Stephens wrote:
> Hello Sir Unrah,
> 
> 
> I just use ntpq -p. I am using Dave Harts rather excellent port to windows:
> 
> C:\Program Files\NTP\bin>ntpq -p
>  remote   refid   st  t   when
> pollreach   delay   offset   jitter
> ==
> *GPS_NMEA(1)  .GPS.0   l  1   16  
> 377 0.000   -0.139  0.059
> oPPS(1)   .PPS.   0l   -  
> 16  377 0.000   -0.007  0.002
> 
> I restarted ntpd a couple of hours ago so these number will improve.
> 
> That is a good question, are we talking seconds for offset and jitter here? 

They are milliseconds. If ntpd on Windows can really keep the clock
stable to to ~10 microseconds, the recent suggestion posted here to
never use Windows for serious timekeeping might need to be revisited.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] New ntp Server

2011-12-09 Thread Miroslav Lichvar
On Thu, Dec 08, 2011 at 12:50:12PM +, Dave Hart wrote:
> The results are worse than FreeBSD or Linux  I suspect the difference
> is mostly due to the interpolation code having to guess at when, on
> the counter timescale, the system clock ticked up to the present
> value.  Some ugly busy-looping logic might help refine that and also
> overcome the incompatibility with newer Windows versions' clocks.

It's a pity the system doesn't provide a function for precise clock
reading.

What resolution has the clock frequency adjustment? I'm reading
about the SetSystemTimeAdjustment function and the adjustment is in
100-ns units applied over an lpTimeIncrement interval. If the interval
is too short I suspect this could also limit the time and frequency
accuracy of the system clock.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Configure FreeBSD or Linux to use stepping clock?

2011-12-16 Thread Miroslav Lichvar
On Thu, Dec 15, 2011 at 07:50:08PM +, Dave Hart wrote:
> Dr. Mills raised the possibility privately that either FreeBSD or
> Linux might be reconfigured to use a more primitive clock that steps
> once per millisecond or less.  If possible and I am able to accomplish
> it, my testing of these bug 2037 fuzzing changes would be greatly
> assisted.

On Linux, you could set a different kernel clocksource. Perhaps
to jiffies or pit, if available.

Check these files:
/sys/devices/system/clocksource/clocksource0/current_clocksource
/sys/devices/system/clocksource/clocksource0/available_clocksource

Why not degrade the resolution of the clock directly in ntp sources?
In get_systime():
GET_SYSTIME_AS_TIMESPEC(&ts);
ts.tv_nsec /= 100;
ts.tv_nsec *= 1000000;

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


[ntp:questions] Visualization of clock control

2012-01-04 Thread Miroslav Lichvar
Hi,

I wrote a tool to visualize the data generated by the clknetsim
simulator and I thought some of you might find it interesting. The
goal was to show how a clock is controlled by NTP client and at the
same time see its offset from true time and the NTP measurements (the
actual offset and delay seen by the client).

Here are some example runs of the tool captured to animated gifs:
http://mlichvar.fedorapeople.org/clknetsim/chrony_ntp/vis/visclocks_10us.gif
http://mlichvar.fedorapeople.org/clknetsim/chrony_ntp/vis/visclocks_100us.gif
http://mlichvar.fedorapeople.org/clknetsim/chrony_ntp/vis/visclocks_1000us.gif

The simulations were done with a clock wandering at 1 ppb/s,
10/100/1000us network jitter with exponential distribution and the NTP
clients were configured to use 64s polling interval.

The white line is the reference clock. The red line is the clock
controlled by ntp and green is chrony. The blue lines are the NTP
measurements made by chrony. Both clients were getting the same data,
but the polling intervals were not exactly the same so the frequency
changes in the red line don't match exactly with the blue lines.

The tool is included in the clknetsim git as visclocks.py. It also has
a game mode, where you control the frequency and phase of the clock by
mouse and you can try to beat the other clients. :)

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Visualization of clock control

2012-01-05 Thread Miroslav Lichvar
On Thu, Jan 05, 2012 at 11:40:25AM +0800, Dennis Ferguson wrote:
> On 4 Jan, 2012, at 22:54 , Miroslav Lichvar wrote:
> > The simulations were done with a clock wandering at 1 ppb/s,
> > 10/100/1000us network jitter with exponential distribution and the NTP
> > clients were configured to use 64s polling interval.
> 
> That's pretty neat.  I think, however, that the clock wander of 1 ppb/s
> is about an order of magnitude too large for real life, at least for machines
> kept in an air conditioned room (and the behavior of clocks in machines
> subject to environmental variations probably can't be modeled by "wander" at
> all).  My measurements against precise hardware tended towards a value of
> 1ppb/10s, which is also consistent with the 10^-8/1000s which sometimes shows
> up on Allan variance plots (I think there's a square root relationship in 
> there
> if the wander is a truly random walk).

I think the 1ppb/10s random walk wander corresponds to ~0.32ppb/1s.
The +0.5 slope in the variance plot intersecting 10^-8 at 1000s would
be ~0.6ppb/s wander.

I tried to model some thermal effects by adding a sine, triangle or
pulse wave to the clock frequency, but it seemed to me the effect it
had on the overall RMS time error was similar to just increasing the
wander. So instead of three or more parameters of the clock I set only
one. Sometimes I use even 10ppb/s wander, to simulate a machine with
varying CPU load and I think the results are not that different from
what I see on my desktop.

BTW, the simulator can be configured to read the clock frequency from
a file. If you have real data from a PPS refclock, you can use that
and see at what random walk wander will ntpd give similar results.

> The other difficulty with respect to real life may be modeling network jitter
> as exponential, since I believe the probability distribution for network 
> delays
> is heavy-tailed (i.e. with extreme values way over-represented; this is a 
> problem
> when using statistics which assume the underlying error distribution is 
> gaussian).
> I don't know how to fix that, though.

I'd definitely be interested in a better model for network delays. I
guess we could try to make a collection of the ntp rawstats logs from
various network environments and see how the distribution looks like.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Leap second

2012-01-06 Thread Miroslav Lichvar
On Fri, Jan 06, 2012 at 12:41:13PM +0100, Rob van der Putten wrote:
> >It is announced now, it occurs Jun 30.
> >The tzdata database contains a file called "leapseconds" which contains
> >all of the leapseconds which have occured  or are know to occur in the 
> >future.
> 
> In 'right' (based on the International Atomic Time) it does, in
> 'posix' (based on the Coordinated Universal Time) it doesn't.
> Does anyone use 'right'? Is this supported by NTPD?

I don't think you can use the "right" timezones on system running
synchronized via NTP. But ntp/chrony could use the information about
leap seconds stored in the right/UTC timezone and I think that would
be a nice feature. To check if a leap second will occur on a specified
date, it just needs to call mktime() in the right/UTC zone and see if
the seconds overflowed or not, see

http://pastebin.com/DqM4s35Y

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Failed to test leapsecond's handling

2012-03-08 Thread Miroslav Lichvar
On Thu, Mar 08, 2012 at 01:10:02PM +0100, Marco Marongiu wrote:
> But when I graph the time log (see the log target in the makefile), I
> don't see the leap second kicking in. Based on Mills' "The NTP Timescale
> and Leap Seconds"[1], when the leap second kicks in, I'd expect two
> consecutive date command to _appear_ happen at different offset than in
> normal conditions. Unfortunately, that didn't happen, and if I draw a
> line of the accumulated offsets between consecutive runs of the command,
> the line is almost perfectly straight.

Do you see the leap bit enabled in ntptime or adjtimex output? Is the
local timezone UTC? Just to make sure the date commands sets time to
before 0:00 UTC and not some other hour. It would be interesting to
also try "disable kernel" in the ntp.conf.

In a clknetsim simulation with ntp-4.2.6p5 I can see the clock is
correctly stepped by 1.0 second. Here is the ntpd log (in UTC+2
timezone):

http://pastebin.com/ZRi6qv8E

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Failed to test leapsecond's handling

2012-03-08 Thread Miroslav Lichvar
On Thu, Mar 08, 2012 at 02:28:07PM +0100, Miroslav Lichvar wrote:
> In a clknetsim simulation with ntp-4.2.6p5 I can see the clock is
> correctly stepped by 1.0 second. Here is the ntpd log (in UTC+2
> timezone):
> 
> http://pastebin.com/ZRi6qv8E

In another simulation set to start 15 seconds before midnight it
didn't work and it seems ntpd needs to be started sooner, perhaps
some number of polling intervals?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] PSYCHO PC clock is advancing at 2 HR per second

2012-03-20 Thread Miroslav Lichvar
On Tue, Mar 20, 2012 at 02:59:12AM +, Dave Hart wrote:
> Although it's the first time I've seen such, it appears the offset and
> frequency calculations both ended up overflowing.  I would have
> guessed bad input should have appeared in peerstats before loopstats
> but I didn't find anything unusual.

This sounds familiar. Perhaps the OP is hitting the bug 2156 fixed
recently? If the emulated adjtime on Windows doesn't apply the 500 ppm
limit, it could have explained the huge frequency error.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] PSYCHO PC clock is advancing at 2 HR per second

2012-03-23 Thread Miroslav Lichvar
On Fri, Mar 23, 2012 at 11:49:19AM +0100, Terje Mathisen wrote:
> unruh wrote:
> >No I would not. That is not what ntpd does. It really does throw away 7
> >of the samples and never uses them. The whole question is what is the
> >best statistic to use. I do not believe that the "shortest roundtrip
> >time" is that best statistic. If you could convince me it is, I would be
> >more than happy to have ntp use it.
> 
> In _some_ scenarios, keeping only the minimum rttsample is indeed
> the best approach:

Yes, it depends on the network jitter and clock stability. But ntpd
doesn't try to estimate the stability and uses a fixed dispersion rate
and Allan intercept in the filter algorithm (15 ppm and 1024 sec by
default). By tweaking the constants you can change the ratio of
dropped samples.

But I think a much bigger problem with the clock filter and PLL
combination is that it can't drop more than 7 samples. When the
network is saturated, it's usually better to drop much more than. If
the increase in delay is 1 second and the clock is good to 10 ppm, it
could wait for days before accepting another sample.

> In order to be considered OK, we can't accept more than 50 ppb
> frequency offset.
> 
> Handling this with up to 50 ms sawtooth variation (with periods up
> to several hours) in the one-way latency means that the vendor
> require sampling periods of up to 10+ hours, with multiple
> packets/second and then keeping a single packet at the end.

That seems excessive. Do they set the frequency directly just from the
last two samples? With PLL or similar, increasing the time constant
accordingly might be a better approach.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] PSYCHO PC clock is advancing at 2 HR per second

2012-03-26 Thread Miroslav Lichvar
On Fri, Mar 23, 2012 at 06:12:11PM +0100, Terje Mathisen wrote:
> Miroslav Lichvar wrote:
> >On Fri, Mar 23, 2012 at 11:49:19AM +0100, Terje Mathisen wrote:
> >But I think a much bigger problem with the clock filter and PLL
> >combination is that it can't drop more than 7 samples. When the
> >network is saturated, it's usually better to drop much more than. If
> >the increase in delay is 1 second and the clock is good to 10 ppm, it
> >could wait for days before accepting another sample.
> 
> Oh but it can!
> 
> Check out "huff-puff"!
> 
> You can easily tell ntpd to coast past multi-hour periods of
> excessive delays/traffic.

With huff-puff it doesn't really coast, it just shifts the offset in
one direction by increase in the delay. This works well when the link
is saturated in one direction, but under normal conditions it makes the
timekeeping worse, so you need to consider if it's worth enabling.

If you want to see why ntpd can't drop more samples you can block the
NTP packets in firewall, e.g. in a cycle which allows 4 packets and
drops 60. The PLL will be unstable, frequency will be jumping up
and down, offset orders of magnitude higher. This is the reason why
some other NTP implementations were created.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] WARNING: someone's faking a leap second tonight

2012-08-31 Thread Miroslav Lichvar
On Thu, Aug 02, 2012 at 05:57:43AM +, Dave Hart wrote:
> On Thu, Aug 2, 2012 at 1:17 AM, Chris Adams wrote:
> > I'm still seeing leap=01 from 204.235.61.9 (name1.glorb.com), a
> > stratum-2 server in the US pool (a few of my systems have it in their
> > list).
> 
> That particular system seems to have corrected its leap indication,
> but plenty of other pool participants are advertising leap.  I have
> this laptop set to associate with every IP in a list of all pool
> servers as of late June.  The following are showing leap=01 now:
> 
[...]

>From that list the following IPv4 servers still seem to be announcing
a pending leap second:

131.155.140.129  Netherlands
131.155.140.130  Netherlands
143.121.199.173  Netherlands
161.53.248.35Croatia
164.107.116.179  United States
178.237.34.94Netherlands
192.87.106.2 Netherlands
192.87.106.3 Netherlands
192.87.36.4  Netherlands
193.2.111.2  Slovenia
193.2.111.3  Slovenia
193.2.4.2Slovenia
193.2.78.228 Slovenia
193.77.222.200   Slovenia
193.77.237.128   Slovenia
193.95.229.133   Slovenia
194.171.167.130  Netherlands
194.249.198.30   Slovenia
213.129.242.82   Austria
213.206.85.20Netherlands
217.75.72.153Slovakia
219.117.206.46   Japan
64.22.125.197United States
67.209.225.216   United States
69.65.33.188 United States
72.14.178.210United States
77.245.91.218Netherlands
77.94.135.133Slovenia
80.239.2.130 Norway
81.167.109.120   Norway
81.187.35.170United Kingdom
81.93.163.20 Norway
81.93.163.23 Norway
82.197.80.125United Kingdom
83.98.201.133Netherlands
83.98.201.134Netherlands
85.158.249.144   Netherlands
85.17.71.101 Netherlands
85.252.162.7 Norway
86.61.66.23  Slovenia
90.155.74.40 United Kingdom
91.198.87.118Netherlands
94.26.2.134  Bulgaria
95.211.7.153 Netherlands
98.191.213.7 United States

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Testing throughput in NTP servers

2012-09-13 Thread Miroslav Lichvar
On Wed, Sep 12, 2012 at 02:28:24PM +0200, Ulf Samuelsson wrote:
> Anyone knows if there are any available Linux based S/W to test the
> throughput of NTP servers?
> I.E:
> 
>   packets per second?
>   % of lost packets
>   etc?

I've used tcpdump and tcpreplay to measure the maximum packet rate
ntpd can handle. IIRC, the ntpd process itself needed only a couple of
percent of the CPU, I think the bottleneck is always in the kernel or
the NIC.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] SO_TIMESTAMPING experiments (sub-us jitter over LAN)

2012-10-18 Thread Miroslav Lichvar
On Thu, Oct 18, 2012 at 02:53:02AM -0700, gabs wrote:
> SO_TIMESTAMPING [1] is a socket option for obtaining transmit and receive
> timestamps. ntpd uses SO_TIMESTAMP to get receive timestamps. Transmit
> timestamps require NIC driver support [2] and the application gets the
> timestamp after the packet is sent.
> 
> A PTP-like protocol is used to measure the delay and offset between the
> client and server. The first part is a message exchange similar to
> NTP [3], except the client gets a more accurate transmit timestamp
> (T1) from the kernel after sending the packet. The server, after sending
> its reply, gets the kernel transmit timestamp (T4), then sends another
> packet containing T4 (similar to the PTP Sync Follow message). The
> median of 6 offsets are sent to the client's ntpd thru the SHM driver.

NTP supports interleaved mode in peer associations which does the
timestamp followup. It would be really nice if ntpd supported the
SO_TIMESTAMPING option.

http://www.eecis.udel.edu/~mills/ntp/html/xleave.html

> Both are running Linux kernel 3.3.4, tickless, no preemption,

You may want to try nohz=off, it seems there is a bug in the kernel
which can cause an extra jitter of couple microseconds in the
system clock readings when the system is idle.

> Sample measurements (raw):
> left: delay
> right: offset (in microseconds)
> 
> 91509  509
> 89577 -991
> 88365 -795
> 89574 -731
> 90593 -163
> 89360 -1067
> 
> 92650 -318
> 90634 -910
> 89455 -989
> 89511 -1080
> 88874 -693
> 88534 -1140

Cool. Are those numbers nanoseconds?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] What is the NTP recovery time from 16s step in GPS server?

2012-10-31 Thread Miroslav Lichvar
On Wed, Oct 31, 2012 at 05:22:44PM +, Rob wrote:
> Using USB ports in a service started at boot time should normally
> work ok, but when it has issues on the Raspberry maybe it could
> be solved by delaying the startup of gpsd a bit.  But don't try to
> tackle all issues at the same time.

Isn't it better to start it from udev then? The gpsd sources provide a
hotplug script, which I think is included at least in the Debian and Fedora
gpsd packages.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Timing issue with Linux and kernel PPS?

2012-11-19 Thread Miroslav Lichvar
On Mon, Nov 19, 2012 at 09:02:12AM +, David Taylor wrote:
> On 18/11/2012 15:20, Uwe Klein wrote:
> >what happens if you "insmod pps_ldisc" into the "not ready" system?
> 
> (1) I get Error: could not load pps_ldisc module: No such file or directory

insmod needs full path to the module, it's better to call "modprobe pps_ldisc".

> I looked at the article you referenced, but in this case the
> Raspberry Pi is not using the DCD line, but a separate GPIO pin.
> lsmod shows pps_gpio as present.

>From the original post it seems you have two pps devices, one for gpio
and the other for ldisc which is created two minutes later (some USB
device?).

Do you see two /dev/pps* devices and are you sure ntpd is using the
gpio one? Perhaps there is an ordering problem?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Timing issue with Linux and kernel PPS?

2012-11-20 Thread Miroslav Lichvar
On Mon, Nov 19, 2012 at 06:03:06PM +, David Taylor wrote:
> On both systems, sudo modprobe pps_ldisc produces no output.

No message is a good message :).

> I have no idea which device ntpd is using, I simply have the type 22
> driver installed which, as I understood it, gets the accurate
> timestamp from the kernel. 

127.127.22.0 is /dev/pps0, 127.127.22.1 is /dev/pps1, ...

> How the kernel chooses which device to
> use I don't know.

With udev the order might be random. There could be a race between the
script which loads modules from /etc/modules and udev.

> In /dev I see pps0 on the system without a PPS signal connected, and
> pps0 and pps1 on the system /with/ the PPS signal active.  On the
> system /with/ the signal active, some 25 seconds in the dmes output
> I see: pps_ldisc registered (so ldisc does matter, I stand
> corrected), followed by pps1 new source, and source /dev/ttyAMA0
> added.

You can see what pps device is actually generating events with:
grep '' /sys/class/pps/pps*/{assert,clear}

> So the issue appears to be that /dev/ttyAMA0 is not created until
> the GPS receiver is sending second pulses, and by that time ntpd is
> running and can't see the device.  Here are my lines from ntp.conf:
> 
> # Kernel-mode PPS ref-clock for the precise seconds
> server 127.127.22.0 minpoll 4 maxpoll 4
> fudge 127.127.22.0  flag3 1  refid PPS
> 
> I wonder whether I should be using 127.127.22.1 rather than .0?

Perhaps. Do you use in ntpd the serial output from the GPS with some
driver like NMEA?

If you don't need the pps from /dev/ttyACM0, my suggestion would be to
prevent loading of the pps_ldisc module, so there is always only one
pps device. Any chance you added a udev rule to load pps_ldisc
automatically when the serial device is created? 

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] A proposal to use NIC launch time support to improve NTP

2012-12-13 Thread Miroslav Lichvar
On Thu, Dec 13, 2012 at 08:23:47AM -0500, Brian Utterback wrote:
> >The internal clock of the network controller is the PHC for IEEE1588,
> >it has a 1 ns resolution, and can be steered with a 32 bit fractional
> >of 1 ns. see SYSTIML and TIMINCA in the I210 datasheet.
> >
> >// jwalck
> 
> I know that. The problem is that there is going to be jitter
> introduced when you set the clock from the kernel. That is generally
> the problem with IEEE 1588, getting the time from the controller to
> the kernel and vice versa. If you have to go across a PCI bus for
> instance that will introduce jitter.

>From what I have seen, with multiple readings and some filtering, the
jitter is very small, somewhere in nanoseconds or couple tens of
nanoseconds. Even if the delay was highly asymmetric, with 2us RTT the
error would be only 1 us, which is still much better than the delays
causing the error in the TX timestamp on Ethernet.

The phc2sys program from the linuxptp project can be used to
synchronize the system clock to the PHC or the PHC to the system
clock. It can do that via PPS or filtered clock readings. 

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] A proposal to use NIC launch time support to improve NTP

2012-12-19 Thread Miroslav Lichvar
On Wed, Dec 19, 2012 at 11:05:57AM -0500, Brian Utterback wrote:
> On 12/19/12 10:12, Ulf Samuelsson wrote:
> >The desired launchtime is compared to the network controller
> >timestamp counter in H/W, so again there is no need to synchronize
> >with the system time.
> 
> Yes there is. The ntpd program has to set a timestamp in the
> outgoing packet and then specify the launchtime when it writes the
> packet. The goal here is to have the timestamp written in the packet
> exactly match the time the packet actually hits the wire. So, the
> timestamp in the packet must be a little in the future when it is
> written so that by the time the controller gets it the packet can be
> delayed until the right time. Since ntpd cannot access the clock in
> the controller, this requires that the kernel time be relatively
> close to the controller time.

ntpd can read the clock, much more slowly than the system clock, but
still fast enough to send tens of thousands of packets per second.

I think it makes more sense to have one loop controlling just the PHC
and another, much tighter, syncing the system clock from the PHC,
rather than trying to sync the system clock through the PHC.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] How do I validate my PPS clocks?

2013-02-25 Thread Miroslav Lichvar
On Mon, Feb 25, 2013 at 01:44:02PM +0100, Kasper Pedersen wrote:
> From the PPS arrives, and to the kernel timestamps it, is a very long time.
> I wrote this to measure it:
>  http://n1.taur.dk/edgetest.c
> (you will need a linux machine, gcc, and kernel-headers to compile)

Very interesting, thanks! For my machine it shows that the interrupt
latency is around 12 us.

I'm wondering if the kernel module could have an option which would
enable a polling method to time stamp the PPS events.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] What should the poll be for the shared memory driver (type 28)?

2013-06-18 Thread Miroslav Lichvar
On Mon, Jun 17, 2013 at 07:02:09PM +0100, David Taylor wrote:
> Thanks, Steve.  My knowledge of the source tree is even more limited
> than my knowledge of "C"!  In refclock_shm.c, it does say that the
> "peek" routine is called every second, so if the type 28 driver has
> an internal poll of one second, does it matter what min/max poll is
> set in the ntp.conf file?  Does it even need to be set at all?

IIRC, the one second interval is used only to collect the SHM samples
and store them in a buffer. In the minpoll/maxpoll interval the
collected samples are processed in a median filter and one final
sample is used to update the clock. This improves the jitter.

> I'll try setting it to 4, as a test, and see whether anything changes.

The difference with NTP sources is that changing the poll interval
affects also the jitter as there will be a different number of samples
in the filter.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP makes a time jump

2013-07-09 Thread Miroslav Lichvar
On Mon, Jul 08, 2013 at 08:19:12PM +, unruh wrote:
> Now, If we know that the max difference between the client and server's
> drift rate is say 200PPM, then if one could limit the server to only
> slewing at 300PPM then the client should be able to keep up. But I do
> not know of any way of telling the server it should never slew faster
> than 300PPM. Is there one?

I think the kernel would have to be recompiled with a smaller
MAXFREQ_SCALED constant or ntpd recompiled with smaller NTP_MAXFREQ if
the kernel discipline is disabled.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] ISP bloked port 123

2013-09-18 Thread Miroslav Lichvar
On Wed, Sep 18, 2013 at 09:53:44AM +0100, David Taylor wrote:
> On 18/09/2013 08:55, Bert Gøtterup Petersen wrote:
> >At the moment our best bet seem to be using 'ntpdate' on a
> >different port at regular intervals. From a SW perspective, this is
> >not nice nor elegant, but it would do the trick...

> If you have guaranteed Internet access,
> but with 123 blocked, then you could use:
> 
> - the HTTP protocol on port 80, and get the header information which
> includes the time from a known page on a known reliable server - one
> of your own, of course!  You could use the Last-Modified or Expires
> times, both of which I expect that you could program to return the
> current date and time.  Should be OK for 5-minute accuracy.
> 
> If the following ports are not blocked...
> 
> - use the time protocol on port 37
> 
> - use the daytime protocol on port 13

Or NTP can be used on a different port. ntpd doesn't seem to have a
configuration option to set the port number, but it can be easily
changed in the source code (NTP_PORT in ntp.h) and recompiled.

If it's ok to use a different NTP client, chrony has options to set
the local port and the remote port. If the local port is set to 0, the
port will be assigned randomly, effectively making it a client-only
mode (similar to ntpdate -u).

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP not syncing

2013-12-06 Thread Miroslav Lichvar
On Fri, Dec 06, 2013 at 10:17:48AM +, David Taylor wrote:
> On 06/12/2013 09:36, Harlan Stenn wrote:
> []
> >The only systems we've seen that did this are Linux kernels, and it
> >would be good to get the starting and ending dates/kernel numbers for
> >this behavior.
> >
> >H
> 
> The only data points I can contribute to that are that 3.2.27 and
> 3.6.11 appear to be OK, at least on the Raspberry Pi (Debian).

The relevant commit seems to be
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=08ec0c58fb8a05d3191d5cb6f5d6f81adb419798

It was included in 2.6.38.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Slow convergence loopstats (but nice results)

2013-12-12 Thread Miroslav Lichvar
On Thu, Dec 12, 2013 at 09:59:03AM +0100, Martin Burnicki wrote:
> A major problem was that the standard NTP protocol doesn't support a
> way to send the captured time stamp of a previously sent packet to
> its client, as done by the so-called followup message in PTP.

ntpd has the peer and broadcast interleave modes to send the followup
time stamps.
http://www.eecis.udel.edu/~mills/ntp/html/xleave.html

Also, there is a feature called launch time, which is supported in
some NICs, so the follow up message is not always necessary.

> I don't know if new standard NIC chips which support PTP
> timestamping can also timestamp NTP packets, but even if they do
> then in practice there's still the problem with network switches,
> etc.

Some NICs can time stamp any packets.

> There are network switches out there which are PTP-aware and also
> timestamp incoming and outgoing PTP packets to compensate the
> introduced packet delay in some way, but there are no switches
> (AFAIK) which can do this with NTP packets, so even if you used
> hardware time stamping of NTP packets on NTP end nodes the resulting
> accuracy would still be worse than with PTP.
> 
> That's too sad.

Agreed. I think it's possible to implement a HW NTP support, but there
is problem that the switch would have to keep some state about each
NTP association. If there was a standardized extension field to store
the processing delay in both directions, that wouldn't be necessary.
I'm not sure what would have to be done to not break the NTP
authentication.

A major advantage NTP has over PTP is that it knows the delay for
each measurement in the client/server and symmetric modes, which
allows it to filter out bad measurements. In PTP the delay is measured
independently (similarly to the NTP broadcast mode), so bad
measurements can't be easily ignored and it's necessary to have all
networking HW with PTP support to account for all processing delays.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] better rate limiting against amplification attacks?

2014-01-16 Thread Miroslav Lichvar
On Wed, Jan 15, 2014 at 08:35:32PM +, Rob wrote:
> William Unruh  wrote:
> > I do not mean the default in the config file, I mean the default if
> > there is no config file or if nothing is set in the config file.
> 
> That only becomes meaningful when ntpd starts to actually work without
> config file.  Of course that would be possible, but I don't think it
> is reality today.  Or is it, in the latest versions?

Servers can be now specified on the command line, so you don't really
need a config file to have ntpd doing something useful. The following
command seems to work as expected.

ntpd -c /dev/null 0.pool.ntp.org

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] better rate limiting against amplification attacks?

2014-01-16 Thread Miroslav Lichvar
On Thu, Jan 16, 2014 at 02:28:32PM +0100, Martin Burnicki wrote:
> Harlan Stenn wrote:
> >  pool 0.debian.pool.ntp.org iburst
> 
> I bet the "server" options for pool servers are in there because
> this was used in earlier versions before the "pool" keyword was
> introduced, and it still works.
> 
> >instead, and I'd have to look up when the 'pool' directive was put in
> >there.
> 
> IIRC this is supported in 4.2.6, but has not been supported in
> 4.2.4p8 and earlier. If the ntp.conf file shipped with a particular
> OS has been initially created a long time ago and always been
> updated for newer NTP versions then I'm not surprised to see this.

IIRC the pool command in 4.2.6 uses quite a lot of servers, which
probably is not an acceptable use of pool.ntp.org. I think it was
improved later in 4.2.7. The page about recommended configuration
doesn't mention it yet.

http://www.pool.ntp.org/en/use.html

Vendors should be careful with the pool command.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] ntpdc and collectd queries timeout

2014-01-24 Thread Miroslav Lichvar
On Fri, Jan 24, 2014 at 12:28:27PM +0100, Terje Mathisen wrote:
> michalpurzyns...@gmail.com wrote:
> >The ntpdc queries timeout every time on the NTP version
> >ntp-dev-4.2.7p411 (compiled myself). Looks like the type 7 packets
> >are blocked from localhost but I don't know why.
> 
> Type 7 (which is used by ntpdc) isn't blocked on ntp-dev, it has
> been _removed_!

Wasn't it only disabled by default? It still seems to be in 4.2.7p411
in the ntp_request.c file, but "enable mode7" is now required to
process the ntpdc queries.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] simple nt.conf cases for ntp-client

2014-01-28 Thread Miroslav Lichvar
On Fri, Jan 24, 2014 at 10:13:15PM +, William Unruh wrote:
> On 2014-01-24, David Woolley  wrote:
> > If there is a prefer peer and it survives, it uses that one, otherwise 
> > as per clock_combine in ntp_proto.c, i.e. weighted by synchronisation 
> > distance (which grows with time).

> > The weighting may change between versions.  This is 4.2.7p333.
> >
> >  y = z = 0;
> >  for (i = 0; i < npeers; i++) {
> >  x = 1. / peers[i].synch;
> >  y += x;
> >  z += x * peers[i].peer->offset;
> >  }
> >  sys_offset = z / y;
> >
> 
> So, if this is calculated immediately after a new selected-by-filter reading
> comes in, x is infinity and only the latest one is used.

The synchronization distance includes also delay, dispersion and
precision, so it should never be zero and x should be real.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] status information after ntpd -q

2014-02-04 Thread Miroslav Lichvar
On Tue, Feb 04, 2014 at 10:29:29AM +, Sanal, Arjun (NSN - IN/Bangalore) 
wrote:
> 
> >> i would like to use the command "ntpd -q" to synchronize with a server 
> >> once,
> >> but i need some feedback from the command about the status.
> >
> > ntpd was designed and is intended to run all of the time as a daemon,
> > but you can do what you've asked for by setting explicit logging path like:
> >
> > # ntpd -q -l /tmp/ntpd.log
> 
> Is there any specific reason why 'ntpd -q' doesn't return any error code (for 
> example: situations like server not reachable)

There is a bug filed for that:
https://bugs.ntp.org/show_bug.cgi?id=759

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] IEEE 1588 (PTP) at the nanosecond level?

2014-03-19 Thread Miroslav Lichvar
On Tue, Mar 18, 2014 at 10:20:08PM +0100, Magnus Danielson wrote:
> No, it's not. NTP is being perceived to be "software timestamping"
> but nothing prohibits you from doing it in hardware. Similarly can
> you implement PTP with software time-stamping (with shitty
> performance).
> 
> Doing HNTP makes NTP match up against PTPv1 to some degree, but PTP
> then pulls out the explicit means to make PTP-aware transparent
> clocks to correct for delays, cancelling some of the asymmetry. You
> could do NTP with PTP 2-step processing, but what we would call such
> a bastard would be an interesting thing, NPTP?

There is already a "two step" mode implemented in ntpd that works with
NTP peers or broadcast, it's activated by the xleave option.

An NTP transparent clock could be implemented too. One problem is that
with the current protocol it would have to track the connections. For
a stateless operation a new NTP extension field would probably be needed.
Similarly to PTP, all NTP-aware routers and switches between NTP
server and client would increment a path delay correction.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Problem facing with Ntp client Configuration

2014-04-01 Thread Miroslav Lichvar
On Fri, Mar 28, 2014 at 10:09:25AM -0400, Brian Utterback wrote:
> Two is never fine, and not just because of clock hopping. Like the
> old adage, a man with a watch knows what time it is, a man with two
> watches is never sure, NTP will often refuse to set the time with
> just two upstream sources if the two sources do not agree and the
> dispersion intervals do not overlap.

I think we had this discussion before. I wouln't say that two is never
fine. I think two is much better than one if you need to be able to
tell when there is a problem and don't need to recover automatically.
Clock hopping shouldn't be a problem since source combining was
implemented.

> That means that two servers can
> agree on the time to within a millisecond of each other, but is the
> dispersion is less than a half of a millisecond, NTP will not set
> the clock by either of them.

Well, at least one of the servers is a falseticker if their intervals
don't overlap and it should be fixed to not lie about its dispersion.
Adding a third source just to hide this problem doesn't seem right.

> But I would like to point out something to you. You often remind us
> that NTP only uses one in eight data points. But each server you add
> means one more data point used, which means that if eight servers
> were used then NTP would be using the same number of data points
> that it would be if it only had one server and used all the data
> points. 

Please note that the data points are not equal. The point which is
used to update the clock has the shortest distance and may carry more
useful information than the other points combined if the clock is
stable enough.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Three NTP servers, one strange IP-address in 'refid'

2014-04-02 Thread Miroslav Lichvar
On Wed, Apr 02, 2014 at 09:48:26AM +0200, Sander Smeenk wrote:
> Quoting E-Mail Sent to this address will be added to the BlackLists 
> (Null@BlackList.Anitech-Systems.invalid):
> > I guess it could also be a IPv6 ref mangling issue?
> 
> That could well be. We use IPv6 where we can.
> But that would constitute this refid issue a bug.
> One that is rather confusing and time-consuming.

For IPv6 addresses the refid is defined as first 4 bytes of the MD5
sum of the address. With 2001:7b8:3:32:213:136:0:252 (tt52.ripe.net)
that is 0xac023551, or 172.2.53.81 in the quad-dotted notation.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Three NTP servers, one strange IP-address in 'refid'

2014-04-07 Thread Miroslav Lichvar
On Mon, Apr 07, 2014 at 09:28:09AM +0200, Martin Burnicki wrote:
> Rob wrote:
> >When the NTP server puts an IPv6 hash in the refid field, it could set
> >the upper 4 bits to 1.  (so the hex value starts with F)
> >A valid IPv4 address never has that, so ntpq could print it in hex in
> >this case, and as a dotted quad in other cases.
> >
> >This also guarantees a hashed IPv6 can never collide with a valid IPv4
> >refid.  But at the same time, it shrinks the space of IPv6 hashes,
> >increasing the chance of a hash collision between two IPv6 addresses.
> 
> In my opinion this sounds reasonable. The danger of collision might
> be slightly higher (less with IPv4, a little bit more with another
> IPv6 hash), but for users it would avoid confusing IPv4 addresses
> with IPv6 hashes.

If I'm not mistaken, the main purpose of the refid value is detection
of synchronization loops. To not break that, all NTP servers would
have to update their refid definition at the same time. That's not
doable. Fixing the tools to print the value in hex instead of dotted
quads to avoid confusion seems like a better fix to me.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Handle ntp conf modification when ntp is already running

2014-04-11 Thread Miroslav Lichvar
On Thu, Apr 10, 2014 at 08:20:59PM +, Harlan Stenn wrote:
> Rob writes:
> > Furthermore, the "simple solution" of having SIGHUP perform an exec
> > of the same binary, thus in fact restarting the entire process and
> > losing all state information, is not the only possible solution.
> 
> If the current process has chroot()ed, how do you re-exec?  How do you
> handle the things that are done before the chroot()?  Again, I haven't
> looked at the code to be sure, but I believe there are some things that
> will behave differently if they are attempted from the chroot() target.
> 
> Sure, one could have a top-level master process that simply waits for
> the chroot()ed subprocess to die and then restarts it, but we're
> starting to get in to a lot of wheel-reinventing here, and would this
> really be worth the overhead on a program that is already larger and
> more complicated than many folks want?

That sounds like a horrible hack.

Even without chroot it will be difficult. If the ntpd process dropped
root privileges after start, it won't be able to re-exec and it may
not have permissions to open newly added refclocks or reread the keys,
for instance.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP.log interpretation

2014-04-18 Thread Miroslav Lichvar
On Fri, Apr 18, 2014 at 09:01:09AM -0500, GregL wrote:
> >   What you should do is to add more servers to the config.
> 
> What about the idea of going to only one entry, but that entry is served by
> a DNS load balancer to choose one of two internal time servers to check.
>  Each of those, is configured to point at a pool of time servers (4 each).

Well, that will prevent the client from detecting it's getting wrong
time. Is that what you want?

>From the log it seems that at least one server is completely wrong,
the offset between the two servers is around 3 seconds! I'd suggest to
fix that first.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP.log interpretation

2014-04-18 Thread Miroslav Lichvar
On Fri, Apr 18, 2014 at 10:38:10AM -0500, GregL wrote:
> But, was the "sychronization lost" message *because* ntp saw the time
> difference so great on peer servers...and chose one to synch to...resulting
> in the time reset message?

It seems so. Not sure how close this is to the version you are
running, but in xntp3-5.93e (dated 1998) it seems the system peer is
unselected (and the message logged) on every clock step.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Precision changed after upgrade from ntp 4.2.4p4 to 4.2.6p2

2014-05-05 Thread Miroslav Lichvar
On Sun, May 04, 2014 at 08:29:26PM +0100, Caecilius wrote:
> After upgrading ntp from 4.2.4p4 to 4.2.6p2 as part of a Linux upgrade
> from Debian Lenny to Squueze, I've noticed that the precision variable
> has changed from -20 to -22. So it appears that my clock has now got a
> better precision. But the hardware is unchanged, and I'm running the
> same kernel.
> 
> I thought the precision was dependent on the "granularity" of the
> system clock, which I would have expected to be independent of the ntp
> version and any other userland code.  Am I misunderstanding something
> perhaps?

The older ntpd is probably using gettimeofday() which has microsecond
resolution (-20 in the log scale) and not the nanosecond
clock_gettime().

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Fwd: Re: Best ways to get the reference times from ntp

2014-05-12 Thread Miroslav Lichvar
On Wed, May 07, 2014 at 07:40:28PM +, William Unruh wrote:
> On 2014-05-07, mike cook  wrote:
> > Le 7 mai 2014 ? 18:32, William Unruh a ?crit :
> >> The short answer is no, ntpd cannot play this game. You are trying to
> >> use A to discipline not only B but C as well but on machine B.
> >
> > My reading is that C is not being disciplined at all, but is to be used a 
> > reference (though non UTC) for B.
> 
> That is my reading as well. But something must be done to determine
> those values of x and y (Ctime= xT+y where T is UTC). Either that can be
> done on C using something like chrony (better) or ntpd, or B could run
> something to determine x and y for C and use those to help discipline B. 

The OP said the frequency offset of C is known, so only y is unknown
if I'm reading it right. But he also said that A and C are in the same
network, so I'm not sure if the frequency of C can be tranferred to B
with better accuracy than the frequency of A and if this idea of using
A to estimate offset and using C to estimate frequency can give better
results than just simply increasing the rate of polling of A.

I think support for frequency only sources wouldn't be very difficult
to add to chrony. Add a new selection option to bypass the selection
algorithm and just combine its frequency with other sources by
estimated skew. This could work with both NTP sources and reference
clocks.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Servers in virtual machines

2014-06-23 Thread Miroslav Lichvar
On Mon, Jun 23, 2014 at 12:28:53PM +0100, David Woolley wrote:
> On 23/06/14 12:03, Rob Heemskerk wrote:
> >Could we say it is safe to run ntp servers on a virtualized platform or do 
> >we still need a few (4?) dedicated pieces of hardware to run our internal 
> >NTP servers?
> 
> No.
> 
> Normal virtualised machines are not intended for hard realtime applications.
> Also, the host clock can and should be disciplined using NTP, so there is a
> risk double correction.

I think it all depends on the VM implementation and what clocksource
is used in the guest. If the guest is using tsc (i.e. its frequency is
independent of the host clock), it will need to run its own NTP
client. If the guest's clock is locked to the host's system clock,
there still may be a static offset between them and an NTP client
(possibly using the host as the NTP server) can be used to correct the
offset.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Pool Server Costs me $40/mo in Bandwidth--is

2014-06-24 Thread Miroslav Lichvar
On Mon, Jun 23, 2014 at 11:45:16PM -0500, Mike S wrote:
> On 6/16/2014 6:05 AM, Jochen Bern wrote:
> 
> >There are four official slots - two primary, two secondary - over the
> >course of the year to insert leap seconds,
> 
> Those are only preferences. Leap seconds may be inserted at any month
> boundary.
> 
> "A positive or negative leap-second should be the last second of a UTC
> month, but first preference should be given to the end of December and June,
> and second preference to the end of March and September." - ITU-R TF.460-6

Sooner or later, not even 12 leap seconds per year will be enough to
keep UTC close to UT1. Hopefully they will be abolished long before
that.

Practically speaking, beside having to make more than two corrections
per year (which is not expected to happen in the next few decades),
could there be any reason to do it in other months than June and
December? Older ntpd versions (< 4.2.5p53) used to check the month
before setting the leap flag and I'm wondering if it still can used to
detect spurious leap seconds.

FWIW, the IERS announcements say "Leap seconds can be introduced in
UTC at the end of the months of December or June, depending on the
evolution of UT1-TAI."

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Pool Server Costs me $40/mo in Bandwidth--is

2014-06-24 Thread Miroslav Lichvar
On Tue, Jun 24, 2014 at 12:08:10PM +0200, Jochen Bern wrote:
> I've browsed the results of the infamous poll and most of the people
> voting "abolish leap seconds" apparently didn't mean to actually
> *abolish* them (as in, decouple UT1 and UTC, or whatever their
> successors might be called), but to have them *rearranged* into fewer
> and larger leaps. Of course, one can imagine that to go the other way -
> i.e., smaller but more frequent leaps.

As someone who implemented support for leap seconds in several
applications, I'd really like to see them gone. Fixing all software
where time is critical to handle them correctly may not be possible
and from what I've heard a common solution is just to turn it off and
wait until it passes.

Making smaller but more frequent corrections would probably only make
it worse.

To me, it seems the reasonable thing to do would be to decouple UTC and
UT1 completely and make the adjustment at a higher level like
timezones if necessary. Countries adjust their timezones all the time,
we can handle that better.

> (Returning to your question as phrased, and circumstances as of today:
> IIUC the quality of prediction *would* already suffice to attempt
> scheduling leap seconds so as to aim for min-sum-of-squares, rather than
> predefined schedule slots.)

Good point. The question is if they will ever choose to do that.

Thanks,

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Pool Server Costs me $40/mo in Bandwidth--is

2014-06-24 Thread Miroslav Lichvar
On Tue, Jun 24, 2014 at 03:46:15PM +0200, Jochen Bern wrote:
> While I may have started from the same setting, I *did* try to put
> myself into the shoes of astronomers and people operating satellite
> systems (which, ironically, includes the popular stratum 0 of GPS).

Do these people work just with UTC? I'd think it's not accurate enough
for their purposes and they need to include the current UTC-UT1
offset anyway.

> Personally, I'd say that if a computer's clock's best suited to run on
> TAI (or equivalent) and all data needs to be converted from it to $TZ
> for the users, anyway, then having it run on TAI and disseminating and
> handling a TAI-UTC delta along with the sync and timezone deltas seems
> like the proper approach. But that wish doesn't change gettimeofday()
> implementations all over the globe with a snap of my fingers, does it.

Agreed, but wouldn't switching to TAI everywhere be much more
difficult than stopping messing with UTC and keep it a fixed offset
from TAI?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Pool Server Costs me $40/mo in Bandwidth--is

2014-06-25 Thread Miroslav Lichvar
On Tue, Jun 24, 2014 at 06:25:37PM +0200, Jochen Bern wrote:
> On -10.01.-28163 20:59, Miroslav Lichvar wrote:
> > Agreed, but wouldn't switching to TAI everywhere be much more
> > difficult than stopping messing with UTC and keep it a fixed offset
> > from TAI?
> 
> Having computer clocks run on UTC(frozen) instead of TAI makes the
> adaptation easier today, more difficult tomorrow ("do we *really* need
> to work on that for (n<3) seconds of an offset!?"), and no less
> necessary in the long run (when UT1-TAI has grown much larger than
> UT1-UTC(frozen), and changes much faster as well). I prefer to have the
> slope right where the ball needs to get rolling. ;-)

I was thinking about larger adjustments in the timezones, like 15, 30
or 60 minutes. They could be announced decades or centuries ahead, but
possibly they would be hidden in the noise of the political/religious
adjustments that are common today. Before the first correction is
needed, maybe a global fixed timezone (or UTC directly) is already
used everywhere and the position of the Sun observed at 12:00 is let
to slowly revolve around Earth.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Pool Server Costs me $40/mo in Bandwidth--is

2014-06-25 Thread Miroslav Lichvar
On Tue, Jun 24, 2014 at 06:13:17PM -0500, Mike S wrote:
> On 6/24/2014 5:59 AM, Miroslav Lichvar wrote:
> >To me, it seems the reasonable thing to do would be to decouple UTC and
> >UT1 completely and make the adjustment at a higher level like
> >timezones if necessary.
> 
> You're doing it wrong. If you don't want leap seconds, use a timescale which
> doesn't have them (e.g. TAI, GPS). UTC was created to closely track Sol.
> Decoupling that breaks its purpose, and the promise made when it took over
> from GMT.

Yes, but to me it looks like redefining UTC to not track solar time
anymore is easier than converting everyone and everything to keep time
in TAI.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Thoughts on KOD

2014-07-08 Thread Miroslav Lichvar
On Mon, Jul 07, 2014 at 07:04:01PM +0200, Jan Ceuleers wrote:
> I'm not sure why sending the requester's timestamp back to him is better
> than an immutable timestamp.
> 
> The effect of the former is slow drift, the effect of the latter is (I
> suspect) no lock at all due to the lack of passage of time. So I think
> that the latter is more likely to catch the admin's eye. If there is an
> admin.

I think most clients check at least one of the stratum/leap fields
and don't use the time stamps from a KOD response to actually update
their clock.

If the KOD response was modified to set the leap and stratum bits as
synchronized, the client would drift slowly away, but ntpd would need
to stick to it and never send the client correct time.

I agree that purposely serving bad time might be the best way how to
get an attention of the user and get the NTP implementation fixed if
it can be identified reliably and no innocent clients behind that IP
adress are harmed.

The identification could be improved, for example by monitoring the
distribution of the client's polling interval as simple clients use a
fixed interval, but I'm not sure if it's possible to make it so
reliable that ntpd could be allowed to send a reponse with purposely
bad time.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] LOCL clock reachability not 377?

2014-07-31 Thread Miroslav Lichvar
On Thu, Jul 31, 2014 at 10:43:20AM +0200, Martin Burnicki wrote:
> Rob schrieb:
> >However, that is broken.  Not only do you probably not want to mark
> >that clock prefer (external references are often more accurate than the
> >serial NMEA time, for example), but also you may have two or more ATOM
> >PPS clocks, each with their own status, and there is no way to do that
> >with this method.
> 
> I've already proposed some times ago that another way of assigning PPS
> signal(s) to other time source(s) would be more versatile:
> http://lists.ntp.org/pipermail/questions/2009-April/022599.html
> http://lists.ntp.org/pipermail/questions/2009-April/022600.html
> 
> This would also provide a simple way to declare a PPS signal "reliable",
> e.g. if it is derived from a Rubidium or so, in which case it could continue
> to be accepted even though other time sources become unreachable.
> 
> On the other hand, if a PPS input signal is associated to a particular time
> source the PPS signal could be discarded if the associated time source
> becomes unreachable.

Agreed, it would be useful to have an option to specify the PPS->time
source association for each PPS refclock directly.

In chrony, this is done with the lock refclock option. It's typically
used like this:

refclock SHM 0 offset 0.5 refid SHM0 noselect
refclock PPS /dev/pps0 lock SHM0

The SHM refclock (e.g. GPS NMEA) is configured with the noselect
option so it's never selected and only used by the PPS refclock to
align the pulses to the SHM time. When SHM stops getting new samples
the PPS refclock will stop immediately too.

When the PPS refclock doesn't have the lock option and the local
stratum option is not used, the pulses will be accepted only when the
clock is synchronized, first to another refclock or NTP server and
then possibly the PPS refclock itself. If local stratum is enabled,
the PPS will work immediately without any other sources, but the clock
obviously needs to be already close to the correct time on start,
otherwise it will be off by a whole number of seconds.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] LOCL clock reachability not 377?

2014-08-01 Thread Miroslav Lichvar
On Thu, Jul 31, 2014 at 04:31:08PM +0200, Martin Burnicki wrote:
> This sounds good. I think we'd have to distinguish some basic cases a few of
> which immediately come to my mind:
> 
> 1) A refclock provides absolute time, status, and a PPS signal
> 
> 1a) The refclock contains a good oscillator, so the PPS signal could be
> accepted for some time after the refclock started freewheeling.
> 
> 1b) The refclock only has a simply xtal which starts drifting immediately
> when the refclock starts freewheeling.
> 
> 
> 2) A good PPS signal is available, but no absolute time (e.g. in case of a
> Rubidium)
> 
> 2a) Some status information is available telling if the PPS signal is "good"
> or not
> 
> 2b) No information on the PPS quality is available

To generalize it a bit more, there could be also a case of a PPS that
is not locked in phase and a case of a PPS that's not even locked in
frequency. When only a source with poor short-term stability is
available, I think it would be pretty cool if it could be combined
with a PPS derived from a cheap TCXO. Doing this in ntpd could be
tricky however.

> Beside the implementation of such a flexible concept in ntpd it would have
> to be discussed how this can easily be configured. With NTP's basic
> configuration syntax in mind a possible way could be something like this:
> 
> # a refclock with PPS signal but no good oscillator
> server 127.127.8.0
> server 127.127.22.0 ref 127.127.8.0
> 
> # a refclock with PPS signal and good oscillator
> server 127.127.8.1
> server 127.127.22.1 ref 127.127.8.1 trust 3600
> 
> # a PPS source relying on the usual system peer to
> # provide absolute time
> server 127.127.22.2 ref sys_peer
> 
> # a PPS source which should be trusted always
> server 127.127.22.3 trust always

This looks good, but shouldn't it be rather specified with a fudge
command?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] LOCL clock reachability not 377?

2014-08-01 Thread Miroslav Lichvar
On Thu, Jul 31, 2014 at 10:43:12PM +, Rob wrote:
> William Unruh  wrote:
> > I think you need to read up on the cmos clock. As I said, it reports
> > only the seconds, but is settable and "readable" to microseconds. 
> 
> The CMOS clock is running off a 32768Hz crystal, so no way it can be
> more accurately set than 30us.
> 
> Even it could be possible in theory to set and read it accurately to
> that value, apparently Linux does not do that.  That makes it questionable
> to me if it can be done.  I could understand when Windows would not
> exploit such a capability, when there is no monetary gain to be made.
> But the Linux developers are too proud and too nerdy to skip such an
> opportunity.

Well, the problem with reading or setting the RTC accurately is that
it takes up to 1 second, for a system call that's unacceptable. It
can't be really compared to the system clock, which can be read in few
tens of nanoseconds, on Linux it usually doesn't even involve a real
system call.

> The fact that there is a microsecond-accurate API to set and read the
> clock does not indicate anything.  Remember Linux can run on any platform,
> and there may be other platforms, now or in the future, that can use
> this accuracy.

The RTC ioctls use only second resolution, AFAIK there is no API that
would allow reading or setting the RTC with better resolution, you
need to do it yourself by timing the ioctl call when setting the clock
and enabling the interrupt when reading the clock. When ntpd is
running, the kernel 11-minute update mode will time the RTC update to
few ticks, that's few milliseconds with a 1000Hz kernel.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] LOCL clock reachability not 377?

2014-08-01 Thread Miroslav Lichvar
On Fri, Aug 01, 2014 at 12:59:32PM +0200, Martin Burnicki wrote:
> Miroslav Lichvar wrote:
> >To generalize it a bit more, there could be also a case of a PPS that
> >is not locked in phase and a case of a PPS that's not even locked in
> >frequency. When only a source with poor short-term stability is
> >available, I think it would be pretty cool if it could be combined
> >with a PPS derived from a cheap TCXO. Doing this in ntpd could be
> >tricky however.
> 
> Hm, I maybe I don't understand correctly what you mean.
> 
> You want to use a PPS signal without proper phase and frequency, and then
> use *in addition* another PPS derived from a TCXO?

I meant to use a PPS signal from an external undisciplined *XO to
stabilize the system clock. The driver would track the phase and
frequency offsets against other sources or the system clock over a
longer interval and use that to correct the samples before normal
processing.

I think this could be useful with jittery sources (e.g. NTP) or
reference clocks that don't have their own stabilized oscillator.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] How to measure computer clock error using PPS?

2014-09-03 Thread Miroslav Lichvar
(hmm, it took over 3 months for this message to reach the list)

On Tue, May 27, 2014 at 03:03:47PM +0400, Vladislav Ross wrote:
> I have NTP server with Ublox LEA-6T GPS receiver. I want to determine my
> server's oscillator accuracy and stability. I've read about Allan deviation 
> and learnt how to make ADEV plot, but I don't fully understand how to
> use this method.
> 
> My question is: what method should I use to determine server clock
> accuracy and stability using PPS as reference? Can I use ntpd to collect
> data about clock error? As far as I understand ntpd will adjust the
> clock, but I need freely running clock.

You could use the "disable ntp" directive in ntp.conf to disable the
clock discipline, but I'm not sure if ntpd will keep logging PPS
offsets as the clock will be slowly drifting away.

If you are on Linux, you might find the following tool useful:

https://github.com/mlichvar/ppsallan

It can be used like this:

ppsallan -p adev.plot /sys/class/pps/pps0/assert

This will collect raw PPS timestamps from the file in /sys and while
it's running, a rough graph of Allan deviation will be shown in the
console. On exit it will save the x, y coordinates of the plot to the
file specified by -p and the plotallan script can be used to create a
nice graph with gnuplot.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Min & max poll no longer needed for SHM/GPSD driver?

2014-09-12 Thread Miroslav Lichvar
On Thu, Sep 11, 2014 at 07:37:10PM +0100, David Taylor wrote:
> It has been pointed out to me that this page:
> 
>   http://www.eecis.udel.edu/~mills/ntp/html/drivers/driver28.html
> 
> says: "The gpsd man page suggests setting minpoll and maxpoll to 4. That was
> an attempt to reduce jitter. The SHM driver was fixed (ntp-4.2.5p138) to
> collect data each second rather than once per polling interval so that
> suggestion is no longer reasonable"
> 
> So what should minpoll and maxpoll be set to for the GPSD shared memory
> driver?  Or should they be omitted?  I'm confused

Hm, that paragraph doesn't make much sense to me either as the default
refclock poll is 6.

When collecting samples each second and processing them in the median
filter, the output has a lower jitter, so it's better to use a shorter
poll if the goal is to get the best accuracy, not longer.

On Linux when the GPS PPS signal has 1us jitter, poll 3 or 4 usually
works best for me.

FWIW, in clknetsim simulations with 1us jitter and 1ppb/s wander I get
these results:

pollRMS time error (s)  RMS freq error (1)
3   6.0e-07 1.7e-08
4   1.5e-06 9.5e-09
5   4.3e-06 1.0e-08
6   1.1e-05 1.3e-08

With 10us jitter:

pollRMS time error (s)  RMS freq error (1)
3   2.4e-06 1.6e-07
4   2.2e-06 6.7e-08
5   4.5e-06 2.7e-08
6   1.3e-05 1.7e-08

On other systems (using the standard PLL time constant shift) the best
poll would be even shorter.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Possible new attack?

2014-10-07 Thread Miroslav Lichvar
On Mon, Oct 06, 2014 at 06:49:58PM -0700, Evandro Menezes wrote:
> On Monday, October 6, 2014 6:50:09 PM UTC-5, William Unruh wrote:
> > Not only that but they are probably running ntp 3 systems, which does
> > not have KOD.
> 
> The suspects are purportedly NTPV4:
> 
> remote address  port local address  count m ver rstr avgint  
> lstint
> wnpgmb1154w-a-b   123 192.168.a.b   18 3 45f8  6   0
> a-b.dyn.suddenlink.net 42324 192.168.a.b 1590 3 45f8 14   
> 6

Out of curiousity, do you have a pcap file or tcpdump output you could
share? 

I've been trying to fix widely used open source (S)NTP implementations
to not poll frequently and I'm wondering if this is a client I know.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Support for "tickless" systems

2014-11-19 Thread Miroslav Lichvar
On Wed, Nov 19, 2014 at 10:09:42AM +, David Taylor wrote:
> In bug 2314, I reported that the jitter was always reported as 0 soon after
> NTP had started, and this was traced to the Linux in use on the Raspberry Pi
> being tickless.  Recompiling the kernel without the tickless option was a
> work-round, but is it possible to get jitter values with a tickless system?

There was a problem with clock stability in the tickless mode on idle
systems, which should be fixed or at least significantly improved in
3.17. I'm not sure how it could cause the jitter to be reported as
zero though.

Can you try 3.17 or later and see if it's fixed? Also, it would be
interesting to know if adding nohz=off to the kernel command line
instead of recompiling works as a workaround too.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Support for "tickless" systems

2014-11-20 Thread Miroslav Lichvar
On Thu, Nov 20, 2014 at 07:27:47AM +, David Taylor wrote:
> On 19/11/2014 11:56, Miroslav Lichvar wrote:
> >Can you try 3.17 or later and see if it's fixed? Also, it would be
> >interesting to know if adding nohz=off to the kernel command line
> >instead of recompiling works as a workaround too.
> 
> I found the right file (thanks, Rob, yes there are more options as you say)
> and tried setting nohz=off but it made no difference - jitter still reported
> as zero.

Interesting. When you tested the kernel compiled without CONFIG_NO_HZ,
where ntpd reported non-zero jitter, was that the only difference
compared to the original kernel which reported zero jitter?

> How would I tell whether the nohz=off was actually accepted or not, i.e. how
> to determine whether the kernel is tickless or not?

I'm not sure if there is any reliable way to tell that from
user-space, beside parsing the kernel command line.

> pi@raspberrypi ~ $ cat /proc/interrupts | grep -i time
>   3:4351879   ARMCTRL  BCM2708 Timer Tick
> pi@raspberrypi ~ $ sleep 10
> pi@raspberrypi ~ $ cat /proc/interrupts | grep -i time
>   3:4353699   ARMCTRL  BCM2708 Timer Tick
> pi@raspberrypi ~ $
> 
> I don't know how to interpret the difference of 1820 in those two numbers.
> The first two commands were typed by hand, by the way, the third with an
> up-arrow recall.

That's between 100 and 250 Hz, so the kernel could be compiled with
CONFIG_HZ=100. Do you see that in the kernel config file? Does the
interrupt rate change significantly when you load the CPU, e.g. by
running "cat /dev/urandom > /dev/null" ?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Support for "tickless" systems

2014-11-20 Thread Miroslav Lichvar
On Thu, Nov 20, 2014 at 10:16:13AM +, David Taylor wrote:
> Running the sleep 10 sequence from a command procedure gives a difference of
> 1055, so I guess that's 105.5 interrupts per second.  Does sound like 100
> Hz, yes.
> 
> Running the command while another terminal was running "cat /dev/urandom >
> /dev/null" resulted in 1063 interrupts, so 106.3 Hz.
> 
> Does that mean I'm tickless or not?

It seems it's not running in the tickless mode and the problem with
zero jitter is caused by something else.

Do you have PPS kernel discipline enabled in your ntpd config (flag3)
and which driver do you use? The PPS discipline is always disabled
when the Linux kernel is compiled with NO_HZ, so I think that could
explain what you are seeing. I'm not sure if that would be an ntpd bug
or kernel bug, but I can look into it.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Support for "tickless" systems

2014-11-20 Thread Miroslav Lichvar
On Thu, Nov 20, 2014 at 12:02:06PM +0100, Miroslav Lichvar wrote:
> On Thu, Nov 20, 2014 at 10:16:13AM +, David Taylor wrote:
> > Running the sleep 10 sequence from a command procedure gives a difference of
> > 1055, so I guess that's 105.5 interrupts per second.  Does sound like 100
> > Hz, yes.
> > 
> > Running the command while another terminal was running "cat /dev/urandom >
> > /dev/null" resulted in 1063 interrupts, so 106.3 Hz.
> > 
> > Does that mean I'm tickless or not?
> 
> It seems it's not running in the tickless mode and the problem with
> zero jitter is caused by something else.
> 
> Do you have PPS kernel discipline enabled in your ntpd config (flag3)
> and which driver do you use? The PPS discipline is always disabled
> when the Linux kernel is compiled with NO_HZ, so I think that could
> explain what you are seeing. I'm not sure if that would be an ntpd bug
> or kernel bug, but I can look into it.

After some debugging it seems the problem is that ntpd configured to
use the PPS kernel discipline enables it even when the kernel consumer
binding failed with the ENOTSUPP error (as would happen with a kernel
compiled with NO_HZ). ntpd thinks PPS is running and is using the PPS
stats for the clock jitter.

This was broken somewhere between ntp-4.2.4 and ntp-4.2.6. I've
attached a patch to the ntp bug #2314.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Number of Stratum 1 & Stratum 2 Peers

2014-12-04 Thread Miroslav Lichvar
On Thu, Dec 04, 2014 at 10:46:17AM -0500, brian utterback wrote:
> I remain unconvinced. I believe that it takes three correct servers to
> outvote a single falseticker, meaning that if you want to be safe
> against one of your servers becoming a falseticker and still being
> accepted as the system server by a client, the client needs at least
> four servers.

Four (or any larger number) of servers still doesn't guarantee the
source selection algorithm will mark one bad source as a falseticker.
There was a very similar discussion about this few years ago,
including an example:

http://lists.ntp.org/pipermail/questions/2011-January/028313.html

> Now imagine that the falseticker has a similar overlap with T1, but on
> the interval T1off-T1disp to T1off. That interval does not include the
> real time, so F is indeed a falseticker. So, we have a completely
> symmetric situation, with T1 and F "voting" for an interval that does
> not include the real time and T1 and T2 "voting" for an interval that
> does include the real time. By what mechanism are we to presume that the
> client will choose the interval that includes the real time?

The intersection interval determined in the source selection algorithm
will be equal to the interval of T1 and all three servers will pass as
truechimers. Adding a third good server may not be enough to change
the result.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Red Hat vote for chrony

2014-12-08 Thread Miroslav Lichvar
On Mon, Dec 08, 2014 at 03:27:15AM +, William Unruh wrote:
> On 2014-12-07, Charles Swiger  wrote:
> > Yes, so chrony recommends using maxpoll=4 to the LAN, and not only to local 
> > refclocks.
> 
> No, read the chrony docs. the default is maxpoll 10 minpoll 5. 

The default minpoll is 6 and maxpoll 10, exactly the same as in ntpd.

> That the faq as an example uses 2 and 4 is I agree stupid. It is faq. I
> have no idea who wrote it.

I wrote it. What exactly is wrong with poll 4 on a LAN?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Red Hat vote for chrony

2014-12-08 Thread Miroslav Lichvar
On Sat, Dec 06, 2014 at 03:35:10PM -0500, Paul wrote:
> On Sat, Dec 6, 2014 at 11:12 AM, William Unruh  wrote:
> 
> > And in my tests 10 years ago or so, I used a local gps clock to test the
> > ability of chrony and ntpd to discipline a computer clock networked to
> > another server which was disciplined by a gps. Thus the network was the
> > same, and the difference was ntpd vs chrony.
> > chrony was better. Primarily, I think, because chrony responded more
> > quickly to drift rate changes due to temp changes.
> >
> 
> I looked at your data back in the day.  Even then I thought they were old.
> Of course if the secret sauce is loop constants (I haven't read the Chrony
> architecture document, maybe because there isn't one) then perhaps the
> results would still be the same.

The main part of the "secret sauce" is the variable number of points
used in the linear regression. When the clock frequency changes
quickly, only a small number of points will be used to get a better
estimate of the current frequency offset. I.e. chronyd adapts to
the Allan intercept without changing the polling interval. This
adaptation doesn't always work perfectly, the current code often
reduces the number of points unnecessarily, but there are some ideas
that will likely be implemented in the future to improve it.

Of course, a similar approach could be used with the NTP PLL/FLL loop.
If the time constant wasn't fixed to the polling interval and the FLL
part of the loop wasn't active only when the update interval is longer
than 2048 seconds, the performance could be improved significantly. I
was suggesting this years ago.

It would be nice if there was at least a tinker option to shift the
constant as needed, but a patch for that was rejected. So now the
Linux kernel uses a nonstandard PLL shift to get a better performance
with ntpd and current typical network jitters.

If you like the slow response of ntpd, chronyd can be configured with
the minsamples and maxsamples options to use a fixed number of points
and to some extent imitate ntpd. In my testing, when the number was
set to 40 the overall time and frequency errors were quite close to
ntpd (running on Linux).

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Red Hat vote for chrony

2014-12-10 Thread Miroslav Lichvar
On Tue, Dec 09, 2014 at 07:38:06PM +, William Unruh wrote:
> On 2014-12-09, Charles Swiger  wrote:
> > Well, yes.  You can get a PCI(e) card with a TCXO or OCXO and an
> > optional GPS module like the Beagle ClockCard or a SpectraCom TSync
> > for a few hundred bucks.
> >
> > That's quite a bit more than a $40 GPS puck, but these will also
> > freewheel for a lot longer before losing or gaining a second in
> > error: ~2 seconds/month if kept stable at 23C, I believe one said.
> 
> I suspect even the cheap ones can do that if kept stable at 23C. 
> (that is about 1PPM) And if you could put a fast thermal probe onto the
> crystal, you could probably do as well even in a flutuating environment
> with an addition to ntpd/chrony to use the temp data to compensate the
> clock rate.

Here is an interesting test of temperature compensation on the
BeagleBone Black where the frequency stability improved almost by a
factor of 20.

http://blog.dan.drown.org/beaglebone-black-ntpgps-server-temperature-compensation/

See also the following post, apparently the BBB can use a separate HW
timer to timestamp PPS, reducing the error and jitter significantly.

http://blog.dan.drown.org/beaglebone-black-timer-capture-driver/

It looks like this could be a really nice and cheap NTP/PTP server.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP, GPSD & PPS

2014-12-10 Thread Miroslav Lichvar
On Tue, Dec 09, 2014 at 03:29:45PM +0100, Sander Smeenk wrote:
> I run a stratum 1 server which has a Garmin LVC 18x connected to its ttyS0.
> The GPS provides a PPS signal via serial and i use gpsd to provide the
> NMEA sentences and pulse data in shared memory to NTP.
> 
> This partly works. NTP syncs against the PPS signal but the NMEA signal
> is always marked as falseticker even though i managed to bring down the
> offset to -1.5ųsec average by fudging the time a bit. The NMEA signal
> offset fluctuates a lot. From ~ -65ųsec to ~ +75ųsec.

> 1) Can i get a 'true PPS sync' with this setup?
> Eliminating gpsd so 'ntpq -p' shows 'oSHM(1)' instead of '*SHM(1)' ?

If it's a recent version of gpsd which uses kernel timestamps (check
for KPPS messages in "gpsd -D 4" output), you may already have a "true
PPS sync". The PPS and NMEA timestamps are paired in gpsd, so it's not
necessary to add the NMEA source to ntpd. To avoid problems with the
falseticker, you can remove the source from ntpd configuration or use
the noselect option to never use it and only monitor it.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] NTP, GPSD & PPS

2014-12-10 Thread Miroslav Lichvar
On Wed, Dec 10, 2014 at 11:50:22AM +, David Taylor wrote:
> With -D 4 I get a list of devices ending with "PPS", but presumably that is
> not the same as "KPPS"?

In gpsd the PPS without K is the userspace timestamping. With kernel
timestamping the log looks like this:

gpsd:PROG: PPS edge: 1, cycle: 100 uSec, duration:  78 uSec @ 
1418214654.07216
gpsd:INFO: PPS hooks called with accepted 1418214653.99223 offset 
0.00777
gpsd:PROG: PPS edge accepted 1418214653.99223 offset 0.00777
gpsd:PROG: KPPS assert 1418214653.99223, sequence: 73 - clear  
1418214654.20573, sequence: 73
gpsd:PROG: KPPS data: using clear
gpsd:PROG: KPPS cycle:  99 uSec, duration:  21 uSec @ 
1418214654.20573

> I did try an apt-get first to update gpsd but it
> seems I have the most recent available.  It seems I have 3.6.  Do I need a
> development version or...?

The kernel PPS support was added in 3.0 or so, but gpsd needs to be
compiled with the timepps.h header, similarly to ntpd for the ATOM and
other drivers. Also, some gpsd versions had bugs in the PPS/KPPS
support and I'm not sure if 3.6 was a good or bad. The latest version
- 3.11 is working well for me, 3.10 was not.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] pool.ntp.org and authentication

2014-12-16 Thread Miroslav Lichvar
On Tue, Dec 16, 2014 at 05:43:59AM +, Harlan Stenn wrote:
> d_anderson writes:
> > Thanks! I quickly skimmed through the document, and I think I am
> > asking the wrong questions..
> 
> I've been trying to think of good reasons to authenticate pool servers
> and I haven't come up with any good ones yet.

Protection against MITM attacks?

Of course, with a public pool like pool.ntp.org an attacker could join
it with a number of his NTP servers, get their certificates signed and
serve whatever he wants, no need for a MITM. Even if DNS was secure
and all clients were configured to use four pool servers, the pool DNS
server would not likely be able to prevent some clients getting three
bad servers outvoting the fourth server.

But I think it would still be a significant improvement in security.
The NTS draft says the scheme is not applicable to pools. I'm
wondering what would be needed to make it applicable.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] pool.ntp.org and authentication

2014-12-16 Thread Miroslav Lichvar
On Tue, Dec 16, 2014 at 10:13:43AM +, Rob wrote:
> In the NTP pool the servers are only put in the DNS when the monitoring
> system considers the time returned from that server sufficiently reliable.
> But the server can easily separate the queries from the monitoring system
> from the queries by the clients they want to mislead, so it is trivial
> to keep the servers in the pool while returning wrong time to others.

Agreed. The assumption is that most servers in the pool are not doing
that and it's much less likely that a client gets three malicious
servers from the pool than someone on the network path to the internet
running a tool like this:

https://github.com/PentesterES/Delorean/

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Default total number of servers NTP wants to have when using pool .....

2014-12-16 Thread Miroslav Lichvar
On Mon, Dec 08, 2014 at 01:47:33PM -0500, Paul wrote:
> On Mon, Dec 8, 2014 at 12:27 PM, David Taylor <
> david-tay...@blueyonder.co.uk.invalid> wrote:
> 
> > When using the pool directive, NTP tries to get a certain total number of
> > servers.  What is that number, please (I don't know where to find it in the
> > source code).  I'm seeing a total of 9 servers, with ten lines in the ntpq
> > -pn output, one line being the pool directive itself.  Is that correct and
> > expected?
> >
> 
> maxclock *maxclock*Specify the maximum number of servers retained by the
> server discovery schemes. The default is 10. See the Automatic Server
> Discovery <http://www.eecis.udel.edu/%7Emills/ntp/html/discover.html> page
> for further details.

Should be the example of simple client configuration on ntp wiki [1]
updated to include "tos maxclock 5" to not increase the NTP traffic as
users and OS vendors will be switching to the pool directive? Or
change the default value of maxclock? 

[1] http://support.ntp.org/bin/view/Support/ConfiguringNTP#Section_6.10.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Default total number of servers NTP wants to have when using pool .....

2014-12-17 Thread Miroslav Lichvar
On Wed, Dec 17, 2014 at 12:04:04PM +, Harlan Stenn wrote:
> I'd love to see discussion about "what should the default number of
> servers queried be for the 'pool' directive?"

The "How do I use pool.ntp.org?" page [1] is pretty clear, quoting:

  Be friendly. Many servers are provided by volunteers, and almost
  all time servers are really file or mail or webservers which just
  happen to also run ntp. So don't use more than four time servers
  in your configuration, and don't play tricks with burst or
  minpoll - all you will gain is extra load on the volunteer time
  servers.

> There is clearly a tradeoff, and I'm inclined to say that "between 5 and
> 9" is probably a good number.

Ok, but examples of ntpd configuration using pool.ntp.org should follow
their policy. Maybe you can convince them to change it. Do you think
the servers are ready to handle twice as many clients?

[1] http://www.pool.ntp.org/en/use.html

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Leap second to be introduced in June

2015-01-26 Thread Miroslav Lichvar
On Mon, Jan 26, 2015 at 01:03:48PM +0100, Terje Mathisen wrote:
> One of the good points about Google's smear is the fact that they use a half
> cosine to distribute the offset, which means that they have a time function
> which is both continuous and monotonic, as well as having an infinite number
> of defined derivatives, i.e. it is maximally smooth.

They could have chosen a better function though. If its second
derivative (wander) started at zero, the NTP clients could adjust
their polling interval if necessary before the error gets large and
the maximum error between the clients could be minimized.

Here is a test showing error between two clients of a server
smearing.a large offset. With the cosine function you can see a large
spike when smearing started.

https://mlichvar.fedorapeople.org/tmp/smear_cos.png
https://mlichvar.fedorapeople.org/tmp/smear_sinx.png

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Leap second to be introduced in June

2015-01-27 Thread Miroslav Lichvar
On Mon, Jan 26, 2015 at 06:45:58PM +0100, Terje Mathisen wrote:
> Miroslav Lichvar wrote:
> >Here is a test showing error between two clients of a server
> >smearing.a large offset. With the cosine function you can see a large
> >spike when smearing started.
> >
> >https://mlichvar.fedorapeople.org/tmp/smear_cos.png
> >https://mlichvar.fedorapeople.org/tmp/smear_sinx.png
> >
> This seems wrong!?!
> 
> First of all, you seem to extend the smearing over a million seconds or so?
> I.e. 10-15 days?

Yes.

> How large is the adjustment to be smeared out?

1 seconds. It was a test to see how useful is smearing when
bringing an isolated network back to UTC in a controlled manner.

> The google cosine approach starts with a derivate of zero and ends the same
> way, I really can't see how that leads to that huge (more than 128 ms!)
> spike at the start?

The frequency is changing too quickly at start (2nd derivative is at
the maximum) and the clients don't have a chance to shorten their polling
interval to better track the server.

The point is that there are better functions than cosine for this.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


[ntp:questions] ntpd -x and leap seconds

2015-02-09 Thread Miroslav Lichvar
I was wondering what others think about handling leap seconds when
ntpd is running in the "slew only" mode (-x option).

The -x option disables the kernel discipline, so the kernel is not
told about pending leap seconds and its up to ntpd to do whatever is
needed. Older ntpd versions (before 4.2.6) didn't handle leap second
in the daemon loop and -x could be used to avoid the backward step in
the Unix time scale and possibly upset the applications running on the
system.

In 4.2.6 was added support for leap seconds in the daemon loop and
ntpd now steps the clock by calling settimeofday() or clock_settime(),
even if the step threshold (set by -x or tinker step) is larger than
one second.

Should be leap seconds threated as a normal offset and not corrected
by step when the threshold is larger than 1.0? Should there be a
separate option for them?

http://bugs.ntp.org/show_bug.cgi?id=2745

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Authenticated TLS "constraints" in ntpd

2015-02-12 Thread Miroslav Lichvar
On Wed, Feb 11, 2015 at 02:29:54PM +0100, Terje Mathisen wrote:
> Jan Ceuleers wrote:
> >I'd like to draw this list's attention to an idea that Reyk Floeter
> >floated, namely to use TLS to help sanity-check NTP timestamps:
> >
> >http://marc.info/?l=openbsd-tech&m=142356166731390&w=2
> >
> Isn't public/private signed timestamps far better?

It surely is, but NTP currently doesn't have a suitable authentication
scheme for such use, does it?

My understanding is this will change when the new Network Time
Security (NTS) specification is implemented in NTP. Does anyone know
how far it is? Is anyone working on it?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP offset doesn't change.

2015-02-13 Thread Miroslav Lichvar
On Fri, Feb 13, 2015 at 05:42:54AM +, William Unruh wrote:
> On 2015-02-13, Paul  wrote:
> > On Thu, Feb 12, 2015 at 7:27 PM, William Unruh  wrote:
> >
> >> It was based on measurements I made with ntpd
> >
> > Are you assuming the numbers I provided are based on theory or were you
> > looking over my shoulder when I perturbed system time by two milliseconds
> > and watched it converge to O(10) microseconds in ~180 seconds.
> 
> OK, so we seem to have two different sets of experiments with very
> different results. Note that I did not erase the drift file, or restart
> ntpd after my perturbation. 

The difference probably comes from different ntp versions. In some
4.2.7 version the code was reworked to correct the initial offset by
adjtime() without touching the PLL. If the drift file contains an
accurate value, the PLL should be now able to lock pretty quickly.

There is an interesting problem with larger step threshold, however
[1]. The code assumes the adjtime() correction can finish in 300
seconds. If the correction is so large that it can't finish before
the next clock update, it results in worse behavior than without this
feature.

On systems that use the standard adjtime() slew rate of 500 ppm the
maximum reliable correction is 150 ms, on systems with faster slew
it's proprotionally larger.

[1] https://bugs.ntp.org/show_bug.cgi?id=2021

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] chrony as a server

2015-02-16 Thread Miroslav Lichvar
On Sun, Feb 15, 2015 at 10:40:11PM +, Rob wrote:
> However, it does not reply to NTP requests from other systems with ntpd.
> (I can confirm that in a network trace)

> Is there a magic command that has to be in the config to make it work
> as a server?

No, your configuration looks good. Any chance there is a forgotten
firewall rule blocking NTP or that clients are actually using IPv6?

Is chronyd listening on the port?

# netstat -a -n -p | grep 123
udp0  0 0.0.0.0:123 0.0.0.0:*   
29615/chronyd   
udp6   0  0 :::123  :::*
29615/chronyd   

> Configuration:
> 
> driftfile   /var/lib/ntp/ntp.drift
> logdir  /var/log/ntpstats
> log statistics measurements tracking tempcomp
> local stratum   10
> makestep10 3
> refclockPPS /dev/pps0
> server 192.168.42.1 iburst
> server 192.168.42.60iburst
> server 192.168.42.61iburst
> allow   0/0
> cmdallow    192.168.42.0/24

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] chrony as a server

2015-02-16 Thread Miroslav Lichvar
On Mon, Feb 16, 2015 at 09:59:27AM +, Rob wrote:
> I have strace'd the daemon and I see that it does receive the datagram
> from the socket, but it does not send a reply.

Hm, interesting. Can you post what follows that recvmsg() call?

You could try running it with -d -d (after it's compiled with
--enable-debug) and see if there are any debug messages indicating why
it's dropping the client request. If there aren't any, you could try
it with chrony-2.0-pre1 and see if it's different.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] chrony as a server

2015-02-16 Thread Miroslav Lichvar
On Mon, Feb 16, 2015 at 11:29:31AM +0100, Miroslav Lichvar wrote:
> On Mon, Feb 16, 2015 at 09:59:27AM +, Rob wrote:
> > I have strace'd the daemon and I see that it does receive the datagram
> > from the socket, but it does not send a reply.
> 
> Hm, interesting. Can you post what follows that recvmsg() call?
> 
> You could try running it with -d -d (after it's compiled with
> --enable-debug) and see if there are any debug messages indicating why
> it's dropping the client request. If there aren't any, you could try
> it with chrony-2.0-pre1 and see if it's different.

BTW, could it be that the client is one of the servers configured in
chrony.conf? The client request from the configured server would be
dropped as an invalid reply to chrony's own client request. This bug
was in 1.30 and 1.31, it should be fixed in 2.0-pre1.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] chrony as a server

2015-02-16 Thread Miroslav Lichvar
On Mon, Feb 16, 2015 at 12:56:27PM +, David Lord wrote:
> I've just fetched chrony-2.0-pre1. It seemed to compile and
> install ok on NetBSD-6/i386. The client IS one of the servers
> configured in chrony.conf and it behaved same as with 1.31.

I didn't know this was such a common configuration.

As a workaround you can add "acquisitionport 123" to chrony.conf to
use just one socket for all (client, peer, server) communication,
which will effectively disable the check in which the server's request
is failing.

> That was it, as restart after the client had been removed from
> chrony.conf the client picked up a reply from chrony. So that
> bug still needs fixing.

I'm not sure what's wrong, it seems to be working for me with
2.0-pre1.

ntp.conf (for ntp-4.2.6p5) on host 1:

server 192.168.100.2 minpoll 3 maxpoll 3
driftfile /var/lib/ntp/drift

chrony.conf on host 2:

pool 2.pool.ntp.org iburst
server 192.168.100.1 minpoll 3 maxpoll 3
driftfile /var/lib/chrony/drift
allow 0/0

# /usr/sbin/chronyd -d -d
...
2015-02-16T14:16:27Z ntp_core.c:906:(transmit_timeout) Transmit timeout for 
[192.168.100.1:123
]
(this is chrony sending its client request)
2015-02-16T14:16:27Z ntp_io.c:679:(send_packet) Sent 48 bytes to 
192.168.100.1:123 from [UNSPE
C] fd 6
(receiving reply from ntpd)
2015-02-16T14:16:27Z ntp_io.c:562:(read_from_socket) Received 48 bytes from 
192.168.100.1:123 
to 192.168.100.2 fd 6
...
(discarding it for synchronization loop testD=0)
2015-02-16T14:16:27Z ntp_core.c:1287:(receive_packet) test123=111 test567=111 
testABCD=1110 kod_rate=0 valid=1 good=0  

(this is ntpd's request)
2015-02-16T14:16:33Z ntp_io.c:562:(read_from_socket) Received 48 bytes from 
192.168.100.1:123 to 192.168.100.2 fd 5  
(and chrony sending reply)
2015-02-16T14:16:33Z ntp_io.c:679:(send_packet) Sent 48 bytes to 
192.168.100.1:123 from 192.168.100.2 fd 5  

# ntpq -pn
 remote   refid  st t when poll reach   delay   offset  jitter
==
*192.168.100.2   176.9.1.148  4 u68  3770.1430.044   0.055


If you compile chrony with --enable-debug, do you see similar Received
and Sent message pairs in the chronyd -d -d output?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] chrony as a server

2015-02-16 Thread Miroslav Lichvar
On Mon, Feb 16, 2015 at 02:00:30PM +, Rob wrote:
> Is chronyc of 1.31 compatible with chronyd 2.0?

Yes, old configuration should still work. But you can use
"acquisitionport 123" as a workaround if you prefer stable version.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] chrony as a server

2015-02-16 Thread Miroslav Lichvar
On Mon, Feb 16, 2015 at 03:30:52PM +, Rob wrote:
> Miroslav Lichvar  wrote:
> > On Mon, Feb 16, 2015 at 02:00:30PM +, Rob wrote:
> >> Is chronyc of 1.31 compatible with chronyd 2.0?
> >
> > Yes, old configuration should still work. But you can use
> > "acquisitionport 123" as a workaround if you prefer stable version.
> 
> Well I tried that before and it did not solve that issue.

Hm, you are right. I tried it again and it seems this works only with
1.30 and not 1.31.

> What I mean is can I manage a mix of 1.31 and 2.0 servers from a single
> system with one version of chronyc.

Yes, that should be compatible. The cmdmon protocol was just extended
(with one command - runtime makestep configuration) between 1.31 and
2.0. With 2.0 chronyc you can do everything 1.31 chronyc does, with
1.31 chronyc you can do everything except that one command.

For 2.0, you will need to add "bindcmdaddress 0.0.0.0" to chrony.conf
for as it binds to the loopback interface by default now.

> It would be nice when chronyd could be contacted using ntpq with at
> least the -p and the -c rv commands.  Then the monitoring system does
> not need to know what kind of ntp daemon is running on the servers.

It would make the monitoring easier, but chronyd has different
internal variables so it would have to be an emulation even if only
the -p and -c rv commands were supported.

>From the security point of view, I'd prefer to not have any support
for the private/control modes of NTP. The chrony protocol runs on a
separate port and the access can be tightly controlled, independently
from NTP access.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] chrony as a server

2015-02-16 Thread Miroslav Lichvar
On Mon, Feb 16, 2015 at 03:51:07PM +, David Lord wrote:
> Miroslav Lichvar wrote:
> >As a workaround you can add "acquisitionport 123" to chrony.conf to
> >use just one socket for all (client, peer, server) communication,
> >which will effectively disable the check in which the server's request
> >is failing.
> 
> Done and ready for next restart.

Apparently, that workaround is not usable with 1.31, sorry for the
noise.

> >>That was it, as restart after the client had been removed from
> >>chrony.conf the client picked up a reply from chrony. So that
> >>bug still needs fixing.
> >
> >I'm not sure what's wrong, it seems to be working for me with
> >2.0-pre1.
> 
> Nothing wrong, it started working ok after I had removed that
> client from the config file.

I meant with 2.0-pre1 the clients should be getting responses even if
they are configured as servers in chrony.conf with otherwise standard
configuration. It seems to work for me. If it doesn't for you, can you
please post your chronyd -d -d output?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] chrony as a server

2015-02-16 Thread Miroslav Lichvar
On Mon, Feb 16, 2015 at 03:12:27PM +0100, Terje Mathisen wrote:
> William Unruh wrote:
> >I think, but am not sure, that the biggest problem with porting chrony
> >to windows is that windows does not have a good way of having the kernel
> >discipline the clock-- the equivalent of adjtimex on Linux.
> 
> If this is the biggest problem, then it would already be running there!

There is also the part with porting all the code to Win32/Cygwin :).

> GetSystemTimeAdjustment()
> SetSystemTimeAdjustment()
> 
> The only "hard" part is that you have to manually convert the adjustment
> rate to an absolute value:
> 
> Call Get* to retrieve the amount the system clock is incremented by on each
> timer tick/basic clock interval, then scale this value by the adjustment
> rate, i.e. to add 5.6ppm you would take the base value and multiply by
> 1.056.

In what resolution can be the frequency controlled? I'm not sure if I
remember correctly, I thought it was rather bad and would require
dithering. Looking at nt_clockstuff.c in the ntp distribution, it
certainly doesn't look easy.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] chrony as a server

2015-02-17 Thread Miroslav Lichvar
On Mon, Feb 16, 2015 at 07:19:39PM +, Rob wrote:
> The PPS refclock has changed is refid from PPP0 to PPP1 with this version.

That is a bug, the refid numbering wasn't supposted to change in the
new version. Fixed in git. Thanks.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


<    1   2   3   >