Re: [ntp:questions] Red Hat vote for chrony

2014-12-08 Thread Miroslav Lichvar
On Sat, Dec 06, 2014 at 03:35:10PM -0500, Paul wrote:
 On Sat, Dec 6, 2014 at 11:12 AM, William Unruh un...@invalid.ca wrote:
 
  And in my tests 10 years ago or so, I used a local gps clock to test the
  ability of chrony and ntpd to discipline a computer clock networked to
  another server which was disciplined by a gps. Thus the network was the
  same, and the difference was ntpd vs chrony.
  chrony was better. Primarily, I think, because chrony responded more
  quickly to drift rate changes due to temp changes.
 
 
 I looked at your data back in the day.  Even then I thought they were old.
 Of course if the secret sauce is loop constants (I haven't read the Chrony
 architecture document, maybe because there isn't one) then perhaps the
 results would still be the same.

The main part of the secret sauce is the variable number of points
used in the linear regression. When the clock frequency changes
quickly, only a small number of points will be used to get a better
estimate of the current frequency offset. I.e. chronyd adapts to
the Allan intercept without changing the polling interval. This
adaptation doesn't always work perfectly, the current code often
reduces the number of points unnecessarily, but there are some ideas
that will likely be implemented in the future to improve it.

Of course, a similar approach could be used with the NTP PLL/FLL loop.
If the time constant wasn't fixed to the polling interval and the FLL
part of the loop wasn't active only when the update interval is longer
than 2048 seconds, the performance could be improved significantly. I
was suggesting this years ago.

It would be nice if there was at least a tinker option to shift the
constant as needed, but a patch for that was rejected. So now the
Linux kernel uses a nonstandard PLL shift to get a better performance
with ntpd and current typical network jitters.

If you like the slow response of ntpd, chronyd can be configured with
the minsamples and maxsamples options to use a fixed number of points
and to some extent imitate ntpd. In my testing, when the number was
set to 40 the overall time and frequency errors were quite close to
ntpd (running on Linux).

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Number of Stratum 1 Stratum 2 Peers

2014-12-04 Thread Miroslav Lichvar
On Thu, Dec 04, 2014 at 10:46:17AM -0500, brian utterback wrote:
 I remain unconvinced. I believe that it takes three correct servers to
 outvote a single falseticker, meaning that if you want to be safe
 against one of your servers becoming a falseticker and still being
 accepted as the system server by a client, the client needs at least
 four servers.

Four (or any larger number) of servers still doesn't guarantee the
source selection algorithm will mark one bad source as a falseticker.
There was a very similar discussion about this few years ago,
including an example:

http://lists.ntp.org/pipermail/questions/2011-January/028313.html

 Now imagine that the falseticker has a similar overlap with T1, but on
 the interval T1off-T1disp to T1off. That interval does not include the
 real time, so F is indeed a falseticker. So, we have a completely
 symmetric situation, with T1 and F voting for an interval that does
 not include the real time and T1 and T2 voting for an interval that
 does include the real time. By what mechanism are we to presume that the
 client will choose the interval that includes the real time?

The intersection interval determined in the source selection algorithm
will be equal to the interval of T1 and all three servers will pass as
truechimers. Adding a third good server may not be enough to change
the result.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Support for tickless systems

2014-11-20 Thread Miroslav Lichvar
On Thu, Nov 20, 2014 at 07:27:47AM +, David Taylor wrote:
 On 19/11/2014 11:56, Miroslav Lichvar wrote:
 Can you try 3.17 or later and see if it's fixed? Also, it would be
 interesting to know if adding nohz=off to the kernel command line
 instead of recompiling works as a workaround too.
 
 I found the right file (thanks, Rob, yes there are more options as you say)
 and tried setting nohz=off but it made no difference - jitter still reported
 as zero.

Interesting. When you tested the kernel compiled without CONFIG_NO_HZ,
where ntpd reported non-zero jitter, was that the only difference
compared to the original kernel which reported zero jitter?

 How would I tell whether the nohz=off was actually accepted or not, i.e. how
 to determine whether the kernel is tickless or not?

I'm not sure if there is any reliable way to tell that from
user-space, beside parsing the kernel command line.

 pi@raspberrypi ~ $ cat /proc/interrupts | grep -i time
   3:4351879   ARMCTRL  BCM2708 Timer Tick
 pi@raspberrypi ~ $ sleep 10
 pi@raspberrypi ~ $ cat /proc/interrupts | grep -i time
   3:4353699   ARMCTRL  BCM2708 Timer Tick
 pi@raspberrypi ~ $
 
 I don't know how to interpret the difference of 1820 in those two numbers.
 The first two commands were typed by hand, by the way, the third with an
 up-arrow recall.

That's between 100 and 250 Hz, so the kernel could be compiled with
CONFIG_HZ=100. Do you see that in the kernel config file? Does the
interrupt rate change significantly when you load the CPU, e.g. by
running cat /dev/urandom  /dev/null ?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Support for tickless systems

2014-11-20 Thread Miroslav Lichvar
On Thu, Nov 20, 2014 at 10:16:13AM +, David Taylor wrote:
 Running the sleep 10 sequence from a command procedure gives a difference of
 1055, so I guess that's 105.5 interrupts per second.  Does sound like 100
 Hz, yes.
 
 Running the command while another terminal was running cat /dev/urandom 
 /dev/null resulted in 1063 interrupts, so 106.3 Hz.
 
 Does that mean I'm tickless or not?

It seems it's not running in the tickless mode and the problem with
zero jitter is caused by something else.

Do you have PPS kernel discipline enabled in your ntpd config (flag3)
and which driver do you use? The PPS discipline is always disabled
when the Linux kernel is compiled with NO_HZ, so I think that could
explain what you are seeing. I'm not sure if that would be an ntpd bug
or kernel bug, but I can look into it.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Support for tickless systems

2014-11-20 Thread Miroslav Lichvar
On Thu, Nov 20, 2014 at 12:02:06PM +0100, Miroslav Lichvar wrote:
 On Thu, Nov 20, 2014 at 10:16:13AM +, David Taylor wrote:
  Running the sleep 10 sequence from a command procedure gives a difference of
  1055, so I guess that's 105.5 interrupts per second.  Does sound like 100
  Hz, yes.
  
  Running the command while another terminal was running cat /dev/urandom 
  /dev/null resulted in 1063 interrupts, so 106.3 Hz.
  
  Does that mean I'm tickless or not?
 
 It seems it's not running in the tickless mode and the problem with
 zero jitter is caused by something else.
 
 Do you have PPS kernel discipline enabled in your ntpd config (flag3)
 and which driver do you use? The PPS discipline is always disabled
 when the Linux kernel is compiled with NO_HZ, so I think that could
 explain what you are seeing. I'm not sure if that would be an ntpd bug
 or kernel bug, but I can look into it.

After some debugging it seems the problem is that ntpd configured to
use the PPS kernel discipline enables it even when the kernel consumer
binding failed with the ENOTSUPP error (as would happen with a kernel
compiled with NO_HZ). ntpd thinks PPS is running and is using the PPS
stats for the clock jitter.

This was broken somewhere between ntp-4.2.4 and ntp-4.2.6. I've
attached a patch to the ntp bug #2314.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Support for tickless systems

2014-11-19 Thread Miroslav Lichvar
On Wed, Nov 19, 2014 at 10:09:42AM +, David Taylor wrote:
 In bug 2314, I reported that the jitter was always reported as 0 soon after
 NTP had started, and this was traced to the Linux in use on the Raspberry Pi
 being tickless.  Recompiling the kernel without the tickless option was a
 work-round, but is it possible to get jitter values with a tickless system?

There was a problem with clock stability in the tickless mode on idle
systems, which should be fixed or at least significantly improved in
3.17. I'm not sure how it could cause the jitter to be reported as
zero though.

Can you try 3.17 or later and see if it's fixed? Also, it would be
interesting to know if adding nohz=off to the kernel command line
instead of recompiling works as a workaround too.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Possible new attack?

2014-10-07 Thread Miroslav Lichvar
On Mon, Oct 06, 2014 at 06:49:58PM -0700, Evandro Menezes wrote:
 On Monday, October 6, 2014 6:50:09 PM UTC-5, William Unruh wrote:
  Not only that but they are probably running ntp 3 systems, which does
  not have KOD.
 
 The suspects are purportedly NTPV4:
 
 remote address  port local address  count m ver rstr avgint  
 lstint
 wnpgmb1154w-a-b   123 192.168.a.b   18 3 45f8  6   0
 a-b.dyn.suddenlink.net 42324 192.168.a.b 1590 3 45f8 14   
 6

Out of curiousity, do you have a pcap file or tcpdump output you could
share? 

I've been trying to fix widely used open source (S)NTP implementations
to not poll frequently and I'm wondering if this is a client I know.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Min max poll no longer needed for SHM/GPSD driver?

2014-09-12 Thread Miroslav Lichvar
On Thu, Sep 11, 2014 at 07:37:10PM +0100, David Taylor wrote:
 It has been pointed out to me that this page:
 
   http://www.eecis.udel.edu/~mills/ntp/html/drivers/driver28.html
 
 says: The gpsd man page suggests setting minpoll and maxpoll to 4. That was
 an attempt to reduce jitter. The SHM driver was fixed (ntp-4.2.5p138) to
 collect data each second rather than once per polling interval so that
 suggestion is no longer reasonable
 
 So what should minpoll and maxpoll be set to for the GPSD shared memory
 driver?  Or should they be omitted?  I'm confused

Hm, that paragraph doesn't make much sense to me either as the default
refclock poll is 6.

When collecting samples each second and processing them in the median
filter, the output has a lower jitter, so it's better to use a shorter
poll if the goal is to get the best accuracy, not longer.

On Linux when the GPS PPS signal has 1us jitter, poll 3 or 4 usually
works best for me.

FWIW, in clknetsim simulations with 1us jitter and 1ppb/s wander I get
these results:

pollRMS time error (s)  RMS freq error (1)
3   6.0e-07 1.7e-08
4   1.5e-06 9.5e-09
5   4.3e-06 1.0e-08
6   1.1e-05 1.3e-08

With 10us jitter:

pollRMS time error (s)  RMS freq error (1)
3   2.4e-06 1.6e-07
4   2.2e-06 6.7e-08
5   4.5e-06 2.7e-08
6   1.3e-05 1.7e-08

On other systems (using the standard PLL time constant shift) the best
poll would be even shorter.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] LOCL clock reachability not 377?

2014-08-01 Thread Miroslav Lichvar
On Thu, Jul 31, 2014 at 04:31:08PM +0200, Martin Burnicki wrote:
 This sounds good. I think we'd have to distinguish some basic cases a few of
 which immediately come to my mind:
 
 1) A refclock provides absolute time, status, and a PPS signal
 
 1a) The refclock contains a good oscillator, so the PPS signal could be
 accepted for some time after the refclock started freewheeling.
 
 1b) The refclock only has a simply xtal which starts drifting immediately
 when the refclock starts freewheeling.
 
 
 2) A good PPS signal is available, but no absolute time (e.g. in case of a
 Rubidium)
 
 2a) Some status information is available telling if the PPS signal is good
 or not
 
 2b) No information on the PPS quality is available

To generalize it a bit more, there could be also a case of a PPS that
is not locked in phase and a case of a PPS that's not even locked in
frequency. When only a source with poor short-term stability is
available, I think it would be pretty cool if it could be combined
with a PPS derived from a cheap TCXO. Doing this in ntpd could be
tricky however.

 Beside the implementation of such a flexible concept in ntpd it would have
 to be discussed how this can easily be configured. With NTP's basic
 configuration syntax in mind a possible way could be something like this:
 
 # a refclock with PPS signal but no good oscillator
 server 127.127.8.0
 server 127.127.22.0 ref 127.127.8.0
 
 # a refclock with PPS signal and good oscillator
 server 127.127.8.1
 server 127.127.22.1 ref 127.127.8.1 trust 3600
 
 # a PPS source relying on the usual system peer to
 # provide absolute time
 server 127.127.22.2 ref sys_peer
 
 # a PPS source which should be trusted always
 server 127.127.22.3 trust always

This looks good, but shouldn't it be rather specified with a fudge
command?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] LOCL clock reachability not 377?

2014-08-01 Thread Miroslav Lichvar
On Thu, Jul 31, 2014 at 10:43:12PM +, Rob wrote:
 William Unruh un...@invalid.ca wrote:
  I think you need to read up on the cmos clock. As I said, it reports
  only the seconds, but is settable and readable to microseconds. 
 
 The CMOS clock is running off a 32768Hz crystal, so no way it can be
 more accurately set than 30us.
 
 Even it could be possible in theory to set and read it accurately to
 that value, apparently Linux does not do that.  That makes it questionable
 to me if it can be done.  I could understand when Windows would not
 exploit such a capability, when there is no monetary gain to be made.
 But the Linux developers are too proud and too nerdy to skip such an
 opportunity.

Well, the problem with reading or setting the RTC accurately is that
it takes up to 1 second, for a system call that's unacceptable. It
can't be really compared to the system clock, which can be read in few
tens of nanoseconds, on Linux it usually doesn't even involve a real
system call.

 The fact that there is a microsecond-accurate API to set and read the
 clock does not indicate anything.  Remember Linux can run on any platform,
 and there may be other platforms, now or in the future, that can use
 this accuracy.

The RTC ioctls use only second resolution, AFAIK there is no API that
would allow reading or setting the RTC with better resolution, you
need to do it yourself by timing the ioctl call when setting the clock
and enabling the interrupt when reading the clock. When ntpd is
running, the kernel 11-minute update mode will time the RTC update to
few ticks, that's few milliseconds with a 1000Hz kernel.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] LOCL clock reachability not 377?

2014-08-01 Thread Miroslav Lichvar
On Fri, Aug 01, 2014 at 12:59:32PM +0200, Martin Burnicki wrote:
 Miroslav Lichvar wrote:
 To generalize it a bit more, there could be also a case of a PPS that
 is not locked in phase and a case of a PPS that's not even locked in
 frequency. When only a source with poor short-term stability is
 available, I think it would be pretty cool if it could be combined
 with a PPS derived from a cheap TCXO. Doing this in ntpd could be
 tricky however.
 
 Hm, I maybe I don't understand correctly what you mean.
 
 You want to use a PPS signal without proper phase and frequency, and then
 use *in addition* another PPS derived from a TCXO?

I meant to use a PPS signal from an external undisciplined *XO to
stabilize the system clock. The driver would track the phase and
frequency offsets against other sources or the system clock over a
longer interval and use that to correct the samples before normal
processing.

I think this could be useful with jittery sources (e.g. NTP) or
reference clocks that don't have their own stabilized oscillator.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] LOCL clock reachability not 377?

2014-07-31 Thread Miroslav Lichvar
On Thu, Jul 31, 2014 at 10:43:20AM +0200, Martin Burnicki wrote:
 Rob schrieb:
 However, that is broken.  Not only do you probably not want to mark
 that clock prefer (external references are often more accurate than the
 serial NMEA time, for example), but also you may have two or more ATOM
 PPS clocks, each with their own status, and there is no way to do that
 with this method.
 
 I've already proposed some times ago that another way of assigning PPS
 signal(s) to other time source(s) would be more versatile:
 http://lists.ntp.org/pipermail/questions/2009-April/022599.html
 http://lists.ntp.org/pipermail/questions/2009-April/022600.html
 
 This would also provide a simple way to declare a PPS signal reliable,
 e.g. if it is derived from a Rubidium or so, in which case it could continue
 to be accepted even though other time sources become unreachable.
 
 On the other hand, if a PPS input signal is associated to a particular time
 source the PPS signal could be discarded if the associated time source
 becomes unreachable.

Agreed, it would be useful to have an option to specify the PPS-time
source association for each PPS refclock directly.

In chrony, this is done with the lock refclock option. It's typically
used like this:

refclock SHM 0 offset 0.5 refid SHM0 noselect
refclock PPS /dev/pps0 lock SHM0

The SHM refclock (e.g. GPS NMEA) is configured with the noselect
option so it's never selected and only used by the PPS refclock to
align the pulses to the SHM time. When SHM stops getting new samples
the PPS refclock will stop immediately too.

When the PPS refclock doesn't have the lock option and the local
stratum option is not used, the pulses will be accepted only when the
clock is synchronized, first to another refclock or NTP server and
then possibly the PPS refclock itself. If local stratum is enabled,
the PPS will work immediately without any other sources, but the clock
obviously needs to be already close to the correct time on start,
otherwise it will be off by a whole number of seconds.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Thoughts on KOD

2014-07-08 Thread Miroslav Lichvar
On Mon, Jul 07, 2014 at 07:04:01PM +0200, Jan Ceuleers wrote:
 I'm not sure why sending the requester's timestamp back to him is better
 than an immutable timestamp.
 
 The effect of the former is slow drift, the effect of the latter is (I
 suspect) no lock at all due to the lack of passage of time. So I think
 that the latter is more likely to catch the admin's eye. If there is an
 admin.

I think most clients check at least one of the stratum/leap fields
and don't use the time stamps from a KOD response to actually update
their clock.

If the KOD response was modified to set the leap and stratum bits as
synchronized, the client would drift slowly away, but ntpd would need
to stick to it and never send the client correct time.

I agree that purposely serving bad time might be the best way how to
get an attention of the user and get the NTP implementation fixed if
it can be identified reliably and no innocent clients behind that IP
adress are harmed.

The identification could be improved, for example by monitoring the
distribution of the client's polling interval as simple clients use a
fixed interval, but I'm not sure if it's possible to make it so
reliable that ntpd could be allowed to send a reponse with purposely
bad time.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Pool Server Costs me $40/mo in Bandwidth--is

2014-06-25 Thread Miroslav Lichvar
On Tue, Jun 24, 2014 at 06:25:37PM +0200, Jochen Bern wrote:
 On -10.01.-28163 20:59, Miroslav Lichvar wrote:
  Agreed, but wouldn't switching to TAI everywhere be much more
  difficult than stopping messing with UTC and keep it a fixed offset
  from TAI?
 
 Having computer clocks run on UTC(frozen) instead of TAI makes the
 adaptation easier today, more difficult tomorrow (do we *really* need
 to work on that for (n3) seconds of an offset!?), and no less
 necessary in the long run (when UT1-TAI has grown much larger than
 UT1-UTC(frozen), and changes much faster as well). I prefer to have the
 slope right where the ball needs to get rolling. ;-)

I was thinking about larger adjustments in the timezones, like 15, 30
or 60 minutes. They could be announced decades or centuries ahead, but
possibly they would be hidden in the noise of the political/religious
adjustments that are common today. Before the first correction is
needed, maybe a global fixed timezone (or UTC directly) is already
used everywhere and the position of the Sun observed at 12:00 is let
to slowly revolve around Earth.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Pool Server Costs me $40/mo in Bandwidth--is

2014-06-25 Thread Miroslav Lichvar
On Tue, Jun 24, 2014 at 06:13:17PM -0500, Mike S wrote:
 On 6/24/2014 5:59 AM, Miroslav Lichvar wrote:
 To me, it seems the reasonable thing to do would be to decouple UTC and
 UT1 completely and make the adjustment at a higher level like
 timezones if necessary.
 
 You're doing it wrong. If you don't want leap seconds, use a timescale which
 doesn't have them (e.g. TAI, GPS). UTC was created to closely track Sol.
 Decoupling that breaks its purpose, and the promise made when it took over
 from GMT.

Yes, but to me it looks like redefining UTC to not track solar time
anymore is easier than converting everyone and everything to keep time
in TAI.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Pool Server Costs me $40/mo in Bandwidth--is

2014-06-24 Thread Miroslav Lichvar
On Mon, Jun 23, 2014 at 11:45:16PM -0500, Mike S wrote:
 On 6/16/2014 6:05 AM, Jochen Bern wrote:
 
 There are four official slots - two primary, two secondary - over the
 course of the year to insert leap seconds,
 
 Those are only preferences. Leap seconds may be inserted at any month
 boundary.
 
 A positive or negative leap-second should be the last second of a UTC
 month, but first preference should be given to the end of December and June,
 and second preference to the end of March and September. - ITU-R TF.460-6

Sooner or later, not even 12 leap seconds per year will be enough to
keep UTC close to UT1. Hopefully they will be abolished long before
that.

Practically speaking, beside having to make more than two corrections
per year (which is not expected to happen in the next few decades),
could there be any reason to do it in other months than June and
December? Older ntpd versions ( 4.2.5p53) used to check the month
before setting the leap flag and I'm wondering if it still can used to
detect spurious leap seconds.

FWIW, the IERS announcements say Leap seconds can be introduced in
UTC at the end of the months of December or June, depending on the
evolution of UT1-TAI.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Pool Server Costs me $40/mo in Bandwidth--is

2014-06-24 Thread Miroslav Lichvar
On Tue, Jun 24, 2014 at 12:08:10PM +0200, Jochen Bern wrote:
 I've browsed the results of the infamous poll and most of the people
 voting abolish leap seconds apparently didn't mean to actually
 *abolish* them (as in, decouple UT1 and UTC, or whatever their
 successors might be called), but to have them *rearranged* into fewer
 and larger leaps. Of course, one can imagine that to go the other way -
 i.e., smaller but more frequent leaps.

As someone who implemented support for leap seconds in several
applications, I'd really like to see them gone. Fixing all software
where time is critical to handle them correctly may not be possible
and from what I've heard a common solution is just to turn it off and
wait until it passes.

Making smaller but more frequent corrections would probably only make
it worse.

To me, it seems the reasonable thing to do would be to decouple UTC and
UT1 completely and make the adjustment at a higher level like
timezones if necessary. Countries adjust their timezones all the time,
we can handle that better.

 (Returning to your question as phrased, and circumstances as of today:
 IIUC the quality of prediction *would* already suffice to attempt
 scheduling leap seconds so as to aim for min-sum-of-squares, rather than
 predefined schedule slots.)

Good point. The question is if they will ever choose to do that.

Thanks,

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Pool Server Costs me $40/mo in Bandwidth--is

2014-06-24 Thread Miroslav Lichvar
On Tue, Jun 24, 2014 at 03:46:15PM +0200, Jochen Bern wrote:
 While I may have started from the same setting, I *did* try to put
 myself into the shoes of astronomers and people operating satellite
 systems (which, ironically, includes the popular stratum 0 of GPS).

Do these people work just with UTC? I'd think it's not accurate enough
for their purposes and they need to include the current UTC-UT1
offset anyway.

 Personally, I'd say that if a computer's clock's best suited to run on
 TAI (or equivalent) and all data needs to be converted from it to $TZ
 for the users, anyway, then having it run on TAI and disseminating and
 handling a TAI-UTC delta along with the sync and timezone deltas seems
 like the proper approach. But that wish doesn't change gettimeofday()
 implementations all over the globe with a snap of my fingers, does it.

Agreed, but wouldn't switching to TAI everywhere be much more
difficult than stopping messing with UTC and keep it a fixed offset
from TAI?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP Servers in virtual machines

2014-06-23 Thread Miroslav Lichvar
On Mon, Jun 23, 2014 at 12:28:53PM +0100, David Woolley wrote:
 On 23/06/14 12:03, Rob Heemskerk wrote:
 Could we say it is safe to run ntp servers on a virtualized platform or do 
 we still need a few (4?) dedicated pieces of hardware to run our internal 
 NTP servers?
 
 No.
 
 Normal virtualised machines are not intended for hard realtime applications.
 Also, the host clock can and should be disciplined using NTP, so there is a
 risk double correction.

I think it all depends on the VM implementation and what clocksource
is used in the guest. If the guest is using tsc (i.e. its frequency is
independent of the host clock), it will need to run its own NTP
client. If the guest's clock is locked to the host's system clock,
there still may be a static offset between them and an NTP client
(possibly using the host as the NTP server) can be used to correct the
offset.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Fwd: Re: Best ways to get the reference times from ntp

2014-05-12 Thread Miroslav Lichvar
On Wed, May 07, 2014 at 07:40:28PM +, William Unruh wrote:
 On 2014-05-07, mike cook michael.c...@sfr.fr wrote:
  Le 7 mai 2014 ? 18:32, William Unruh a ?crit :
  The short answer is no, ntpd cannot play this game. You are trying to
  use A to discipline not only B but C as well but on machine B.
 
  My reading is that C is not being disciplined at all, but is to be used a 
  reference (though non UTC) for B.
 
 That is my reading as well. But something must be done to determine
 those values of x and y (Ctime= xT+y where T is UTC). Either that can be
 done on C using something like chrony (better) or ntpd, or B could run
 something to determine x and y for C and use those to help discipline B. 

The OP said the frequency offset of C is known, so only y is unknown
if I'm reading it right. But he also said that A and C are in the same
network, so I'm not sure if the frequency of C can be tranferred to B
with better accuracy than the frequency of A and if this idea of using
A to estimate offset and using C to estimate frequency can give better
results than just simply increasing the rate of polling of A.

I think support for frequency only sources wouldn't be very difficult
to add to chrony. Add a new selection option to bypass the selection
algorithm and just combine its frequency with other sources by
estimated skew. This could work with both NTP sources and reference
clocks.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Precision changed after upgrade from ntp 4.2.4p4 to 4.2.6p2

2014-05-05 Thread Miroslav Lichvar
On Sun, May 04, 2014 at 08:29:26PM +0100, Caecilius wrote:
 After upgrading ntp from 4.2.4p4 to 4.2.6p2 as part of a Linux upgrade
 from Debian Lenny to Squueze, I've noticed that the precision variable
 has changed from -20 to -22. So it appears that my clock has now got a
 better precision. But the hardware is unchanged, and I'm running the
 same kernel.
 
 I thought the precision was dependent on the granularity of the
 system clock, which I would have expected to be independent of the ntp
 version and any other userland code.  Am I misunderstanding something
 perhaps?

The older ntpd is probably using gettimeofday() which has microsecond
resolution (-20 in the log scale) and not the nanosecond
clock_gettime().

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP.log interpretation

2014-04-18 Thread Miroslav Lichvar
On Fri, Apr 18, 2014 at 09:01:09AM -0500, GregL wrote:
What you should do is to add more servers to the config.
 
 What about the idea of going to only one entry, but that entry is served by
 a DNS load balancer to choose one of two internal time servers to check.
  Each of those, is configured to point at a pool of time servers (4 each).

Well, that will prevent the client from detecting it's getting wrong
time. Is that what you want?

From the log it seems that at least one server is completely wrong,
the offset between the two servers is around 3 seconds! I'd suggest to
fix that first.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP.log interpretation

2014-04-18 Thread Miroslav Lichvar
On Fri, Apr 18, 2014 at 10:38:10AM -0500, GregL wrote:
 But, was the sychronization lost message *because* ntp saw the time
 difference so great on peer servers...and chose one to synch to...resulting
 in the time reset message?

It seems so. Not sure how close this is to the version you are
running, but in xntp3-5.93e (dated 1998) it seems the system peer is
unselected (and the message logged) on every clock step.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Handle ntp conf modification when ntp is already running

2014-04-11 Thread Miroslav Lichvar
On Thu, Apr 10, 2014 at 08:20:59PM +, Harlan Stenn wrote:
 Rob writes:
  Furthermore, the simple solution of having SIGHUP perform an exec
  of the same binary, thus in fact restarting the entire process and
  losing all state information, is not the only possible solution.
 
 If the current process has chroot()ed, how do you re-exec?  How do you
 handle the things that are done before the chroot()?  Again, I haven't
 looked at the code to be sure, but I believe there are some things that
 will behave differently if they are attempted from the chroot() target.
 
 Sure, one could have a top-level master process that simply waits for
 the chroot()ed subprocess to die and then restarts it, but we're
 starting to get in to a lot of wheel-reinventing here, and would this
 really be worth the overhead on a program that is already larger and
 more complicated than many folks want?

That sounds like a horrible hack.

Even without chroot it will be difficult. If the ntpd process dropped
root privileges after start, it won't be able to re-exec and it may
not have permissions to open newly added refclocks or reread the keys,
for instance.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Three NTP servers, one strange IP-address in 'refid'

2014-04-07 Thread Miroslav Lichvar
On Mon, Apr 07, 2014 at 09:28:09AM +0200, Martin Burnicki wrote:
 Rob wrote:
 When the NTP server puts an IPv6 hash in the refid field, it could set
 the upper 4 bits to 1.  (so the hex value starts with F)
 A valid IPv4 address never has that, so ntpq could print it in hex in
 this case, and as a dotted quad in other cases.
 
 This also guarantees a hashed IPv6 can never collide with a valid IPv4
 refid.  But at the same time, it shrinks the space of IPv6 hashes,
 increasing the chance of a hash collision between two IPv6 addresses.
 
 In my opinion this sounds reasonable. The danger of collision might
 be slightly higher (less with IPv4, a little bit more with another
 IPv6 hash), but for users it would avoid confusing IPv4 addresses
 with IPv6 hashes.

If I'm not mistaken, the main purpose of the refid value is detection
of synchronization loops. To not break that, all NTP servers would
have to update their refid definition at the same time. That's not
doable. Fixing the tools to print the value in hex instead of dotted
quads to avoid confusion seems like a better fix to me.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Three NTP servers, one strange IP-address in 'refid'

2014-04-02 Thread Miroslav Lichvar
On Wed, Apr 02, 2014 at 09:48:26AM +0200, Sander Smeenk wrote:
 Quoting E-Mail Sent to this address will be added to the BlackLists 
 (Null@BlackList.Anitech-Systems.invalid):
  I guess it could also be a IPv6 ref mangling issue?
 
 That could well be. We use IPv6 where we can.
 But that would constitute this refid issue a bug.
 One that is rather confusing and time-consuming.

For IPv6 addresses the refid is defined as first 4 bytes of the MD5
sum of the address. With 2001:7b8:3:32:213:136:0:252 (tt52.ripe.net)
that is 0xac023551, or 172.2.53.81 in the quad-dotted notation.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Problem facing with Ntp client Configuration

2014-04-01 Thread Miroslav Lichvar
On Fri, Mar 28, 2014 at 10:09:25AM -0400, Brian Utterback wrote:
 Two is never fine, and not just because of clock hopping. Like the
 old adage, a man with a watch knows what time it is, a man with two
 watches is never sure, NTP will often refuse to set the time with
 just two upstream sources if the two sources do not agree and the
 dispersion intervals do not overlap.

I think we had this discussion before. I wouln't say that two is never
fine. I think two is much better than one if you need to be able to
tell when there is a problem and don't need to recover automatically.
Clock hopping shouldn't be a problem since source combining was
implemented.

 That means that two servers can
 agree on the time to within a millisecond of each other, but is the
 dispersion is less than a half of a millisecond, NTP will not set
 the clock by either of them.

Well, at least one of the servers is a falseticker if their intervals
don't overlap and it should be fixed to not lie about its dispersion.
Adding a third source just to hide this problem doesn't seem right.

 But I would like to point out something to you. You often remind us
 that NTP only uses one in eight data points. But each server you add
 means one more data point used, which means that if eight servers
 were used then NTP would be using the same number of data points
 that it would be if it only had one server and used all the data
 points. 

Please note that the data points are not equal. The point which is
used to update the clock has the shortest distance and may carry more
useful information than the other points combined if the clock is
stable enough.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] status information after ntpd -q

2014-02-04 Thread Miroslav Lichvar
On Tue, Feb 04, 2014 at 10:29:29AM +, Sanal, Arjun (NSN - IN/Bangalore) 
wrote:
 
  i would like to use the command ntpd -q to synchronize with a server 
  once,
  but i need some feedback from the command about the status.
 
  ntpd was designed and is intended to run all of the time as a daemon,
  but you can do what you've asked for by setting explicit logging path like:
 
  # ntpd -q -l /tmp/ntpd.log
 
 Is there any specific reason why 'ntpd -q' doesn't return any error code (for 
 example: situations like server not reachable)

There is a bug filed for that:
https://bugs.ntp.org/show_bug.cgi?id=759

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] simple nt.conf cases for ntp-client

2014-01-28 Thread Miroslav Lichvar
On Fri, Jan 24, 2014 at 10:13:15PM +, William Unruh wrote:
 On 2014-01-24, David Woolley david@ex.djwhome.demon.invalid wrote:
  If there is a prefer peer and it survives, it uses that one, otherwise 
  as per clock_combine in ntp_proto.c, i.e. weighted by synchronisation 
  distance (which grows with time).

  The weighting may change between versions.  This is 4.2.7p333.
 
   y = z = 0;
   for (i = 0; i  npeers; i++) {
   x = 1. / peers[i].synch;
   y += x;
   z += x * peers[i].peer-offset;
   }
   sys_offset = z / y;
 
 
 So, if this is calculated immediately after a new selected-by-filter reading
 comes in, x is infinity and only the latest one is used.

The synchronization distance includes also delay, dispersion and
precision, so it should never be zero and x should be real.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] ntpdc and collectd queries timeout

2014-01-24 Thread Miroslav Lichvar
On Fri, Jan 24, 2014 at 12:28:27PM +0100, Terje Mathisen wrote:
 michalpurzyns...@gmail.com wrote:
 The ntpdc queries timeout every time on the NTP version
 ntp-dev-4.2.7p411 (compiled myself). Looks like the type 7 packets
 are blocked from localhost but I don't know why.
 
 Type 7 (which is used by ntpdc) isn't blocked on ntp-dev, it has
 been _removed_!

Wasn't it only disabled by default? It still seems to be in 4.2.7p411
in the ntp_request.c file, but enable mode7 is now required to
process the ntpdc queries.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] better rate limiting against amplification attacks?

2014-01-16 Thread Miroslav Lichvar
On Wed, Jan 15, 2014 at 08:35:32PM +, Rob wrote:
 William Unruh un...@invalid.ca wrote:
  I do not mean the default in the config file, I mean the default if
  there is no config file or if nothing is set in the config file.
 
 That only becomes meaningful when ntpd starts to actually work without
 config file.  Of course that would be possible, but I don't think it
 is reality today.  Or is it, in the latest versions?

Servers can be now specified on the command line, so you don't really
need a config file to have ntpd doing something useful. The following
command seems to work as expected.

ntpd -c /dev/null 0.pool.ntp.org

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] better rate limiting against amplification attacks?

2014-01-16 Thread Miroslav Lichvar
On Thu, Jan 16, 2014 at 02:28:32PM +0100, Martin Burnicki wrote:
 Harlan Stenn wrote:
   pool 0.debian.pool.ntp.org iburst
 
 I bet the server options for pool servers are in there because
 this was used in earlier versions before the pool keyword was
 introduced, and it still works.
 
 instead, and I'd have to look up when the 'pool' directive was put in
 there.
 
 IIRC this is supported in 4.2.6, but has not been supported in
 4.2.4p8 and earlier. If the ntp.conf file shipped with a particular
 OS has been initially created a long time ago and always been
 updated for newer NTP versions then I'm not surprised to see this.

IIRC the pool command in 4.2.6 uses quite a lot of servers, which
probably is not an acceptable use of pool.ntp.org. I think it was
improved later in 4.2.7. The page about recommended configuration
doesn't mention it yet.

http://www.pool.ntp.org/en/use.html

Vendors should be careful with the pool command.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Slow convergence loopstats (but nice results)

2013-12-12 Thread Miroslav Lichvar
On Thu, Dec 12, 2013 at 09:59:03AM +0100, Martin Burnicki wrote:
 A major problem was that the standard NTP protocol doesn't support a
 way to send the captured time stamp of a previously sent packet to
 its client, as done by the so-called followup message in PTP.

ntpd has the peer and broadcast interleave modes to send the followup
time stamps.
http://www.eecis.udel.edu/~mills/ntp/html/xleave.html

Also, there is a feature called launch time, which is supported in
some NICs, so the follow up message is not always necessary.

 I don't know if new standard NIC chips which support PTP
 timestamping can also timestamp NTP packets, but even if they do
 then in practice there's still the problem with network switches,
 etc.

Some NICs can time stamp any packets.

 There are network switches out there which are PTP-aware and also
 timestamp incoming and outgoing PTP packets to compensate the
 introduced packet delay in some way, but there are no switches
 (AFAIK) which can do this with NTP packets, so even if you used
 hardware time stamping of NTP packets on NTP end nodes the resulting
 accuracy would still be worse than with PTP.
 
 That's too sad.

Agreed. I think it's possible to implement a HW NTP support, but there
is problem that the switch would have to keep some state about each
NTP association. If there was a standardized extension field to store
the processing delay in both directions, that wouldn't be necessary.
I'm not sure what would have to be done to not break the NTP
authentication.

A major advantage NTP has over PTP is that it knows the delay for
each measurement in the client/server and symmetric modes, which
allows it to filter out bad measurements. In PTP the delay is measured
independently (similarly to the NTP broadcast mode), so bad
measurements can't be easily ignored and it's necessary to have all
networking HW with PTP support to account for all processing delays.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP not syncing

2013-12-06 Thread Miroslav Lichvar
On Fri, Dec 06, 2013 at 10:17:48AM +, David Taylor wrote:
 On 06/12/2013 09:36, Harlan Stenn wrote:
 []
 The only systems we've seen that did this are Linux kernels, and it
 would be good to get the starting and ending dates/kernel numbers for
 this behavior.
 
 H
 
 The only data points I can contribute to that are that 3.2.27 and
 3.6.11 appear to be OK, at least on the Raspberry Pi (Debian).

The relevant commit seems to be
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=08ec0c58fb8a05d3191d5cb6f5d6f81adb419798

It was included in 2.6.38.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP makes a time jump

2013-07-09 Thread Miroslav Lichvar
On Mon, Jul 08, 2013 at 08:19:12PM +, unruh wrote:
 Now, If we know that the max difference between the client and server's
 drift rate is say 200PPM, then if one could limit the server to only
 slewing at 300PPM then the client should be able to keep up. But I do
 not know of any way of telling the server it should never slew faster
 than 300PPM. Is there one?

I think the kernel would have to be recompiled with a smaller
MAXFREQ_SCALED constant or ntpd recompiled with smaller NTP_MAXFREQ if
the kernel discipline is disabled.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] What should the poll be for the shared memory driver (type 28)?

2013-06-18 Thread Miroslav Lichvar
On Mon, Jun 17, 2013 at 07:02:09PM +0100, David Taylor wrote:
 Thanks, Steve.  My knowledge of the source tree is even more limited
 than my knowledge of C!  In refclock_shm.c, it does say that the
 peek routine is called every second, so if the type 28 driver has
 an internal poll of one second, does it matter what min/max poll is
 set in the ntp.conf file?  Does it even need to be set at all?

IIRC, the one second interval is used only to collect the SHM samples
and store them in a buffer. In the minpoll/maxpoll interval the
collected samples are processed in a median filter and one final
sample is used to update the clock. This improves the jitter.

 I'll try setting it to 4, as a test, and see whether anything changes.

The difference with NTP sources is that changing the poll interval
affects also the jitter as there will be a different number of samples
in the filter.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] How do I validate my PPS clocks?

2013-02-25 Thread Miroslav Lichvar
On Mon, Feb 25, 2013 at 01:44:02PM +0100, Kasper Pedersen wrote:
 From the PPS arrives, and to the kernel timestamps it, is a very long time.
 I wrote this to measure it:
  http://n1.taur.dk/edgetest.c
 (you will need a linux machine, gcc, and kernel-headers to compile)

Very interesting, thanks! For my machine it shows that the interrupt
latency is around 12 us.

I'm wondering if the kernel module could have an option which would
enable a polling method to time stamp the PPS events.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] A proposal to use NIC launch time support to improve NTP

2012-12-19 Thread Miroslav Lichvar
On Wed, Dec 19, 2012 at 11:05:57AM -0500, Brian Utterback wrote:
 On 12/19/12 10:12, Ulf Samuelsson wrote:
 The desired launchtime is compared to the network controller
 timestamp counter in H/W, so again there is no need to synchronize
 with the system time.
 
 Yes there is. The ntpd program has to set a timestamp in the
 outgoing packet and then specify the launchtime when it writes the
 packet. The goal here is to have the timestamp written in the packet
 exactly match the time the packet actually hits the wire. So, the
 timestamp in the packet must be a little in the future when it is
 written so that by the time the controller gets it the packet can be
 delayed until the right time. Since ntpd cannot access the clock in
 the controller, this requires that the kernel time be relatively
 close to the controller time.

ntpd can read the clock, much more slowly than the system clock, but
still fast enough to send tens of thousands of packets per second.

I think it makes more sense to have one loop controlling just the PHC
and another, much tighter, syncing the system clock from the PHC,
rather than trying to sync the system clock through the PHC.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] A proposal to use NIC launch time support to improve NTP

2012-12-13 Thread Miroslav Lichvar
On Thu, Dec 13, 2012 at 08:23:47AM -0500, Brian Utterback wrote:
 The internal clock of the network controller is the PHC for IEEE1588,
 it has a 1 ns resolution, and can be steered with a 32 bit fractional
 of 1 ns. see SYSTIML and TIMINCA in the I210 datasheet.
 
 // jwalck
 
 I know that. The problem is that there is going to be jitter
 introduced when you set the clock from the kernel. That is generally
 the problem with IEEE 1588, getting the time from the controller to
 the kernel and vice versa. If you have to go across a PCI bus for
 instance that will introduce jitter.

From what I have seen, with multiple readings and some filtering, the
jitter is very small, somewhere in nanoseconds or couple tens of
nanoseconds. Even if the delay was highly asymmetric, with 2us RTT the
error would be only 1 us, which is still much better than the delays
causing the error in the TX timestamp on Ethernet.

The phc2sys program from the linuxptp project can be used to
synchronize the system clock to the PHC or the PHC to the system
clock. It can do that via PPS or filtered clock readings. 

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Timing issue with Linux and kernel PPS?

2012-11-20 Thread Miroslav Lichvar
On Mon, Nov 19, 2012 at 06:03:06PM +, David Taylor wrote:
 On both systems, sudo modprobe pps_ldisc produces no output.

No message is a good message :).

 I have no idea which device ntpd is using, I simply have the type 22
 driver installed which, as I understood it, gets the accurate
 timestamp from the kernel. 

127.127.22.0 is /dev/pps0, 127.127.22.1 is /dev/pps1, ...

 How the kernel chooses which device to
 use I don't know.

With udev the order might be random. There could be a race between the
script which loads modules from /etc/modules and udev.

 In /dev I see pps0 on the system without a PPS signal connected, and
 pps0 and pps1 on the system /with/ the PPS signal active.  On the
 system /with/ the signal active, some 25 seconds in the dmes output
 I see: pps_ldisc registered (so ldisc does matter, I stand
 corrected), followed by pps1 new source, and source /dev/ttyAMA0
 added.

You can see what pps device is actually generating events with:
grep '' /sys/class/pps/pps*/{assert,clear}

 So the issue appears to be that /dev/ttyAMA0 is not created until
 the GPS receiver is sending second pulses, and by that time ntpd is
 running and can't see the device.  Here are my lines from ntp.conf:
 
 # Kernel-mode PPS ref-clock for the precise seconds
 server 127.127.22.0 minpoll 4 maxpoll 4
 fudge 127.127.22.0  flag3 1  refid PPS
 
 I wonder whether I should be using 127.127.22.1 rather than .0?

Perhaps. Do you use in ntpd the serial output from the GPS with some
driver like NMEA?

If you don't need the pps from /dev/ttyACM0, my suggestion would be to
prevent loading of the pps_ldisc module, so there is always only one
pps device. Any chance you added a udev rule to load pps_ldisc
automatically when the serial device is created? 

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Timing issue with Linux and kernel PPS?

2012-11-19 Thread Miroslav Lichvar
On Mon, Nov 19, 2012 at 09:02:12AM +, David Taylor wrote:
 On 18/11/2012 15:20, Uwe Klein wrote:
 what happens if you insmod pps_ldisc into the not ready system?
 
 (1) I get Error: could not load pps_ldisc module: No such file or directory

insmod needs full path to the module, it's better to call modprobe pps_ldisc.

 I looked at the article you referenced, but in this case the
 Raspberry Pi is not using the DCD line, but a separate GPIO pin.
 lsmod shows pps_gpio as present.

From the original post it seems you have two pps devices, one for gpio
and the other for ldisc which is created two minutes later (some USB
device?).

Do you see two /dev/pps* devices and are you sure ntpd is using the
gpio one? Perhaps there is an ordering problem?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] What is the NTP recovery time from 16s step in GPS server?

2012-10-31 Thread Miroslav Lichvar
On Wed, Oct 31, 2012 at 05:22:44PM +, Rob wrote:
 Using USB ports in a service started at boot time should normally
 work ok, but when it has issues on the Raspberry maybe it could
 be solved by delaying the startup of gpsd a bit.  But don't try to
 tackle all issues at the same time.

Isn't it better to start it from udev then? The gpsd sources provide a
hotplug script, which I think is included at least in the Debian and Fedora
gpsd packages.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] SO_TIMESTAMPING experiments (sub-us jitter over LAN)

2012-10-18 Thread Miroslav Lichvar
On Thu, Oct 18, 2012 at 02:53:02AM -0700, gabs wrote:
 SO_TIMESTAMPING [1] is a socket option for obtaining transmit and receive
 timestamps. ntpd uses SO_TIMESTAMP to get receive timestamps. Transmit
 timestamps require NIC driver support [2] and the application gets the
 timestamp after the packet is sent.
 
 A PTP-like protocol is used to measure the delay and offset between the
 client and server. The first part is a message exchange similar to
 NTP [3], except the client gets a more accurate transmit timestamp
 (T1) from the kernel after sending the packet. The server, after sending
 its reply, gets the kernel transmit timestamp (T4), then sends another
 packet containing T4 (similar to the PTP Sync Follow message). The
 median of 6 offsets are sent to the client's ntpd thru the SHM driver.

NTP supports interleaved mode in peer associations which does the
timestamp followup. It would be really nice if ntpd supported the
SO_TIMESTAMPING option.

http://www.eecis.udel.edu/~mills/ntp/html/xleave.html

 Both are running Linux kernel 3.3.4, tickless, no preemption,

You may want to try nohz=off, it seems there is a bug in the kernel
which can cause an extra jitter of couple microseconds in the
system clock readings when the system is idle.

 Sample measurements (raw):
 left: delay
 right: offset (in microseconds)
 
 91509  509
 89577 -991
 88365 -795
 89574 -731
 90593 -163
 89360 -1067
 
 92650 -318
 90634 -910
 89455 -989
 89511 -1080
 88874 -693
 88534 -1140

Cool. Are those numbers nanoseconds?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Testing throughput in NTP servers

2012-09-13 Thread Miroslav Lichvar
On Wed, Sep 12, 2012 at 02:28:24PM +0200, Ulf Samuelsson wrote:
 Anyone knows if there are any available Linux based S/W to test the
 throughput of NTP servers?
 I.E:
 
   packets per second?
   % of lost packets
   etc?

I've used tcpdump and tcpreplay to measure the maximum packet rate
ntpd can handle. IIRC, the ntpd process itself needed only a couple of
percent of the CPU, I think the bottleneck is always in the kernel or
the NIC.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] WARNING: someone's faking a leap second tonight

2012-08-31 Thread Miroslav Lichvar
On Thu, Aug 02, 2012 at 05:57:43AM +, Dave Hart wrote:
 On Thu, Aug 2, 2012 at 1:17 AM, Chris Adams wrote:
  I'm still seeing leap=01 from 204.235.61.9 (name1.glorb.com), a
  stratum-2 server in the US pool (a few of my systems have it in their
  list).
 
 That particular system seems to have corrected its leap indication,
 but plenty of other pool participants are advertising leap.  I have
 this laptop set to associate with every IP in a list of all pool
 servers as of late June.  The following are showing leap=01 now:
 
[...]

From that list the following IPv4 servers still seem to be announcing
a pending leap second:

131.155.140.129  Netherlands
131.155.140.130  Netherlands
143.121.199.173  Netherlands
161.53.248.35Croatia
164.107.116.179  United States
178.237.34.94Netherlands
192.87.106.2 Netherlands
192.87.106.3 Netherlands
192.87.36.4  Netherlands
193.2.111.2  Slovenia
193.2.111.3  Slovenia
193.2.4.2Slovenia
193.2.78.228 Slovenia
193.77.222.200   Slovenia
193.77.237.128   Slovenia
193.95.229.133   Slovenia
194.171.167.130  Netherlands
194.249.198.30   Slovenia
213.129.242.82   Austria
213.206.85.20Netherlands
217.75.72.153Slovakia
219.117.206.46   Japan
64.22.125.197United States
67.209.225.216   United States
69.65.33.188 United States
72.14.178.210United States
77.245.91.218Netherlands
77.94.135.133Slovenia
80.239.2.130 Norway
81.167.109.120   Norway
81.187.35.170United Kingdom
81.93.163.20 Norway
81.93.163.23 Norway
82.197.80.125United Kingdom
83.98.201.133Netherlands
83.98.201.134Netherlands
85.158.249.144   Netherlands
85.17.71.101 Netherlands
85.252.162.7 Norway
86.61.66.23  Slovenia
90.155.74.40 United Kingdom
91.198.87.118Netherlands
94.26.2.134  Bulgaria
95.211.7.153 Netherlands
98.191.213.7 United States

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] PSYCHO PC clock is advancing at 2 HR per second

2012-03-26 Thread Miroslav Lichvar
On Fri, Mar 23, 2012 at 06:12:11PM +0100, Terje Mathisen wrote:
 Miroslav Lichvar wrote:
 On Fri, Mar 23, 2012 at 11:49:19AM +0100, Terje Mathisen wrote:
 But I think a much bigger problem with the clock filter and PLL
 combination is that it can't drop more than 7 samples. When the
 network is saturated, it's usually better to drop much more than. If
 the increase in delay is 1 second and the clock is good to 10 ppm, it
 could wait for days before accepting another sample.
 
 Oh but it can!
 
 Check out huff-puff!
 
 You can easily tell ntpd to coast past multi-hour periods of
 excessive delays/traffic.

With huff-puff it doesn't really coast, it just shifts the offset in
one direction by increase in the delay. This works well when the link
is saturated in one direction, but under normal conditions it makes the
timekeeping worse, so you need to consider if it's worth enabling.

If you want to see why ntpd can't drop more samples you can block the
NTP packets in firewall, e.g. in a cycle which allows 4 packets and
drops 60. The PLL will be unstable, frequency will be jumping up
and down, offset orders of magnitude higher. This is the reason why
some other NTP implementations were created.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] PSYCHO PC clock is advancing at 2 HR per second

2012-03-23 Thread Miroslav Lichvar
On Fri, Mar 23, 2012 at 11:49:19AM +0100, Terje Mathisen wrote:
 unruh wrote:
 No I would not. That is not what ntpd does. It really does throw away 7
 of the samples and never uses them. The whole question is what is the
 best statistic to use. I do not believe that the shortest roundtrip
 time is that best statistic. If you could convince me it is, I would be
 more than happy to have ntp use it.
 
 In _some_ scenarios, keeping only the minimum rttsample is indeed
 the best approach:

Yes, it depends on the network jitter and clock stability. But ntpd
doesn't try to estimate the stability and uses a fixed dispersion rate
and Allan intercept in the filter algorithm (15 ppm and 1024 sec by
default). By tweaking the constants you can change the ratio of
dropped samples.

But I think a much bigger problem with the clock filter and PLL
combination is that it can't drop more than 7 samples. When the
network is saturated, it's usually better to drop much more than. If
the increase in delay is 1 second and the clock is good to 10 ppm, it
could wait for days before accepting another sample.

 In order to be considered OK, we can't accept more than 50 ppb
 frequency offset.
 
 Handling this with up to 50 ms sawtooth variation (with periods up
 to several hours) in the one-way latency means that the vendor
 require sampling periods of up to 10+ hours, with multiple
 packets/second and then keeping a single packet at the end.

That seems excessive. Do they set the frequency directly just from the
last two samples? With PLL or similar, increasing the time constant
accordingly might be a better approach.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] PSYCHO PC clock is advancing at 2 HR per second

2012-03-20 Thread Miroslav Lichvar
On Tue, Mar 20, 2012 at 02:59:12AM +, Dave Hart wrote:
 Although it's the first time I've seen such, it appears the offset and
 frequency calculations both ended up overflowing.  I would have
 guessed bad input should have appeared in peerstats before loopstats
 but I didn't find anything unusual.

This sounds familiar. Perhaps the OP is hitting the bug 2156 fixed
recently? If the emulated adjtime on Windows doesn't apply the 500 ppm
limit, it could have explained the huge frequency error.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Failed to test leapsecond's handling

2012-03-08 Thread Miroslav Lichvar
On Thu, Mar 08, 2012 at 01:10:02PM +0100, Marco Marongiu wrote:
 But when I graph the time log (see the log target in the makefile), I
 don't see the leap second kicking in. Based on Mills' The NTP Timescale
 and Leap Seconds[1], when the leap second kicks in, I'd expect two
 consecutive date command to _appear_ happen at different offset than in
 normal conditions. Unfortunately, that didn't happen, and if I draw a
 line of the accumulated offsets between consecutive runs of the command,
 the line is almost perfectly straight.

Do you see the leap bit enabled in ntptime or adjtimex output? Is the
local timezone UTC? Just to make sure the date commands sets time to
before 0:00 UTC and not some other hour. It would be interesting to
also try disable kernel in the ntp.conf.

In a clknetsim simulation with ntp-4.2.6p5 I can see the clock is
correctly stepped by 1.0 second. Here is the ntpd log (in UTC+2
timezone):

http://pastebin.com/ZRi6qv8E

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Failed to test leapsecond's handling

2012-03-08 Thread Miroslav Lichvar
On Thu, Mar 08, 2012 at 02:28:07PM +0100, Miroslav Lichvar wrote:
 In a clknetsim simulation with ntp-4.2.6p5 I can see the clock is
 correctly stepped by 1.0 second. Here is the ntpd log (in UTC+2
 timezone):
 
 http://pastebin.com/ZRi6qv8E

In another simulation set to start 15 seconds before midnight it
didn't work and it seems ntpd needs to be started sooner, perhaps
some number of polling intervals?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Leap second

2012-01-06 Thread Miroslav Lichvar
On Fri, Jan 06, 2012 at 12:41:13PM +0100, Rob van der Putten wrote:
 It is announced now, it occurs Jun 30.
 The tzdata database contains a file called leapseconds which contains
 all of the leapseconds which have occured  or are know to occur in the 
 future.
 
 In 'right' (based on the International Atomic Time) it does, in
 'posix' (based on the Coordinated Universal Time) it doesn't.
 Does anyone use 'right'? Is this supported by NTPD?

I don't think you can use the right timezones on system running
synchronized via NTP. But ntp/chrony could use the information about
leap seconds stored in the right/UTC timezone and I think that would
be a nice feature. To check if a leap second will occur on a specified
date, it just needs to call mktime() in the right/UTC zone and see if
the seconds overflowed or not, see

http://pastebin.com/DqM4s35Y

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Visualization of clock control

2012-01-05 Thread Miroslav Lichvar
On Thu, Jan 05, 2012 at 11:40:25AM +0800, Dennis Ferguson wrote:
 On 4 Jan, 2012, at 22:54 , Miroslav Lichvar wrote:
  The simulations were done with a clock wandering at 1 ppb/s,
  10/100/1000us network jitter with exponential distribution and the NTP
  clients were configured to use 64s polling interval.
 
 That's pretty neat.  I think, however, that the clock wander of 1 ppb/s
 is about an order of magnitude too large for real life, at least for machines
 kept in an air conditioned room (and the behavior of clocks in machines
 subject to environmental variations probably can't be modeled by wander at
 all).  My measurements against precise hardware tended towards a value of
 1ppb/10s, which is also consistent with the 10^-8/1000s which sometimes shows
 up on Allan variance plots (I think there's a square root relationship in 
 there
 if the wander is a truly random walk).

I think the 1ppb/10s random walk wander corresponds to ~0.32ppb/1s.
The +0.5 slope in the variance plot intersecting 10^-8 at 1000s would
be ~0.6ppb/s wander.

I tried to model some thermal effects by adding a sine, triangle or
pulse wave to the clock frequency, but it seemed to me the effect it
had on the overall RMS time error was similar to just increasing the
wander. So instead of three or more parameters of the clock I set only
one. Sometimes I use even 10ppb/s wander, to simulate a machine with
varying CPU load and I think the results are not that different from
what I see on my desktop.

BTW, the simulator can be configured to read the clock frequency from
a file. If you have real data from a PPS refclock, you can use that
and see at what random walk wander will ntpd give similar results.

 The other difficulty with respect to real life may be modeling network jitter
 as exponential, since I believe the probability distribution for network 
 delays
 is heavy-tailed (i.e. with extreme values way over-represented; this is a 
 problem
 when using statistics which assume the underlying error distribution is 
 gaussian).
 I don't know how to fix that, though.

I'd definitely be interested in a better model for network delays. I
guess we could try to make a collection of the ntp rawstats logs from
various network environments and see how the distribution looks like.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


[ntp:questions] Visualization of clock control

2012-01-04 Thread Miroslav Lichvar
Hi,

I wrote a tool to visualize the data generated by the clknetsim
simulator and I thought some of you might find it interesting. The
goal was to show how a clock is controlled by NTP client and at the
same time see its offset from true time and the NTP measurements (the
actual offset and delay seen by the client).

Here are some example runs of the tool captured to animated gifs:
http://mlichvar.fedorapeople.org/clknetsim/chrony_ntp/vis/visclocks_10us.gif
http://mlichvar.fedorapeople.org/clknetsim/chrony_ntp/vis/visclocks_100us.gif
http://mlichvar.fedorapeople.org/clknetsim/chrony_ntp/vis/visclocks_1000us.gif

The simulations were done with a clock wandering at 1 ppb/s,
10/100/1000us network jitter with exponential distribution and the NTP
clients were configured to use 64s polling interval.

The white line is the reference clock. The red line is the clock
controlled by ntp and green is chrony. The blue lines are the NTP
measurements made by chrony. Both clients were getting the same data,
but the polling intervals were not exactly the same so the frequency
changes in the red line don't match exactly with the blue lines.

The tool is included in the clknetsim git as visclocks.py. It also has
a game mode, where you control the frequency and phase of the clock by
mouse and you can try to beat the other clients. :)

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Configure FreeBSD or Linux to use stepping clock?

2011-12-16 Thread Miroslav Lichvar
On Thu, Dec 15, 2011 at 07:50:08PM +, Dave Hart wrote:
 Dr. Mills raised the possibility privately that either FreeBSD or
 Linux might be reconfigured to use a more primitive clock that steps
 once per millisecond or less.  If possible and I am able to accomplish
 it, my testing of these bug 2037 fuzzing changes would be greatly
 assisted.

On Linux, you could set a different kernel clocksource. Perhaps
to jiffies or pit, if available.

Check these files:
/sys/devices/system/clocksource/clocksource0/current_clocksource
/sys/devices/system/clocksource/clocksource0/available_clocksource

Why not degrade the resolution of the clock directly in ntp sources?
In get_systime():
GET_SYSTIME_AS_TIMESPEC(ts);
ts.tv_nsec /= 100;
ts.tv_nsec *= 100;

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] New ntp Server

2011-12-09 Thread Miroslav Lichvar
On Thu, Dec 08, 2011 at 12:50:12PM +, Dave Hart wrote:
 The results are worse than FreeBSD or Linux  I suspect the difference
 is mostly due to the interpolation code having to guess at when, on
 the counter timescale, the system clock ticked up to the present
 value.  Some ugly busy-looping logic might help refine that and also
 overcome the incompatibility with newer Windows versions' clocks.

It's a pity the system doesn't provide a function for precise clock
reading.

What resolution has the clock frequency adjustment? I'm reading
about the SetSystemTimeAdjustment function and the adjustment is in
100-ns units applied over an lpTimeIncrement interval. If the interval
is too short I suspect this could also limit the time and frequency
accuracy of the system clock.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] New ntp Server

2011-12-08 Thread Miroslav Lichvar
On Thu, Dec 08, 2011 at 03:55:50AM +, Mark C. Stephens wrote:
 Oh no I am quite happy with my hp3325A ;)
 
 
 Well Okay, after a slight detour trying to get ilo100 to work, I loaded 
 centos 6.0 x64 on the DL165 G2 (computer) and found it has 3.3V PCI slots. So 
 none of my Serial I/O cards fit, being 5V. I have seen people take a dremel 
 to them to cut a 3.3V notch, but I am not a 100% sure this works. 
 
 Centos 6.0 is really impressive I have to say. Also the PPS kernel module is 
 already built and installed, just need to load it.

The kernel includes general PPS support, but there is no support for
PPS on serial devices (pps_ldisc module). You'll probably need to use
a newer version of kernel or backport the module to the old version.
You'll also need to recompile the ntp package with the timepps.h
header.

It might be easier to try a newer distro. For instance, Fedora 14 and
later have kernel, ntp and chrony packages compiled with PPS support
and it should work out of the box, even with SELinux enabled :).

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] New ntp Server

2011-12-08 Thread Miroslav Lichvar
On Thu, Dec 08, 2011 at 07:53:15AM +, Mark C. Stephens wrote:
 Hello Sir Unrah,
 
 
 I just use ntpq -p. I am using Dave Harts rather excellent port to windows:
 
 C:\Program Files\NTP\binntpq -p
  remote   refid   st  t   when
 pollreach   delay   offset   jitter
 ==
 *GPS_NMEA(1)  .GPS.0   l  1   16  
 377 0.000   -0.139  0.059
 oPPS(1)   .PPS.   0l   -  
 16  377 0.000   -0.007  0.002
 
 I restarted ntpd a couple of hours ago so these number will improve.
 
 That is a good question, are we talking seconds for offset and jitter here? 

They are milliseconds. If ntpd on Windows can really keep the clock
stable to to ~10 microseconds, the recent suggestion posted here to
never use Windows for serious timekeeping might need to be revisited.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Ginormous offset and slow convergance

2011-12-02 Thread Miroslav Lichvar
On Thu, Dec 01, 2011 at 12:24:44AM +, Pete Ashdown wrote:
 Miroslav Lichvar mlich...@redhat.com writes:
 
 Would be interesting to know if this happens on every ntpd restart or
 only shortly after the GPS unit was powered up.
 
 Every restart (that doesn't have 127.127.0.1 in the config).

That would suggest a problem rather on the ntpd side. I wasn't able to
analyze the oncore debug messages in your other post, but maybe you
could try to switch the unit to NMEA mode and use the NMEA driver or
try it with gpsd and SHM driver and see if that makes a difference.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Ginormous offset and slow convergance

2011-11-30 Thread Miroslav Lichvar
On Wed, Nov 30, 2011 at 11:28:22PM +, unruh wrote:
 On 2011-11-30, Miroslav Lichvar mlich...@redhat.com wrote:
  On Wed, Nov 30, 2011 at 10:24:45PM +, unruh wrote:
  If he has peerstats log file, he can look at it and see what teh offset
  is of the oncore and the other ntp sources to see if it is really
  misbehaving that badly. Also, if it is out by 16 sec, why in the world
  has ntp not stepped the time? The threshold is 128ms. 
 
  I think it did step and more than once. I'd suspect a bug in the
  firmware in the GPS-UTC offset handling, current offset is 15 seconds
  and that is visible in one of the ntpq outputs in the original post.
 
 But how could he get a 16 second offset, after starting out with a .1 s
 and 1 s offset. At 500PPM, 16 sec takes 32000 sec  (10 hr) to accumulate
  which is poll interval 15. Ie, I cannot see how ntpd could have
  allowed that huge an offset to occur. 

ntpd doesn't step more than once per 15 minutes. What I think was
happening: on start the clock is good to couple ms, NTP servers are
not reachable yet, but GPS is off by 16s, ntpd steps immediately; GPS
is off by 15s, NTP servers are off by 16s, ntpd doesn't step yet; GPS
and NTP are off by 16s, ntpd steps back and stabilizes.

The loopstats log would be useful.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] NTP on embedded Linux with GPRS connection

2011-11-24 Thread Miroslav Lichvar
On Wed, Nov 23, 2011 at 08:06:28AM -0800, mas...@tlen.pl wrote:
 - use hwclock --adjust. So after ntpd synchronises the time, I would
 have to issue hwclock -w -u and repeat it after at least 24h, so
 hwclock can estimate the drift. Then repeat this process.

To use the --adjust option with ntpd you'll need to make sure the
kernel RTC synchronization (11 minute mode) is not enabled as it would
throw off the RTC drift estimation. See hwclock(8) for more
information.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Loop Frequency and Offset

2011-09-27 Thread Miroslav Lichvar
On Tue, Sep 27, 2011 at 10:20:57AM +0100, David Woolley wrote:
 Richard B. Gilbert wrote:
 
 
 I don't believe that accuracy of 1 microsecond , or less, is
 obtainable without without installing a GPS Timing Receiver or an
 atomic clock of some sort.
 
 He asked for an offset of 1 microsecond (presumably RMS or 90
 percentile?), not an accuracy of 1 microsecond.
 
 If you ignore systematic errors, an offset of 1 microsecond
 corresponds to an accuracy in the low 100s of nanoseconds.

Only if the loop is tracking frequency changes well. If you see in
the loopstats log long runs of offsets with the same sing, the actual
error is probably closer to the reported offset.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] garmin 18x and linux

2011-09-06 Thread Miroslav Lichvar
On Mon, Sep 05, 2011 at 04:47:20PM +, unruh wrote:
 On 2011-09-05, Miroslav Lichvar mlich...@redhat.com wrote:
  It's from gpsd which seems to make the NMEA receive timestamp after
  the message is processed.
 
 Never did understand that. Timestamping the beginning of the sentences
 is cheap enough and easy enough. 
 Mind you, your fluctuations are far more than I would expect simply from 
 variations in the length of the sentences.
 Are there more sentences delivered than just the one gpsd uses?

There are other messages enabled (I like to monitor the visibility of
satellites in cgps), but RMC and GGA are transmitted first. The baud
rate is set to 115200. The measured time it takes to transmit one
batch is about 85 +/- 10 ms.

Here is another capture, this time only over couple hours, but it's
the offset to the beginning of the transfer (i.e. start of RMC).

http://mlichvar.fedorapeople.org/tmp/18x_nmea2.png

The offset still moves in a 300ms range.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] garmin 18x and linux

2011-09-05 Thread Miroslav Lichvar
On Sat, Sep 03, 2011 at 08:05:21AM -0500, steven Sommars wrote:
 I monitored Garmin LVC (corrected firmware) NMEA time and saw variance of up
 to 50msec.  I wonder if the variation in NMEA time depends on GPS signal
 quality.

I'm wondering what is the cause of the variance too.

With 18x LVC (firmware 3.70) I see errors up to 150 ms. That
wouldn't be that bad if it was randomly distributed.

A capture over 30 hours:
http://mlichvar.fedorapeople.org/tmp/18x_nmea.png

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] garmin 18x and linux

2011-09-05 Thread Miroslav Lichvar
On Mon, Sep 05, 2011 at 03:04:54PM +, unruh wrote:
 On 2011-09-05, Miroslav Lichvar mlich...@redhat.com wrote:
  With 18x LVC (firmware 3.70) I see errors up to 150 ms. That
  wouldn't be that bad if it was randomly distributed.
 
  A capture over 30 hours:
  http://mlichvar.fedorapeople.org/tmp/18x_nmea.png
 
 This was captured how? Is that the beginning or the end of the nmea
 sentence?
 You have some where the offset is negative. Does this really mean that
 the nmea came in before the beginning of the second it referred to?

No, I forgot to mention that was already with 0.5s correction applied.
It's from gpsd which seems to make the NMEA receive timestamp after
the message is processed.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Accuracy of GPS device

2011-09-02 Thread Miroslav Lichvar
On Fri, Sep 02, 2011 at 09:50:05AM +0100, Miguel Gonçalves wrote:
 I found out the problem and just for the record I'll explain...
 
 The offset is larger than the delay because NTPd is using 10.0.2.254 (more
 on this switch later) as a time source and it shouldn't because it has two
 local stratum 1 clocks that are closer (0.170 ms vs 0.583 ms) are show less
 jitter. Anyway... to prove my point I removed 10.0.2.254 (the **internal**
 switch) from the configuration and here's the result of ntpq -p as of now:

It would be interesting to see the root distances for the three
servers. I think it's reasonable to expect the weights of the stratum
1 servers to be much higher than the weight of the third server, so
the combined offset isn't affected much by the third server. But what
I think it's happening here is the high default dispersion rate (15
ppm) increases the root distance so much that the weights are not that
much different. Setting tinker dispersion to a more realistic value
like 1 ppm (or even to 0.1 ppm in your case, see my comment below)
should help.

You can also use tos minclock 2 to limit the number of combined
sources. 

 $ ntpq -p 10.0.2.2
  remote   refid  st t when poll reach   delay   offset
  jitter
 ==
 +10.0.2.10   .GPS.1 u  889 1024  3770.179   -0.066
 0.083
 *10.0.2.9.GPS.1 u  391 1024  3770.166   -0.084
 0.051

Those are very good numbers for such high polling interval. Is the
crystal oscillator thermally stabilized? 

In any case I'd suggest to use a shorter maximum poll interval. The
default maxpoll is way too high for jitters normally seen on LANs if
you want best accuracy.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Fwd: Re: NetBSD GPS/PPS using 4.2.6p3

2011-08-22 Thread Miroslav Lichvar
On Sun, Aug 21, 2011 at 02:55:55PM -0700, A C wrote:
 That is where I obtained the ppstest code and then later I
 discovered the test code within the ntpd source distribution.  The
 NetBSD list also suggested that I compare kernel traces on the two
 programs.  It seems that ntpd's pps-api code behaves a bit
 differently than ntpd itself when it interfaces with the kernel.  I
 can provide traces to anyone that would like them for both the
 pps-api test program and ntpd 4.2.6p3.

 127.127.22.1  flag2 0 flag3 1 refid PPS\n\n
  11255  1 ntpd CALL  ioctl(7,PPS_IOC_SETPARAMS,0x1052d204)
  11255  1 ntpd CALL  ioctl(7,PPS_IOC_KCBIND,0xefffdf8c)

A shot in the dark, have you tried removing flag3 1 to disable the
kernel PPS discipline?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Sure GPS - Very High Jitter and Offset

2011-08-16 Thread Miroslav Lichvar
On Mon, Aug 15, 2011 at 10:58:45PM -0500, Ken Link wrote:
 Now, I get timestamps in the assert and clear files: 1313465775.004708342#4545
 
 However, the source is still jittery! 15ms sometimes, like it's not
 using the PPS signal at all.

If you compare few assert timestamps in row, how stable is the offset?
Couple of microseconds?

 The important configuration lines are:
 server 127.127.20.0 mode 18 minpoll 4 prefer
 fudge 127.127.20.0 flag1 1 flag2 0 flag3 1 flag4 0

You wrote earlier that you use a 2.6.38 kernel. I think the kernel PPS
discipline was added later, so maybe it would help to remove the flag3
setting.

Also, how is marked the PPS source in ntpq -p output?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Sure GPS - Very High Jitter and Offset

2011-08-16 Thread Miroslav Lichvar
On Tue, Aug 16, 2011 at 09:42:09AM -0500, Ken Link wrote:
 I wrote a script to compare the previous/current assert timestamp
 (ignoring the seconds), and this is what I get when NTP is *not*
 running:
 
 -.04357
 -.07328
 -.08905
 -.07969
 -.07593
 -.09174
 -.06566
 -.08152
 -.07500
 -.06854

That looks good.

 I tried setting flag3 to 0 but it didn't appear to make a difference.
 Also, I would have expected NTP to mark the clock as a PPS source with
 'o', but when I let it sync it seems to stick with '*' instead:
 
 $ ntpq -p
  remote   refid  st t when poll reach   delay   offset  jitter
 ==
  LOCAL(0).LOCL.  12 l  232   64   100.0000.000   0.002
 *GPS_NMEA(0) .GPS.0 l7   16  3770.000   20.012  26.423

It seems the GPS driver is not getting or is ignoring the PPS signal.
I think there were some issues fixed recently in it. I'd try the ATOM
driver (22) first to verify ntpd was compiled with PPS support and is
able to use it.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Sure GPS - Very High Jitter and Offset

2011-08-16 Thread Miroslav Lichvar
On Tue, Aug 16, 2011 at 12:49:16PM -0500, Ken Link wrote:
 
 I tried copying the header from ntpsrc/ports/winnt/include/timepps.h
 to /usr/include/timepps.h, but no dice. Do I just need to copy some
 more headers somewhere or does this mean I have to recompile the
 kernel?

I think you just need the timepps.h header, try this one
https://raw.github.com/ago/pps-tools/HEAD/timepps.h

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] [ntp:hackers] ntpdate removal is coming

2011-07-18 Thread Miroslav Lichvar
On Sun, Jul 17, 2011 at 12:57:06PM -0700, Harlan Stenn wrote:
 Just to be clear, there *used* to be some reasons to set the clock
 before starting ntpd.  In general, there is no need to do this anymore
 and I have not heard any good reasons it should still be needed.
 
 If anybody knows of any *good* reasons to set the clock before starting
 ntpd, please speak up.

With the -x option (or any larger tinker step) setting the clock
before starting ntpd is useful to avoid possibly very long initial
offset correction.

That wouldn't be needed if ntpd had an option to always step on start.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] How do I prevent sudden system time jumps.

2011-07-14 Thread Miroslav Lichvar
On Wed, Jul 13, 2011 at 04:49:37PM -0500, Hal Murray wrote:
 The problem is that the adjustment takes to large steps, not that it
 takes to long time.
 
 ntpd will slew the clock at 500 PPM.  You may be willing to wait a while
 for a second or two, but it takes a long time if you have to adjust
 by several minutes or an hour.  That may be OK for your setup, but you
 should think about it.
 
 If you are using the -x flag, be sure to check out the -g flag that will
 let it do one long jump at startup time.

The -g option only allows the initial offset correction to be larger
than 1000 seconds, it doesn't affect the step value.

When both -x and -g are used and the initial offset is just below 600
seconds (-x is alias for tinker step 600), it will still take about
two weeks to correct the offset.

An interesting scenario is when tinker step is larger than panic and
the initial offset is between the two. With -g the first clock update
is allowed, but the clock is not stepped so the following updates will
still be be over the panic limit and ntpd will abort.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] How do I prevent sudden system time jumps.

2011-07-14 Thread Miroslav Lichvar
On Wed, Jul 13, 2011 at 10:51:46PM +0100, David Woolley wrote:
 Hal Murray wrote:
 OK.  I asked since a timewarp of 200ms is a bit surprising for real HW,
 but is something to be expected if you were running in a VM.
 
 It's easy to get a time-warp of 200 ms on a DSL link.  Just download
 a huge file, say a CD.  The queuing delay on the input to the DSL
 link turns into asymmetrical delays.  I've seen delays up to 3.5 seconds.
 
 
 The huff and puff tinker optin can help mitigate this.

Please note that while the huffpuff option works very well with large
temporary asymmetric delays, it makes things worse with normal
delays as the offset will contain network jitter.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] how does jitter and round trip time affect the accuracy of the local clock?

2011-06-27 Thread Miroslav Lichvar
On Sun, Jun 26, 2011 at 09:05:59PM +0100, David Woolley wrote:
 If the jitter is of the order of 500 microseconds, and your
 delays are perfectly symmetric, and there is no clock wander (in
 particular, the temperature is tightly controlled), the error will
 exceed 500 microseconds, a small but significant amount of the time.

That wouldn't be a very good clock discipline if it wasn't able to
keep the clock error significantly below the jitter in these ideal
conditions.

In a simulation with 500us exponentionally distributed jitter and
clock wander insignificant to the PLL time constant, the RMS time
error is about 40 us for the standard PLL and 80 us for the Linux PLL.
The 99th percentiles are about 100 us and 200 us respectively.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Garmin firmware update - GPS 18x 5Hz software version 3.20

2011-06-22 Thread Miroslav Lichvar
On Fri, Jun 17, 2011 at 10:11:19AM +0100, David J Taylor wrote:
 You may want to watch for several days.   Previous 18x LVC
 firmware (3.60)
 drifted on my system from 0.5 to 1.4 seconds over a span of 6
 days. I've
 been running a beta version of 3.70 for several weeks with no problems.
 
 How stable is it?
 
 I don't care what the offset is as long as it's constant.
 
 It's constant, according to a graph I've seen from Steve.

I've updated the firmware on my unit from 3.20 to 3.70 and it seems
the jitter has indeed improved, see:
http://i.imgur.com/6NFBA.png

The middle part is when the unit was upgraded.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] ntpd[7602]: synchronized to

2011-05-26 Thread Miroslav Lichvar
On Thu, May 26, 2011 at 09:49:55AM -0700, Chuck Swiger wrote:
 On May 26, 2011, at 9:28 AM, Florin Andrei wrote:
  On 05/23/2011 03:57 PM, Chuck Swiger wrote:
  
  That being said, it's not expected that the preferred time source
  would change that frequently.  You'd probably do better to run
  with at least 4 servers, so that it can better judge their
  reliability; with only two, there isn't much basis for
  comparison.
  
  So, basically you're saying the quality of the NTP servers
  fluctuates for some reason and this machine is flipping back and
  forth between them?
 
 Evidently yes for your case.  With only two servers, it may not be
 possible to find a best intersection via ntpd's variant of
 Marzullo's algorithm:

I think the frequent source switching can happen with any number of
sources if the two best servers have similar synchronization distance.
With more servers you may have a better chance that the best server is
significantly better than the others though.

The problem can be usually fixed by increasing the anti-clockhopping
distance by tos mindist, perhaps to 10 ms.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Loop Filter Gains vs. Polling Interval

2011-05-17 Thread Miroslav Lichvar
On Sat, May 14, 2011 at 12:23:56PM -0500, Mischanko, Edward T wrote:
 Can anyone tell me, does the sensitivity for frequency adjustment lessen
 as the polling interval increases?  I ask because I'm observing that my
 offset increases and the frequency adjustment decreases to the point I
 fall out of sync at polling intervals above 256.  What am I doing wrong?

As I have recently learned, the Windows ntpd works with daemon loop
(instead of the kernel loop) which is optimized more for stable clocks
and noisy networks. So if your clock's frequency changes rapidly, ntpd
won't be able to keep up, as it would with the kernel loop.

Fortunately, you can improve that significantly by enabling the FLL
part of the loop by setting a shorter Allan intercept, in 4.2.6 it's
11 by default (set in log2(s)), i.e. FLL is active with poll 11 and
above.

For example:

tinker allan 7

HTH,

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] ntpd stays synced after loosing gps

2011-05-11 Thread Miroslav Lichvar
On Wed, May 11, 2011 at 05:17:14AM +, Dave Hart wrote:
 2011/5/11 Николай Орехов nowh...@mail.ru:
  Thanks, I'll try it! But is the recent tarball stable enough?
  I'm using latest 4.2.6 with some minor changes to support my LassenIQ TSIP
  and I need it to be very stable.
 
 I prefer ntp-dev in general over ntp-stable, but my perspective may be warped.
 
  Could you provide me link to this bug and fixes so I can make a patch?
 
 No, I'm afraid I can't.  I recall the problem you describe, and I
 haven't seen it in ntp-dev for a while, so I'm pretty sure it was
 fixed after 4.2.6 based on your experience, but I'd like to confirm
 that, as we are likely to start the push to polish 4.2.7 into a
 (-stable) 4.2.8 soon.  I do not know of an existing bug report for the
 behavior, though I haven't searched for one either.  Similarly while I
 believe it to be fixed in more recent versions, I can't at this point
 suggest when it was fixed or which change fixed it.

I think it's the bug #1554, which was fixed only in 4.2.7.

https://bugs.ntp.org/show_bug.cgi?id=1554

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] POSIX leap seconds versus the current NTP behaviour

2011-05-11 Thread Miroslav Lichvar
On Fri, May 06, 2011 at 12:44:42PM +0800, Dennis Ferguson wrote:
 Of course I do want to adjust the system's clock too, I just can't
 do it the way the NTP code was doing it.  What could be done,
 however, was design a system call interface which allowed time and
 frequency adjustments (which are done solely with arithmetic as
 well), but did so in a way which returned enough information about
 what was done to allow one to precisely compute the time one would
 have gotten from the unadjusted clock as a function of the adjusted
 clock's timestamps.

This is very interesting. Do you have a description of the new
interface? I'd really like to see something like that supported in
stock kernels, although I'm primarily interested in Linux. I've
proposed to the subsystem's kernel maintainer extending the adjtimex
interface to include a variant of the SINGLESHOT mode which would
allow slewing at nanosecond resolution, at specified rate and which
would provide timestamp when exactly the adjustment started. This
should allow us to accurately reconstruct any timestamp in history as
if the adjustement never happened or was complete.

Currently, we use an ugly combinantion of three different slewing
mechanisms, each with different shortcomings. First is temporary
frequency/tick adjustment through adjtimex(), samples collected while
the adjustment is running are accurate, but the total adjustment is
not, with each frequency change an error has to be estimated and added
to the dispersion of old samples. Second is adjtime(), which slews
accurately in microsecond resolution, but the reported remaining
adjustment is updated only per second which means that samples
collected while it's running have error up to 500 us and it's hard to
determine when exactly the adjustment finished (or started). Third is
PLL in FREQHOLD mode, it allows nanosecond resolution, but has the
same problems as adjtime() and it's even harder to estimate the error.
It's a horrible mess, but it seems to be able to keep the clock stable
to 200-300 nanoseconds at 16s update interval, on a machine with a PPS
refclock (1us jitter) and an ordinary clock oscillator (wander
estimated at 1ppb/s).

 And as for results transferring time from the card to the system
 clock, I have found that if it samples the offset 4 times per second
 and processes that data to determine time and frequency errors
 (using a least squares fit, after outlier filtering) then, if an
 adjustment is made only when it computes a time or frequency result
 which differs from the current clock setting at an 80% confidence
 level, it will typically end up making an adjustment roughly every
 10 seconds or so with the time adjustments tending to be about 10
 nanoseconds in size and the frequency adjustments being very roughly
 on the order of 10^-9.

Impressive numbers. I'd expect larger offset corrections if the
frequency needs to be changed by 10 ppb every 10 seconds though. I
assume you have an ordinary clock oscillator without any
stabilization.

How many samples do you use to make the fit? Is it fixed or variable?
We use the runs statistical test (number of runs of offset's signs)
and keep maximum number of samples which pass the test. For best
performance, I think it should correspond to the Allan intercept. 

Thanks,

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] POSIX leap seconds versus the current NTP behaviour

2011-05-11 Thread Miroslav Lichvar
On Wed, May 11, 2011 at 12:55:15PM +0200, Miroslav Lichvar wrote:
 On Fri, May 06, 2011 at 12:44:42PM +0800, Dennis Ferguson wrote:
  level, it will typically end up making an adjustment roughly every
  10 seconds or so with the time adjustments tending to be about 10
  nanoseconds in size and the frequency adjustments being very roughly
  on the order of 10^-9.
 
 Impressive numbers. I'd expect larger offset corrections if the
 frequency needs to be changed by 10 ppb every 10 seconds though.

Hm, I can't read, it's 1 ppb per 10 seconds, sorry for the noise.
That seems to be high quality oscillator.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] ntpd stays synced after loosing gps

2011-05-11 Thread Miroslav Lichvar
On Wed, May 11, 2011 at 05:48:21PM +0600, Nickolay Orekhov wrote:
 Just one more question :-)
 After loosing sync, server should lower it's stratum so other servers can't
 synchronize with it, but:
 
 ntpq pe
  remote   refid  st t when poll reach   delay   offset
  jitter
 ==
  PPS(0)  .PPSE.   0 l  153800.0005.051
 0.000
  GENERIC(0)  .TSIP.   0 l  158800.000   -6.027
 0.000
 
 ntpq rv
 associd=0 status=0028 leap_none, sync_unspec, 2 events, no_sys_peer,
 version=ntpd 4.2.7p164@1.2483 Wed May 11 07:27:34 UTC 2011 (1),
 processor=UNKNOWN, system=QNX/6.5.0, leap=00, stratum=1,
 precision=-10, rootdelay=0.000, rootdisp=9.456, refid=PPSE,

That has changed with ntp version 4, it doesn't increase stratum and
change leap to unsync, it will keep increasing the root dispersion
instead. When the distance reacheas a certain limit, the clients will
switch to another source.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


[ntp:questions] reftime xmt in server reply valid?

2011-05-05 Thread Miroslav Lichvar
Hi,

RFC 5905 has in section 5.1.1.:

/*
 * Verify valid root distance.
 */

if (r-rootdelay / 2 + r-rootdisp = MAXDISP || p-reftime 
r-xmt)
return; /* invalid header values */


But it seems that ntpd (at least 4.2.6p3 and 4.2.7p162) doesn't have
the reftime  xmt check, only the distance check, is that correct?

Thanks,

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Venting steam: Autokey in 4.2.6/4.2.7

2011-03-29 Thread Miroslav Lichvar
On Mon, Mar 28, 2011 at 11:11:28PM +, Dave Hart wrote:
 Autokey is very clever in dealing with some unique challenges other
 PKI OpenSSL client code doesn't have to.  Anyone attempting to
 configure it should be on payroll, if not time and a half.
 
 (insert series of profanities here)

I had a similar feeling when I was expanding my NTP test suite to test
basic Autokey functionality and compatibility between 4.2.2, 4.2.4 and
4.2.6 version. I eventually got most of it working, but I'm not sure
if it's working as intended or accidentaly by misplacing a private
key, etc.

I wasn't able to get the MV scheme working though. I have read the
official ntp-keygen page and the wiki document.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Secure NTP

2011-03-25 Thread Miroslav Lichvar
On Thu, Mar 24, 2011 at 05:01:07PM -0700, Chris Albertson wrote:
 Security is so that you know you are not being spoofed.  Or if you are
 providing the time so that you can prove to your users that you are
 who you claim to be and are not spoofing them.
 
 There is the chance that someone might impersonate one of your
 servers or a server you use. and then make a computer's clock be set
 to the wrong time.   Again who cares if you only use your computer
 to serf the web and read emails but what if you were a bank processing
 ATM or visa card transactions or worse a computer routing trans or
 airplanes or controlling stop lights.

There is one important thing I haven't seen mentioned here. A MITM
doesn't need to modify the NTP packets to seriously degrade your
timekeeping. He can exploit the PLL instability when undersampled and
by dropping and delaying the packets (up to maxdist, 1.5s by default)
he can fairly quickly throw your clock off and let you drift away.

In addition to the authentication, it's important to monitor
reachability of the peers.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Flash 400 on all peers; can't get ntpd to be happy

2011-03-09 Thread Miroslav Lichvar
On Tue, Mar 08, 2011 at 03:26:34PM -0800, Chuck Swiger wrote:
  You are better off running ntpdate (or sntp) periodically via cron in
  the DomUs.
  
  Perhaps in certain cases, but not across the board.
 
 I'd be happy to review counterexamples to my generalization

I'd say it depends on the VM.

For instance, Fedora 14 running in kvm on Fedora 14. There are four
clocksources available in the guest system: kvm-clock tsc hpet
acpi_pm. With each of them the frequency seems to be stable, even when
the host or guest CPU is heavily loaded. The kvm-clock and hpet
clocks seem to be running at same rate as the host's system clock, tsc
at the real CPU's rate and acpi_pm is off by few tens of ppm.

Here is a rv output from ntpd running in the guest with the tsc clock,
the host is not synchronized:

associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync,
version=ntpd 4.2.6p2@1.2194-o Mon Aug 23 12:18:41 UTC 2010 (1),
processor=x86_64, system=Linux/2.6.35.2-9.fc14.x86_64, leap=00,
stratum=3, precision=-23, rootdelay=128.742, rootdisp=41.165,
refid=10.34.32.125,
reftime=d121ddab.0ab5b995  Wed, Mar  9 2011  6:06:19.041,
clock=d121ddab.5dd3a440  Wed, Mar  9 2011  6:06:19.366, peer=47730,
tc=4,
mintc=3, offset=-0.013, frequency=22.454, sys_jitter=0.011,
clk_jitter=0.016, clk_wander=0.028

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Flash 400 on all peers; can't get ntpd to be happy

2011-03-08 Thread Miroslav Lichvar
On Tue, Mar 08, 2011 at 05:00:44PM +, unruh wrote:
  filtoffset= 67671.8 66534.8 65931.3 65118.0 63317.3 63029.5 62216.4 
  58156.6,

 Not at all sure how Mills comes into the picture. On a system where the
 frequency fluctuates wildly, ntpd is not the right answer, nor is any
 system. I suspect that the best you could do would be to run something
 like ntpdate often and jump the clock around.

The frequency offset in this case seems to be around 2% which is still
well below the 10% maximum Linux can adjust. I'd try chrony before
resorting to ntpdate, the timekeeping probably won't be very good, but
at least the clock won't be stepped.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Detecting bufferbloat via ntp?

2011-02-14 Thread Miroslav Lichvar
On Mon, Feb 14, 2011 at 08:44:46AM +, Rob wrote:
 When the users would set their TCP window to a reasonable value, the
 bufferbloat problem would not exist!
 When the TCP window is correct for the delay*bandwidth product of a
 TCP session, there are no packets piling up in buffers halfway, as
 there is a continous stream of just enough data.

How do you control the delay? Do you communicate only with servers
located on a circle around you and only with one at a time? What
about the other computers sharing the same internet connection?

I think you'll get much better results with a large window and traffic
shaping configured properly.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] ntpstat source code master location?

2011-02-03 Thread Miroslav Lichvar
On Wed, Feb 02, 2011 at 05:07:19PM -0700, Schmidt, Bryan wrote:
 Normally my team uses the ntpstat utility to check that systems are in
 sync and so on.  Usually we use the one provided by our OS vendor, but
 in a particular case we require the latest stable ntpd build because of
 a bugfix that affects some of our systems.  The old build of ntpstat
 does not work with latest stable ntpd, I assume because of some sort of
 data structure change.

The ntpstat code needs update, because some variables reported by ntpd
were renamed. Apply ntpstat-0.2-sysvars.patch from here

http://pkgs.fedoraproject.org/gitweb/?p=ntp.git;a=tree

the other ntpstat-* patches might be useful too.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] ntp-4.2.6p3-1.el5 - minpoll local PPS source

2011-02-02 Thread Miroslav Lichvar
On Sun, Jan 30, 2011 at 07:11:07PM -, Q wrote:
 My local PPS source is set for 'minpoll 4' (16 sec) this has had the knock 
 on effect that the other network based servers have all decided to poll at 
 64sec intervals.

I can confirm this with ntp-4.2.6p3 and a recent ntp-dev. But it seems
to be a design decision to use similar polling interval for all
sources, even when they have very different jitter.

As others have said, a workaround is to set minpoll to 10 for the NTP
sources.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Polling interval in FreeBSD vs. Windows

2011-01-19 Thread Miroslav Lichvar
On Tue, Jan 18, 2011 at 09:48:08PM +, David Woolley wrote:
 That was my point.  Unruh's main issue is that, on modern LANs, the
 dominant low frequency error is in the local clock, rather than the
 measurements.

That's the theory behind NTP.

 It's more complicated.  I don't think the current version ages the
 samples,

It still does, see clock_filter() in ntp_proto.c:

dtemp = clock_phi * (current_time - peer-update);

 I
 think the filter will take out more like 7 in 8 for gaussian input,
 but the expected input pattern isn't actually gaussian, either.

I think the delay is assumed to be exponentially distributed (and
that's what I use in my simulations). It would be interesting to
analyse real data from rawstats.

Also, to answer the question whether PLL has a good lock, you can use
the runs tests. Run this command on your loopstats and we'll see if
the offset is random or not.

awk '{ n++; r += $3 * l  0; l = $3 } END { print r  /  n }'

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Polling interval in FreeBSD vs. Windows

2011-01-18 Thread Miroslav Lichvar
On Tue, Jan 18, 2011 at 09:02:55AM +, David Woolley wrote:
 The actual error, when locked, should be almost an order of
 magnitude less than the typical offsets.  Moreover, if there is both
 jitter and wander, and you set a very fast poll, you could get low
 offsets but a high error, because the real error is in the wander.
 Increasing the loop time constant will report higher offsets, but
 the time will actually be more correct.

The trouble is with when locked. When the jitter reaches a certain
point (or better the ratio between jitter and clock stability --
usually expressed as Allan intercept in the NTP docs), the PLL won't
be able to get a good lock and the clock accuracy will be limited only
by the clock stability, not the jitter.

As ntpd fixes the time constant to the polling interval, the only
thing you can do is to use a lower polling interval. If ntpd was able
to change time constant and PLL/FLL mode independently from polling
interval, it would be a huge improvement. Would be very tricky though.

I'd suggest you to look at the clknetsim graphs, I think you can get a
very good understanding how is time and frequency accuracy affected by
jitter/wander, poll interval and the PLL gain.

http://mlichvar.fedorapeople.org/clknetsim/

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Polling interval in FreeBSD vs. Windows

2011-01-18 Thread Miroslav Lichvar
On Tue, Jan 18, 2011 at 12:49:52PM +, David Woolley wrote:
 Miroslav Lichvar wrote:
 
 The trouble is with when locked. When the jitter reaches a certain
 point (or better the ratio between jitter and clock stability --
 usually expressed as Allan intercept in the NTP docs), the PLL won't
 be able to get a good lock and the clock accuracy will be limited only
 by the clock stability, not the jitter.
 
 Are you suggesting that the assumed Allan intercept if a couple of
 orders of magnitude too high?

Yes, when the jitter is too low or the clock too unstable. Ideally,
ntp would run a statistic and adjust it in runtime. Chrony counts
number of runs of residuals with same sign, perhaps it could work for
ntp too. But in ntp the assumed Allan intercept only controls the
PLL/FLL switch (and only in daemon mode), the PLL time constant is
always fixed to polling interval.

 As ntpd fixes the time constant to the polling interval, the only
 thing you can do is to use a lower polling interval. If ntpd was able
 
 Linking them makes total sense.  The poll interval needs to be a
 small submultiple of the time constant, so that there is reasonable
 oversampling and allowance is made for the subsetting of the samples
 in the initial filter.  Polling faster than this adds very little
 information to the timing solution and polling slower will break the
 Nyquist criterion.

Linking them makes sense if you want to keep things simple and robust.
The problem is that even the minimum allowed poll 3 is too long in
some situations and that it wastes network bandwidth.

 to change time constant and PLL/FLL mode independently from polling
 interval, it would be a huge improvement. Would be very tricky though.
 
 I'd suggest you to look at the clknetsim graphs, I think you can get a
 very good understanding how is time and frequency accuracy affected by
 jitter/wander, poll interval and the PLL gain.
 
 http://mlichvar.fedorapeople.org/clknetsim/
 
 You are measuring (RMS) offset, which is not the same as error, and
 you are not accounting for network wander, which can reach 100s of
 ms, if NTP isn't prioritised.

Define error and network wander.

Please note that the RMS offset is from the actual clock offset
maintained by the simulator, not the one reported by ntp.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Polling interval in FreeBSD vs. Windows

2011-01-18 Thread Miroslav Lichvar
On Tue, Jan 18, 2011 at 02:11:12PM +, David Woolley wrote:
 Miroslav Lichvar wrote:
 Yes, when the jitter is too low or the clock too unstable. Ideally,
 ntp would run a statistic and adjust it in runtime. Chrony counts
 
 It does.  I forget the exact metric, but look for the term poll adjust.

The allan_xpt variable is 11 by default, and can be changed only by
the tinker allan command. (which is what was used to force ntpd to
enable FLL in the clknetsim tests)

 Linking them makes sense if you want to keep things simple and robust.
 The problem is that even the minimum allowed poll 3 is too long in
 some situations and that it wastes network bandwidth.
 
 Reducing the poll interval without reducing the time constant will
 simply result in oversampling.  The time constant will still
 determine the loop behaviour. 

It will improve the accuracy when your Allan intercept is high (e.g.
you have a thermally stabilized oscillator). Reducing time constant is
probably the more interesting case, but this needs to be done
carefully to avoid undersampling, the constant probably could be
reduced only to half or quarter, then it would have to switch to FLL.

 (It may make things worse if network
 delay transients no longer fit within the 8 sample filter.)

Good point. That's another advantage of reducing time constant
independently from poll interval.

 Network wander is when, for example, an asymmetric load is applied
 temporarily.  With 3.1 kHz modems, that would result in such a large
 wander that ntpd would start ignoring the server entirely, so was
 rather benign.  With lower speed DSL, it results in clock steps.
 With higher speed connections, the peak to peak wander may be under
 128ms.

Ok, on the clknetsim page are some tests with temporary asymmetric
delays too. It's actually a good example where increasing time
constant would allow dropping more than 8 consecutive samples, instead
of mangling them with huffpuff to avoid undersampling.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Polling interval in FreeBSD vs. Windows

2011-01-18 Thread Miroslav Lichvar
On Tue, Jan 18, 2011 at 07:18:08PM +, unruh wrote:
 On 2011-01-18, David Woolley david@ex.djwhome.demon.invalid wrote:
  The actual error, when locked, should be almost an order of magnitude 
  less than the typical offsets.  Moreover, if there is both jitter and 
 
 Uh, it depends. If the frequecy shifts ( more or less work being done by
 the computer) ntp will track very badly. Anyway, if the errors are
 random, many measurement over a short period will keep the accuracy
 better than the same number of measurements over a long period. And be
 more responsive to frequecy shifts. Also ntpd's averaging time is NOT of
 the order of 100 poll intervals ( which you would need to get your order
 of magnitude) Especially as ntpd uses only one of every 8 polls, the
 actual statistical improvement is less than 3, not an order of
 magnitude. 

It seems to depend on the distribution of network delays too. With
normally distributed delays I see improvement about 3, but with
exponentially distributed delays it's slightly more than 10. Is that
because the selected sample is the one with the best delay, so it
carries more information than the others?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Use ntpd as a daemon so that it continuously disciplines clock, no

2011-01-17 Thread Miroslav Lichvar
On Sat, Jan 15, 2011 at 11:06:43AM -0800, Chris Albertson wrote:
 As for resources ntpd takes up less then you can measure. After it has
 been running for a while it takes up almost zero.  Most of the
 activity is when it first starts up.  So letting it run might use less
 CPU cycle than starting it serval times per day. Running the crontab
 scrip involves starting multiple new processes.  this is a very
 reasource intensive thing to do, much more so then letting ntpd run.

Even when completely idle, ntpd wakes up every second and does quite a
lot (updating timers, scanning the peer hash table, etc). I'd say that
starting ntpd two times per day will take much less resources than
running it continuosly.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Polling interval in FreeBSD vs. Windows

2011-01-17 Thread Miroslav Lichvar
On Sun, Jan 16, 2011 at 05:37:18PM -0800, Chris Albertson wrote:
 A longer poling interval is not a bad thing.  The polling interval is
 adjusted so as to reduce total noise.  There is a sweet spot where
 polling faster or slower is worse.

Yes, there is a sweet spot, but ntpd isn't looking for it. It strongly
prefers longer polling interval to save network bandwidth. If you want
the best accuracy, you will need to set maxpoll according to the
network jitter and clock stability you have.

For a typical clock oscillator and the standard kernel PLL, poll 3
will give you better accuracy than poll 4 when the network jitter is
about 100 microseconds or less. Such jitter is not uncommon on LAN,
sometimes I observe 100us jitter to close pool.ntp.org servers!

 As an example, lets say you wanted to measure the thickness of a sheet
 of paper but your ruller only goes to 1/100 inch divisions.  You get
 soe gross errors if yu tried to measure one sheet.  But stack 1,000
 sheets and you will do well.   Longer polling interval works kind of
 the same  way.

Except the thickness is slowly changing during the production, so you
have to use a compromise to keep the noise down and to get the current
thinkness. 

 I think the longer poll time is telling you something good about the
 internal clock in the BSD system.

What exactly?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Use ntpd as a daemon so that it continuously disciplines clock, no

2011-01-17 Thread Miroslav Lichvar
On Mon, Jan 17, 2011 at 02:46:28PM -, David J Taylor wrote:
 Even when completely idle, ntpd wakes up every second and does quite a
 lot (updating timers, scanning the peer hash table, etc). I'd say that
 starting ntpd two times per day will take much less resources than
 running it continuosly.
 
 Have you ever measured the resources used by ntpd on a modern CPU?
 Absolutely negligible - at least when serving a dozen clients and
 serving as a stratum-1 PPS clock.  Perhaps a little more with
 thousands of clients, of course.  Not running ntpd continuously will
 ruin its accuracy.

For notebook users running ntpd only as an NTP client the extra wakeup
per second may make a measurable difference in battery life.

I was just pointing out it will take more resources than ntpd -q run
twice a day. Of course, the accuracy will be orders of magnitude
worse than continuosly running ntpd (even with poll 15 or 16).

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Use ntpd as a daemon so that it continuously disciplines clock, no

2011-01-17 Thread Miroslav Lichvar
On Mon, Jan 17, 2011 at 04:51:47PM -, David J Taylor wrote:
 However, David Malone reports 0.1% CPU for serving a thousand users,
 so the CPU for a single user will be far less, and negligible for a
 notebook user.  Likely the overhead of launching the new process
 would outweigh any CPU/battery saved.  On the PC I'm using right
 now, ntpd working ourely as a client has used 0.484 seconds total
 CPU in almost 2.5 days uptime, and around 5MB of memory.

How much CPU it has used right after start? Is it more than fifth of
the 0.484 (which would be spent in 0.5 days)?

Here, ntpd -q takes 14 milliseconds of CPU, including system time.

Note that CPU power consumption depends on which sleep state it's in
and it usually takes a lot of time to switch to/from deeper states, so
it's more energy efficient to load CPU once for 0.5 seconds than 5
times for 10 microseconds.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] GPX18x LVC 3.50 firmware - high serial delay problem workround

2011-01-14 Thread Miroslav Lichvar
On Fri, Jan 14, 2011 at 08:40:25AM -, David J Taylor wrote:
 Folks,
 
 You may recall that I had a problem with a Garmin GPS18x LVC after
 firmware upgrades, where the offset between the leading edge of the
 PPS signal and the end of the NMEA serial data exceeded one second.
 With some help from Hal Murray who knows more of NTP than I do, we
 have worked round the problem as described here:
 
  http://www.satsignal.eu/ntp/FreeBSD-GPS-PPS.htm#gps-18x

 - add that offset (as +1.000 seconds) to the fudge time2 value for
 the 18x in the ntp.conf

Thanks for the information. I was curious about the new position
averaging mode, but I'll wait until this is resolved.

1.0s offset is horrible, that will certainly break gpsd or any
application that pairs pulses with following NMEA timestamps.

Have you tried increasing baud rate to 38400 and disabling all
unneeded NMEA sentences?

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Number of servers needed to detect one falseticker?

2011-01-05 Thread Miroslav Lichvar
On Wed, Jan 05, 2011 at 09:23:59AM +0100, Terje Mathisen wrote:
 Two servers which don't overlap, and a third which overlaps (partly)
 both of them:
 
       server A and B
   ---   server C
 
 In this particular situation C must be a survivor, but since it
 overlaps both A and B with an identical amount, there is no way to
 determine if (A^C) or (B^C) is the best interval to pick.

The select algorithm doesn't care how much they overlap. Recent
ntp-dev versions work as described on the select.html web page, so the
intersection interval will be equal to C and all three sources will
pass. Older versions worked also with centers of the intervals and as
the centers of A and B are lying outside the intersection interval, C
would be the only truechimer. 

I'd be curious to hear why that approach was dropped.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Number of servers needed to detect one falseticker?

2011-01-05 Thread Miroslav Lichvar
On Wed, Jan 05, 2011 at 10:31:15AM -0500, Brian Utterback wrote:
 Let's equalize a bit to make it a bit more fair:
 
  c   b-
   a--
 
 So, now, if you were NTP, which would you choose? You are correct in
 your assessment that NTP would accept them all as truechimers. You are
 correct also that adding a fourth still does not guarantee that you
 will throw out the falseticker, but NTP uses intervals at this stage,
 not actual servers, so adding another truechimer will guarantee that
 the interval used will contain the real time.

Not necessarily.

  |
   -   A
  | -  B
   C 
   --- D
  |
== X

Here, B is the only server off, but the result X doesn't contain the
actual time.

  I think clockhopping can happen with any number of servers, there just
  needs to be two or more similar sources on top of the list sorted by
  synchronization distance.
  
 
 With more servers on the list, the clustering and combining algorithms
 will merge them into a single offset and they will not hop. With two
 servers, these algorithms cannot function.

Combining doesn't affect clockhopping, it happens after the system
peer is selected.

 By the way, over time Dr. Mills has added features to try to suppress
 clock hopping as much as possible without compromising the correctness
 proofs. With the latest versions, clock hopping may not be so much of
 a problem. Bu tit is still an issue. Even if you prefer one clock, it
 might be inaccessible for a while and you will hop anyway.

Yes, the maximum anti-clockhopping threshold is a fixed value (1 ms by
default), so it can't work well in all situations. But it can be tuned
with the tos mindist command.

-- 
Miroslav Lichvar
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


<    1   2   3   >