Julian,

Thanks for the paper reference. Your ideas on feed-forward are similar to the ideas in Greg Troxel's MIT dissertation. These ideas were partially implemented in NTPv3 a very long time ago.

There are some minor misinterptretations in the paper. The NTP discipline loop is not critically damped; it is purposely underdamped with a minor overshoot of a few percent in order to improve the transient response. The impulse response was slavishly copied in the microkernel and nanokernel code that left here over a decade ago. The microkernel survives in Solaris and the nanokernel in most BSD systems with varying degrees of fidelity; however, many systems have elected to modify the original BSD tickadj semantics, which result in an extra pole. The result is a moderate instability at the longer poll intervals, especially if the step threshold is increased or eliminated. In any case, the response has no serious misbehavior as the paper described. Note that in no case are the daemon and kernel algorithms cascaded as the paper implies. Either one or the other is used, but not both.

The system behavior with multiple servers is indeed as the paper suggests, but there is considerable algorithm horsepower to diminish the effects, including the cluster and combine algorithms, plus the anti-clockhop and prefer peer mechanisms. These provisions were first implementd in the Internet of twenty years ago when the congestion on overseas links frequently reached over one second. Perhaps today these algorithms could be more carefully tuned for LANs and even wifi netorks.

As the paper describes, NTP algorithms are designed for traditional mathematical analysis, but with both linear and nonlinear components. However, the FLL algorithm is based on a model described by Levine as predictive. The model in the documentation describes both the PLL and FLL in predictive terms, but that doesn't change the conclusions in the paper.

The paper suggests possible improvements in data filtering and analysis. The clock filter and popcorn spike suppressor algorithms in NTP represent one approach. A persistent observation is that NTP does not effectively use offset/delay samples other than at the apex of the scattergram. While it does indeed do that for the huff-n'-puff fiilter, the possible improvement in other cases is problematic. The paper does not mention the implications of roundtrip delay in the maximum error statistic, such as in Cristian's model, as used by NTP. It is a natural error bound for asymmetric paths such as mentioned in the paper.

In summary, the NTP algorithms have evolved over thiry years in response to major changes in Internet service models and surely could use some further evolution. I am glad there is continuing interest in improvements.

Dave

Julien Ridoux wrote:

On Thursday, June 7, 2012 4:19:31 PM UTC+10, unruh wrote:
On 2012-06-07, skillz...@gmail.com <skillz...@gmail.com> wrote:
On Jun 5, 6:46?pm, Julien Ridoux <jul...@synclab.org> wrote:
On Tuesday, June 5, 2012 9:12:42 AM UTC+10, E-Mail Sent to this address will be 
added to the BlackLists wrote:
Thanks for response. It would be great to have a simplified
description of the algorithm. I've read most of the docs on the
synclab site.

That is a very impressive effort :)

I'm trying to synchronize several devices to a reference
clock of a similar device rather than to a true wall clock time of a
real NTP server. I can't use the RADclock code directly because it's
GPL so I'd like distill the algorithm down to something I can
implement from scratch. I'd like to adapt my current NTP client and
server code to use RADclock so I can compare performance. I'm aiming
Why not just use the reference implimentation of radclock for the tests?

The next version of RADclock is likely to be released under a BSD licence. This 
should save you the trouble of reimplementing the algorithm.

for 10 microseconds or less of sync on a wireless LAN with a fast
Not going to work.
initial sync time (less than 5 seconds, but I can send many packets
In 5 sec you could set the time, but you cannot get a good estimate of
the rate.
close together if needed). I haven't been able to get even close to
this with my current NTP code (Ethernet is close, but WiFi introduces
a lot of variability). Some of the key points seem to be:
Precisely. And Radclock will not help with that. It uses the same ( or
perhaps much slower) synchronization algorithm.

I agree with Unruh, you are going to bump into quite a few issues and the much 
noisier characteristic of the WiFi channel makes it much harder for any 
synchronisation algorithm. RADclock however does a pretty good job at filtering 
noise but there is no magic involved. If the overall noise characteristics of 
all packets is bad, then you cannot do much about it.
I have run RADclock over WiFi, but I would not dare quoting any achievable 
clock error value at this stage: we do not have an independent hardware time 
reference over WiFi at this stage so we would have to rely on more convoluted 
setups to do an objective assessment of RADclock there. I'll put this on my 
TODO list.


1. Feed forward: Use raw TSC instead of the time value adjusted by the
algorithm. Is there more to this? Should raw TSC values be used for
offset sync calculations or should the raw TSC be scaled by the latest
adjusted rate (which would seem to be partially feed-back vs fully
feed forward)?
The "time" of the remote clock is measured with the TSC/HPET/.... raw
clock. The packets exchanged are filtered to get the best ones (takes
time) and the rate and offset of the TSC calculated. If you want a
difference clock, only the rate is used. to make the transfer from TSC
differences to time differences.

Actually, the time of the local clock is measured in counter units (that is Ta and 
Tf in NTP jargon). The RTT of NTP request/reply is measured in counter units, and 
filtering of RTTs is made in counter units. This prevents new inputs to the 
algorithm from being "polluted' by wrong estimates of drift based on past NTP 
exchanges. This is a simplified example of the feed-forward approach.
Of course, raw counter values have to be converted to time in seconds at one 
stage, but this is done when a consumer of time request the creation of a 
timestamp (a kernel sub-system or a userland application).

Note that you do not have to convert the counter values. You can store them, as 
well as radclock outputs and delay the conversion.

2. Separate offset sync from rate sync. What I wasn't sure about is
why the rate sync would be so much more stable than offset sync since
they seem to both come from the same packet exchanges. Is the
advantage that the rate sync is smoothed over time whereas the offset
sync isn't?
If you correct the clock for offset, the only way you can do that is to
somehow shift the time. Thus if you want the difference in times from t0
to t2 when you corrected the offset at t1, that offset correction will
affect the time difference between t2 and t0. Thus, if you only correct
determine the rate, the difference in raw times times the rate will
directly give you the time difference.

Unruh's explication is pretty much accurate. I would add that the algorithm is 
layered. The rate is estimated first in a very reliable way (that is the easy 
task) and is used for the difference clock, where drift error cancel out if the 
interval measured is small enough. For the purpose of absolute time (that is 
the hard bit, and as Unruh pointed out, very much error prone), the drift 
estimation algorithm is built on top of the rate estimate one.
Last piece of information: the rate estimation has been designed to change very 
slowly to guarantee that the case describe by Unruh does not happen (again if 
you measure small intervals).

A little disclaimer, the 'rate' defined in radclock is computed and defined in a very different way than the way ntpd does. Taking the risk to oversimplify, radclock sees the rate and drift as 2 different components of the clock. Radclock does not 'adjust the rate'. I am well aware that this can create confusion, but it is hard to come up with a better wording. Please refer to the papers mentioned earlier for the nitty gritty details.


3. Use only the lowest RTT samples. NTP also does this so I wasn't
sure where the advantage was here.
Their algorithm seems, in the hints in the paper, to be more
sophisticates/complex than the simple "best of 8" algorithm of NTP.

Indeed. Radclock maintains a much longer history of NTP exchanges. This is 
particularly useful to provide a stable rate estimate and adjust the minimum 
RTT filtering if it 'jumps up'. A typical case would be a network 
reconfiguration where the previous minimum RTT is never observed again.

The obvious tradeoff here is memory. However, on Linux and FreeBSD, radclock 
and ntpd have comparable memory footprint. I am well aware that this could be a 
limitation for embedded systems, and something we will work on if/when someone 
shows interest with such needs.

Coming back to one of Unruh's comment, the algorithm is not "slower" than ntpd. 
Unruh, I am actually not sure what you meant by this, but if that clarifies things, 
radclock does need to wait for the history to fill in to provide rate and drift estimates.


4. Weight samples by their RTT. There's some mention of this in the
documents, but I wasn't sure what the algorithm was for determining
the weights or adapting them as conditions change (e.g. if network
load increased for a long period of time, which would also cause RTT's
to increase due to queuing in routers, etc.).
Then those samples would be thrown out, and the clock allowed to
freewheel until the samples settled down again.

Pretty much. The weight reflect the increase of RTT and translate into a quality measure. 
See page 9 of "Robust Synchronization of Absolute and Difference Clocks over 
Networks" for the details of the exponential weights assigned.

Then, there are a few different cases. If the input is very bad, then freewheel 
for a little while (based on older but good input) is better than believing bad 
input. But of course, after a long time, poor input is better than on input at 
all. This is really getting into the details of the algo, calibration of 
quality thresholds and the like. Here I would refer you to the code for the 
details.


Are there other key aspects of the algorithm that I'm missing?

I think these are the key points.


I do not know it, the above are my conclusions from reading the docs I
have read. They could be all wrong.

Unruh, you definitely did a pretty good job  ;-).

Julien Ridoux

_______________________________________________
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

_______________________________________________
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Reply via email to