Re: Proposal for kernel clock changes

2014-04-02 Thread Warner Losh

On Apr 1, 2014, at 1:50 PM, David Laight  wrote:

> This may mean that you can (effectively) count the ticks on all your
> clocks since 'boot' and then scale the frequency of each to give the
> same 'time since boot' - even though that will slightly change the
> relationship between old timestamps taken on different clocks.
> Possibly you do need a small offset for each clock to avoid
> discrepencies in the 'current time' when you recalculate the clocks
> frequency.

If the underling clock moves in frequency, you need to have both a
scale on the frequency, and a time to count adjustment as well. Otherwise
on long-running systems you accumulate a fair amount of error. It doesn’t
take much more than 1ppm of error to accumulate a second of error in 10
days if you don’t have ‘on time’ marks that integrate all of time up to that
point. Then the error in phase will be related to the time since last phase
sync, rather than since time of boot.

Warner



Re: Proposal for kernel clock changes

2014-04-02 Thread Dennis Ferguson

On 1 Apr, 2014, at 12:50 , David Laight  wrote:
> On Fri, Mar 28, 2014 at 06:16:23PM -0400, Dennis Ferguson wrote:
>> I would like to rework the clock support in the kernel a bit to correct
>> some deficiencies which exist now, and to provide new functionality.  The
>> issues I would like to try to address include:
> 
> A few comments, I've deleted the body so they aren't hidden!

Thanks very much for looking at it.  I know that reading about
clocks is, for most people, a good way to put oneself to sleep
at night.

> One problem I do see is knowing which counter to trust most.
> You are trying to cross synchronise values and it might be that
> the clock with the best long term accuracy is a very slow one
> with a lot of jitter (NTP over dialup anyone?).
> Whereas the fastest clock is likely to have the least jitter, but
> may not have the long term stability.

This is true but when considering the quality of non-special-purpose
computer clock hardware running on its own, either on the CPU board
or on an ethernet card, what you'll effectively end up trying to
determine by this is whether the clock is just crappy, or is crappier
than that. The stability of cheap, uncompensated free-running
crystals is always poor, you shouldn't trust any of these these unless
you have no choice, and life is too short to worry about trying to
measure degrees of crappiness.

Since all the clocks in your system are likely to be crappy if left
running free the "best" clock in the system will always be the one
which is making the most accurate measurements of the most accurate
external time source you have available and steering itself to that.
The only important "quality" of a clock is how well it is measuring
its time source and how good that time source is.  The measurement
clocks are only useful if you have an application which is interested
in taking and processing those measurements, and if that application
is not broken it will certainly come to some opinion about which of
those clocks is the best one based on those measurements.  That will be
the clock the time comes from, the polling is the mechanism to get it
to the others. The kernel itself will see the polling and see adjustments
being made to clocks but it will be the application which knows why that
is being done and which way the time is moving.  If there are no
external time sources, however, you'll probably just live with whatever
your chosen system clock does and not worry about the measurement clocks.

> There are places where you are only interested in the difference
> between timestamps - rather than needing them converting to absolute
> times.

I'm not quite sure how to read that, but I'll guess.  I over-simplified
the description of what is being maintained a bit.  I'm fond of, and
the system call interface I like makes use of, the two timescales the
kernel maintains now, i.e.

time = uptime + boottime;

where `time' has an UTC-aligned epoch, `uptime's epoch is around the
time the machine was booted, and boottime is a mostly-constant value
which expresses uptime's epoch in terms of time's epoch.  uptime is
maintained to advance at the same rate as time but to be phase
continuous, which means that uptime will advance at as close to the
rate of the SI second as we can determine it (since it advances
at the same rate as time, which advances at the rate of UTC, which
advances at the rate of the SI second) but is unaffected by step
changes made to time make to bring it into phase alignment to UTC
(boottime changes instead).  uptime hence tracks UTC's frequency but
not its phase.

If you want to measure the interval between timestamps, then, I think
you would take your timestamps in terms of uptime and then compute

interval = uptime[1] - uptime[0];

which should reliably give you system's best estimate of the elapsed
number of SI seconds between the times the two stamps were acquired.
I like to record event timestamps in terms of uptime as well since it
makes it unambiguous when the events occurred even if someone calls
clock_settime() in between.  Also, the tuple describing a conversion
from a tickcount_t tc to a systime_t, which I over-simplified, actually
maintains the pair of timescales by maintaining two `c' values, so that

time = (tc << s) * r + c_time;
uptime = (tc << s) * r + c_uptime;

and

boottime = c_time - c_uptime;

So if "absolute time" means UTC, in the form of UTC-aligned `time',
then I agree.  You can't reliably compute time intervals from two
UTC timestamps since, almost unavoidably, some day the system's
estimate of UTC will be wrong and will require a step change to
fix, and you'll compute a bogus time interval if your timestamps
straddle that.  On the other hand, if avoiding "needing them
converted to absolute times" means hanging on to the raw
tickstamp/tickcount for an extended period then I don't see
the point.  The conversion isn't very expensive, and a pair of
uptime timestamps taken from the system clock will rel

Re: Proposal for kernel clock changes

2014-04-01 Thread David Laight
On Fri, Mar 28, 2014 at 06:16:23PM -0400, Dennis Ferguson wrote:
> I would like to rework the clock support in the kernel a bit to correct
> some deficiencies which exist now, and to provide new functionality.  The
> issues I would like to try to address include:

A few comments, I've deleted the body so they aren't hidden!

One problem I do see is knowing which counter to trust most.
You are trying to cross synchronise values and it might be that
the clock with the best long term accuracy is a very slow one
with a lot of jitter (NTP over dialup anyone?).
Whereas the fastest clock is likely to have the least jitter, but
may not have the long term stability.

There are places where you are only interested in the difference
between timestamps - rather than needing them converting to absolute
times.

I also wonder whether there are timestamps for which you are never
really interested in the absolute accuracy of old values.
Possibly because 'old' timestamps will already have been converted
to some other clock.
This might be the case for ethernet packet timestamps, you may want
to be able to synchronise the timestamps from different interfaces,
but you may not be interested in the absolute accuracy of timestamps
from packets takem several hours ago.

This may mean that you can (effectively) count the ticks on all your
clocks since 'boot' and then scale the frequency of each to give the
same 'time since boot' - even though that will slightly change the
relationship between old timestamps taken on different clocks.
Possibly you do need a small offset for each clock to avoid
discrepencies in the 'current time' when you recalculate the clocks
frequency.

If the 128bit divides are being done to generate corrected frequences,
it might be that you can use the error term to adjust the current value
- and remove the need for the divide at all (after the initial setup).

One thought I've sometimes had is that, instead of trying to synchronise
the TSC counters in an SMP system, move them as far from each other
as possible!
Then, when you read the TSC, you can tell from the value which cpu
it must have come from!

David



Proposal for kernel clock changes

2014-03-28 Thread Dennis Ferguson
I would like to rework the clock support in the kernel a bit to correct
some deficiencies which exist now, and to provide new functionality.  The
issues I would like to try to address include:

- It has become common for systems to include clocks which are unsuitable
  for use as the time source for the system clock but which are none-the-less
  useful because they are the timestamp source for (hardware) measurements of
  external events.  The most frequently encountered example of this may be the
  counter included in many Ethernet MAC chips which is sampled when IEEE 1588
  packets are sent and received; many systems may have more than one of these.
  Peripherals which hardware timestamp other types of events (e.g. signal
  transitions, like the PPS output of a GPS receiver) are often found in
  integrated SoCs, as are devices which use a free-running counter to generate
  events.

  Making all of these "measurement" clocks useful to the system seems to
  require two things.  It first requires that each of these clocks be visible
  to and indepently adjustable via a clock adjustment interface.  The only
  thing one can do with accurate-in-time external events which are measured
  with a particular clock is to use the information to adjust that particular
  clock into synchronization, so each such clock must be independently
  adjustable.

  The second requirement is that it must be possible to measure the times
  of independent pairs of clocks in the system against each other as
  precisely as possible, perhaps with a sequence of the form

read clock A
read clock B
read clock A

  to provide an estimate of both the offset between the clocks and the
  uncertainty/ambiguity of the measurement itself.  The reason for this
  is that having a precisely synchronized measurement clock in, say, an
  Ethernet MAC chip is clearly fairly useless by itself.  Its time becomes
  useful only when it can be transferred to the system clock and/or other
  clocks in need of synchronization so that other applications can use it too.

- Acquiring a timestamp from a clock is generally done by (1) reading a
  value from a hardware register, then (2) manipulating the value and doing
  the arithmetic necessary to convert it to a time of day.  I would like
  to be able to separate (1) from (2), storing the raw output from the
  hardware now (I've been calling this a "tickstamp") but deferring the
  work of converting it to a meaningful timestamp until a bit later.  An
  example use of this might be to tickstamp every packet arrival early
  in a network controller's interrupt routine but only go to the trouble
  of converting the tickstamp to a timestamp if the packet arrives
  somewhere which cares about this (e.g. an NTP or PTP socket), with
  unused tickstamps perhaps being donated to the random number pool.

  Event timestamping has many uses, and with a suitably inexpensive hardware
  time source capturing event times in tickstamp form has many advantages.
  It minimizes the overhead in the case that not all tickstamps are
  consumed as timestamps, it often allows the bulk of the work of acquiring
  a timestamp to be moved from time critical code to code with more relaxed
  constraints, and is probably the most appropriate way to provide random
  number pool entropy.  The clock-pair polling above might be implemented as

read clock A tickstamp
read clock B tickstamp
read clock A tickstamp

  with the corresponding timestamps being sorted out after the time-critical
  polling code segment has been completed.  It might also be possible to
  provide arithmetic functions computing nominal time intervals directly
  from tickstamps themselves to make the implementation of things like
  Codel's packet timestamping more economical.

- The fundamental problem that clock synchronization software needs to
  deal with is that the oscillator (likely a crystal soldered to a board
  somewhere) driving the clock the software is trying to keep correct
  makes errors: the output frequency of the oscillator usually differs
  considerably from the number written on its package and will measurably
  vary with time.  The operating system will likely be setting the rate
  of advance of the digital clock being driven by the oscillator based on
  the number written on the oscillator's package, the job of clock
  synchronization software is to measure the actual frequency of the
  oscillator and to update the clock's rate of advance to match.

  The traditional BSD clock adjustment interface, adjtime(2), provides
  only a slewing phase (i.e. time) adjustment.  There is no way to directly
  alter the clock's rate of advance to reduce the need for phase adjustments,
  nor is the adjtime(2) slew rate adjustable, and by modern standards the
  precision of the phase adjustments it implements is quite limited.  The
  optional NTP adjustment interface does provide a frequency adjustment, but
  the resolution is lower than the measurem