Re: Proposal for kernel clock changes
On Apr 1, 2014, at 1:50 PM, David Laight wrote: > This may mean that you can (effectively) count the ticks on all your > clocks since 'boot' and then scale the frequency of each to give the > same 'time since boot' - even though that will slightly change the > relationship between old timestamps taken on different clocks. > Possibly you do need a small offset for each clock to avoid > discrepencies in the 'current time' when you recalculate the clocks > frequency. If the underling clock moves in frequency, you need to have both a scale on the frequency, and a time to count adjustment as well. Otherwise on long-running systems you accumulate a fair amount of error. It doesn’t take much more than 1ppm of error to accumulate a second of error in 10 days if you don’t have ‘on time’ marks that integrate all of time up to that point. Then the error in phase will be related to the time since last phase sync, rather than since time of boot. Warner
Re: Proposal for kernel clock changes
On 1 Apr, 2014, at 12:50 , David Laight wrote: > On Fri, Mar 28, 2014 at 06:16:23PM -0400, Dennis Ferguson wrote: >> I would like to rework the clock support in the kernel a bit to correct >> some deficiencies which exist now, and to provide new functionality. The >> issues I would like to try to address include: > > A few comments, I've deleted the body so they aren't hidden! Thanks very much for looking at it. I know that reading about clocks is, for most people, a good way to put oneself to sleep at night. > One problem I do see is knowing which counter to trust most. > You are trying to cross synchronise values and it might be that > the clock with the best long term accuracy is a very slow one > with a lot of jitter (NTP over dialup anyone?). > Whereas the fastest clock is likely to have the least jitter, but > may not have the long term stability. This is true but when considering the quality of non-special-purpose computer clock hardware running on its own, either on the CPU board or on an ethernet card, what you'll effectively end up trying to determine by this is whether the clock is just crappy, or is crappier than that. The stability of cheap, uncompensated free-running crystals is always poor, you shouldn't trust any of these these unless you have no choice, and life is too short to worry about trying to measure degrees of crappiness. Since all the clocks in your system are likely to be crappy if left running free the "best" clock in the system will always be the one which is making the most accurate measurements of the most accurate external time source you have available and steering itself to that. The only important "quality" of a clock is how well it is measuring its time source and how good that time source is. The measurement clocks are only useful if you have an application which is interested in taking and processing those measurements, and if that application is not broken it will certainly come to some opinion about which of those clocks is the best one based on those measurements. That will be the clock the time comes from, the polling is the mechanism to get it to the others. The kernel itself will see the polling and see adjustments being made to clocks but it will be the application which knows why that is being done and which way the time is moving. If there are no external time sources, however, you'll probably just live with whatever your chosen system clock does and not worry about the measurement clocks. > There are places where you are only interested in the difference > between timestamps - rather than needing them converting to absolute > times. I'm not quite sure how to read that, but I'll guess. I over-simplified the description of what is being maintained a bit. I'm fond of, and the system call interface I like makes use of, the two timescales the kernel maintains now, i.e. time = uptime + boottime; where `time' has an UTC-aligned epoch, `uptime's epoch is around the time the machine was booted, and boottime is a mostly-constant value which expresses uptime's epoch in terms of time's epoch. uptime is maintained to advance at the same rate as time but to be phase continuous, which means that uptime will advance at as close to the rate of the SI second as we can determine it (since it advances at the same rate as time, which advances at the rate of UTC, which advances at the rate of the SI second) but is unaffected by step changes made to time make to bring it into phase alignment to UTC (boottime changes instead). uptime hence tracks UTC's frequency but not its phase. If you want to measure the interval between timestamps, then, I think you would take your timestamps in terms of uptime and then compute interval = uptime[1] - uptime[0]; which should reliably give you system's best estimate of the elapsed number of SI seconds between the times the two stamps were acquired. I like to record event timestamps in terms of uptime as well since it makes it unambiguous when the events occurred even if someone calls clock_settime() in between. Also, the tuple describing a conversion from a tickcount_t tc to a systime_t, which I over-simplified, actually maintains the pair of timescales by maintaining two `c' values, so that time = (tc << s) * r + c_time; uptime = (tc << s) * r + c_uptime; and boottime = c_time - c_uptime; So if "absolute time" means UTC, in the form of UTC-aligned `time', then I agree. You can't reliably compute time intervals from two UTC timestamps since, almost unavoidably, some day the system's estimate of UTC will be wrong and will require a step change to fix, and you'll compute a bogus time interval if your timestamps straddle that. On the other hand, if avoiding "needing them converted to absolute times" means hanging on to the raw tickstamp/tickcount for an extended period then I don't see the point. The conversion isn't very expensive, and a pair of uptime timestamps taken from the system clock will rel
Re: Proposal for kernel clock changes
On Fri, Mar 28, 2014 at 06:16:23PM -0400, Dennis Ferguson wrote: > I would like to rework the clock support in the kernel a bit to correct > some deficiencies which exist now, and to provide new functionality. The > issues I would like to try to address include: A few comments, I've deleted the body so they aren't hidden! One problem I do see is knowing which counter to trust most. You are trying to cross synchronise values and it might be that the clock with the best long term accuracy is a very slow one with a lot of jitter (NTP over dialup anyone?). Whereas the fastest clock is likely to have the least jitter, but may not have the long term stability. There are places where you are only interested in the difference between timestamps - rather than needing them converting to absolute times. I also wonder whether there are timestamps for which you are never really interested in the absolute accuracy of old values. Possibly because 'old' timestamps will already have been converted to some other clock. This might be the case for ethernet packet timestamps, you may want to be able to synchronise the timestamps from different interfaces, but you may not be interested in the absolute accuracy of timestamps from packets takem several hours ago. This may mean that you can (effectively) count the ticks on all your clocks since 'boot' and then scale the frequency of each to give the same 'time since boot' - even though that will slightly change the relationship between old timestamps taken on different clocks. Possibly you do need a small offset for each clock to avoid discrepencies in the 'current time' when you recalculate the clocks frequency. If the 128bit divides are being done to generate corrected frequences, it might be that you can use the error term to adjust the current value - and remove the need for the divide at all (after the initial setup). One thought I've sometimes had is that, instead of trying to synchronise the TSC counters in an SMP system, move them as far from each other as possible! Then, when you read the TSC, you can tell from the value which cpu it must have come from! David
Proposal for kernel clock changes
I would like to rework the clock support in the kernel a bit to correct some deficiencies which exist now, and to provide new functionality. The issues I would like to try to address include: - It has become common for systems to include clocks which are unsuitable for use as the time source for the system clock but which are none-the-less useful because they are the timestamp source for (hardware) measurements of external events. The most frequently encountered example of this may be the counter included in many Ethernet MAC chips which is sampled when IEEE 1588 packets are sent and received; many systems may have more than one of these. Peripherals which hardware timestamp other types of events (e.g. signal transitions, like the PPS output of a GPS receiver) are often found in integrated SoCs, as are devices which use a free-running counter to generate events. Making all of these "measurement" clocks useful to the system seems to require two things. It first requires that each of these clocks be visible to and indepently adjustable via a clock adjustment interface. The only thing one can do with accurate-in-time external events which are measured with a particular clock is to use the information to adjust that particular clock into synchronization, so each such clock must be independently adjustable. The second requirement is that it must be possible to measure the times of independent pairs of clocks in the system against each other as precisely as possible, perhaps with a sequence of the form read clock A read clock B read clock A to provide an estimate of both the offset between the clocks and the uncertainty/ambiguity of the measurement itself. The reason for this is that having a precisely synchronized measurement clock in, say, an Ethernet MAC chip is clearly fairly useless by itself. Its time becomes useful only when it can be transferred to the system clock and/or other clocks in need of synchronization so that other applications can use it too. - Acquiring a timestamp from a clock is generally done by (1) reading a value from a hardware register, then (2) manipulating the value and doing the arithmetic necessary to convert it to a time of day. I would like to be able to separate (1) from (2), storing the raw output from the hardware now (I've been calling this a "tickstamp") but deferring the work of converting it to a meaningful timestamp until a bit later. An example use of this might be to tickstamp every packet arrival early in a network controller's interrupt routine but only go to the trouble of converting the tickstamp to a timestamp if the packet arrives somewhere which cares about this (e.g. an NTP or PTP socket), with unused tickstamps perhaps being donated to the random number pool. Event timestamping has many uses, and with a suitably inexpensive hardware time source capturing event times in tickstamp form has many advantages. It minimizes the overhead in the case that not all tickstamps are consumed as timestamps, it often allows the bulk of the work of acquiring a timestamp to be moved from time critical code to code with more relaxed constraints, and is probably the most appropriate way to provide random number pool entropy. The clock-pair polling above might be implemented as read clock A tickstamp read clock B tickstamp read clock A tickstamp with the corresponding timestamps being sorted out after the time-critical polling code segment has been completed. It might also be possible to provide arithmetic functions computing nominal time intervals directly from tickstamps themselves to make the implementation of things like Codel's packet timestamping more economical. - The fundamental problem that clock synchronization software needs to deal with is that the oscillator (likely a crystal soldered to a board somewhere) driving the clock the software is trying to keep correct makes errors: the output frequency of the oscillator usually differs considerably from the number written on its package and will measurably vary with time. The operating system will likely be setting the rate of advance of the digital clock being driven by the oscillator based on the number written on the oscillator's package, the job of clock synchronization software is to measure the actual frequency of the oscillator and to update the clock's rate of advance to match. The traditional BSD clock adjustment interface, adjtime(2), provides only a slewing phase (i.e. time) adjustment. There is no way to directly alter the clock's rate of advance to reduce the need for phase adjustments, nor is the adjtime(2) slew rate adjustable, and by modern standards the precision of the phase adjustments it implements is quite limited. The optional NTP adjustment interface does provide a frequency adjustment, but the resolution is lower than the measurem