Two issues explained below:
- unconventional Era 0 split point
- undefined behaviour when casting a double
On Mon, 17 Feb 2025, ruggero rossi wrote:
On Tue, 18 Feb 2025 00:22:16 +1000 (AEST)
David Leonard <[email protected]> wrote:
Hello David,
This is the first time I've thought about NTP eras, so I'm not sure my
reasoning is correct, but here are my considerations:
in lpf_to_d, d was calculated by adding 1<<32 to dates < 1970, so d is
always the number of seconds since 1900-01-01T00:00:00.0 - no era here.
Agree, My reading was that doubles are always used in this program to
hold the real seconds since 1900Z. Moreover their values are always
non-negative.
This is different to the l_fixedpt_t structure which holds the
(endian-corrected) encoding of wrapped NTP timestamp, and where the era
number is missing.
Just to recap the problem, NTP eras, which are defined in the RFCs,
are the high bits lost when truncating the timestamp seconds to uint32.
Here are start and mid dates for some Eras, along with their values
in decimal and hex:
Era 0 1900-01-01T00:00:00Z 0 0x000000000
1968-01-20T03:14:08Z 2147483648 0x080000000
Era 1 2036-02-07T06:28:16Z 4294967296 0x100000000
2104-02-26T09:42:24Z 6442450944 0x180000000
Era 2 2172-03-15T12:56:32Z 8589934592 0x200000000
Because ntpd.c originally assumed Era 0, it was going to break in 2036.
A previous commit to busybox's ntpd.c to make it Y2036/Y2038 ready
fixed that Era 0 assumption by inserting code to instead assume Era 0 for
large uint32s, and Era 1 for small uint32s. The split point chosen was
1970-01-01Z.
That choice of 1970 may possibly be related to the confusing time_t
cast. The split point didn't have to be 1970, because there is already
an RFC 4330 convention of using ~1968 (midpoint of Era 0):
As the NTP timestamp format has been in use for over 20 years, it
is possible that it will be in use 32 years from now, when the
seconds field overflows. As it is probably inappropriate to
archive NTP timestamps before bit 0 was set in 1968, a convenient
way to extend the useful life of NTP timestamps is the following
convention: If bit 0 is set, the UTC time is in the range 1968-
2036, and UTC time is reckoned from 0h 0m 0s UTC on 1 January
1900. If bit 0 is not set, the time is in the range 2036-2104 and
UTC time is reckoned from 6h 28m 16s UTC on 7 February 2036. Note
that when calculating the correspondence, 2000 is a leap year, and
leap seconds are not included in the reckoning.
The arithmetic calculations used by NTP to determine the clock
offset and roundtrip delay require the client time to be within 34
years of the server time before the client is launched. As the
time since the Unix base 1970 is now more than 34 years, means
must be available to initialize the clock at a date closer to the
present, either with a time-of-year (TOY) chip or from firmware.
The update in RFC 5905 appears to have removed this convention and
replaced it with this text:
Eras cannot be produced by NTP directly, nor is there need to do so.
When necessary, they can be derived from external means, such as
the filesystem or dedicated hardware.
(I haven't looked at the rest of the protocol; but given that busybox may
be installed on embedded devices for more than an era, and have an RTC
with a dead battery can forget the era, I'm not sure how this might be
reconciled.)
Now we have to go back, using d_to_lpf
Say time_t is 64 bits.
then (time_t)d is never limited to maxInt64 for the next 500E9 years.
If 1970 < d < 02.07.2036 then (time_t)d is a full 32 bit quantity: actual
era.
If d > 02.07.2036 then (time_t)d is a 33 bit quantity, where 33 bit = 1 and
32 bits = 0.
Casting to an uint32_t takes only the lower 32
bits and represents a date in the next era.
Yes except for the last sentence. C23 says the final cast is undefined when
d >= MAX(time_t)+1:
6.3.1.4 Real floating and integer
When a finite value of standard floating type is converted to an
integer type other than bool, the fractional part is discarded (i.e.,
the value is truncated toward zero). If the value of the integral
part cannot be represented by the integer type, the behavior is
undefined.^65)
^65 The remaindering operation performed when a value of integer
type is converted to unsigned type need not be performed when a
value of real floating type is converted to unsigned type. Thus,
the range of portable real floating values is (−1, Utype_MAX + 1).
Now, lets say time_t is 32 bit.
if d > 0x7fff.ffff, then float conversion of (time_t)(d) is screwing us. We
get 0x7fff.ffff, that is 20.01.1968.
if we do (uint32_t)d, (like I did), the float conversion limit the value
to 0xffff.ffff, and we can't switch era.
Hence we should do:
uint32_t intl = (uint32_t)(int64_t)d;
uint32_t frac = (uint32_t)((d - (int64_t)d) * 0xffffffff);
and this is also correct in case of 64 bit time_t. A good point is that we
never calls for time_t: Writing this E-Mail I came to the conclusion
that time_t here only creates confusion between unix date and
ntp data.
Well, I completely agree that involving time_t is unnecessary and also
confusing.
But I think that a guard is needed when converting the double to uint32,
instead of relying on the UB wraparound. Something like this:
static NOINLINE void
d_to_lfp(l_fixedpt_t *lfp, double d)
{
double di, df;
di = modf(d, &df);
/* lfp->int_partl = fmod(di, 0x100000000); */
lfp->int_partl = di < 0x100000000 ? di : di - 0x100000000;
lfp->fractionl = df * 0x100000000;
}
static NOINLINE double
lfp_to_d(l_fixedpt_t lfp)
{
/* Treat truncated timestamps from the first half of an era as
* belonging to Era 1, and those from the second half as
* belonging to Era 0. This works for timestamps generated during
* the years 1968-2104. */
double ret = lfp->int_partl + lfp->fractionl / (double)0x100000000;
if (lfp->int_partl < 0x80000000)
ret += 0x100000000;
return ret;
}
I'm unsure of the size impact of modf() and fmod().
About the starting point, the only comment I can make, is that we must be
sure that a system starting with unix time_t = 0 - no battery in the real
time clock - starts in the right era.
This should be fine, because (time_t)0 maps to NTP timestamp 0x83aa_7e80 which
is safely within the 1968-2036 era, it is unaliased in the lfp representation.
Note that an NTP timestamp of exactly {0,0} has a special meaning. It's
only a problem if your system clock is stuck at 1900-01-01Z.
_______________________________________________
busybox mailing list
[email protected]
https://lists.busybox.net/mailman/listinfo/busybox