Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-06-27 Thread Peter Hilber
On 21.06.24 10:45, David Woodhouse wrote:
> On Thu, 2024-06-20 at 17:19 +0100, David Woodhouse wrote:
>>
>>>
 +
 +   /* Counter frequency, and error margin. Units of (second >> 64) */
 +   uint64_t counter_period_frac_sec;
>>>
>>> AFAIU this might limit the precision in case of high counter frequencies.
>>> Could the unit be aligned to the expected frequency band of counters?
>>
>> This field indicates the period of a single tick, in units of 1>>64 of
>> a second. That's about 5.4e-20 seconds, or 54 zeptoseconds? 
>>
>> Can you walk me through a calculation where you believe that level of
>> precision is insufficient?
>>
>> I guess the precision matters if the structure isn't updated for a long
>> period of time, and the delta between the current counter and the
>> snapshot is high? That's a *lot* of 54 zeptosecondses? But you really
>> would need a *lot* of them before you care? And if nobody's been
>> calibrating your counter for that long, surely you have bigger worries?
>>
>> Am I missing something there?
> 
> Hm, that was a bit rushed at the end of the day; let's take a better look...
> 
> Let's take a hypothetical example of a 100GHz counter. That's two
> orders of magnitude more than today's Arm arch counter.
> 
> The period of such a counter would be 10 picoseconds. 
> 
> (Let's ignore the question of how far light actually travels in that
> time and how *realistic* that example is, for the moment.)
> 
> It turns out that at that rate, there *are* a lot of 54 zeptosecondses
> of precision loss in the day. It could be half a millisecond a day, or
> 20µs an hour.
> 
> That particular example of 10 picoseconds is 184467440.7370955
> (seconds>>64) which could be truncated to 184467440 — losing about 4PPB
> (a third of a millisecond a day; 14µs an hour).
> 
> So yeah, I suppose a 'shift' field could make sense. It's easy enough
> to consume on the guest side as it doesn't really perturb the 128-bit
> multiplication very much; especially if we don't let it be negative.
> 
> And implementations *can* just set it to zero. It hurts nobody.
> 
> Or were you thinking of just using a fixed shift like (seconds>>80)
> instead?

The 'shift' field should be fine.



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-06-21 Thread David Woodhouse
On Thu, 2024-06-20 at 14:37 +0200, Peter Hilber wrote:
> Should implement .gettimex64 instead.

Thanks. This look sane?

As noted in the code comment, in the *ideal* case we just build all
three pre/post/device timestamps from the very same counter read. So
sts->pre_ts == sts->post_ts.

In the less ideal case (which will happen on x86 when kvmclock is being
used for the system time), we use the time from ktime_get_snapshot() as
the pre_ts and take a new snapshot immediately after the get_cycles().


diff --git a/drivers/ptp/ptp_vmclock.c b/drivers/ptp/ptp_vmclock.c
index e8c65405a8f3..07a81a94d29a 100644
--- a/drivers/ptp/ptp_vmclock.c
+++ b/drivers/ptp/ptp_vmclock.c
@@ -96,9 +96,11 @@ static inline uint64_t mul_u64_u64_add_u64(uint64_t *res_hi, 
uint64_t delta,
 }
 
 static int vmclock_get_crosststamp(struct vmclock_state *st,
+  struct ptp_system_timestamp *sts,
   struct system_counterval_t *system_counter,
   struct timespec64 *tspec)
 {
+   struct system_time_snapshot systime_snapshot;
uint64_t cycle, delta, seq, frac_sec;
int ret = 0;
 
@@ -119,7 +121,17 @@ static int vmclock_get_crosststamp(struct vmclock_state 
*st,
continue;
}
 
-   cycle = get_cycles();
+   if (sts) {
+   ktime_get_snapshot(&systime_snapshot);
+
+   if (systime_snapshot.cs_id == st->cs_id) {
+   cycle = systime_snapshot.cycles;
+   } else {
+   cycle = get_cycles();
+   ptp_read_system_postts(sts);
+   }
+   } else
+   cycle = get_cycles();
 
delta = cycle - st->clk->counter_value;
 
@@ -139,6 +151,21 @@ static int vmclock_get_crosststamp(struct vmclock_state 
*st,
if (ret)
return ret;
 
+   /*
+* When invoked for gettimex64, fill in the pre/post system times.
+* The ideal case is when system time is based on the the same
+* counter as st->cs_id, in which case all three pre/post/device
+* times are derived from the *same* counter value. If cs_id does
+* not match, then the value from ktime_get_snapshot() is used as
+* pre_ts, and ptp_read_system_postts() was already called above
+* for the post_ts. Those are either side of the get_cycles() call.
+*/
+   if (sts) {
+   sts->pre_ts = ktime_to_timespec64(systime_snapshot.real);
+   if (systime_snapshot.cs_id == st->cs_id)
+   sts->post_ts = sts->pre_ts;
+   }
+
if (system_counter) {
system_counter->cycles = cycle;
system_counter->cs_id = st->cs_id;
@@ -155,7 +182,7 @@ static int ptp_vmclock_get_time_fn(ktime_t *device_time,
struct timespec64 tspec;
int ret;
 
-   ret = vmclock_get_crosststamp(st, system_counter, &tspec);
+   ret = vmclock_get_crosststamp(st, NULL, system_counter, &tspec);
if (!ret)
*device_time = timespec64_to_ktime(tspec);
 
@@ -198,7 +225,16 @@ static int ptp_vmclock_gettime(struct ptp_clock_info *ptp, 
struct timespec64 *ts
struct vmclock_state *st = container_of(ptp, struct vmclock_state,
ptp_clock_info);
 
-   return vmclock_get_crosststamp(st, NULL, ts);
+   return vmclock_get_crosststamp(st, NULL, NULL, ts);
+}
+
+static int ptp_vmclock_gettimex(struct ptp_clock_info *ptp, struct timespec64 
*ts,
+   struct ptp_system_timestamp *sts)
+{
+   struct vmclock_state *st = container_of(ptp, struct vmclock_state,
+   ptp_clock_info);
+
+   return vmclock_get_crosststamp(st, sts, NULL, ts);
 }
 
 static int ptp_vmclock_enable(struct ptp_clock_info *ptp,
@@ -216,6 +252,7 @@ static const struct ptp_clock_info ptp_vmclock_info = {
.adjfine= ptp_vmclock_adjfine,
.adjtime= ptp_vmclock_adjtime,
.gettime64  = ptp_vmclock_gettime,
+   .gettimex64 = ptp_vmclock_gettimex,
.settime64  = ptp_vmclock_settime,
.enable = ptp_vmclock_enable,
.getcrosststamp = ptp_vmclock_getcrosststamp,




smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-06-21 Thread David Woodhouse
On Thu, 2024-06-20 at 17:19 +0100, David Woodhouse wrote:
> 
> > 
> > > +
> > > +   /* Counter frequency, and error margin. Units of (second >> 64) */
> > > +   uint64_t counter_period_frac_sec;
> > 
> > AFAIU this might limit the precision in case of high counter frequencies.
> > Could the unit be aligned to the expected frequency band of counters?
> 
> This field indicates the period of a single tick, in units of 1>>64 of
> a second. That's about 5.4e-20 seconds, or 54 zeptoseconds? 
> 
> Can you walk me through a calculation where you believe that level of
> precision is insufficient?
> 
> I guess the precision matters if the structure isn't updated for a long
> period of time, and the delta between the current counter and the
> snapshot is high? That's a *lot* of 54 zeptosecondses? But you really
> would need a *lot* of them before you care? And if nobody's been
> calibrating your counter for that long, surely you have bigger worries?
> 
> Am I missing something there?

Hm, that was a bit rushed at the end of the day; let's take a better look...

Let's take a hypothetical example of a 100GHz counter. That's two
orders of magnitude more than today's Arm arch counter.

The period of such a counter would be 10 picoseconds. 

(Let's ignore the question of how far light actually travels in that
time and how *realistic* that example is, for the moment.)

It turns out that at that rate, there *are* a lot of 54 zeptosecondses
of precision loss in the day. It could be half a millisecond a day, or
20µs an hour.

That particular example of 10 picoseconds is 184467440.7370955
(seconds>>64) which could be truncated to 184467440 — losing about 4PPB
(a third of a millisecond a day; 14µs an hour).

So yeah, I suppose a 'shift' field could make sense. It's easy enough
to consume on the guest side as it doesn't really perturb the 128-bit
multiplication very much; especially if we don't let it be negative.

And implementations *can* just set it to zero. It hurts nobody.

Or were you thinking of just using a fixed shift like (seconds>>80)
instead?


smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-06-20 Thread David Woodhouse
On Thu, 2024-06-20 at 14:37 +0200, Peter Hilber wrote:
> Changing virtio-dev address to the new one. The discussion might also be
> relevant for virtio-comment, but it is discouraged to forward it to both.

I will happily take it to whichever forum you think is most
appropriate. (And you have my permission to direct replies to whichever
you choose.)

> On 15.06.24 10:40, David Woodhouse wrote:
> > As discussed before, I don't think it makes sense to design a new high-
> > precision virtual clock which only gets it right *most* of the time. We
> > absolutely need to address the issue of live migration.
 ...
> > Here's a first attempt at defining such a memory structure. For now
> > I've done it as a "vmclock" ACPI device based loosely on vmgenid, but I
> > think it makes most sense for this to be part of the virtio_rtc spec.
> > Ultimately it doesn't matter *how* the guest finds the memory region.
> 
> This looks sensible to me. I also think it would be possible to adapt this for
> the virtio-rtc spec. The proposal also supports some other use cases which are
> not in the virtio-rtc RFC spec v4 [2], notably vDSO-style clock reading, and
> others such as indication of a past leap second.

Right. The vDSO-style clock reading is key to solving the live
migration problem.

The other key thing this adds is the error bounds, which some
applications care deeply about. I've been working with the team that
owns ClockBound on that part: https://github.com/aws/clock-bound

> Compared to the virtio-rtc RFC spec v4 [2], this proposal does not seem to
> support leap second smearing. 

That's kind of intentional. Leap second smearing should be considered
the *antithesis* of precise time. Anyone who wants a monotonic realtime
clock should be using the POSIX CLOCK_TAI.

Part of my motivation for fixing the LM problem is because some
financial institutions can incur significant penalties for putting
inaccurate timestamps on transactions — even the disruption caused by
live migration is enough to trigger that. So deliberately lying to them
about what the UTC time is, by up to a second in either direction, is
not necessarily in their best interest.

As you noted, this proposal does expose leap seconds in the recent
past, which is all that's needed to allow a guest to generate a smeared
clock *from* the accurate clock that is provided through this
mechanism.

(Knowledge of past leap seconds is needed because in some modes,
smearing adjustments continue for some hours *afte* the leap second
should have occurred. So the NTP style of leap indicator isn't
sufficient).

> Also, it would be helpful to allow indicating
> when some of the fields are not valid (such as leapsecond_counter_value,
> leapsecond_tai_time, tai_offset_sec, utc_time_maxerror_picosec, ...).

Right. For some of those the answer can just be 'zero means invalid',
for the max error, perhaps MAX_UINT64. But we should definitely make
that explicit.

I'm also not entirely sure I understood Julien's insistence that we
include the leapsecond_counter_value as *well* as the
leapsecond_tai_time. It seems to me that the implementation would have
to recalculate that every time the frequency is adjusted. 

For some of those fields, I was expecting a certain amount of
bikeshedding to occur and figured it was better to post an early straw
man and solicit feedback.

> Do you have plans to contribute this to the Virtio specification and Linux
> driver implementation?

Yes, absolutely. For now I've implemented it in the Linux guest¹ and in
QEMU² as an ACPI device modelled on vmgenid, but I'd love *not* to have
to do that, and just to do it based on virtio instead.

¹ https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/vmclock
² https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/vmclock


> > +static const struct ptp_clock_info ptp_vmclock_info = {
> > +   .owner  = THIS_MODULE,
> > +   .max_adj= 0,
> > +   .n_ext_ts   = 0,
> > +   .n_pins = 0,
> > +   .pps= 0,
> > +   .adjfine= ptp_vmclock_adjfine,
> > +   .adjtime= ptp_vmclock_adjtime,
> > +   .gettime64  = ptp_vmclock_gettime,
> 
> Should implement .gettimex64 instead.

Ack, thanks. I'll go play with that.

> 
> > +
> > +   /* Counter frequency, and error margin. Units of (second >> 64) */
> > +   uint64_t counter_period_frac_sec;
> 
> AFAIU this might limit the precision in case of high counter frequencies.
> Could the unit be aligned to the expected frequency band of counters?

This field indicates the period of a single tick, in units of 1>>64 of
a second. That's about 5.4e-20 seconds, or 54 zeptoseconds? 

Can you walk me through a calculation where you believe that level of
precision is insufficient?

I guess the precision matters if the structure isn't updated for a long
period of time, and the delta between the current counter and the
snapshot is high? That's a *lot* of 54 zeptosecondses? But 

Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-06-20 Thread Peter Hilber
Changing virtio-dev address to the new one. The discussion might also be
relevant for virtio-comment, but it is discouraged to forward it to both.

On 15.06.24 10:40, David Woodhouse wrote:
> On Mon, 2023-12-18 at 08:38 +0100, Peter Hilber wrote:
>> RFC v3 updates
>> --
>>
>> This series implements a driver for a virtio-rtc device conforming to spec
>> RFC v3 [1]. It now includes an RTC class driver with alarm, in addition to
>> the PTP clock driver already present before.
>>
>> This patch series depends on the patch series "treewide: Use clocksource id
>> for get_device_system_crosststamp()" [3]. Pull [4] to get the combined
>> series on top of mainline.
>>
>> Overview
>> 
>>
>> This patch series adds the virtio_rtc module, and related bugfixes. The
>> virtio_rtc module implements a driver compatible with the proposed Virtio
>> RTC device specification [1]. The Virtio RTC (Real Time Clock) device
>> provides information about current time. The device can provide different
>> clocks, e.g. for the UTC or TAI time standards, or for physical time
>> elapsed since some past epoch. The driver can read the clocks with simple
>> or more accurate methods. Optionally, the driver can set an alarm.
>>
>> The series first fixes some bugs in the get_device_system_crosststamp()
>> interpolation code, which is required for reliable virtio_rtc operation.
>> Then, add the virtio_rtc implementation.
>>
>> For the Virtio RTC device, there is currently a proprietary implementation,
>> which has been used for provisional testing.
> 
> As discussed before, I don't think it makes sense to design a new high-
> precision virtual clock which only gets it right *most* of the time. We
> absolutely need to address the issue of live migration.
> 
> When live migration occurs, the guest's time precision suffers in two
> ways.
> 
> First, even when migrating to a supposedly identical host the precise
> rate of the underlying counter (TSC, arch counter, etc.) can change
> within the tolerances (e.g. ±50PPM) of the hardware. Unlike the natural
> changes that NTP normally manages to track, this is a *step* change,
> potentially from -50PPM to +50PPM from one host to the next.
> 
> Second, the accuracy of the counter as preserved across migration is
> limited by the accuracy of each host's NTP synchronization. So there is
> also a step change in the value of the counter itself.
> 
> At the moment of migration, the guest's timekeeping should be
> considered invalid. Any previous cross-timestamps are meaningless, and
> blindly using the previously-known relationship between the counter and
> real time can lead to problems such as corruption in distributed
> databases, fines for mis-timestamped transactions, and other general
> unhappiness.
> 
> We obviously can't get a new timestamp from the virtio_rtc device every
> time an application wants to know the time reliably. We don't even want
> *system* calls for that, which is why we have it in a vDSO.
> 
> We can take the same approach to warning the guest about clock
> disruption due to live migration. A shared memory region from the
> virtual clock device even be mapped all the way to userspace, for those
> applications which need precise and *reliable* time to check a
> 'disruption' marker in it, and do whatever is appropriate (e.g. abort
> transactions and wait for time to resync) when it happens.
> 
> We can do better than just letting the guest know about disruption, of
> course. We can put the actual counter→realtime relationship into the
> memory region too. As well as being able to provide a PTP driver with
> this, the applications which care about *reliable* timestamps can mmap
> the page directly and use it, vDSO-style, to have accurate timestamps
> even from the first cycle after migration.
> 
> When disruption is signalled, the guest needs to throw away any
> *additional* refinement that it's done with NTP/PTP/PPS/etc. and revert
> to knowing nothing more than what the hypervisor advertises here.
> 
> Here's a first attempt at defining such a memory structure. For now
> I've done it as a "vmclock" ACPI device based loosely on vmgenid, but I
> think it makes most sense for this to be part of the virtio_rtc spec.
> Ultimately it doesn't matter *how* the guest finds the memory region.

This looks sensible to me. I also think it would be possible to adapt this for
the virtio-rtc spec. The proposal also supports some other use cases which are
not in the virtio-rtc RFC spec v4 [2], notably vDSO-style clock reading, and
others such as indication of a past leap second.

Compared to the virtio-rtc RFC spec v4 [2], this proposal does not seem to
support leap second smearing. Also, it would be helpful to allow indicating
when some of the fields are not valid (such as leapsecond_counter_value,
leapsecond_tai_time, tai_offset_sec, utc_time_maxerror_picosec, ...).

Do you have plans to contribute this to the Virtio specification and Linux
driver implementation?

I also added a 

Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-06-15 Thread David Woodhouse
On Mon, 2023-12-18 at 08:38 +0100, Peter Hilber wrote:
> RFC v3 updates
> --
> 
> This series implements a driver for a virtio-rtc device conforming to spec
> RFC v3 [1]. It now includes an RTC class driver with alarm, in addition to
> the PTP clock driver already present before.
> 
> This patch series depends on the patch series "treewide: Use clocksource id
> for get_device_system_crosststamp()" [3]. Pull [4] to get the combined
> series on top of mainline.
> 
> Overview
> 
> 
> This patch series adds the virtio_rtc module, and related bugfixes. The
> virtio_rtc module implements a driver compatible with the proposed Virtio
> RTC device specification [1]. The Virtio RTC (Real Time Clock) device
> provides information about current time. The device can provide different
> clocks, e.g. for the UTC or TAI time standards, or for physical time
> elapsed since some past epoch. The driver can read the clocks with simple
> or more accurate methods. Optionally, the driver can set an alarm.
> 
> The series first fixes some bugs in the get_device_system_crosststamp()
> interpolation code, which is required for reliable virtio_rtc operation.
> Then, add the virtio_rtc implementation.
> 
> For the Virtio RTC device, there is currently a proprietary implementation,
> which has been used for provisional testing.

As discussed before, I don't think it makes sense to design a new high-
precision virtual clock which only gets it right *most* of the time. We
absolutely need to address the issue of live migration.

When live migration occurs, the guest's time precision suffers in two
ways.

First, even when migrating to a supposedly identical host the precise
rate of the underlying counter (TSC, arch counter, etc.) can change
within the tolerances (e.g. ±50PPM) of the hardware. Unlike the natural
changes that NTP normally manages to track, this is a *step* change,
potentially from -50PPM to +50PPM from one host to the next.

Second, the accuracy of the counter as preserved across migration is
limited by the accuracy of each host's NTP synchronization. So there is
also a step change in the value of the counter itself.

At the moment of migration, the guest's timekeeping should be
considered invalid. Any previous cross-timestamps are meaningless, and
blindly using the previously-known relationship between the counter and
real time can lead to problems such as corruption in distributed
databases, fines for mis-timestamped transactions, and other general
unhappiness.

We obviously can't get a new timestamp from the virtio_rtc device every
time an application wants to know the time reliably. We don't even want
*system* calls for that, which is why we have it in a vDSO.

We can take the same approach to warning the guest about clock
disruption due to live migration. A shared memory region from the
virtual clock device even be mapped all the way to userspace, for those
applications which need precise and *reliable* time to check a
'disruption' marker in it, and do whatever is appropriate (e.g. abort
transactions and wait for time to resync) when it happens.

We can do better than just letting the guest know about disruption, of
course. We can put the actual counter→realtime relationship into the
memory region too. As well as being able to provide a PTP driver with
this, the applications which care about *reliable* timestamps can mmap
the page directly and use it, vDSO-style, to have accurate timestamps
even from the first cycle after migration.

When disruption is signalled, the guest needs to throw away any
*additional* refinement that it's done with NTP/PTP/PPS/etc. and revert
to knowing nothing more than what the hypervisor advertises here.

Here's a first attempt at defining such a memory structure. For now
I've done it as a "vmclock" ACPI device based loosely on vmgenid, but I
think it makes most sense for this to be part of the virtio_rtc spec.
Ultimately it doesn't matter *how* the guest finds the memory region.

Very preliminary QEMU hacks at 
https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/vmclock
(I still need to do the KVM side helper for actually filling in the
host clock information, converted to the *guest* TSC)

This is the guest side. H aving heckled your assumption that we can use
the virt counter on Arm, I concede I'm doing the same thing for now.
The structure itself is at the end, or see
https://git.infradead.org/?p=users/dwmw2/linux.git;a=blob;f=include/uapi/linux/vmclock.h;hb=vmclock

From 9e1c3b823d497efa4e0acb21b226a72e4d6e8a53 Mon Sep 17 00:00:00 2001
From: David Woodhouse 
Date: Mon, 10 Jun 2024 15:10:11 +0100
Subject: [PATCH] ptp: Add vDSO-style vmclock support

The vmclock "device" provides a shared memory region with precision clock
information. By using shared memory, it is safe across Live Migration.

Signed-off-by: David Woodhouse 
---
 drivers/ptp/Kconfig  |  13 ++
 drivers/ptp/Makefile |   1 +
 drivers/ptp/ptp_vmclock.c| 404 ++

Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-20 Thread David Woodhouse
On Tue, 2024-03-19 at 14:47 +0100, Peter Hilber wrote:
> While the virtio-comment list is not available, now also CC'ing Parav,
> which may be interested in this virtio-rtc spec related discussion thread.
> 
> On 14.03.24 15:19, David Woodhouse wrote:
> > On 14 March 2024 11:13:37 CET, Peter Hilber  
> > wrote:
> > > > To a certain extent, as long as the virtio-rtc device is designed to 
> > > > expose time precisely and unambiguously, it's less important if the 
> > > > Linux kernel *today* can use that. Although of course we should strive 
> > > > for that. Let's be...well, *unambiguous*, I suppose... that we've 
> > > > changed topics to discuss that though.
> > > > 
> > > 
> > > As Virtio is extensible (unlike hardware), my approach is to mostly 
> > > specify
> > > only what also has a PoC user and a use case.
> > 
> > If we get memory-mapped (X, Y, Z, ±x, ±y) I'll have a user and a use case 
> > on day one. Otherwise, as I said in my first response, I can go do that as 
> > a separate device and decide that virtio_rtc doesn't meet our needs 
> > (especially for maintaining accuracy over LM).
> 
> We plan to add 
> 
> - leap second indication,
> 
> - UTC-to-TAI offset,
> 
> - clock smearing indication (including the noon-to-noon linear smearing
>   variant which seems to be somewhat popular), and
>
> - clock accuracy indication
> 
> to the initial spec and to the PoC implementation.

Sounds good, thanks! I look forward to seeing the new revision. I'm
hoping Julien can give feedback on the clock accuracy parts.

> However, due to resource restrictions, we cannot ourselves add the
> memory-mapped clock to the initial spec.
>
> Everyone is very welcome to contribute the memory-mapped clock to the spec,
> and I think it might then still make it to the initial version.

Makes sense. That is my primary target, so I'm *hoping* we can converge
and get that into your initial spec, otherwise for expediency I'm going
to have to define an ACPI or DT or PCI device of our own and expose the
memory region through that instead.

(Even if I have to do that in the short term to stop the bleeding with
customers' clocks and live migration, I'd still aspire to migrate to a
virtio_rtc version of it in future)



smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-19 Thread Peter Hilber
While the virtio-comment list is not available, now also CC'ing Parav,
which may be interested in this virtio-rtc spec related discussion thread.

On 14.03.24 15:19, David Woodhouse wrote:
> On 14 March 2024 11:13:37 CET, Peter Hilber  
> wrote:
>>> To a certain extent, as long as the virtio-rtc device is designed to expose 
>>> time precisely and unambiguously, it's less important if the Linux kernel 
>>> *today* can use that. Although of course we should strive for that. Let's 
>>> be...well, *unambiguous*, I suppose... that we've changed topics to discuss 
>>> that though.
>>>
>>
>> As Virtio is extensible (unlike hardware), my approach is to mostly specify
>> only what also has a PoC user and a use case.
> 
> If we get memory-mapped (X, Y, Z, ±x, ±y) I'll have a user and a use case on 
> day one. Otherwise, as I said in my first response, I can go do that as a 
> separate device and decide that virtio_rtc doesn't meet our needs (especially 
> for maintaining accuracy over LM).

We plan to add 

- leap second indication,

- UTC-to-TAI offset,

- clock smearing indication (including the noon-to-noon linear smearing
  variant which seems to be somewhat popular), and

- clock accuracy indication

to the initial spec and to the PoC implementation.

However, due to resource restrictions, we cannot ourselves add the
memory-mapped clock to the initial spec.

Everyone is very welcome to contribute the memory-mapped clock to the spec,
and I think it might then still make it to the initial version.

> 
> My main concern for virto_rtc is that we avoid *ambiguity*. Yes, I get that 
> it's extensible but we don't want a v1.0 of the spec, implemented by various 
> hypervisors, which still leaves guests not knowing what the actual time is. 
> That would not be good. And even UTC without a leap second indicator has that 
> problem.

Agreed. That should be addressed by the above changes.

Best regards,

Peter



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-14 Thread David Woodhouse
On 14 March 2024 11:13:37 CET, Peter Hilber  
wrote:
>> To a certain extent, as long as the virtio-rtc device is designed to expose 
>> time precisely and unambiguously, it's less important if the Linux kernel 
>> *today* can use that. Although of course we should strive for that. Let's 
>> be...well, *unambiguous*, I suppose... that we've changed topics to discuss 
>> that though.
>> 
>
>As Virtio is extensible (unlike hardware), my approach is to mostly specify
>only what also has a PoC user and a use case.

If we get memory-mapped (X, Y, Z, ±x, ±y) I'll have a user and a use case on 
day one. Otherwise, as I said in my first response, I can go do that as a 
separate device and decide that virtio_rtc doesn't meet our needs (especially 
for maintaining accuracy over LM).

My main concern for virto_rtc is that we avoid *ambiguity*. Yes, I get that 
it's extensible but we don't want a v1.0 of the spec, implemented by various 
hypervisors, which still leaves guests not knowing what the actual time is. 
That would not be good. And even UTC without a leap second indicator has that 
problem.



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-14 Thread Peter Hilber
Now CC'ing the previous commenters to the virtio-rtc spec draft, since
this discussion is mostly about the spec, and the Virtio mailing lists
still seem to be in a migration hiatus...

On 13.03.24 19:18, David Woodhouse wrote:
> On 13 March 2024 17:50:48 GMT, Peter Hilber  
> wrote:
>> On 13.03.24 13:45, David Woodhouse wrote:
>>> Surely the whole point of this effort is to provide guests with precise
>>> and *unambiguous* knowledge of what the time is? 
>>
>> I would say, a fundamental point of this effort is to enable such
>> implementations, and to detect if a device is promising to support this.
>>
>> Where we might differ is as to whether the Virtio clock *for every
>> implementation* has to be *continuously* accurate w.r.t. a time standard,
>> or whether *for some implementations* it could be enough that all guests in
>> the local system have the same, precise local notion of time, which might
>> be off from the actual time standard.
> 
> That makes sense, but remember I don't just want {X, Y, Z} but *also* the 
> error bounds of ±deltaY and ±deltaZ too.
> 
> So your example just boils down to "I'm calling it UTC, and it's really 
> precise, but we make no promises about its *accuracy*". And that's fine.
> 
>> Also, cf. ptp_kvm, which AFAIU doesn't address leap seconds at all...
> 
> KVM is not an exemplar of good time practices. 
> Not in *any* respect :)
> 
>> With your described use case the UTC_SMEARED clock should of course not be
>> used. The UTC_SMEARED clock would get a distinct name through udev, like
>> /dev/ptp_virtio_utc_smeared, so the incompatibility could at least be
>> detected.
> 
> As long as it's clear to all concerned that this is fundamentally not usable 
> as an accurate time source, and is only for the local-sync case you 
> described, sure.
> 
>>> Using UTC is bad enough, because for a UTC timestamp in the middle of a
>>> leap second the guest can't know know *which* occurrence of that leap
>>> second it is, so it might be wrong by a second. To resolve that
>>> ambiguity needs a leap indicator and/or tai_offset field.
>>
>> I agree that virtio-rtc should communicate this. The question is, what
>> exactly, and for which clock read request?
> 
> Are we now conflating software architecture (and Linux in particular) with 
> "hardware" design?
> 
> To a certain extent, as long as the virtio-rtc device is designed to expose 
> time precisely and unambiguously, it's less important if the Linux kernel 
> *today* can use that. Although of course we should strive for that. Let's 
> be...well, *unambiguous*, I suppose... that we've changed topics to discuss 
> that though.
> 

As Virtio is extensible (unlike hardware), my approach is to mostly specify
only what also has a PoC user and a use case.

>> As for PTP clocks:
>>
>> - It doesn't fit into the ioctl PTP_SYS_OFFSET_PRECISE2.
>>
>> - The clock_adjtime(2) tai_offset and return value could be set (if
>>  upstream will accept this). Would this help? As discussed, user space
>>  would need to interpret this (and currently no dynamic POSIX clock sets
>>  this).
> 
> Hm, maybe?
> 
> 
 I think I can add a SHOULD requirement which vaguely refers to vCPU 0, or
 boot vCPU. But the Virtio device is not necessarily hosted by a hypervisor,
 so the device might not even know which vCPUs there are. E.g. there is even
 interest to make virtio-rtc work as part of the virtio-net device (which
 might be implemented in hardware).
>>>
>>> Sure, but those implementations aren't going to offer the TSC pairing
>>> at all, are they?
>>>
>>
>> They could offer an Intel ART pairing (some physical PTP NICs are already
>> doing this, look for the convert_art_to_tsc() users).
> 
> Right, but isn't that software's problem? The time pairing is defined against 
> the ART in that case.

My point was that such a device would then not necessarily have an idea
what vCPU 0 is. But let's just say that this will be phrased as a SHOULD
best-effort requirement anyway.

Thanks for the comments,

Peter



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-14 Thread Peter Hilber
On 13.03.24 21:12, Andrew Lunn wrote:
>> As long as it doesn't behave differently from the other RTC, I'm fine
>> with this. This is important because I don't want to carry any special
>> infrastructure for this driver or to have to special case this driver
>> later on because it is incompatible with some evolution of the
>> subsystem.
> 
> Maybe deliberately throw away all the sub-second accuracy when
> accessing the time via the RTC API? That helps to make it look like an
> RTC. And when doing the rounding, add a constant offset of 10ms to
> emulate the virtual i2c bus it is hanging off, like most RTCs?
> 
> Andrew

The truncating to whole seconds is already done. As to the offset, I do not
understand how this would help. I can read out my CMOS RTC in ~50 us.

Thanks for the comment,

Peter



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread Andrew Lunn
> As long as it doesn't behave differently from the other RTC, I'm fine
> with this. This is important because I don't want to carry any special
> infrastructure for this driver or to have to special case this driver
> later on because it is incompatible with some evolution of the
> subsystem.

Maybe deliberately throw away all the sub-second accuracy when
accessing the time via the RTC API? That helps to make it look like an
RTC. And when doing the rounding, add a constant offset of 10ms to
emulate the virtual i2c bus it is hanging off, like most RTCs?

  Andrew



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread David Woodhouse
On 13 March 2024 17:50:48 GMT, Peter Hilber  
wrote:
>On 13.03.24 13:45, David Woodhouse wrote:
>> Surely the whole point of this effort is to provide guests with precise
>> and *unambiguous* knowledge of what the time is? 
>
>I would say, a fundamental point of this effort is to enable such
>implementations, and to detect if a device is promising to support this.
>
>Where we might differ is as to whether the Virtio clock *for every
>implementation* has to be *continuously* accurate w.r.t. a time standard,
>or whether *for some implementations* it could be enough that all guests in
>the local system have the same, precise local notion of time, which might
>be off from the actual time standard.

That makes sense, but remember I don't just want {X, Y, Z} but *also* the error 
bounds of ±deltaY and ±deltaZ too.

So your example just boils down to "I'm calling it UTC, and it's really 
precise, but we make no promises about its *accuracy*". And that's fine.

>Also, cf. ptp_kvm, which AFAIU doesn't address leap seconds at all...

KVM is not an exemplar of good time practices. 
Not in *any* respect :)

>With your described use case the UTC_SMEARED clock should of course not be
>used. The UTC_SMEARED clock would get a distinct name through udev, like
>/dev/ptp_virtio_utc_smeared, so the incompatibility could at least be
>detected.

As long as it's clear to all concerned that this is fundamentally not usable as 
an accurate time source, and is only for the local-sync case you described, 
sure.

>> Using UTC is bad enough, because for a UTC timestamp in the middle of a
>> leap second the guest can't know know *which* occurrence of that leap
>> second it is, so it might be wrong by a second. To resolve that
>> ambiguity needs a leap indicator and/or tai_offset field.
>
>I agree that virtio-rtc should communicate this. The question is, what
>exactly, and for which clock read request?

Are we now conflating software architecture (and Linux in particular) with 
"hardware" design?

To a certain extent, as long as the virtio-rtc device is designed to expose 
time precisely and unambiguously, it's less important if the Linux kernel 
*today* can use that. Although of course we should strive for that. Let's 
be...well, *unambiguous*, I suppose... that we've changed topics to discuss 
that though.

>As for PTP clocks:
>
>- It doesn't fit into the ioctl PTP_SYS_OFFSET_PRECISE2.
>
>- The clock_adjtime(2) tai_offset and return value could be set (if
>  upstream will accept this). Would this help? As discussed, user space
>  would need to interpret this (and currently no dynamic POSIX clock sets
>  this).

Hm, maybe?


>>> I think I can add a SHOULD requirement which vaguely refers to vCPU 0, or
>>> boot vCPU. But the Virtio device is not necessarily hosted by a hypervisor,
>>> so the device might not even know which vCPUs there are. E.g. there is even
>>> interest to make virtio-rtc work as part of the virtio-net device (which
>>> might be implemented in hardware).
>> 
>> Sure, but those implementations aren't going to offer the TSC pairing
>> at all, are they?
>> 
>
>They could offer an Intel ART pairing (some physical PTP NICs are already
>doing this, look for the convert_art_to_tsc() users).

Right, but isn't that software's problem? The time pairing is defined against 
the ART in that case.



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread Peter Hilber
On 13.03.24 15:06, David Woodhouse wrote:
> On Wed, 2024-03-13 at 13:58 +0100, Alexandre Belloni wrote:
>> The TSC or whatever CPU counter/clock that is used to keep the system
>> time is not an RTC, I don't get why it has to be exposed as such to the
>> guests. PTP is fine and precise, RTC is not.
> 
> Ah, I see. But the point of the virtio_rtc is not really to expose that
> CPU counter. The point is to report the wallclock time, just like an
> actual RTC. The real difference is the *precision*.
> 
> The virtio_rtc device has a facility to *also* expose the counter,
> because that's what we actually need to gain that precision...
> 
> Applications don't read the RTC every time they want to know what the
> time is. These days, they don't even make a system call; it's done
> entirely in userspace mode. The kernel exposes some shared memory,
> essentially saying "the counter was X at time Y, and runs at Z Hz".
> Then applications just read the CPU counter and do some arithmetic.
> 
> As we require more and more precision in the calibration, it becomes
> important to get *paired* readings of the CPU counter and the wallclock
> time at precisely the same moment. If the guest has to read one and
> then the other, potentially taking interrupts, getting preempted and
> suffering steal/SMI time in the middle, that introduces an error which
> is increasingly significant as we increasingly care about precision.
> 
> Peter's proposal exposes the pairs of {X,Y} and leaves *all* the guest
> kernels having to repeat readings over time and perform the calibration
> as the underlying hardware oscillator frequency (Z) drifts with
> temperature. I'm trying to get him to let the hypervisor expose the
> calibrated frequency Z too. Along with *error* bounds for ±δX and ±δZ.
> Which aside from reducing the duplication of effort, will *also* fix
> the problem of live migration where *all* those things suffer a step
> change and leave the guest with an inaccurate clock but not knowing it.

I am already convinced that this would work significantly better than the
{X,Y} pair (but would be a bit more effort to implement):

1. when accessed by user space, obviously

2. when backing the PTP clock, it saves CPU time and makes non-paired
   reads more precise.

I would just prefer to try upstreaming the {X,Y} pairing first. I think the
{X,Y,Z...} pairing could be discussed and developed in parallel.

Thanks for the comments,

Peter



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread Peter Hilber
On 13.03.24 13:45, David Woodhouse wrote:
> On Wed, 2024-03-13 at 10:45 +0100, Peter Hilber wrote:
>> On 12.03.24 18:15, David Woodhouse wrote:
>>> On Mon, 2024-03-11 at 19:24 +0100, Peter Hilber wrote:
 On 08.03.24 13:33, David Woodhouse wrote:
> On Fri, 2024-03-08 at 11:32 +0100, Peter Hilber wrote:
>> On 07.03.24 15:02, David Woodhouse wrote:
>>> Hm, should we allow UTC? If you tell me the time in UTC, then
>>> (sometimes) I still don't actually know what the time is, because some
>>> UTC seconds occur twice. UTC only makes sense if you provide the TAI
>>> offset, surely? Should the virtio_rtc specification make it mandatory
>>> to provide such?
>>>
>>> Otherwise you're just designing it to allow crappy hypervisors to
>>> expose incomplete information.
>>>
>>
>> Hi David,
>>
>> (adding virtio-comm...@lists.oasis-open.org for spec discussion),
>>
>> thank you for your insightful comments. I think I take a broadly similar
>> view. The reason why the current spec and driver is like this is that I
>> took a pragmatic approach at first and only included features which work
>> out-of-the-box for the current Linux ecosystem.
>>
>> The current virtio_rtc features work similar to ptp_kvm, and therefore
>> can work out-of-the-box with time sync daemons such as chrony.
>>
>> As of RFC spec v3, UTC clock only is allowed. If mandating a TAI clock
>> as well, I am afraid that
>>
>> - in some (embedded) scenarios, the TAI clock may not be available
>>
>> - crappy hypervisors will pass off the UTC clock as the TAI clock.
>>
>> For the same reasons, I am also not sure about adding a *mandatory* TAI
>> offset to each readout. I don't know user-space software which would
>> leverage this already (at least not through the PTP clock interface).
>> And why would such software not go straight for the TAI clock instead?
>>
>> How about adding a requirement to the spec that the virtio-rtc device
>> SHOULD expose the TAI clock whenever it is available - would this
>> address your concerns?
>
> I think that would be too easy for implementors to miss, or decide not
> to obey. Or to get *wrong*, by exposing a TAI clock but actually
> putting UTC in it.
>
> I think I prefer to mandate the tai_offset field with the UTC clock.
> Crappy implementations will just set it to zero, but at least that
> gives a clear signal to the guests that it's *their* problem to
> resolve.

 To me there are some open questions regarding how this would work. Is there
 a use case for this with the v3 clock reading methods, or would it be
 enough to address this with the Virtio timekeeper?

 Looking at clock_adjtime(2), the tai_offset could be exposed, but probably
 best alongside some additional information about leap seconds. I am not
 aware about any user-space user. In addition, leap second smearing should
 also be addressed.

>>>
>>> Is there even a standard yet for leap-smearing? Will it be linear over
>>> 1000 seconds like UTC-SLS? Or semi-raised-cosine over 24 hours, which I
>>> think is what Google does? Meta does something different again, don't
>>> they?
>>>
>>> Exposing UTC as the only clock reference is bad enough; when leap
>>> seconds happen there's a whole second during which you don't *know*
>>> which second it is. It seems odd to me, for a precision clock to be
>>> deliberately ambiguous about what the time is!
>>
>> Just to be clear, the device can perfectly expose only a TAI reference
>> clock (or both UTC and TAI), the spec is just completely open about this,
>> as it tries to work for diverse use cases.
> 
> As long as the guest *knows* what it's getting, sure.
> 
>>>
>>> But if the virtio-rtc clock is defined as UTC and then expose something
>>> *different* in it, that's even worse. You potentially end up providing
>>> inaccurate time for a whole *day* leading up to the leap second.
>>>
>>> I think you're right that leap second smearing should be addressed. At
>>> the very least, by making it clear that the virtio-rtc clock which
>>> advertises UTC shall be used *only* for UTC, never UTC-SLS or any other
>>> yet-to-be-defined variant.
>>>
>>
>> Agreed.
>>
>>> Please make it explicit that any hypervisor which wants to advertise a
>>> smeared clock shall define a new type which specifies the precise
>>> smearing algorithm and cannot be conflated with the one you're defining
>>> here.
>>>
>>
>> I will add a requirement that the UTC clock can never have smeared/smoothed
>> leap seconds.
> 
> Thanks.
> 
>> I think that not every vendor would bother to first add a definition of a
>> smearing algorithm. Also, I think in some cases knowing the precise
>> smearing algorithm might not be important (when having the same time as the
>> hypervisor is enough and accuracy w.r.t. actual time is less important).
>>
>> S

Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread Alexandre Belloni
On 13/03/2024 14:06:42+, David Woodhouse wrote:
> If you're asking why patch 7/7 in Peter's series exists to expose the
> virtio clock through RTC, and you're not particularly interested in the
> first six, I suppose that's a fair question. As is the question of "why
> is it called virtio_rtc not virtio_ptp?". 
> 

Exactly my question, thanks :)

> But let me turn it around: if the kernel has access to this virtio
> device and *not* any other RTC, why *wouldn't* the kernel use the time
> from it? The fact that it can optionally *also* provide paired readings
> with the CPU counter doesn't actually *hurt* for the RTC use case, does
> it?

As long as it doesn't behave differently from the other RTC, I'm fine
with this. This is important because I don't want to carry any special
infrastructure for this driver or to have to special case this driver
later on because it is incompatible with some evolution of the
subsystem.

-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread Peter Hilber
On 13.03.24 12:18, Alexandre Belloni wrote:
> On 13/03/2024 10:45:54+0100, Peter Hilber wrote:
>>> Exposing UTC as the only clock reference is bad enough; when leap
>>> seconds happen there's a whole second during which you don't *know*
>>> which second it is. It seems odd to me, for a precision clock to be
>>> deliberately ambiguous about what the time is!
>>
>> Just to be clear, the device can perfectly expose only a TAI reference
>> clock (or both UTC and TAI), the spec is just completely open about this,
>> as it tries to work for diverse use cases.
>>
>>>
>>> But if the virtio-rtc clock is defined as UTC and then expose something
>>> *different* in it, that's even worse. You potentially end up providing
>>> inaccurate time for a whole *day* leading up to the leap second.
>>>
>>> I think you're right that leap second smearing should be addressed. At
>>> the very least, by making it clear that the virtio-rtc clock which
>>> advertises UTC shall be used *only* for UTC, never UTC-SLS or any other
>>> yet-to-be-defined variant.
>>>
>>
>> Agreed.
>>
>>> Please make it explicit that any hypervisor which wants to advertise a
>>> smeared clock shall define a new type which specifies the precise
>>> smearing algorithm and cannot be conflated with the one you're defining
>>> here.
>>>
>>
>> I will add a requirement that the UTC clock can never have smeared/smoothed
>> leap seconds.
>>
>> I think that not every vendor would bother to first add a definition of a
>> smearing algorithm. Also, I think in some cases knowing the precise
>> smearing algorithm might not be important (when having the same time as the
>> hypervisor is enough and accuracy w.r.t. actual time is less important).
>>
>> So maybe I should add a VIRTIO_RTC_CLOCK_UTC_SMEARED clock type, which for
>> now could catch every UTC-like clock which smears/smoothes leap seconds,
>> where the vendor cannot be bothered to add the smearing algorithm to spec
>> and implementations.
>>
> 
> I still don't know anything about virtio but under Linux, an RTC is
> always UTC (or localtime when dual booting but let's not care) and never
> accounts for leap seconds. Having an RTC and RTC driver behaving
> differently would be super inconvenient. Why don't you leave this to
> userspace?
> 
> I guess I'm still questioning whether this is the correct interface to
> expose the host system time instead of an actual RTC.
> 

virtio_rtc only registers RTC class devices for virtio_rtc clock type UTC,
so adding an UTC_SMEARED clock type would avoid leap seconds for the RTC
class.

But I understand that there are more concerns and I will re-consider. From
my POV CLOCK_{REALTIME,MONOTONIC}_ALARM support is very important however.

So the only alternative to me seems to be adding an alternative alarmtimer
backend (and time synchronization through user space).

Thanks for the comment,

Peter



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread David Woodhouse
On Wed, 2024-03-13 at 13:58 +0100, Alexandre Belloni wrote:
> The TSC or whatever CPU counter/clock that is used to keep the system
> time is not an RTC, I don't get why it has to be exposed as such to the
> guests. PTP is fine and precise, RTC is not.

Ah, I see. But the point of the virtio_rtc is not really to expose that
CPU counter. The point is to report the wallclock time, just like an
actual RTC. The real difference is the *precision*.

The virtio_rtc device has a facility to *also* expose the counter,
because that's what we actually need to gain that precision...

Applications don't read the RTC every time they want to know what the
time is. These days, they don't even make a system call; it's done
entirely in userspace mode. The kernel exposes some shared memory,
essentially saying "the counter was X at time Y, and runs at Z Hz".
Then applications just read the CPU counter and do some arithmetic.

As we require more and more precision in the calibration, it becomes
important to get *paired* readings of the CPU counter and the wallclock
time at precisely the same moment. If the guest has to read one and
then the other, potentially taking interrupts, getting preempted and
suffering steal/SMI time in the middle, that introduces an error which
is increasingly significant as we increasingly care about precision.

Peter's proposal exposes the pairs of {X,Y} and leaves *all* the guest
kernels having to repeat readings over time and perform the calibration
as the underlying hardware oscillator frequency (Z) drifts with
temperature. I'm trying to get him to let the hypervisor expose the
calibrated frequency Z too. Along with *error* bounds for ±δX and ±δZ.
Which aside from reducing the duplication of effort, will *also* fix
the problem of live migration where *all* those things suffer a step
change and leave the guest with an inaccurate clock but not knowing it.

But that part isn't relevant to the RTC, as you say. RTC doesn't care
about that level of precision; it's just what the system uses to know
roughly what year it is, when it powers up. (And isn't even really
trusted for that, which is a large part of why I added the
X509_V_FLAG_NO_CHECK_TIME flag to OpenSSL, so Secure Boot doesn't break
when the RTC is catastrophically wrong :)

If you're asking why patch 7/7 in Peter's series exists to expose the
virtio clock through RTC, and you're not particularly interested in the
first six, I suppose that's a fair question. As is the question of "why
is it called virtio_rtc not virtio_ptp?". 

But let me turn it around: if the kernel has access to this virtio
device and *not* any other RTC, why *wouldn't* the kernel use the time
from it? The fact that it can optionally *also* provide paired readings
with the CPU counter doesn't actually *hurt* for the RTC use case, does
it?





smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread Alexandre Belloni
On 13/03/2024 12:29:38+, David Woodhouse wrote:
> On Wed, 2024-03-13 at 12:18 +0100, Alexandre Belloni wrote:
> > 
> > I still don't know anything about virtio but under Linux, an RTC is
> > always UTC (or localtime when dual booting but let's not care) and never
> > accounts for leap seconds. Having an RTC and RTC driver behaving
> > differently would be super inconvenient. Why don't you leave this to
> > userspace?
> 
> Well yes, we don't need to expose *anything* from the hypervisor and we
> can leave it all to guest userspace. We can run NTP on every single one
> of *hundreds* of guests, leaving them all to duplicate the work of
> calibrating the *same* underlying oscillator.
> 

Really, I see the point of sharing the time accurately between the host
and the guest and not duplicating the effort. This is not what I am
objecting to.

> I thought we were trying to avoid that, by having the hypervisor tell
> them what the time was. If we're going to do that, we need it to be
> sufficiently precise (and some clients want to *know* the precision),
> and above all we need it to be *unambiguous*.
> 
> If the hypervisor says that the time is 3692217600.001, then the guest
> doesn't actually know *which* 3692217600.001 it is, and thus it still
> doesn't know the time to an accuracy better than 1 second.
> 

The RTC subsystem has a 1 second resolution and this is not going to
change because there is no point doing so for the hardware, to get the
time precisely, UIE MUST be used there is no vendor that will just
support reading the time/date or timestamp as this is way too imprecise.

> And if we start allowing the hypervisor to smear clocks in some other
> underspecified ways, then we end up with errors of up to 1 second in
> the clock for long periods of time *around* the leap second.
> 
> We need to avoid that ambiguity.

Exactly my point and I said, reading an RTC is always UTC and never
handles leap seconds so if userspace doesn't handle the leap second and
updates the RTC, too bad. There are obviously no RTC that will smear
clock unless instructed to, so the hypervisor must not smear the clock.

Note that this is not an issue for the subsystem because if you are not
capable to track an accurate clock, the RTC itself will have drifted by
much more than a second in between leap second inclusions.

> 
> > I guess I'm still questioning whether this is the correct interface to
> > expose the host system time instead of an actual RTC.
> 
> If an RTC device is able to report '23:59:60' as the time of day, I
> suppose that *could* resolve the ambiguity. But talking to a device is
> slow; we want guests to be able to know the time — accurately — with a
> simple counter/tsc read and some arithmetic. Which means *paired* reads
> of 'RTC' and the counter, and a precise indication of the counter
> frequency.

23:59:60 is not and will never be allowed in the RTC subsystem as this
is an invalid value for the hardware.

The TSC or whatever CPU counter/clock that is used to keep the system
time is not an RTC, I don't get why it has to be exposed as such to the
guests. PTP is fine and precise, RTC is not.

-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread David Woodhouse
On Wed, 2024-03-13 at 10:45 +0100, Peter Hilber wrote:
> On 12.03.24 18:15, David Woodhouse wrote:
> > On Mon, 2024-03-11 at 19:24 +0100, Peter Hilber wrote:
> > > On 08.03.24 13:33, David Woodhouse wrote:
> > > > On Fri, 2024-03-08 at 11:32 +0100, Peter Hilber wrote:
> > > > > On 07.03.24 15:02, David Woodhouse wrote:
> > > > > > Hm, should we allow UTC? If you tell me the time in UTC, then
> > > > > > (sometimes) I still don't actually know what the time is, because 
> > > > > > some
> > > > > > UTC seconds occur twice. UTC only makes sense if you provide the TAI
> > > > > > offset, surely? Should the virtio_rtc specification make it 
> > > > > > mandatory
> > > > > > to provide such?
> > > > > > 
> > > > > > Otherwise you're just designing it to allow crappy hypervisors to
> > > > > > expose incomplete information.
> > > > > > 
> > > > > 
> > > > > Hi David,
> > > > > 
> > > > > (adding virtio-comm...@lists.oasis-open.org for spec discussion),
> > > > > 
> > > > > thank you for your insightful comments. I think I take a broadly 
> > > > > similar
> > > > > view. The reason why the current spec and driver is like this is that 
> > > > > I
> > > > > took a pragmatic approach at first and only included features which 
> > > > > work
> > > > > out-of-the-box for the current Linux ecosystem.
> > > > > 
> > > > > The current virtio_rtc features work similar to ptp_kvm, and therefore
> > > > > can work out-of-the-box with time sync daemons such as chrony.
> > > > > 
> > > > > As of RFC spec v3, UTC clock only is allowed. If mandating a TAI clock
> > > > > as well, I am afraid that
> > > > > 
> > > > > - in some (embedded) scenarios, the TAI clock may not be available
> > > > > 
> > > > > - crappy hypervisors will pass off the UTC clock as the TAI clock.
> > > > > 
> > > > > For the same reasons, I am also not sure about adding a *mandatory* 
> > > > > TAI
> > > > > offset to each readout. I don't know user-space software which would
> > > > > leverage this already (at least not through the PTP clock interface).
> > > > > And why would such software not go straight for the TAI clock instead?
> > > > > 
> > > > > How about adding a requirement to the spec that the virtio-rtc device
> > > > > SHOULD expose the TAI clock whenever it is available - would this
> > > > > address your concerns?
> > > > 
> > > > I think that would be too easy for implementors to miss, or decide not
> > > > to obey. Or to get *wrong*, by exposing a TAI clock but actually
> > > > putting UTC in it.
> > > > 
> > > > I think I prefer to mandate the tai_offset field with the UTC clock.
> > > > Crappy implementations will just set it to zero, but at least that
> > > > gives a clear signal to the guests that it's *their* problem to
> > > > resolve.
> > > 
> > > To me there are some open questions regarding how this would work. Is 
> > > there
> > > a use case for this with the v3 clock reading methods, or would it be
> > > enough to address this with the Virtio timekeeper?
> > > 
> > > Looking at clock_adjtime(2), the tai_offset could be exposed, but probably
> > > best alongside some additional information about leap seconds. I am not
> > > aware about any user-space user. In addition, leap second smearing should
> > > also be addressed.
> > > 
> > 
> > Is there even a standard yet for leap-smearing? Will it be linear over
> > 1000 seconds like UTC-SLS? Or semi-raised-cosine over 24 hours, which I
> > think is what Google does? Meta does something different again, don't
> > they?
> > 
> > Exposing UTC as the only clock reference is bad enough; when leap
> > seconds happen there's a whole second during which you don't *know*
> > which second it is. It seems odd to me, for a precision clock to be
> > deliberately ambiguous about what the time is!
> 
> Just to be clear, the device can perfectly expose only a TAI reference
> clock (or both UTC and TAI), the spec is just completely open about this,
> as it tries to work for diverse use cases.

As long as the guest *knows* what it's getting, sure.

> > 
> > But if the virtio-rtc clock is defined as UTC and then expose something
> > *different* in it, that's even worse. You potentially end up providing
> > inaccurate time for a whole *day* leading up to the leap second.
> > 
> > I think you're right that leap second smearing should be addressed. At
> > the very least, by making it clear that the virtio-rtc clock which
> > advertises UTC shall be used *only* for UTC, never UTC-SLS or any other
> > yet-to-be-defined variant.
> > 
> 
> Agreed.
> 
> > Please make it explicit that any hypervisor which wants to advertise a
> > smeared clock shall define a new type which specifies the precise
> > smearing algorithm and cannot be conflated with the one you're defining
> > here.
> > 
> 
> I will add a requirement that the UTC clock can never have smeared/smoothed
> leap seconds.

Thanks.

> I think that not every vendor would bother to first add a definition of a
> smearing algorithm. Also, I think in some 

Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread David Woodhouse
On Wed, 2024-03-13 at 12:18 +0100, Alexandre Belloni wrote:
> 
> I still don't know anything about virtio but under Linux, an RTC is
> always UTC (or localtime when dual booting but let's not care) and never
> accounts for leap seconds. Having an RTC and RTC driver behaving
> differently would be super inconvenient. Why don't you leave this to
> userspace?

Well yes, we don't need to expose *anything* from the hypervisor and we
can leave it all to guest userspace. We can run NTP on every single one
of *hundreds* of guests, leaving them all to duplicate the work of
calibrating the *same* underlying oscillator.

I thought we were trying to avoid that, by having the hypervisor tell
them what the time was. If we're going to do that, we need it to be
sufficiently precise (and some clients want to *know* the precision),
and above all we need it to be *unambiguous*.

If the hypervisor says that the time is 3692217600.001, then the guest
doesn't actually know *which* 3692217600.001 it is, and thus it still
doesn't know the time to an accuracy better than 1 second.

And if we start allowing the hypervisor to smear clocks in some other
underspecified ways, then we end up with errors of up to 1 second in
the clock for long periods of time *around* the leap second.

We need to avoid that ambiguity.

> I guess I'm still questioning whether this is the correct interface to
> expose the host system time instead of an actual RTC.

If an RTC device is able to report '23:59:60' as the time of day, I
suppose that *could* resolve the ambiguity. But talking to a device is
slow; we want guests to be able to know the time — accurately — with a
simple counter/tsc read and some arithmetic. Which means *paired* reads
of 'RTC' and the counter, and a precise indication of the counter
frequency.



smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread Alexandre Belloni
On 13/03/2024 10:45:54+0100, Peter Hilber wrote:
> > Exposing UTC as the only clock reference is bad enough; when leap
> > seconds happen there's a whole second during which you don't *know*
> > which second it is. It seems odd to me, for a precision clock to be
> > deliberately ambiguous about what the time is!
> 
> Just to be clear, the device can perfectly expose only a TAI reference
> clock (or both UTC and TAI), the spec is just completely open about this,
> as it tries to work for diverse use cases.
> 
> > 
> > But if the virtio-rtc clock is defined as UTC and then expose something
> > *different* in it, that's even worse. You potentially end up providing
> > inaccurate time for a whole *day* leading up to the leap second.
> > 
> > I think you're right that leap second smearing should be addressed. At
> > the very least, by making it clear that the virtio-rtc clock which
> > advertises UTC shall be used *only* for UTC, never UTC-SLS or any other
> > yet-to-be-defined variant.
> > 
> 
> Agreed.
> 
> > Please make it explicit that any hypervisor which wants to advertise a
> > smeared clock shall define a new type which specifies the precise
> > smearing algorithm and cannot be conflated with the one you're defining
> > here.
> > 
> 
> I will add a requirement that the UTC clock can never have smeared/smoothed
> leap seconds.
> 
> I think that not every vendor would bother to first add a definition of a
> smearing algorithm. Also, I think in some cases knowing the precise
> smearing algorithm might not be important (when having the same time as the
> hypervisor is enough and accuracy w.r.t. actual time is less important).
> 
> So maybe I should add a VIRTIO_RTC_CLOCK_UTC_SMEARED clock type, which for
> now could catch every UTC-like clock which smears/smoothes leap seconds,
> where the vendor cannot be bothered to add the smearing algorithm to spec
> and implementations.
> 

I still don't know anything about virtio but under Linux, an RTC is
always UTC (or localtime when dual booting but let's not care) and never
accounts for leap seconds. Having an RTC and RTC driver behaving
differently would be super inconvenient. Why don't you leave this to
userspace?

I guess I'm still questioning whether this is the correct interface to
expose the host system time instead of an actual RTC.

-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com



Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-13 Thread Peter Hilber
On 12.03.24 18:15, David Woodhouse wrote:
> On Mon, 2024-03-11 at 19:24 +0100, Peter Hilber wrote:
>> On 08.03.24 13:33, David Woodhouse wrote:
>>> On Fri, 2024-03-08 at 11:32 +0100, Peter Hilber wrote:
 On 07.03.24 15:02, David Woodhouse wrote:
> Hm, should we allow UTC? If you tell me the time in UTC, then
> (sometimes) I still don't actually know what the time is, because some
> UTC seconds occur twice. UTC only makes sense if you provide the TAI
> offset, surely? Should the virtio_rtc specification make it mandatory
> to provide such?
>
> Otherwise you're just designing it to allow crappy hypervisors to
> expose incomplete information.
>

 Hi David,

 (adding virtio-comm...@lists.oasis-open.org for spec discussion),

 thank you for your insightful comments. I think I take a broadly similar
 view. The reason why the current spec and driver is like this is that I
 took a pragmatic approach at first and only included features which work
 out-of-the-box for the current Linux ecosystem.

 The current virtio_rtc features work similar to ptp_kvm, and therefore
 can work out-of-the-box with time sync daemons such as chrony.

 As of RFC spec v3, UTC clock only is allowed. If mandating a TAI clock
 as well, I am afraid that

 - in some (embedded) scenarios, the TAI clock may not be available

 - crappy hypervisors will pass off the UTC clock as the TAI clock.

 For the same reasons, I am also not sure about adding a *mandatory* TAI
 offset to each readout. I don't know user-space software which would
 leverage this already (at least not through the PTP clock interface).
 And why would such software not go straight for the TAI clock instead?

 How about adding a requirement to the spec that the virtio-rtc device
 SHOULD expose the TAI clock whenever it is available - would this
 address your concerns?
>>>
>>> I think that would be too easy for implementors to miss, or decide not
>>> to obey. Or to get *wrong*, by exposing a TAI clock but actually
>>> putting UTC in it.
>>>
>>> I think I prefer to mandate the tai_offset field with the UTC clock.
>>> Crappy implementations will just set it to zero, but at least that
>>> gives a clear signal to the guests that it's *their* problem to
>>> resolve.
>>
>> To me there are some open questions regarding how this would work. Is there
>> a use case for this with the v3 clock reading methods, or would it be
>> enough to address this with the Virtio timekeeper?
>>
>> Looking at clock_adjtime(2), the tai_offset could be exposed, but probably
>> best alongside some additional information about leap seconds. I am not
>> aware about any user-space user. In addition, leap second smearing should
>> also be addressed.
>>
> 
> Is there even a standard yet for leap-smearing? Will it be linear over
> 1000 seconds like UTC-SLS? Or semi-raised-cosine over 24 hours, which I
> think is what Google does? Meta does something different again, don't
> they?
> 
> Exposing UTC as the only clock reference is bad enough; when leap
> seconds happen there's a whole second during which you don't *know*
> which second it is. It seems odd to me, for a precision clock to be
> deliberately ambiguous about what the time is!

Just to be clear, the device can perfectly expose only a TAI reference
clock (or both UTC and TAI), the spec is just completely open about this,
as it tries to work for diverse use cases.

> 
> But if the virtio-rtc clock is defined as UTC and then expose something
> *different* in it, that's even worse. You potentially end up providing
> inaccurate time for a whole *day* leading up to the leap second.
> 
> I think you're right that leap second smearing should be addressed. At
> the very least, by making it clear that the virtio-rtc clock which
> advertises UTC shall be used *only* for UTC, never UTC-SLS or any other
> yet-to-be-defined variant.
> 

Agreed.

> Please make it explicit that any hypervisor which wants to advertise a
> smeared clock shall define a new type which specifies the precise
> smearing algorithm and cannot be conflated with the one you're defining
> here.
> 

I will add a requirement that the UTC clock can never have smeared/smoothed
leap seconds.

I think that not every vendor would bother to first add a definition of a
smearing algorithm. Also, I think in some cases knowing the precise
smearing algorithm might not be important (when having the same time as the
hypervisor is enough and accuracy w.r.t. actual time is less important).

So maybe I should add a VIRTIO_RTC_CLOCK_UTC_SMEARED clock type, which for
now could catch every UTC-like clock which smears/smoothes leap seconds,
where the vendor cannot be bothered to add the smearing algorithm to spec
and implementations.

As for UTC-SLS, this *could* also be added, although [1] says

It is inappropriate to use Internet-Drafts as reference material or
   

Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-12 Thread David Woodhouse
On Mon, 2024-03-11 at 19:24 +0100, Peter Hilber wrote:
> On 08.03.24 13:33, David Woodhouse wrote:
> > On Fri, 2024-03-08 at 11:32 +0100, Peter Hilber wrote:
> > > On 07.03.24 15:02, David Woodhouse wrote:
> > > > Hm, should we allow UTC? If you tell me the time in UTC, then
> > > > (sometimes) I still don't actually know what the time is, because some
> > > > UTC seconds occur twice. UTC only makes sense if you provide the TAI
> > > > offset, surely? Should the virtio_rtc specification make it mandatory
> > > > to provide such?
> > > > 
> > > > Otherwise you're just designing it to allow crappy hypervisors to
> > > > expose incomplete information.
> > > > 
> > > 
> > > Hi David,
> > > 
> > > (adding virtio-comm...@lists.oasis-open.org for spec discussion),
> > > 
> > > thank you for your insightful comments. I think I take a broadly similar
> > > view. The reason why the current spec and driver is like this is that I
> > > took a pragmatic approach at first and only included features which work
> > > out-of-the-box for the current Linux ecosystem.
> > > 
> > > The current virtio_rtc features work similar to ptp_kvm, and therefore
> > > can work out-of-the-box with time sync daemons such as chrony.
> > > 
> > > As of RFC spec v3, UTC clock only is allowed. If mandating a TAI clock
> > > as well, I am afraid that
> > > 
> > > - in some (embedded) scenarios, the TAI clock may not be available
> > > 
> > > - crappy hypervisors will pass off the UTC clock as the TAI clock.
> > > 
> > > For the same reasons, I am also not sure about adding a *mandatory* TAI
> > > offset to each readout. I don't know user-space software which would
> > > leverage this already (at least not through the PTP clock interface).
> > > And why would such software not go straight for the TAI clock instead?
> > > 
> > > How about adding a requirement to the spec that the virtio-rtc device
> > > SHOULD expose the TAI clock whenever it is available - would this
> > > address your concerns?
> > 
> > I think that would be too easy for implementors to miss, or decide not
> > to obey. Or to get *wrong*, by exposing a TAI clock but actually
> > putting UTC in it.
> > 
> > I think I prefer to mandate the tai_offset field with the UTC clock.
> > Crappy implementations will just set it to zero, but at least that
> > gives a clear signal to the guests that it's *their* problem to
> > resolve.
> 
> To me there are some open questions regarding how this would work. Is there
> a use case for this with the v3 clock reading methods, or would it be
> enough to address this with the Virtio timekeeper?
> 
> Looking at clock_adjtime(2), the tai_offset could be exposed, but probably
> best alongside some additional information about leap seconds. I am not
> aware about any user-space user. In addition, leap second smearing should
> also be addressed.
> 

Is there even a standard yet for leap-smearing? Will it be linear over
1000 seconds like UTC-SLS? Or semi-raised-cosine over 24 hours, which I
think is what Google does? Meta does something different again, don't
they?

Exposing UTC as the only clock reference is bad enough; when leap
seconds happen there's a whole second during which you don't *know*
which second it is. It seems odd to me, for a precision clock to be
deliberately ambiguous about what the time is!

But if the virtio-rtc clock is defined as UTC and then expose something
*different* in it, that's even worse. You potentially end up providing
inaccurate time for a whole *day* leading up to the leap second.

I think you're right that leap second smearing should be addressed. At
the very least, by making it clear that the virtio-rtc clock which
advertises UTC shall be used *only* for UTC, never UTC-SLS or any other
yet-to-be-defined variant.

Please make it explicit that any hypervisor which wants to advertise a
smeared clock shall define a new type which specifies the precise
smearing algorithm and cannot be conflated with the one you're defining
here.

> > One other thing to note is I think we're being very naïve about the TSC
> > on x86 hosts. Theoretically, the TSC for every vCPU might run at a
> > different frequency, and even if they run at the same frequency they
> > might be offset from each other. I'm happy to be naïve but I think we
> > should be *explicitly* so, and just say for example that it's defined
> > against vCPU0 so if other vCPUs are different then all bets are off.
> 
> ATM Virtio has no notion of vCPUs, or vCPU topology. So I wonder if you
> have an opinion on how to represent this in a platform-independent way.

Well, it doesn't have a notion of TSCs either; you include that by
implicit reference don't you?



smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-11 Thread Peter Hilber
On 08.03.24 13:33, David Woodhouse wrote:
> On Fri, 2024-03-08 at 11:32 +0100, Peter Hilber wrote:
>> On 07.03.24 15:02, David Woodhouse wrote:
>>> On Mon, 2023-12-18 at 08:38 +0100, Peter Hilber wrote:
 RFC v3 updates
 --

 This series implements a driver for a virtio-rtc device conforming to
 spec
 RFC v3 [1]. It now includes an RTC class driver with alarm, in
 addition to
 the PTP clock driver already present before.

 This patch series depends on the patch series "treewide: Use
 clocksource id
 for get_device_system_crosststamp()" [3]. Pull [4] to get the combined
 series on top of mainline.

 Overview
 

 This patch series adds the virtio_rtc module, and related bugfixes.
 The
 virtio_rtc module implements a driver compatible with the proposed
 Virtio
 RTC device specification [1]. The Virtio RTC (Real Time Clock) device
 provides information about current time. The device can provide
 different
 clocks, e.g. for the UTC or TAI time standards, or for physical time
 elapsed since some past epoch.
>>>
>>> Hm, should we allow UTC? If you tell me the time in UTC, then
>>> (sometimes) I still don't actually know what the time is, because some
>>> UTC seconds occur twice. UTC only makes sense if you provide the TAI
>>> offset, surely? Should the virtio_rtc specification make it mandatory
>>> to provide such?
>>>
>>> Otherwise you're just designing it to allow crappy hypervisors to
>>> expose incomplete information.
>>>
>>
>> Hi David,
>>
>> (adding virtio-comm...@lists.oasis-open.org for spec discussion),
>>
>> thank you for your insightful comments. I think I take a broadly similar
>> view. The reason why the current spec and driver is like this is that I
>> took a pragmatic approach at first and only included features which work
>> out-of-the-box for the current Linux ecosystem.
>>
>> The current virtio_rtc features work similar to ptp_kvm, and therefore
>> can
>> work out-of-the-box with time sync daemons such as chrony.
>>
>> As of RFC spec v3, UTC clock only is allowed. If mandating a TAI clock
>> as
>> well, I am afraid that
>>
>> - in some (embedded) scenarios, the TAI clock may not be available
>>
>> - crappy hypervisors will pass off the UTC clock as the TAI clock.
>>
>> For the same reasons, I am also not sure about adding a *mandatory* TAI
>> offset to each readout. I don't know user-space software which would
>> leverage this already (at least not through the PTP clock interface).
>> And
>> why would such software not go straight for the TAI clock instead?
>>
>> How about adding a requirement to the spec that the virtio-rtc device
>> SHOULD expose the TAI clock whenever it is available - would this
>> address
>> your concerns?
>
> I think that would be too easy for implementors to miss, or decide not
> to obey. Or to get *wrong*, by exposing a TAI clock but actually
> putting UTC in it.
>
> I think I prefer to mandate the tai_offset field with the UTC clock.
> Crappy implementations will just set it to zero, but at least that
> gives a clear signal to the guests that it's *their* problem to
> resolve.

To me there are some open questions regarding how this would work. Is there
a use case for this with the v3 clock reading methods, or would it be
enough to address this with the Virtio timekeeper?

Looking at clock_adjtime(2), the tai_offset could be exposed, but probably
best alongside some additional information about leap seconds. I am not
aware about any user-space user. In addition, leap second smearing should
also be addressed.

>
>
>
>
 PTP clock interface
 ---

 virtio_rtc exposes clocks as PTP clocks to userspace, similar to
 ptp_kvm.
 If both the Virtio RTC device and this driver have special support for
 the
 current clocksource, time synchronization programs can use
 cross-timestamping using ioctl PTP_SYS_OFFSET_PRECISE2 aka
 PTP_SYS_OFFSET_PRECISE. Similar to ptp_kvm, system time
 synchronization
 with single-digit ns precision is possible with a quiescent reference
 clock
 (from the Virtio RTC device). This works even when the Virtio device
 response is slow compared to ptp_kvm hypercalls.
>>>
>>> Is PTP the right mechanism for this? As I understand it, PTP is a way
>>> to precisely synchronize one clock with another. But in the case of
>>> virt guests synchronizing against the host, it isn't really *another*
>>> clock. It really is the *same* underlying clock. As the host clock
>>> varies with temperature, for example, so does the guest clock. The only
>>> difference is an offset and (on x86 perhaps) a mathematical scaling of
>>> the frequency.
>>>
>>> I was looking at this another way, when I came across this virtio-rtc
>>> work.
>>>
>>> My idea was just for the hypervisor to expose its own timekeeping
>>> information — the counter/TSC value and TAI time at a given moment,
>>> frequenc

Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-08 Thread David Woodhouse
On Fri, 2024-03-08 at 11:32 +0100, Peter Hilber wrote:
> On 07.03.24 15:02, David Woodhouse wrote:
> > On Mon, 2023-12-18 at 08:38 +0100, Peter Hilber wrote:
> > > RFC v3 updates
> > > --
> > > 
> > > This series implements a driver for a virtio-rtc device conforming to spec
> > > RFC v3 [1]. It now includes an RTC class driver with alarm, in addition to
> > > the PTP clock driver already present before.
> > > 
> > > This patch series depends on the patch series "treewide: Use clocksource 
> > > id
> > > for get_device_system_crosststamp()" [3]. Pull [4] to get the combined
> > > series on top of mainline.
> > > 
> > > Overview
> > > 
> > > 
> > > This patch series adds the virtio_rtc module, and related bugfixes. The
> > > virtio_rtc module implements a driver compatible with the proposed Virtio
> > > RTC device specification [1]. The Virtio RTC (Real Time Clock) device
> > > provides information about current time. The device can provide different
> > > clocks, e.g. for the UTC or TAI time standards, or for physical time
> > > elapsed since some past epoch. 
> > 
> > Hm, should we allow UTC? If you tell me the time in UTC, then
> > (sometimes) I still don't actually know what the time is, because some
> > UTC seconds occur twice. UTC only makes sense if you provide the TAI
> > offset, surely? Should the virtio_rtc specification make it mandatory
> > to provide such?
> > 
> > Otherwise you're just designing it to allow crappy hypervisors to
> > expose incomplete information.
> > 
> 
> Hi David,
> 
> (adding virtio-comm...@lists.oasis-open.org for spec discussion),
> 
> thank you for your insightful comments. I think I take a broadly similar
> view. The reason why the current spec and driver is like this is that I
> took a pragmatic approach at first and only included features which work
> out-of-the-box for the current Linux ecosystem.
> 
> The current virtio_rtc features work similar to ptp_kvm, and therefore can
> work out-of-the-box with time sync daemons such as chrony.
> 
> As of RFC spec v3, UTC clock only is allowed. If mandating a TAI clock as
> well, I am afraid that
> 
> - in some (embedded) scenarios, the TAI clock may not be available
> 
> - crappy hypervisors will pass off the UTC clock as the TAI clock.
> 
> For the same reasons, I am also not sure about adding a *mandatory* TAI
> offset to each readout. I don't know user-space software which would
> leverage this already (at least not through the PTP clock interface). And
> why would such software not go straight for the TAI clock instead?
> 
> How about adding a requirement to the spec that the virtio-rtc device
> SHOULD expose the TAI clock whenever it is available - would this address
> your concerns?

I think that would be too easy for implementors to miss, or decide not
to obey. Or to get *wrong*, by exposing a TAI clock but actually
putting UTC in it.

I think I prefer to mandate the tai_offset field with the UTC clock.
Crappy implementations will just set it to zero, but at least that
gives a clear signal to the guests that it's *their* problem to
resolve.




> > > PTP clock interface
> > > ---
> > > 
> > > virtio_rtc exposes clocks as PTP clocks to userspace, similar to ptp_kvm.
> > > If both the Virtio RTC device and this driver have special support for the
> > > current clocksource, time synchronization programs can use
> > > cross-timestamping using ioctl PTP_SYS_OFFSET_PRECISE2 aka
> > > PTP_SYS_OFFSET_PRECISE. Similar to ptp_kvm, system time synchronization
> > > with single-digit ns precision is possible with a quiescent reference 
> > > clock
> > > (from the Virtio RTC device). This works even when the Virtio device
> > > response is slow compared to ptp_kvm hypercalls.
> > 
> > Is PTP the right mechanism for this? As I understand it, PTP is a way
> > to precisely synchronize one clock with another. But in the case of
> > virt guests synchronizing against the host, it isn't really *another*
> > clock. It really is the *same* underlying clock. As the host clock
> > varies with temperature, for example, so does the guest clock. The only
> > difference is an offset and (on x86 perhaps) a mathematical scaling of
> > the frequency.
> > 
> > I was looking at this another way, when I came across this virtio-rtc
> > work.
> > 
> > My idea was just for the hypervisor to expose its own timekeeping
> > information — the counter/TSC value and TAI time at a given moment,
> > frequency of the counter, and the precision of both that frequency
> > (±PPM) and the TAI timestamp (±µs).
> > 
> > By putting that in a host/guest shared data structure with a seqcount
> > for lockless updates, we can update it as time synchronization on the
> > host is refined, and we can even cleanly handle live migration where
> > the guest ends up on a completely different host. It allows for use
> > cases which *really* care (e.g. timestamping financial transactions) to
> > ensure that there is never even a moment of getting 

Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-08 Thread Peter Hilber
On 07.03.24 15:02, David Woodhouse wrote:
> On Mon, 2023-12-18 at 08:38 +0100, Peter Hilber wrote:
>> RFC v3 updates
>> --
>>
>> This series implements a driver for a virtio-rtc device conforming to spec
>> RFC v3 [1]. It now includes an RTC class driver with alarm, in addition to
>> the PTP clock driver already present before.
>>
>> This patch series depends on the patch series "treewide: Use clocksource id
>> for get_device_system_crosststamp()" [3]. Pull [4] to get the combined
>> series on top of mainline.
>>
>> Overview
>> 
>>
>> This patch series adds the virtio_rtc module, and related bugfixes. The
>> virtio_rtc module implements a driver compatible with the proposed Virtio
>> RTC device specification [1]. The Virtio RTC (Real Time Clock) device
>> provides information about current time. The device can provide different
>> clocks, e.g. for the UTC or TAI time standards, or for physical time
>> elapsed since some past epoch. 
> 
> Hm, should we allow UTC? If you tell me the time in UTC, then
> (sometimes) I still don't actually know what the time is, because some
> UTC seconds occur twice. UTC only makes sense if you provide the TAI
> offset, surely? Should the virtio_rtc specification make it mandatory
> to provide such?
> 
> Otherwise you're just designing it to allow crappy hypervisors to
> expose incomplete information.
> 

Hi David,

(adding virtio-comm...@lists.oasis-open.org for spec discussion),

thank you for your insightful comments. I think I take a broadly similar
view. The reason why the current spec and driver is like this is that I
took a pragmatic approach at first and only included features which work
out-of-the-box for the current Linux ecosystem.

The current virtio_rtc features work similar to ptp_kvm, and therefore can
work out-of-the-box with time sync daemons such as chrony.

As of RFC spec v3, UTC clock only is allowed. If mandating a TAI clock as
well, I am afraid that

- in some (embedded) scenarios, the TAI clock may not be available

- crappy hypervisors will pass off the UTC clock as the TAI clock.

For the same reasons, I am also not sure about adding a *mandatory* TAI
offset to each readout. I don't know user-space software which would
leverage this already (at least not through the PTP clock interface). And
why would such software not go straight for the TAI clock instead?

How about adding a requirement to the spec that the virtio-rtc device
SHOULD expose the TAI clock whenever it is available - would this address
your concerns?

>> PTP clock interface
>> ---
>>
>> virtio_rtc exposes clocks as PTP clocks to userspace, similar to ptp_kvm.
>> If both the Virtio RTC device and this driver have special support for the
>> current clocksource, time synchronization programs can use
>> cross-timestamping using ioctl PTP_SYS_OFFSET_PRECISE2 aka
>> PTP_SYS_OFFSET_PRECISE. Similar to ptp_kvm, system time synchronization
>> with single-digit ns precision is possible with a quiescent reference clock
>> (from the Virtio RTC device). This works even when the Virtio device
>> response is slow compared to ptp_kvm hypercalls.
> 
> Is PTP the right mechanism for this? As I understand it, PTP is a way
> to precisely synchronize one clock with another. But in the case of
> virt guests synchronizing against the host, it isn't really *another*
> clock. It really is the *same* underlying clock. As the host clock
> varies with temperature, for example, so does the guest clock. The only
> difference is an offset and (on x86 perhaps) a mathematical scaling of
> the frequency.
> 
> I was looking at this another way, when I came across this virtio-rtc
> work.
> 
> My idea was just for the hypervisor to expose its own timekeeping
> information — the counter/TSC value and TAI time at a given moment,
> frequency of the counter, and the precision of both that frequency
> (±PPM) and the TAI timestamp (±µs).
> 
> By putting that in a host/guest shared data structure with a seqcount
> for lockless updates, we can update it as time synchronization on the
> host is refined, and we can even cleanly handle live migration where
> the guest ends up on a completely different host. It allows for use
> cases which *really* care (e.g. timestamping financial transactions) to
> ensure that there is never even a moment of getting *wrong* timestamps
> if they haven't yet resynced after a migration.

I considered a similar approach as well, but integrating that with the
kernel timekeeping seemed too much effort for the first step. However,
reading the clock from user space would be much simpler.

> 
> Now I'm trying to work out if I should attempt to reconcile with your
> existing virtio-rtc work, or just decide that virtio-rtc isn't trying
> to solve the actual problem that we have, and go ahead with something
> different... ?
> 

We are certainly interested into the discussed, say, "virtual timekeeper"
mechanism, which would also solve a lot of problems for us (especially if
it w

Re: [RFC PATCH v3 0/7] Add virtio_rtc module and related changes

2024-03-07 Thread David Woodhouse
On Mon, 2023-12-18 at 08:38 +0100, Peter Hilber wrote:
> RFC v3 updates
> --
> 
> This series implements a driver for a virtio-rtc device conforming to spec
> RFC v3 [1]. It now includes an RTC class driver with alarm, in addition to
> the PTP clock driver already present before.
> 
> This patch series depends on the patch series "treewide: Use clocksource id
> for get_device_system_crosststamp()" [3]. Pull [4] to get the combined
> series on top of mainline.
> 
> Overview
> 
> 
> This patch series adds the virtio_rtc module, and related bugfixes. The
> virtio_rtc module implements a driver compatible with the proposed Virtio
> RTC device specification [1]. The Virtio RTC (Real Time Clock) device
> provides information about current time. The device can provide different
> clocks, e.g. for the UTC or TAI time standards, or for physical time
> elapsed since some past epoch. 

Hm, should we allow UTC? If you tell me the time in UTC, then
(sometimes) I still don't actually know what the time is, because some
UTC seconds occur twice. UTC only makes sense if you provide the TAI
offset, surely? Should the virtio_rtc specification make it mandatory
to provide such?

Otherwise you're just designing it to allow crappy hypervisors to
expose incomplete information.

> PTP clock interface
> ---
> 
> virtio_rtc exposes clocks as PTP clocks to userspace, similar to ptp_kvm.
> If both the Virtio RTC device and this driver have special support for the
> current clocksource, time synchronization programs can use
> cross-timestamping using ioctl PTP_SYS_OFFSET_PRECISE2 aka
> PTP_SYS_OFFSET_PRECISE. Similar to ptp_kvm, system time synchronization
> with single-digit ns precision is possible with a quiescent reference clock
> (from the Virtio RTC device). This works even when the Virtio device
> response is slow compared to ptp_kvm hypercalls.

Is PTP the right mechanism for this? As I understand it, PTP is a way
to precisely synchronize one clock with another. But in the case of
virt guests synchronizing against the host, it isn't really *another*
clock. It really is the *same* underlying clock. As the host clock
varies with temperature, for example, so does the guest clock. The only
difference is an offset and (on x86 perhaps) a mathematical scaling of
the frequency.

I was looking at this another way, when I came across this virtio-rtc
work.

My idea was just for the hypervisor to expose its own timekeeping
information — the counter/TSC value and TAI time at a given moment,
frequency of the counter, and the precision of both that frequency
(±PPM) and the TAI timestamp (±µs).

By putting that in a host/guest shared data structure with a seqcount
for lockless updates, we can update it as time synchronization on the
host is refined, and we can even cleanly handle live migration where
the guest ends up on a completely different host. It allows for use
cases which *really* care (e.g. timestamping financial transactions) to
ensure that there is never even a moment of getting *wrong* timestamps
if they haven't yet resynced after a migration.

Now I'm trying to work out if I should attempt to reconcile with your
existing virtio-rtc work, or just decide that virtio-rtc isn't trying
to solve the actual problem that we have, and go ahead with something
different... ?



smime.p7s
Description: S/MIME cryptographic signature