On Mon, Dec 14, 2015 at 10:07:21AM -0800, Andy Lutomirski wrote:
> On Fri, Dec 11, 2015 at 3:48 PM, Marcelo Tosatti <[email protected]> wrote:
> > On Fri, Dec 11, 2015 at 01:57:23PM -0800, Andy Lutomirski wrote:
> >> On Thu, Dec 10, 2015 at 1:32 PM, Marcelo Tosatti <[email protected]>
> >> wrote:
> >> > On Wed, Dec 09, 2015 at 01:10:59PM -0800, Andy Lutomirski wrote:
> >> >> I'm trying to clean up kvmclock and I can't get it to work at all. My
> >> >> host is 4.4.0-rc3-ish on a Skylake laptop that has a working TSC.
> >> >>
> >> >> If I boot an SMP (2 vcpus) guest, tracing says:
> >> >>
> >> >> qemu-system-x86-2517 [001] 102242.610654: kvm_update_master_clock:
> >> >> masterclock 0 hostclock tsc offsetmatched 0
> >> >> qemu-system-x86-2521 [000] 102242.613742: kvm_track_tsc:
> >> >> vcpu_id 0 masterclock 0 offsetmatched 0 nr_online 1 hostclock tsc
> >> >> qemu-system-x86-2522 [000] 102242.622959: kvm_track_tsc:
> >> >> vcpu_id 1 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc
> >> >> qemu-system-x86-2521 [000] 102242.645123: kvm_track_tsc:
> >> >> vcpu_id 0 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc
> >> >> qemu-system-x86-2522 [000] 102242.647291: kvm_track_tsc:
> >> >> vcpu_id 1 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc
> >> >> qemu-system-x86-2521 [000] 102242.653369: kvm_track_tsc:
> >> >> vcpu_id 0 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc
> >> >> qemu-system-x86-2522 [000] 102242.653429: kvm_track_tsc:
> >> >> vcpu_id 1 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc
> >> >> qemu-system-x86-2517 [001] 102242.653447: kvm_update_master_clock:
> >> >> masterclock 0 hostclock tsc offsetmatched 1
> >> >> qemu-system-x86-2521 [000] 102242.653657: kvm_update_master_clock:
> >> >> masterclock 0 hostclock tsc offsetmatched 1
> >> >> qemu-system-x86-2522 [002] 102242.664448: kvm_update_master_clock:
> >> >> masterclock 0 hostclock tsc offsetmatched 1
> >> >>
> >> >>
> >> >> If I boot a UP guest, tracing says:
> >> >>
> >> >> qemu-system-x86-2567 [001] 102370.447484: kvm_update_master_clock:
> >> >> masterclock 0 hostclock tsc offsetmatched 1
> >> >> qemu-system-x86-2571 [002] 102370.447688: kvm_update_master_clock:
> >> >> masterclock 0 hostclock tsc offsetmatched 1
> >> >>
> >> >> I suspect, but I haven't verified, that this is fallout from:
> >> >>
> >> >> commit 16a9602158861687c78b6de6dc6a79e6e8a9136f
> >> >> Author: Marcelo Tosatti <[email protected]>
> >> >> Date: Wed May 14 12:43:24 2014 -0300
> >> >>
> >> >> KVM: x86: disable master clock if TSC is reset during suspend
> >> >>
> >> >> Updating system_time from the kernel clock once master clock
> >> >> has been enabled can result in time backwards event, in case
> >> >> kernel clock frequency is lower than TSC frequency.
> >> >>
> >> >> Disable master clock in case it is necessary to update it
> >> >> from the resume path.
> >> >>
> >> >> Signed-off-by: Marcelo Tosatti <[email protected]>
> >> >> Signed-off-by: Paolo Bonzini <[email protected]>
> >> >>
> >> >>
> >> >> Can we please stop making kvmclock more complex? It's a beast right
> >> >> now, and not in a good way. It's far too tangled with the vclock
> >> >> machinery on both the host and guest sides, the pvclock stuff is not
> >> >> well thought out (even in principle in an ABI sense), and it's never
> >> >> been clear to my what problem exactly the kvmclock stuff is supposed
> >> >> to solve.
> >> >>
> >> >> I'm somewhat tempted to suggest that we delete kvmclock entirely and
> >> >> start over. A correctly functioning KVM guest using TSC (i.e.
> >> >> ignoring kvmclock entirely)
> >> >> seems to work rather more reliably and
> >> >> considerably faster than a kvmclock guest.
> >> >>
> >> >> --Andy
> >> >>
> >> >> --
> >> >> Andy Lutomirski
> >> >> AMA Capital Management, LLC
> >> >
> >> > Andy,
> >> >
> >> > I am all for solving practical problems rather than pleasing aesthetic
> >> > pleasure.
> >> >
> >> >> Updating system_time from the kernel clock once master clock
> >> >> has been enabled can result in time backwards event, in case
> >> >> kernel clock frequency is lower than TSC frequency.
> >> >>
> >> >> Disable master clock in case it is necessary to update it
> >> >> from the resume path.
> >> >
> >> >> once master clock
> >> >> has been enabled can result in time backwards event, in case
> >> >> kernel clock frequency is lower than TSC frequency.
> >> >
> >> > guest visible clock = tsc_timestamp (updated at time 0) + scaled tsc
> >> > reads.
> >> >
> >> > If the effective frequency of the kernel clock is lower (for example
> >> > due to NTP correcting the TSC frequency of the system), and you resume
> >> > and update the system, the following happens:
> >> >
> >> > guest visible clock = tsc_timestamp (updated at time 0) + scaled tsc
> >> > reads=LARGE VALUE.
> >
> > guest reads clock to memory at location A = scaled tsc read.
> >
> > (note TSC is counting at frequency higher than advertised by
> > processor, thats why NTP has to "slow down" the kernel clock
> > which is maintained by successive reads of the TSC).
> >
> >> > suspend/resume event.
> >> > guest visible clock = tsc_timestamp (updated at time N) + scaled tsc
> >> > reads=0.
^^^^^^^^^^^^^
Err this was tsc_systemtime
> > Now the guest visible clock contains a tsc_timestamp that has been
> > corrected by NTP, over say 5 days. So the tiny NTP correction has
> > been added up to something significant.
> >
> > guest reads clock to memory at location B = reads tsc_timestamp.
> >
> > Clock value in B (NTP corrected TSC) < clock value in A (RAW TSC)
> >
> > Yes?
>
> Sure, but I still don't see why this is a problem.
Time as seen by the guest goes backwards.
clock_gettime() = 1000.
followed by
clock_gettime() = 999.
Can't allow that.
> Why would the
> guest compare raw TSC to NTP corrected TSC?
Its "raw TSC" because thats what KVM exports to the guest, via the
tsc_timestamp field.
Its "corrected TSC" because thats what KVM exports to the guest,
via system_time field (because the host is using TSC clocksource, and
the host TSC clocksource is corrected by NTP).
>
> >
> >>
> >> I'm still not seeing the issue.
> >
> > I'll add two items to the three snapshots above, hopefully will make it
> > clearer.
>
> Maybe that'll help.
>
> --Andy
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html