On 9/25/25 1:44 AM, David Woodhouse wrote:
> On Wed, 2025-09-24 at 13:53 -0700, Dongli Zhang wrote:
>>
>>
>> On 9/23/25 10:47 AM, David Woodhouse wrote:
>>> On Tue, 2025-09-23 at 10:25 -0700, Dongli Zhang wrote:
>>>>
>>>>
>>>> On 9/23/25 9:26 AM, David Woodhouse wrote:
>>>>> On Mon, 2025-09-22 at 12:37 -0700, Dongli Zhang wrote:
>>>>>> On 9/22/25 11:16 AM, David Woodhouse wrote:
>>>>
>>>> [snip]
>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> As demonstrated in my test, currently guest_tsc doesn't stop counting 
>>>>>>>> during
>>>>>>>> blackout because of the lack of "MSR_IA32_TSC put" at
>>>>>>>> kvmclock_vm_state_change(). Per my understanding, it is a bug and we 
>>>>>>>> may need to
>>>>>>>> fix it.
>>>>>>>>
>>>>>>>> BTW, kvmclock_vm_state_change() already utilizes KVM_SET_CLOCK to 
>>>>>>>> re-configure
>>>>>>>> kvm-clock before continuing the guest VM.
>>>>>
>>>>> Yeah, right now it's probably just introducing errors for a stop/start
>>>>> of the VM.
>>>>
>>>> But that help can meet the expectation?
>>>>
>>>> Thanks to KVM_GET_CLOCK and KVM_SET_CLOCK, QEMU saves the clock with
>>>> KVM_GET_CLOCK when the VM is stopped, and restores it with KVM_SET_CLOCK 
>>>> when
>>>> the VM is continued.
>>>
>>> It saves the actual *value* of the clock. I would prefer to phrase that
>>> as "it makes the clock jump backwards to the time at which the guest
>>> was paused".
>>>
>>>> This ensures that the clock value itself does not change between stop and 
>>>> cont.
>>>>
>>>> However, QEMU does not adjust the TSC offset via MSR_IA32_TSC during stop.
>>>>
>>>> As a result, when execution resumes, the guest TSC suddenly jumps forward.
>>>
>>> Oh wow, that seems really broken. If we're going to make it experience
>>> a time warp, we should at least be *consistent*.
>>>
>>> So a guest which uses the TSC for timekeeping should be mostly
>>> unaffected by this and its wallclock should still be accurate. A guest
>>> which uses the KVM clock will be hosed by it.
>>>
>>> I think we should fix this so that the KVM clock is unaffected too.
>>
>> From my understanding of your reply, the kvm-clock/tsc should always be 
>> adjusted
>> whenever a QEMU VM is paused and then resumed (i.e. via stop/cont).
> 
> I think I agree, except I still hate the way you use the word
> 'adjusted'.
> 
> If I look at my clock, and then go to sleep for a while and look at the
> clock again, nobody *adjusts* it. It just keeps running.
> 
> That's the effect we should always strive for, and that's how we should
> think about it and talk about it.
> 
> It's difficult to talk about clocks because what does it mean for a
> clock to be "unchanged"? Does it mean that it should return the same
> time value? Or that it should continue to count consistently? I would
> argue that we should *always* use language which assumes the latter.
> 
> Turning to physics for a clumsy analogy, it's about the frame of
> reference. We're all on a moving train. I look at you in the seat
> opposite me, I go to sleep for a while, and I wake up and you're still
> there. Nobody has "adjusted" your position to accommodate for the
> movement of the train while I was asleep.
> 

Thank you very much for explanation!

I will use something like "keeps running".

> 
> 
> 
>> This applies to:
>>
>> - stop / cont
>> - savevm / loadvm
>> - live migration
>> - cpr
>>
>> It is a bug if the clock jumps backwards to the time at which the guest was 
>> paused.
>>
>> The time elapsed while the VM is paused should always be accounted for and
>> reflected in kvm-clock/tsc once the VM resumes.
> 
> In particular, in *all* but the live migration case, there should be
> basically nothing to do. No addition, no subtraction. Only restoring
> the *existing* relationships, precisely as they were before. That is
> the TSC *offset* value, and the precise TSC→kvmclock parameters, all
> bitwise *exactly* the same as before.
> 
> And the only thing that changes on live migration is that you have to
> set the TSC offset such that the guest sees the values it *would* have
> seen on the original host at any given moment in time... and doesn't
> know it was kidnapped and moved onto a different train while it was
> sleeping...?
> 

I see. That means, only re-configure tsc_offset, while maintaining the
tsc->kvmclock PVTI. That's the reason you would like to remove
'kvm_arch->kvmclock_offset' entirely as future work.

Thank you very much!

Dongli Zhang

Reply via email to