On Monday 29 December 2008 02:38:07 Marcelo Tosatti wrote:
> On Tue, Nov 25, 2008 at 01:52:59PM +0100, Andi Kleen wrote:
> > > But yeah - the remapping of HPET timers to virtual HPET timers sounds
> > > pretty tough. I wonder if one could overcome that with a little
> > > hardware support though ...
> >
> > For gettimeofday better make TSC work. Even in the best case (no
> > virtualization) it is much faster than HPET because it sits in the CPU,
> > while HPET is far away on the external south bridge.
>
> The tsc clock on older Linux 2.6 kernels compensates for lost ticks.
> The algorithm uses the PIT count (latched) to measure the delay between
> interrupt generation and handling, and sums that value, on the next
> interrupt, to the TSC delta.
>
> Sheng investigated this problem in the discussions before in-kernel PIT
> was merged:
>
> http://www.mail-archive.com/kvm-de...@lists.sourceforge.net/msg13873.html
>
> The algorithm overcompensates for lost ticks and the guest time runs
> faster than the hosts.
>
> There are two issues:
>
> 1) A bug in the in-kernel PIT which miscalculates the count value.
>
> 2) For the case where more than one interrupt is lost, and later
> reinjected, the value read from PIT count is meaningless for the purpose
> of the tsc algorithm. The count is interpreted as the delay until the
> next interrupt, which is not the case with reinjection.
>
> As Sheng mentioned in the thread above, Xen pulls back the TSC value
> when reinjecting interrupts. VMWare ESX has a notion of "virtual TSC",
> which I believe is similar in this context.
>
> For KVM I believe the best immediate solution (for now) is to provide an
> option to disable reinjection, behaving similarly to real hardware. The
> advantage is simplicity compared to virtualizing the time sources.
>
> The QEMU PIT emulation has a limit on the rate of interrupt reinjection,
> perhaps something similar should be investigated in the future.
>
> The following patch (which contains the bugfix for 1) and disabled
> reinjection) fixes the severe time drift on RHEL4 with "clock=tsc".
> What I'm proposing is to condition reinjection with an option
> (-kvm-pit-no-reinject or something).

I agree that it should go with a user space option to disable rejection, as 
it's hard to overcome the problem that we delayed interrupt injection... 

-- 
regards
Yang, Sheng

> Comments or better ideas?
>
>
> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
> index e665d1c..608af7b 100644
> --- a/arch/x86/kvm/i8254.c
> +++ b/arch/x86/kvm/i8254.c
> @@ -201,13 +201,16 @@ static int __pit_timer_fn(struct kvm_kpit_state *ps)
>       if (!atomic_inc_and_test(&pt->pending))
>               set_bit(KVM_REQ_PENDING_TIMER, &vcpu0->requests);
>
> +     if (atomic_read(&pt->pending) > 1)
> +             atomic_set(&pt->pending, 1);
> +
>       if (vcpu0 && waitqueue_active(&vcpu0->wq))
>               wake_up_interruptible(&vcpu0->wq);
>
>       hrtimer_add_expires_ns(&pt->timer, pt->period);
>       pt->scheduled = hrtimer_get_expires_ns(&pt->timer);
>       if (pt->period)
> -             ps->channels[0].count_load_time = 
> hrtimer_get_expires(&pt->timer);
> +             ps->channels[0].count_load_time = ktime_get();
>
>       return (pt->period == 0 ? 0 : 1);
>  }

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to