(Sorry, forgot to switch to plain text in Gmail, rejected by vger.kernel.org...)

On Wed, Jul 30, 2008 at 10:15 PM, Marcelo Tosatti <[EMAIL PROTECTED]> wrote:
> Hi Dor,
>
> On Wed, Jul 30, 2008 at 12:50:06AM +0300, Dor Laor wrote:
>> Marcelo Tosatti wrote:
>>> The in-kernel PIT emulation can either inject too many or too few
>>> interrupts.
>>>
>>>
>> While it's an improvement, the in-kernel pit is still not perfect. For
>> example, on pit frequency changes the
>> pending count should be recalculated and matched to the new frequency.
>
> Point. That one can be addressed.
>
>> I also tumbled on live migration problem
>
> Can you provide more details? How to reproduce?
>
>> and there is your guest smp fix.
>> IMHO we need to switch back to userspace pit. [Actually I did consider
>> in-kernel pit myself in the past.]. The reasons:
>> 1. There is no performance advantage doing this in the kernel.
>>    It's just potentially reduced the host stability and reduces code
>
> Keeping device emulation in userspace is desired, of course. The
> drawbacks of timer emulation, AFAICS, are:
>
> - Timers in QEMU are, currently, not handled separately from other
> IO processing. The delay between timer expiration and interrupt
> injection depends on the time spent handling unrelated IO in QEMU's
> main loop. This can be fixed, by treating guest timer expiration
> with higher priority.
>
> - The in-kernel emulation allows the host timer to be locked to the vcpu
> it belongs. With userspace emulation, an IPI is necessary whenever the
> iothread is running on a different physical CPU than the target vcpu.
> The overall cost to wakeup a vcpu in a different physical CPU is:
> syscall, IRQ lock acquision (currently kvm->lock, which also protects
> access to in-kernel devices) and finally the IPI cost, which is hundreds
> of ns (googling around it seems to be in the range of 700ns).
>
> That cost is non-existant with timers locked to the vcpu.
>
> I don't know what specific problems the in-kernel PIT emulation solved
> (I recall certain Windows configurations were largely improved). Do you
> the details? Sheng?

Basicly, after in-kernel irqchip checked in and before in-kernel pit
checked in, the time mechanism in KVM is chaos... Simply because
userspace pit can't sync with in-kernel irqchip, typically tons of
interrupt lost, e.g. a bug named "guest time is 1/6 compare to host
time".

The main purpose to move pit to kernel is to fix this timer issue.
There was another purpose, when we do the same to the Xen. That's
Xen's IO exit to qemu is too heavy and affect performance badly. But
KVM at least don't suffer that much.

Personally, I don't against the idea move the IO part of PIT back to
qemu and keep the interrupt handling part in kernel (though still a
little sad...). I just don't know if we can do it elegantly,
genericly, precisely, efficiently and relatively simply, for all
(important) timer source. "vpt.c" in Xen has been criticized as too
complex.

--
regards
Yang, Sheng
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to