Gregory Haskins wrote:
Hi Dor,
Please find a patch attached for your review which adds support for dynamic
substitution of the PIC/APIC code to QEMU. This will allow us to selectively
chose the KVM in-kernel apic emulation vs the QEMU user-space apic emulation.
Support for both is key
Michael Cloran wrote:
Hello
I have done the following
1) installed suse 10 on a dell XPS with T7400 2G ram
2) compiled kernel 2.6.20.2 and booted from it
3) got gcc3.4 and compiled it
4) got qemu and compiled it from source with gcc3.4
4.1) modprobe kvm-intel
5) created image
Qemu-img
Gregory Haskins wrote:
My current thoughts are that we at least move the IOAPIC into the kernel as
well. That will give sufficient control to generate ISA bus interrupts for
guests that understand APICs. If we want to be able to generate ISA
interrupts for legacy guests which talk to the
Does KVM allow something like memory hotplug for its guests?
For example, lets says you are running several guests, and would like to
start yet another one for a while - but have no free memory left.
Obviously, your guests are so important that you don't want to stop them
- so you simply
Anthony Liguori wrote:
Then again, are we really positive that we have to move the APIC into
the kernel? A lot of things will get much more complicated.
The following arguments are in favor:
- allow in-kernel paravirt drivers to interrupt the guest without going
through qemu (which involves
Avi Kivity wrote:
Anthony Liguori wrote:
Then again, are we really positive that we have to move the APIC into
the kernel? A lot of things will get much more complicated.
The following arguments are in favor:
- allow in-kernel paravirt drivers to interrupt the guest without
going
Anthony Liguori wrote:
Avi Kivity wrote:
Anthony Liguori wrote:
Then again, are we really positive that we have to move the APIC
into the kernel? A lot of things will get much more complicated.
The following arguments are in favor:
- allow in-kernel paravirt drivers to interrupt the
This is for the TPR right? VT has special logic to handle TPR
virtualization doesn't it? I thought SVM did too...
Yes, the TPR. Both VT and SVM virtualize CR8 in 64-bit mode. SVM
also supports CR8 in 32-bit mode through a nwe instruction encoding,
but
nobody uses that to my knowledge.
Dor Laor wrote:
This is for the TPR right? VT has special logic to handle TPR
virtualization doesn't it? I thought SVM did too...
Yes, the TPR. Both VT and SVM virtualize CR8 in 64-bit mode. SVM
also supports CR8 in 32-bit mode through a nwe instruction encoding,
but
nobody
Casey,
On Tue, Apr 03, 2007 at 10:46:38PM -0400, Casey Jeffery wrote:
Stephane,
I'm glad you found this; I thought I was going to have to repost while
actually remembering to change the subject line.
Someone else pointed me to your message. The title was indeed misleading.
On Wed, Mar
Anthony Liguori wrote:
This pushes towards in kernel apic too. Can't see how we avoid it.
Does it really? IIUC, we would avoid TPR traps entirely and would
just need to synchronize the TPR whenever we go down to userspace.
It's a bit more complex than that, as userspace would need to
Avi Kivity wrote:
Anthony Liguori wrote:
Maybe some brave soul can hack kvm to patch the new instruction in
place of the mmio instruction Windows uses to bang on the tpr.
It seems like that shouldn't be too hard assuming that the MMIO
instructions are = the new CR8 instruction. It would
Anthony Liguori wrote:
If we do this, then we can probably just handle the TPR as a special
case anyway and not bother returning to userspace when the TPR is
updated through MMIO. That saves the round trip without adding
emulation complexity.
That means the emulation is split among user
Stephane
There may be some propagation delay yet you, supposedly, do not suffer
from masked
interrupt windows. Also something to watch out for is that when you restore
you must make sure that msrs upper bits are set to 1. Otherwise you may
trigger
unvoluntary interrupts.
I'm not
Avi Kivity wrote:
Anthony Liguori wrote:
This pushes towards in kernel apic too. Can't see how we avoid it.
Does it really? IIUC, we would avoid TPR traps entirely and would
just need to synchronize the TPR whenever we go down to userspace.
It's a bit more complex than that, as
On Wed, Apr 4, 2007 at 3:40 AM, in message [EMAIL PROTECTED],
Avi Kivity [EMAIL PROTECTED] wrote:
I would avoid moving down anything that's not strictly necessary.
Agreed.
I still don't have an opinion as to whether it is necessary; I'll need
to study the details. Xen pushes most of
If we do this, then we can probably just handle the TPR as a special
case anyway and not bother returning to userspace when the TPR is
updated through MMIO. That saves the round trip without adding
emulation complexity.
That means the emulation is split among user space and kernel. Not
This pushes towards in kernel apic too. Can't see how we avoid it.
Does it really? IIUC, we would avoid TPR traps entirely and would
just need to synchronize the TPR whenever we go down to userspace.
It's a bit more complex than that, as userspace would need to tell
the
kernel the
Leslie Mann wrote:
I'll prepare the first patch. Can you ensure that your upgraded setup
still works kvm-17.
It does, as I use it daily in order to run a Win app that I need.
Please test the attached patch, against kvm-17. This is subversion
revision 4546 and git commit
Nakajima, Jun wrote:
Most of H/W-virtualization capable processors out there don't support
that feature today. I think the decision (kvm or qemu) should be done
based on performance data. I'm not worried about maintenance issues; the
APIC code is not expected to change frequently. I'm a bit
I swear this has been brought up before in this forum, but I can't
find it. I'm curious what the virtualization gurus in this forum think
of the possibilities for recursive virtualization. I know vbox claims
to support it, but I haven't come across many details on how they do
it and I don't think
Dor Laor wrote:
This pushes towards in kernel apic too. Can't see how we avoid it.
Does it really? IIUC, we would avoid TPR traps entirely and would
just need to synchronize the TPR whenever we go down to userspace.
It's a bit more complex than that, as userspace would
Dor Laor wrote:
If we do this, then we can probably just handle the TPR as a special
case anyway and not bother returning to userspace when the TPR is
updated through MMIO. That saves the round trip without adding
emulation complexity.
That means the emulation is split among user
I swear this has been brought up before in this forum, but I can't
find it. I'm curious what the virtualization gurus in this forum think
of the possibilities for recursive virtualization. I know vbox claims
to support it, but I haven't come across many details on how they do
it and I don't think
Avi Kivity wrote:
Nakajima, Jun wrote:
Most of H/W-virtualization capable processors out there don't support
that feature today. I think the decision (kvm or qemu) should be done
based on performance data. I'm not worried about maintenance issues;
the APIC code is not expected to change
Gregory Haskins wrote:
What I was planning on doing was using that QEMU patch I provided to
intercept all pic_send_irq() calls and forward them directly to the kernel
via a new ioctl(). This ioctl would be directed at the VM fd, not the VCPU,
since its a pure ISA global pin reference and
It seems from cursory inspection that this is possible in theory, even on HVM
hardware. My thoughts are as follows (Intel oriented, which I know better):
*) The hypervisor sets to trap on VMX type operations (VMXON/OFF/START/RESUME,
etc) and provide emulation of them as follows:
*) When a
Anthony Liguori wrote:
BTW, I see CPU utilization of qemu is almost always 99% in the top
command when I run kernel build in an x86-64 Linux guest.
qemu would be 99% even if all the time is being spent in the guest
context.
If the user time is high, an oprofile run would be
Dor,
Thanks, I realize there will certainly be a lot of work in
virtualizing them. Maybe Intel can help out with VVT-x to give a
root-root mode. ;)
Any idea at a high level how vbox does it? I will post in their forum,
but I assume somebody here has a good idea.
Thanks.
On 4/4/07, Dor Laor
Nakajima, Jun wrote:
I compared the performance on Xen and KVM for kernel build using the
same guest image. Looks like KVM was (kvm-17) three times slower as far
as we tested, and that high load of qemu was one of the symptoms. We are
looking at the shadow code, but the load of qemu looks very
On Wed, Apr 4, 2007 at 12:49 PM, in message [EMAIL PROTECTED],
Avi Kivity [EMAIL PROTECTED] wrote:
Gregory Haskins wrote:
Hmm. If the ioapic is in the kernel, then it's a platform- wide resource
and you would need a vm ioctl. If ioapic emulation is in userspace,
then the ioapic logic
Avi Kivity wrote:
Nakajima, Jun wrote:
I compared the performance on Xen and KVM for kernel build using the
same guest image. Looks like KVM was (kvm-17) three times slower as
far as we tested, and that high load of qemu was one of the
symptoms. We are looking at the shadow code, but the load
On Wed, Apr 4, 2007 at 1:43 PM, in message [EMAIL PROTECTED],
Avi Kivity [EMAIL PROTECTED] wrote:
Gregory Haskins wrote:
Agreed. I was thinking that the interface for the IOAPIC in kernel model
would look something like the way the pic_send_irq() function looks, except
it would also
Nakajima, Jun wrote:
Avi Kivity wrote:
Nakajima, Jun wrote:
Most of H/W-virtualization capable processors out there don't support
that feature today. I think the decision (kvm or qemu) should be done
based on performance data. I'm not worried about maintenance issues;
the APIC code
Gregory Haskins wrote:
On Wed, Apr 4, 2007 at 10:20 AM, in message [EMAIL PROTECTED],
Anthony Liguori [EMAIL PROTECTED] wrote:
The devices are already written to take a set_irq function. Instead of
hijacking the emulated PIC device, I think it would be better if in
pc.c, we
* Avi Kivity [EMAIL PROTECTED] wrote:
It still exists in userspace. Having the code duplication
(especially when it's not the same code base) is unfortunate.
This remains true.
but it's the wrong argument. Of course there's duplicate functionality,
and that's _good_ because it
* Anthony Liguori [EMAIL PROTECTED] wrote:
Keeping the apic in the kernel simplifies this with the cost of
maintaining an apic/pic implementation.
Hrm, this is definitely starting to sound like a PITA to deal with.
Maybe in-kernel platform devices are unavoidable :-/
yes, very much
* Avi Kivity [EMAIL PROTECTED] wrote:
My current thoughts are that we at least move the IOAPIC into the
kernel as well. That will give sufficient control to generate ISA
bus interrupts for guests that understand APICs. If we want to be
able to generate ISA interrupts for legacy
* Gregory Haskins [EMAIL PROTECTED] wrote:
pci is level triggered, so maybe the guests just handle the
inaccuracy.
Good point. I'm not sure how this works today. Perhaps we just get
lucky that nothing checks the IRR in the IOAPIC coupled with a bug in
the IOAPIC model that an
Ingo Molnar wrote:
* Avi Kivity [EMAIL PROTECTED] wrote:
It still exists in userspace. Having the code duplication
(especially when it's not the same code base) is unfortunate.
This remains true.
but it's the wrong argument. Of course there's duplicate functionality,
On Wed, Apr 4, 2007 at 4:32 PM, in message [EMAIL PROTECTED],
Ingo Molnar [EMAIL PROTECTED] wrote:
My current thoughts are that we at least move the IOAPIC into the
kernel as well. [...]
yes. And then do the final 10% move of handling the i8529A in KVM too.
Hi Ingo,
We are in full
Dor,
Thanks, I realize there will certainly be a lot of work in
virtualizing them. Maybe Intel can help out with VVT-x to give a
root-root mode. ;)
Any idea at a high level how vbox does it? I will post in their forum,
but I assume somebody here has a good idea.
Vbox branched out from qemu.
Avi Kivity wrote:
Nakajima, Jun wrote:
I compared the performance on Xen and KVM for kernel build using the
same guest image. Looks like KVM was (kvm-17) three times slower as
far as we tested, and that high load of qemu was one of the
symptoms. We are looking at the shadow code, but the load
Gregory Haskins wrote:
On Wed, Apr 4, 2007 at 4:32 PM, in message [EMAIL PROTECTED],
Ingo Molnar [EMAIL PROTECTED] wrote:
My current thoughts are that we at least move the IOAPIC into the
kernel as well. [...]
yes. And then do the final 10% move of handling the i8529A
we should move all the PICs into KVM proper - and that includes the
i8259A PIC too. Qemu-space drivers are then wired to pins on these
PICs,
but nothing in Qemu does vector generation or vector prioritization -
that task is purely up to KVM. There are mixed i8259A+lapic models
possible too and the
Gregory Haskins wrote:
Hmm. If the ioapic is in the kernel, then it's a platform- wide
resource
and you would need a vm ioctl. If ioapic emulation is in userspace,
then the ioapic logic will have decided which cpu is targeted and you
would issue a vcpu ioctl.
Thats exactly in line with
But why is it a good thing to do PV drivers in the kernel? You lose
flexibility and functionality to gain performance. Really, it's more
about there not being good enough userspace interfaces to do network
IO.
The lapic/PIC code
should also be available in Qemu for OSs that dont have
On Wed, 2007-04-04 at 23:21 +0200, Ingo Molnar wrote:
* Anthony Liguori [EMAIL PROTECTED] wrote:
But why is it a good thing to do PV drivers in the kernel? You lose
flexibility and functionality to gain performance. [...]
in Linux a kernel-space network driver can still be tunneled
* Gregory Haskins ([EMAIL PROTECTED]) wrote:
LAPICs can be remapped on a per-cpu basis via an MSR, whereas something
like an IOAPIC is a system-wide resource.
Yes, I see now, no vcpu in kvm_io_device callbacks' context (admittedly,
I'm used to the Xen implementation ;-)
+struct kvm_io_device
49 matches
Mail list logo