Re: [PATCH 3/3] Add KVM support to QEMU
Anthony Liguori wrote: I think live migration is now broken, since kvm accesses will not update the qemu dirty bitmap. Eh? I don't follow you here. guest writes in qemu set the qemu dirty bitmap (for cpu_physical_memory_get_dirty()), but guest accesses in kvm won't, unless you enable dirty tracking and synchronize kvm's bitmap to qemu's. See the changes to migration.c in kvm's qemu. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Add KVM support to QEMU
Avi Kivity wrote: Anthony Liguori wrote: I think live migration is now broken, since kvm accesses will not update the qemu dirty bitmap. Eh? I don't follow you here. guest writes in qemu set the qemu dirty bitmap (for cpu_physical_memory_get_dirty()), but guest accesses in kvm won't, unless you enable dirty tracking and synchronize kvm's bitmap to qemu's. See the changes to migration.c in kvm's qemu. Oh yes, that falls under the save/restore TODO item. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Add KVM support to QEMU
Anthony Liguori wrote: This patch adds very basic KVM support. KVM is a kernel module for Linux that allows userspace programs to make use of hardware virtualization support. It current supports x86 hardware virtualization using Intel VT-x or AMD-V. It also supports IA64 VT-i, PPC 440, and S390. This patch only implements the bare minimum support to get a guest booting. It has very little impact the rest of QEMU and attempts to integrate nicely with the rest of QEMU. Even though this implementation is basic, it is significantly faster than TCG. Booting and shutting down a Linux guest: w/TCG: 1:32.36 elapsed 84% CPU w/KVM: 0:31.14 elapsed 59% CPU Right now, KVM is disabled by default and must be explicitly enabled with -enable-kvm. We can enable it by default later when we have had better testing. Signed-off-by: Anthony Liguori [EMAIL PROTECTED] diff --git a/KVM_TODO b/KVM_TODO new file mode 100644 index 000..9529049 --- /dev/null +++ b/KVM_TODO @@ -0,0 +1,9 @@ +1) Add hooks for load/save of register state + o Fixes gdbstub, save/restore, and vmport +2) Add VGA optimization +3) Add IO thread +4) Add guest SMP support +5) Add TPR optimization +6) Add support for in-kernel APIC +7) Add support for in-kernel PIT +8) Merge in additional changes in kvm-userspace tree One of the important changes is running with signal delivery disabled, since that's particularly slow (requires save/restore of the floating point state, for example). + +typedef struct kvm_userspace_memory_region KVMSlot; KVMMemorySlot? + +static KVMState *kvm_state; Why a pointer? +if (ret 0) { +dprintf(kvm_create_vcpu failed\n); showing errno would be nice. + +static void kvm_getput_reg(__u64 *kvm_reg, target_ulong *qemu_reg, int set) +{ +if (set) +*kvm_reg = *qemu_reg; +else +*qemu_reg = *kvm_reg; +} Ugh. I think live migration is now broken, since kvm accesses will not update the qemu dirty bitmap. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
Anthony Liguori wrote: Another place hook is updating a slot's dirty bitmap. Right now, with my patchset we don't have live migration or the VGA RAM optimization. There's nothing about the VGA RAM optimization that wouldn't work for QEMU. I'm not sure that it really is an optimization in the context of TCG, but I certainly don't think it's any worse. The only thing you really need is to query the KVM dirty bitmap when it comes time to enable start over querying the VGA dirty bits. I don't understand this. The VGA optimization really is qemu's, the kvm modifications only cater to the different way of getting the dirty bits. The same is needed for live migration, so I think what we really need is to change the memory dirty bit tracking API to have a concept of refresh that we can use to hook for KVM. Can you elaborate on this refresh? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
On Wed, Oct 29, 2008 at 11:54:11AM +0200, Avi Kivity wrote: Anthony Liguori wrote: Another place hook is updating a slot's dirty bitmap. Right now, with my patchset we don't have live migration or the VGA RAM optimization. There's nothing about the VGA RAM optimization that wouldn't work for QEMU. I'm not sure that it really is an optimization in the context of TCG, but I certainly don't think it's any worse. The only thing you really need is to query the KVM dirty bitmap when it comes time to enable start over querying the VGA dirty bits. I don't understand this. The VGA optimization really is qemu's, the kvm modifications only cater to the different way of getting the dirty bits. As it seems to me, the real difference is that qemu has to explicitly set certain regions as dirty, while kvm get dirty bit automatically from the kernel. So I believe we can have markers on the code to refresh dirty bitmap for certain area ranges (for kvm use), and also enable a manual override (for qemu). After that, the cpu_physical_memory_get_dirty() will simply return whether or not the page is dirty. Also, kvm only tracks dirty bits, whereas qemu has at least three kinds of them. But I think for now we can assume that kvm's dirty mean all dirty -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
Glauber Costa wrote: Another place hook is updating a slot's dirty bitmap. Right now, with my patchset we don't have live migration or the VGA RAM optimization. There's nothing about the VGA RAM optimization that wouldn't work for QEMU. I'm not sure that it really is an optimization in the context of TCG, but I certainly don't think it's any worse. The only thing you really need is to query the KVM dirty bitmap when it comes time to enable start over querying the VGA dirty bits. I don't understand this. The VGA optimization really is qemu's, the kvm modifications only cater to the different way of getting the dirty bits. As it seems to me, the real difference is that qemu has to explicitly set certain regions as dirty, while kvm get dirty bit automatically from the kernel. I'm completely lost. I don't see how one or the other is more or less automatic, or how qemu has to explicitly set regions as dirty (except when emulating bitblt). So I believe we can have markers on the code to refresh dirty bitmap for certain area ranges (for kvm use), and also enable a manual override (for qemu). After that, the cpu_physical_memory_get_dirty() will simply return whether or not the page is dirty. Does not cpu_p_m_g_dirty() simply return whether or not the page is dirty now? Also, kvm only tracks dirty bits, whereas qemu has at least three kinds of them. But I think for now we can assume that kvm's dirty mean all dirty kvm's dirty bits mean that kvm has seen the page written to since the last query. A zero doesn't mean the page is clean though -- it could have been written to by qemu. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
On Wed, Oct 29, 2008 at 02:39:57PM +0200, Avi Kivity wrote: Glauber Costa wrote: Another place hook is updating a slot's dirty bitmap. Right now, with my patchset we don't have live migration or the VGA RAM optimization. There's nothing about the VGA RAM optimization that wouldn't work for QEMU. I'm not sure that it really is an optimization in the context of TCG, but I certainly don't think it's any worse. The only thing you really need is to query the KVM dirty bitmap when it comes time to enable start over querying the VGA dirty bits. I don't understand this. The VGA optimization really is qemu's, the kvm modifications only cater to the different way of getting the dirty bits. As it seems to me, the real difference is that qemu has to explicitly set certain regions as dirty, while kvm get dirty bit automatically from the kernel. I'm completely lost. I don't see how one or the other is more or less automatic, or how qemu has to explicitly set regions as dirty (except when emulating bitblt). Or maybe I am. But I don't see any way in which qemu sets dirty bits but explicitly with cpu_physical_memory_set_dirty(). This is pretty explicit. So I believe we can have markers on the code to refresh dirty bitmap for certain area ranges (for kvm use), and also enable a manual override (for qemu). After that, the cpu_physical_memory_get_dirty() will simply return whether or not the page is dirty. Does not cpu_p_m_g_dirty() simply return whether or not the page is dirty now? If you look at the vga code, you see something like: cpu_physical_memory_get_dirty(page0, VGA_DIRTY_FLAG) | cpu_physical_memory_get_dirty(page1, VGA_DIRTY_FLAG); if (kvm_enabled()) { update |= bitmap_get_dirty(bitmap, (page0 - s-vram_offset) TARGET_PAGE_BITS); update |= bitmap_get_dirty(bitmap, (page1 - s-vram_offset) TARGET_PAGE_BITS); } so if the page is not dirty to cpu_p_m_g_dirty() (I liked that abb), it can still be dirty for kvm. Ideally, it would not be necessary. Also, kvm only tracks dirty bits, whereas qemu has at least three kinds of them. But I think for now we can assume that kvm's dirty mean all dirty kvm's dirty bits mean that kvm has seen the page written to since the last query. A zero doesn't mean the page is clean though -- it could have been written to by qemu. Right. The point here is more like kvm has 1 type of dirty whereas qemu has many. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
Avi Kivity wrote: Anthony Liguori wrote: Another place hook is updating a slot's dirty bitmap. Right now, with my patchset we don't have live migration or the VGA RAM optimization. There's nothing about the VGA RAM optimization that wouldn't work for QEMU. I'm not sure that it really is an optimization in the context of TCG, but I certainly don't think it's any worse. The only thing you really need is to query the KVM dirty bitmap when it comes time to enable start over querying the VGA dirty bits. I don't understand this. The VGA optimization really is qemu's, the kvm modifications only cater to the different way of getting the dirty bits. Right. I'm just not sure that it's going to be as much of an optimization for TCG as it is for KVM. The same is needed for live migration, so I think what we really need is to change the memory dirty bit tracking API to have a concept of refresh that we can use to hook for KVM. Can you elaborate on this refresh? Right now, in QEMU, code looks like this: for (i = 0; i addr; i += TARGET_PAGE_SIZE) { if (cpu_p_m_g_dirty(i, DIRTY_FLAG)) { cpu_p_m_r_dirty(i, i + TARGET_PAGE_SIZE, DIRTY_FLAG); // do something with dirty memory } All we need to do is add another cpu_physical_memory_sync_dirty(i, i + REGION_SIZE, DIRTY_FLAG); that would go at the start of this. For QEMU, this is a nop since dirty bits are updated as soon as they are reset. For KVM, this would update the entire set of dirty bits for the given memory region. We also need something to enable dirty tracking for a particular region. We already have something for migration, we could perhaps extend that API (cpu_p_m_s_dirty_tracking). Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
Anthony Liguori wrote: Avi Kivity wrote: Anthony Liguori wrote: Another place hook is updating a slot's dirty bitmap. Right now, with my patchset we don't have live migration or the VGA RAM optimization. There's nothing about the VGA RAM optimization that wouldn't work for QEMU. I'm not sure that it really is an optimization in the context of TCG, but I certainly don't think it's any worse. The only thing you really need is to query the KVM dirty bitmap when it comes time to enable start over querying the VGA dirty bits. I don't understand this. The VGA optimization really is qemu's, the kvm modifications only cater to the different way of getting the dirty bits. Right. I'm just not sure that it's going to be as much of an optimization for TCG as it is for KVM. When qemu is loaded certainly this is lost in the noise. But when idling (and as a bonus, the screen doesn't change much), tcg and kvm will benefit equally. Right now, in QEMU, code looks like this: for (i = 0; i addr; i += TARGET_PAGE_SIZE) { if (cpu_p_m_g_dirty(i, DIRTY_FLAG)) { cpu_p_m_r_dirty(i, i + TARGET_PAGE_SIZE, DIRTY_FLAG); // do something with dirty memory } All we need to do is add another cpu_physical_memory_sync_dirty(i, i + REGION_SIZE, DIRTY_FLAG); that would go at the start of this. For QEMU, this is a nop since dirty bits are updated as soon as they are reset. For KVM, this would update the entire set of dirty bits for the given memory region. You mean merge the kvm bitmap into the qemu bitmap? That's what it does now, no? We also need something to enable dirty tracking for a particular region. We already have something for migration, we could perhaps extend that API (cpu_p_m_s_dirty_tracking). I don't think qemu would benefit much from per-region tracking, but maybe I'm wrong. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
Avi Kivity wrote: Anthony Liguori wrote: Avi Kivity wrote: Anthony Liguori wrote: Another place hook is updating a slot's dirty bitmap. Right now, with my patchset we don't have live migration or the VGA RAM optimization. There's nothing about the VGA RAM optimization that wouldn't work for QEMU. I'm not sure that it really is an optimization in the context of TCG, but I certainly don't think it's any worse. The only thing you really need is to query the KVM dirty bitmap when it comes time to enable start over querying the VGA dirty bits. I don't understand this. The VGA optimization really is qemu's, the kvm modifications only cater to the different way of getting the dirty bits. Right. I'm just not sure that it's going to be as much of an optimization for TCG as it is for KVM. When qemu is loaded certainly this is lost in the noise. But when idling (and as a bonus, the screen doesn't change much), tcg and kvm will benefit equally. Right now, in QEMU, code looks like this: for (i = 0; i addr; i += TARGET_PAGE_SIZE) { if (cpu_p_m_g_dirty(i, DIRTY_FLAG)) { cpu_p_m_r_dirty(i, i + TARGET_PAGE_SIZE, DIRTY_FLAG); // do something with dirty memory } All we need to do is add another cpu_physical_memory_sync_dirty(i, i + REGION_SIZE, DIRTY_FLAG); that would go at the start of this. For QEMU, this is a nop since dirty bits are updated as soon as they are reset. For KVM, this would update the entire set of dirty bits for the given memory region. You mean merge the kvm bitmap into the qemu bitmap? That's what it does now, no? We also need something to enable dirty tracking for a particular region. We already have something for migration, we could perhaps extend that API (cpu_p_m_s_dirty_tracking). I don't think qemu would benefit much from per-region tracking, but maybe I'm wrong. QEMU doesn't do per region tracking but instead, uses flags to simulate regions. At any rate, the suggestion for an API to enable dirty tracking was really for KVMs benefit. It's a nop for QEMU. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
On Tue, Oct 28, 2008 at 6:36 PM, Anthony Liguori [EMAIL PROTECTED] wrote: Something I was thinking about this morning, and I think the first place where we'll definitely need a hook, is how to deal with kvm_load_registers(). I think there's overlap between KVM and the IO thread here. There are two reasons (I can think of) that most of the device model code can't run in conjunction with TCG. The first is that TCG may modify CPUState in a non-atomic way. The device model may need to access CPUState although there are very few places that it does. Out of curiosity, where are those places? -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
Hollis Blanchard wrote: On Tue, Oct 28, 2008 at 6:36 PM, Anthony Liguori [EMAIL PROTECTED] wrote: Something I was thinking about this morning, and I think the first place where we'll definitely need a hook, is how to deal with kvm_load_registers(). I think there's overlap between KVM and the IO thread here. There are two reasons (I can think of) that most of the device model code can't run in conjunction with TCG. The first is that TCG may modify CPUState in a non-atomic way. The device model may need to access CPUState although there are very few places that it does. Out of curiosity, where are those places? local apic -- needs to access interrupt disable flag acpi sleep -- halts the current processor, so tied to cpustate vmport -- bad ABI requires access to registers -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
Fabrice Bellard wrote: Avi Kivity wrote: Hollis Blanchard wrote: Out of curiosity, where are those places? local apic -- needs to access interrupt disable flag acpi sleep -- halts the current processor, so tied to cpustate vmport -- bad ABI requires access to registers These accesses are the exception and should be done with specific CPU methods. IMHO, direct access to the CPU state should otherwise never be done from devices. Yes, this is what I'm working on now. There are also non-devices that access CPUState but they are also special cases (gdbstub, monitor, save/restore). Regards, Anthony Liguori Regards, Fabrice. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
Avi Kivity wrote: Hollis Blanchard wrote: On Tue, Oct 28, 2008 at 6:36 PM, Anthony Liguori [EMAIL PROTECTED] wrote: Something I was thinking about this morning, and I think the first place where we'll definitely need a hook, is how to deal with kvm_load_registers(). I think there's overlap between KVM and the IO thread here. There are two reasons (I can think of) that most of the device model code can't run in conjunction with TCG. The first is that TCG may modify CPUState in a non-atomic way. The device model may need to access CPUState although there are very few places that it does. Out of curiosity, where are those places? local apic -- needs to access interrupt disable flag acpi sleep -- halts the current processor, so tied to cpustate vmport -- bad ABI requires access to registers These accesses are the exception and should be done with specific CPU methods. IMHO, direct access to the CPU state should otherwise never be done from devices. Regards, Fabrice. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Add KVM support to QEMU
resending, now with less quoted text: diff --git a/kvm-all.c b/kvm-all.c new file mode 100644 index 000..4379071 --- /dev/null +++ b/kvm-all.c @@ -0,0 +1,377 @@ +/* + * QEMU KVM support + * + * Copyright IBM, Corp. 2008 + * + * Authors: + * Anthony Liguori [EMAIL PROTECTED] + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include sys/types.h +#include sys/ioctl.h +#include sys/mman.h + +#include linux/kvm.h + +#include qemu-common.h +#include sysemu.h +#include kvm.h + +//#define DEBUG_KVM + +#ifdef DEBUG_KVM +#define dprintf(fmt, ...) \ +do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0) +#else +#define dprintf(fmt, ...) \ +do { } while (0) +#endif + +typedef struct kvm_userspace_memory_region KVMSlot; Actually, I don't think it is a good idea. We may want to keep internal-only data tied to the slot, such the slot's dirty bitmap if we do per-slot dirty tracking. Of course there may be other ways to do it, but this is the cleaner and more adequate since we're going through a fresh start. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Add KVM support to QEMU
Glauber Costa wrote: resending, now with less quoted text: Yeah, I was intending to scold you about that ;-) diff --git a/kvm-all.c b/kvm-all.c new file mode 100644 index 000..4379071 --- /dev/null +++ b/kvm-all.c @@ -0,0 +1,377 @@ +/* + * QEMU KVM support + * + * Copyright IBM, Corp. 2008 + * + * Authors: + * Anthony Liguori [EMAIL PROTECTED] + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include sys/types.h +#include sys/ioctl.h +#include sys/mman.h + +#include linux/kvm.h + +#include qemu-common.h +#include sysemu.h +#include kvm.h + +//#define DEBUG_KVM + +#ifdef DEBUG_KVM +#define dprintf(fmt, ...) \ +do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0) +#else +#define dprintf(fmt, ...) \ +do { } while (0) +#endif + +typedef struct kvm_userspace_memory_region KVMSlot; Actually, I don't think it is a good idea. We may want to keep internal-only data tied to the slot, such the slot's dirty bitmap if we do per-slot dirty tracking. Of course there may be other ways to do it, but this is the cleaner and more adequate since we're going through a fresh start. For now, I think it's fine. Switching to making it a proper structure is certainly a good idea when we get more stuff per slot. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
On 10/29/08, Avi Kivity [EMAIL PROTECTED] wrote: Hollis Blanchard wrote: On Tue, Oct 28, 2008 at 6:36 PM, Anthony Liguori [EMAIL PROTECTED] wrote: Something I was thinking about this morning, and I think the first place where we'll definitely need a hook, is how to deal with kvm_load_registers(). I think there's overlap between KVM and the IO thread here. There are two reasons (I can think of) that most of the device model code can't run in conjunction with TCG. The first is that TCG may modify CPUState in a non-atomic way. The device model may need to access CPUState although there are very few places that it does. Out of curiosity, where are those places? local apic -- needs to access interrupt disable flag acpi sleep -- halts the current processor, so tied to cpustate It should be possible to avoid these, just use a qemu_irq for per-CPU interrupt lines and halt signals. vmport -- bad ABI requires access to registers Ugly. Maybe there could be two parts, one in pc.c which registers the ioport and checks EAX/ECX, maybe using a CPU specific helper, and second part, generic port in vmport.c, which does not know about CPU state. I don't know if this would solve the original atomicity problem, though. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 3/3] Add KVM support to QEMU
Hollis Blanchard wrote: Just a quick skim... On Tue, Oct 28, 2008 at 3:13 PM, Anthony Liguori [EMAIL PROTECTED] wrote: +int kvm_cpu_exec(CPUState *env) +{ +struct kvm_run *run = env-kvm_run; +int ret; + +dprintf(kvm_cpu_exec()\n); + +do { +kvm_arch_pre_run(env, run); + +if ((env-interrupt_request CPU_INTERRUPT_EXIT)) { +dprintf(interrupt exit requested\n); +ret = 0; +break; +} + +dprintf(setting tpr\n); +run-cr8 = cpu_get_apic_tpr(env); This belongs in the arch_pre_run hook above. Good catch, I've updated the patch. How did you decide which exit handlers should go into architecture-specific code? Looking at just the KVM architecture set: Based on whether the implementation required target-specific code. IO: x86 and ia64, not PowerPC or s390 cpu_{in,out}[bwl] are defined in vl.c and are available for all architectures. They are no-ops on most architectures because they are never used. MMIO: everybody except s390 cpu_physical_memory_rw() is defined by everyone. DCRs: PowerPC only This will have to be an architecture specific handler. IRQ window: not sure It's a no-op implementation. I would think that this would be needed on PPC. If you want to inject an interrupt, but the guest is unable to handle an interrupt, you need to exit to userspace when the guest re-enables interrupts. Otherwise, you may never return to userspace for the interrupt to be injected. How do you handle that now? Does PPC have something that makes this unnecessary? Regards, Anthony Liguori -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Add KVM support to QEMU
Anthony Liguori wrote: This patch only implements the bare minimum support to get a guest booting. It has very little impact the rest of QEMU and attempts to integrate nicely with the rest of QEMU. Huh? That isn't based on the qemu-accel patches ... surprised, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
On Tue, Oct 28, 2008 at 7:51 PM, Anthony Liguori [EMAIL PROTECTED] wrote: Gerd Hoffmann wrote: Anthony Liguori wrote: This patch only implements the bare minimum support to get a guest booting. It has very little impact the rest of QEMU and attempts to integrate nicely with the rest of QEMU. Huh? That isn't based on the qemu-accel patches ... This is part of the reason for this exercise. I'd rather introduce KVM support first and then look at abstracting things, than vice versa. A number of the hooks in the current QEMUAccel tree are there for the wrong reason (to support the out-of-tree IO thread, for instance). If you just introduce something with various hooks and say, these are hooks we'll need, it's not possible to really evaluate whether the hooks are needed because nothing in the tree makes use of them. We talked extensively on monday about it, and I'm in agreement with it. Regards, Anthony Liguori surprised, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- Glauber Costa. Free as in Freedom http://glommer.net The less confident you are, the more serious you have to act. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH 3/3] Add KVM support to QEMU
Glauber Costa wrote: On Tue, Oct 28, 2008 at 7:51 PM, Anthony Liguori [EMAIL PROTECTED] wrote: This is part of the reason for this exercise. I'd rather introduce KVM support first and then look at abstracting things, than vice versa. A number of the hooks in the current QEMUAccel tree are there for the wrong reason (to support the out-of-tree IO thread, for instance). If you just introduce something with various hooks and say, these are hooks we'll need, it's not possible to really evaluate whether the hooks are needed because nothing in the tree makes use of them. We talked extensively on monday about it, and I'm in agreement with it. Something I was thinking about this morning, and I think the first place where we'll definitely need a hook, is how to deal with kvm_load_registers(). I think there's overlap between KVM and the IO thread here. There are two reasons (I can think of) that most of the device model code can't run in conjunction with TCG. The first is that TCG may modify CPUState in a non-atomic way. The device model may need to access CPUState although there are very few places that it does. The other reason is accessing guest memory. TCG does not preserve atomicity when a guest accesses device memory. There are probably only a few places in the device model (like virtio) that depend on atomicity. If we implemented an API that implemented a lock/unlock for CPUState and for portions of memory, then I think this could be used both as a hook for kvm_load_registers and as a way to introduce an IO thread with TCG. The CPUState lock/unlock is pretty straight forward. For the memory implementation to be efficient, I think you would have to acquire the lock when TCG brings a physical address into the TLB (preferrably, at a page granularity), or whenever someone tries to access memory (via cpu_physical_memory_rw). I think in the vast majority of the cases, there wouldn't be any contention and both could TCG could run along side the IO thread. Another place hook is updating a slot's dirty bitmap. Right now, with my patchset we don't have live migration or the VGA RAM optimization. There's nothing about the VGA RAM optimization that wouldn't work for QEMU. I'm not sure that it really is an optimization in the context of TCG, but I certainly don't think it's any worse. The only thing you really need is to query the KVM dirty bitmap when it comes time to enable start over querying the VGA dirty bits. The same is needed for live migration, so I think what we really need is to change the memory dirty bit tracking API to have a concept of refresh that we can use to hook for KVM. FWIW, I included a TODO in my patch if people are interesting in tackling any of these things. Regards, Anthony Liguori Regards, Anthony Liguori Regards, Anthony Liguori surprised, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 3/3] Add KVM support to QEMU
Hollis Blanchard wrote: Just a quick skim... On Tue, Oct 28, 2008 at 3:13 PM, Anthony Liguori [EMAIL PROTECTED] wrote: +int kvm_cpu_exec(CPUState *env) +{ +struct kvm_run *run = env-kvm_run; +int ret; + +dprintf(kvm_cpu_exec()\n); + +do { +kvm_arch_pre_run(env, run); + +if ((env-interrupt_request CPU_INTERRUPT_EXIT)) { +dprintf(interrupt exit requested\n); +ret = 0; +break; +} + +dprintf(setting tpr\n); +run-cr8 = cpu_get_apic_tpr(env); This belongs in the arch_pre_run hook above. Good catch, I've updated the patch. How did you decide which exit handlers should go into architecture-specific code? Looking at just the KVM architecture set: Based on whether the implementation required target-specific code. IO: x86 and ia64, not PowerPC or s390 cpu_{in,out}[bwl] are defined in vl.c and are available for all architectures. They are no-ops on most architectures because they are never used. MMIO: everybody except s390 cpu_physical_memory_rw() is defined by everyone. DCRs: PowerPC only This will have to be an architecture specific handler. IRQ window: not sure It's a no-op implementation. I would think that this would be needed on PPC. If you want to inject an interrupt, but the guest is unable to handle an interrupt, you need to exit to userspace when the guest re-enables interrupts. Otherwise, you may never return to userspace for the interrupt to be injected. How do you handle that now? Does PPC have something that makes this unnecessary? Regards, Anthony Liguori -Hollis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html