[PATCH] QEMU kill CR3_CACHE references
Hi, The CR3 caching was never implemented in QEMU and is obsoleted by NPT/EPT. This patch removes the unused references to it from target-i386/kvm.c. Cheers, Jes commit 5ed16687929511d015dd3542c4359cabe170401a Author: Jes Sorensen Date: Fri Feb 19 07:39:56 2010 +0100 Remove all references to KVM_CR3_CACHE as it was never implemented. Signed-off-by: Jes Sorensen diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 0d08cd5..5d9aecc 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -158,9 +158,6 @@ struct kvm_para_features { #ifdef KVM_CAP_PV_MMU { KVM_CAP_PV_MMU, KVM_FEATURE_MMU_OP }, #endif -#ifdef KVM_CAP_CR3_CACHE -{ KVM_CAP_CR3_CACHE, KVM_FEATURE_CR3_CACHE }, -#endif { -1, -1 } };
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
On Thu, Feb 18, 2010 at 12:18:25PM +0200, Avi Kivity wrote: > On 02/18/2010 12:05 PM, Jan Kiszka wrote: > >Avi Kivity wrote: > >>On 02/18/2010 11:45 AM, Avi Kivity wrote: > >>>On 02/18/2010 11:40 AM, Jan Kiszka wrote: > >Meanwhile, if anyone has any idea how to kill this lock, I'd love to > >see it. > > > What concurrency does it resolve in the end? On first glance, it only > synchronize the fiddling with pre-VCPU request bits, right? What forces > us to do this? Wouldn't it suffice to disable preemption (thus > migration) and the let concurrent requests race for setting the bits? I > mean if some request bit was already set on entry, we don't include the > related VCPU in smp_call_function_many anyway. > >>>It's more difficult. > >>> > >>>vcpu 0: sets request bit on vcpu 2 > >>> vcpu 1: test_and_set request bit on vcpu 2, returns already set > >>> vcpu 1: returns > >>>vcpu 0: sends IPI > >>>vcpu 0: returns > >>> > >>>so vcpu 1 returns before the IPI was performed. If the request was a > >>>tlb flush, for example, vcpu 1 may free a page that is still in vcpu > >>>2's tlb. > >>One way out would be to have a KVM_REQ_IN_PROGRESS, set it in > >>make_request, clear it in the IPI function. > >> > >>If a second make_request sees it already set, it can simply busy wait > >>until it is cleared, without sending the IPI. Of course the busy wait > >>means we can't enable preemption (or we may busy wait on an unscheduled > >>task), but at least the requests can proceed in parallel instead of > >>serializing. > > >...or include VCPUs with KVM_REQ_IN_PROGRESS set into the IPI set even > >if they already have the desired request bit set. > > But then we're making them take the IPI, which is pointless and > expensive. My approach piggy backs multiple requesters on one IPI. I have played with this in the past (collapsing that would avoid two simultaneous requestors from issuing two IPI's to a given vcpu, and unification with KVM_REQ_KICK to avoid the IPI if vcpu not in guest mode). Its not worthwhile though, this is not a contention point with TDP (maybe it becomes in the future with fine grained flushing, but not ATM). > >Then we should > >serialize in smp_call_function_many. > > Do you mean rely on s_c_f_m's internal synchronization? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] qemu-kvm: extboot: Keep variables in RAM
On 02/18/2010 12:27 PM, Anthony Liguori wrote: > On 02/18/2010 10:13 AM, Jan Kiszka wrote: >> Instead of saving the old INT 0x13 and 0x19 handlers in ROM which fails >> under QEMU as it enforces protection, keep them in spare vectors of the >> interrupt table, namely INT 0x80 and 0x81. >> >> Signed-off-by: Jan Kiszka >> > > commit a4492b03932ea3c9762372f3e15e8c6526ee56c6 > Author: H. Peter Anvin > Date: Fri Jul 18 11:22:59 2008 -0700 > > kvm: extboot: don't use interrupt vectors $0x2b and $0x2c > > extboot's use of interrupt vectors $0x2b and $0x2c is unsafe, as these > interrupt vectors fall in the OS-use range (0x20-0x3f). Furthermore, > it's unnecessary: we can keep a local pointer instead of hooking > another interrupt as long as we can write to our own segment. > > Make the extboot segment writable, and use local variables to hold the > old link pointers. > > If this turns out to cause problems, we should probably switch to > using vectors in the 0xc0-0xef range, and/or other BIOS-reserved > memory. > > Signed-off-by: H. Peter Anvin > Signed-off-by: Avi Kivity > > Sounds like 0x80/0x81 is probably not the best choice. hpa: any > suggestions? There aren't really any free memory that you can just use -- there are no free interrupt vectors which are safe to use. Furthermore, this implies that there is a bug in the Qemu BIOS in this respect: the PCI spec requires that the "ROM region" (upper memory area) that contains shadow ROM contents be writable until INT 19h is executed -- at *that* point it gets write protected. However, as I did point out in the original comment, there are some BIOSes in the field which uses vectors 0xc0-0xdf as a scratch memory pool -- usually to have somewhere to stash a small stack -- so if you absolutely have to go down this route that range those probably be the safest. An alternative would be to use memory in the BDA in the range 0x4ac-0x4ff (absolute), which appears to be available for BIOS-specific uses. >> NEITHER OF THESE OPTIONS ARE SAFE ON REAL HARDWARE << These are both BIOS-specific use areas. -hpa -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] virtio-spec: document block CMD and FLUSH
I took a stub at documenting CMD and FLUSH request types in virtio block. Christoph, could you look over this please? I note that the interface seems full of warts to me, this might be a first step to cleaning them. One issue I struggled with especially is how type field mixes bits and non-bit values. I ended up simply defining all legal values, so that we have CMD = 2, CMD_OUT = 3 and so on. I also avoided instroducing inhdr/outhdr structures that virtio blk driver in linux uses, I was concerned that nesting tables will confuse the reader. Comments welcome. Signed-off-by: Michael S. Tsirkin -- diff --git a/virtio-spec.lyx b/virtio-spec.lyx index d16104a..ed35893 100644 --- a/virtio-spec.lyx +++ b/virtio-spec.lyx @@ -67,7 +67,11 @@ IBM Corporation \end_layout \begin_layout Standard + +\change_deleted 0 1266531118 FIXME: virtio block scsi passthrough section +\change_unchanged + \end_layout \begin_layout Standard @@ -4376,7 +4380,7 @@ struct virtio_net_ctrl_mac { The device can filter incoming packets by any number of destination MAC addresses. \begin_inset Foot -status open +status collapsed \begin_layout Plain Layout Since there are no guarentees, it can use a hash filter orsilently switch @@ -4549,6 +4553,22 @@ blk_size \end_inset . +\change_inserted 0 1266444580 + +\end_layout + +\begin_layout Description + +\change_inserted 0 1266471229 +VIRTIO_BLK_F_SCSI (7) Device supports scsi packet commands. +\end_layout + +\begin_layout Description + +\change_inserted 0 1266444605 +VIRTIO_BLK_F_FLUSH (9) Cache flush command support. +\change_unchanged + \end_layout \begin_layout Description @@ -4700,17 +4720,25 @@ struct virtio_blk_req { \begin_layout Plain Layout +\change_deleted 0 1266472188 + #define VIRTIO_BLK_T_IN 0 \end_layout \begin_layout Plain Layout +\change_deleted 0 1266472188 + #define VIRTIO_BLK_T_OUT 1 \end_layout \begin_layout Plain Layout +\change_deleted 0 1266472188 + #define VIRTIO_BLK_T_BARRIER0x8000 +\change_unchanged + \end_layout \begin_layout Plain Layout @@ -4735,11 +4763,15 @@ struct virtio_blk_req { \begin_layout Plain Layout +\change_deleted 0 1266472204 + #define VIRTIO_BLK_S_OK0 \end_layout \begin_layout Plain Layout +\change_deleted 0 1266472204 + #define VIRTIO_BLK_S_IOERR 1 \end_layout @@ -4759,32 +4791,481 @@ struct virtio_blk_req { \end_layout \begin_layout Standard -The type of the request is either a read (VIRTIO_BLK_T_IN) or a write (VIRTIO_BL -K_T_OUT); the high bit indicates that this request acts as a barrier and - that all preceeding requests must be complete before this one, and all - following requests must not be started until this is complete. + +\change_inserted 0 1266472490 +If the device has VIRTIO_BLK_F_SCSI feature, it can also support scsi packet + command requests, each of these requests is of form: +\begin_inset listings +inline false +status open + +\begin_layout Plain Layout + +\change_inserted 0 1266472395 + +struct virtio_scsi_pc_req { +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266472375 + + u32 type; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266472375 + + u32 ioprio; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266474298 + + u64 sector; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266474308 + +char cmd[]; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266505809 + + char data[][512]; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266505825 + +#define SCSI_SENSE_BUFFERSIZE 96 +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266505848 + +u8 sense[SCSI_SENSE_BUFFERSIZE]; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266472969 + +u32 errors; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266472979 + +u32 data_len; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266472984 + +u32 sense_len; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266472987 + +u32 residual; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266472375 + + u8 status; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 0 1266472375 + +}; +\end_layout + +\end_inset + + +\change_unchanged + \end_layout \begin_layout Standard -The ioprio field is a hint about the relative priorities of requests to - the device: higher numbers indicate more important requests. +The +\emph on +type +\emph default + of the request is either a read (VIRTIO_BLK_T_IN) +\change_inserted 0 1266495815 +, +\change_unchanged + +\change_deleted 0 1266495817 +or +\change_unchanged + a write (VIRTIO_BLK_T_OUT) +\change_inserted 0 1266497316 +, a scsi packet command (VIRTIO_BLK_T_SCSI_CMD or VIRTIO_BLK_T_SCSI_CMD_OUT +\begin_inset Foot +status open + +\begin_layout Plain Layout + +\change_inserted 0 1266497390
Re: [PATCH 1/2] qemu-kvm: extboot: Keep variables in RAM
On 02/18/2010 10:13 AM, Jan Kiszka wrote: Instead of saving the old INT 0x13 and 0x19 handlers in ROM which fails under QEMU as it enforces protection, keep them in spare vectors of the interrupt table, namely INT 0x80 and 0x81. Signed-off-by: Jan Kiszka commit a4492b03932ea3c9762372f3e15e8c6526ee56c6 Author: H. Peter Anvin Date: Fri Jul 18 11:22:59 2008 -0700 kvm: extboot: don't use interrupt vectors $0x2b and $0x2c extboot's use of interrupt vectors $0x2b and $0x2c is unsafe, as these interrupt vectors fall in the OS-use range (0x20-0x3f). Furthermore, it's unnecessary: we can keep a local pointer instead of hooking another interrupt as long as we can write to our own segment. Make the extboot segment writable, and use local variables to hold the old link pointers. If this turns out to cause problems, we should probably switch to using vectors in the 0xc0-0xef range, and/or other BIOS-reserved memory. Signed-off-by: H. Peter Anvin Signed-off-by: Avi Kivity Sounds like 0x80/0x81 is probably not the best choice. hpa: any suggestions? Regards, Anthony Liguori --- Don't forget to update extboot.bin after merging both patches. pc-bios/optionrom/extboot.S | 41 ++--- 1 files changed, 30 insertions(+), 11 deletions(-) diff --git a/pc-bios/optionrom/extboot.S b/pc-bios/optionrom/extboot.S index 1e60f68..1eeb172 100644 --- a/pc-bios/optionrom/extboot.S +++ b/pc-bios/optionrom/extboot.S @@ -19,6 +19,9 @@ * Authors: Anthony Liguori */ +#define OLD_INT19 (0x80 * 4) /* re-use INT 0x80 BASIC vector */ +#define OLD_INT13 (0x81 * 4) /* re-use INT 0x81 BASIC vector */ + .code16 .text .global _start @@ -37,7 +40,7 @@ _start: /* save old int 19 */ mov (0x19*4), %eax - mov %eax, %cs:old_int19 + mov %eax, (OLD_INT19) /* install out int 19 handler */ movw $int19_handler, (0x19*4) @@ -48,6 +51,7 @@ _start: lret int19_handler: + push %eax /* reserve space for lret */ push %eax push %bx push %cx @@ -69,7 +73,7 @@ int19_handler: 1: /* hook int13: intb(0x404) == 1 */ /* save old int 13 to int 2c */ mov (0x13*4), %eax - mov %eax, %cs:old_int13 + mov %eax, (OLD_INT13) /* install our int 13 handler */ movw $int13_handler, (0x13*4) @@ -90,15 +94,21 @@ int19_handler: 3: /* fall through: inb(0x404) == 0 */ /* restore previous int $0x19 handler */ - mov %cs:old_int19,%eax + mov (OLD_INT19),%eax mov %eax,(0x19*4) - + + /* write old handler as return address onto stack */ + push %bp + mov %sp, %bp + mov %eax, 14(%bp) + pop %bp + pop %ds pop %dx pop %cx pop %bx pop %eax - ljmpw *%cs:old_int19 + lret #define FLAGS_CF 0x01 @@ -626,7 +636,21 @@ terminate_disk_emulation: int13_handler: cmp $0x80, %dl je 1f - ljmpw *%cs:old_int13 + + /* write old handler as return address onto stack */ + push %eax + push %eax + push %ds + push %bp + mov %sp, %bp + xor %ax, %ax + mov %ax, %ds + mov (OLD_INT13), %eax + mov %eax, 8(%bp) + pop %bp + pop %ds + pop %eax + lret 1: cmp $0x0, %ah jne 1f @@ -686,10 +710,5 @@ int13_handler: int $0x18 /* boot failed */ iret -/* Variables */ -.align 4, 0 -old_int13: .long 0 -old_int19: .long 0 - .align 512, 0 _end: -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] qemu-kvm: extboot: Clean up host-guest interface
On 02/18/2010 10:13 AM, Jan Kiszka wrote: Drop the unused boot mode port 0x404 from the host-guest interface of extboot and remove related code from both sides. Signed-off-by: Jan Kiszka Makes sense. Acked-by: Anthony Liguori --- hw/extboot.c| 14 +- hw/pc.c |2 +- hw/pc.h |2 +- pc-bios/optionrom/extboot.S | 23 --- 4 files changed, 3 insertions(+), 38 deletions(-) diff --git a/hw/extboot.c b/hw/extboot.c index b91d54f..8ada21b 100644 --- a/hw/extboot.c +++ b/hw/extboot.c @@ -65,12 +65,6 @@ static void get_translated_chs(BlockDriverState *bs, int *c, int *h, int *s) } } -static uint32_t extboot_read(void *opaque, uint32_t addr) -{ -int *pcmd = opaque; -return *pcmd; -} - static void extboot_write_cmd(void *opaque, uint32_t addr, uint32_t value) { union extboot_cmd cmd; @@ -123,13 +117,7 @@ static void extboot_write_cmd(void *opaque, uint32_t addr, uint32_t value) qemu_free(buf); } -void extboot_init(BlockDriverState *bs, int cmd) +void extboot_init(BlockDriverState *bs) { -int *pcmd; - -pcmd = qemu_mallocz(sizeof(int)); - -*pcmd = cmd; -register_ioport_read(0x404, 1, 1, extboot_read, pcmd); register_ioport_write(0x405, 1, 2, extboot_write_cmd, bs); } diff --git a/hw/pc.c b/hw/pc.c index 97e16ce..8175874 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -1057,7 +1057,7 @@ static void pc_init1(ram_addr_t ram_size, bdrv_set_geometry_hint(info->bdrv, cyls, heads, secs); } - extboot_init(info->bdrv, 1); + extboot_init(info->bdrv); } #ifdef CONFIG_KVM_DEVICE_ASSIGNMENT diff --git a/hw/pc.h b/hw/pc.h index b00f311..e9da683 100644 --- a/hw/pc.h +++ b/hw/pc.h @@ -165,7 +165,7 @@ void isa_ne2000_init(int base, int irq, NICInfo *nd); /* extboot.c */ -void extboot_init(BlockDriverState *bs, int cmd); +void extboot_init(BlockDriverState *bs); int cpu_is_bsp(CPUState *env); diff --git a/pc-bios/optionrom/extboot.S b/pc-bios/optionrom/extboot.S index 1eeb172..db6c2b6 100644 --- a/pc-bios/optionrom/extboot.S +++ b/pc-bios/optionrom/extboot.S @@ -62,15 +62,6 @@ int19_handler: xor %ax, %ax mov %ax, %ds - movw $0x404, %dx - inb %dx, %al - cmp $1, %al - je 1f - cmp $2, %al - je 2f - jmp 3f - -1: /* hook int13: intb(0x404) == 1 */ /* save old int 13 to int 2c */ mov (0x13*4), %eax mov %eax, (OLD_INT13) @@ -78,21 +69,7 @@ int19_handler: /* install our int 13 handler */ movw $int13_handler, (0x13*4) mov %cs, (0x13*4+2) - jmp 3f -2: /* linux boot: intb(0x404) == 2 */ - cli - cld - mov $0x9000, %ax - mov %ax, %ds - mov %ax, %es - mov %ax, %fs - mov %ax, %gs - mov %ax, %ss - mov $0x8ffe, %sp - ljmp $0x9000 + 0x20, $0 - -3: /* fall through: inb(0x404) == 0 */ /* restore previous int $0x19 handler */ mov (OLD_INT19),%eax mov %eax,(0x19*4) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/10] Nested SVM fixes (and Win7-64bit bringup)
On Thu, Feb 18, 2010 at 04:54:37PM +0200, Avi Kivity wrote: > On 02/18/2010 04:48 PM, Alexander Graf wrote: > >On 18.02.2010, at 15:33, Avi Kivity wrote: > > > >>On 02/18/2010 01:38 PM, Joerg Roedel wrote: > >>>Hi, > >>> > >>>here is a couple of fixes for the nested SVM implementation. I collected > >>>these > >>>fixes mostly when trying to get Windows 7 64bit running as an L2 guest. > >>>Most > >>>important fixes in this set make lazy fpu switching working with nested > >>>SVM and > >>>the nested tpr handling fixes. Without the later fix the l1 guest freezes > >>>when > >>>trying to run win7 as l2 guest. Please review and comment on these patches > >>>:-) > >>> > >>Overall looks good. Would appreciate Alex looking over these as well. > >The kmap thing is broken though, right? > > Oh yes, but that's a search and replace, not something needing deep rework. Great. I'll post again with fixes soon. Joerg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/10] KVM: SVM: Make lazy FPU switching work with nested svm
On Thu, Feb 18, 2010 at 04:55:06PM +0200, Avi Kivity wrote: > On 02/18/2010 04:51 PM, Alexander Graf wrote: > >On 18.02.2010, at 12:38, Joerg Roedel wrote: > > > >>TDB. > >TDB? That's not a patch description. > > > > Short for "To De Befined", I presume. Ups. I just forgot to give this patch a right commit message. I add one for the next post. Joerg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/10] KVM: SVM: Make lazy FPU switching work with nested svm
On Thu, Feb 18, 2010 at 04:32:02PM +0200, Avi Kivity wrote: > On 02/18/2010 01:38 PM, Joerg Roedel wrote: > >TDB. > > > > ... > > >@@ -973,6 +973,7 @@ static void svm_decache_cr4_guest_bits(struct kvm_vcpu > >*vcpu) > > > > static void update_cr0_intercept(struct vcpu_svm *svm) > > { > >+struct vmcb *vmcb = svm->vmcb; > > ulong gcr0 = svm->vcpu.arch.cr0; > > u64 *hcr0 =&svm->vmcb->save.cr0; > > > >@@ -984,11 +985,25 @@ static void update_cr0_intercept(struct vcpu_svm *svm) > > > > > > if (gcr0 == *hcr0&& svm->vcpu.fpu_active) { > >-svm->vmcb->control.intercept_cr_read&= ~INTERCEPT_CR0_MASK; > >-svm->vmcb->control.intercept_cr_write&= ~INTERCEPT_CR0_MASK; > >+vmcb->control.intercept_cr_read&= ~INTERCEPT_CR0_MASK; > >+vmcb->control.intercept_cr_write&= ~INTERCEPT_CR0_MASK; > >+if (is_nested(svm)) { > >+struct vmcb *hsave = svm->nested.hsave; > >+ > >+hsave->control.intercept_cr_read&= ~INTERCEPT_CR0_MASK; > >+hsave->control.intercept_cr_write&= ~INTERCEPT_CR0_MASK; > >+vmcb->control.intercept_cr_read |= > >svm->nested.intercept_cr_read; > >+vmcb->control.intercept_cr_write |= > >svm->nested.intercept_cr_write; > > Why are the last two lines needed? Because we don't know if the l1 hypervisor wants to intercept cr0. In this case we need this intercept to stay enabled. > >+} > > } else { > > svm->vmcb->control.intercept_cr_read |= INTERCEPT_CR0_MASK; > > svm->vmcb->control.intercept_cr_write |= INTERCEPT_CR0_MASK; > >+if (is_nested(svm)) { > >+struct vmcb *hsave = svm->nested.hsave; > >+ > >+hsave->control.intercept_cr_read |= INTERCEPT_CR0_MASK; > >+hsave->control.intercept_cr_write |= INTERCEPT_CR0_MASK; > >+} > > } > > } > > Maybe it's better to call update_cr0_intercept() after a vmexit > instead, to avoid this repetition, and since the if () may take a > different branch for the nested guest and guest cr0. Thinking again about it I am not sure if this is needed at all. At vmexit emulation we call svm_set_cr0 which itself calls update_cr0_intercept. I'll try this. Joerg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/10] KVM: SVM: Fix schedule-while-atomic on nested exception handling
On Thu, Feb 18, 2010 at 03:52:20PM +0200, Avi Kivity wrote: > On 02/18/2010 01:38 PM, Joerg Roedel wrote: > >Move the actual vmexit routine out of code that runs with > >irqs and preemption disabled. > > > >Cc: sta...@kernel.org > >Signed-off-by: Joerg Roedel > >--- > > arch/x86/kvm/svm.c | 20 +--- > > 1 files changed, 17 insertions(+), 3 deletions(-) > > > >diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > >index 7c96b8b..25d26ec 100644 > >--- a/arch/x86/kvm/svm.c > >+++ b/arch/x86/kvm/svm.c > >@@ -128,6 +128,7 @@ static void svm_flush_tlb(struct kvm_vcpu *vcpu); > > static void svm_complete_interrupts(struct vcpu_svm *svm); > > > > static int nested_svm_exit_handled(struct vcpu_svm *svm); > >+static int nested_svm_exit_handled_atomic(struct vcpu_svm *svm); > > static int nested_svm_vmexit(struct vcpu_svm *svm); > > static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr, > > bool has_error_code, u32 error_code); > >@@ -1386,7 +1387,7 @@ static int nested_svm_check_exception(struct vcpu_svm > >*svm, unsigned nr, > > svm->vmcb->control.exit_info_1 = error_code; > > svm->vmcb->control.exit_info_2 = svm->vcpu.arch.cr2; > > > >-return nested_svm_exit_handled(svm); > >+return nested_svm_exit_handled_atomic(svm); > > } > > What do you say to > > >if (nested_svm_intercepts(svm)) > svm->nested.exit_required = true; > > here, and recoding nested_svm_exit_handled() to call > nested_svm_intercepts()? I think it improves readability a little > by avoiding a function that changes behaviour according to how it is > called. Thats a good idea, will change that. It improves readability. > Long term, we may want to split out the big switch into the > individual handlers, to avoid decoding the exit reason twice. I don't think thats a good idea. The nested exit handling is at the beginning of svm_handle_exit to hide the nested vcpu state from the kvm logic which may not be aware of nesting at all. My rationale is that the host hypervisor don't see any exits that needs to be reinjected to the l1 hypervisor. Joerg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] KVM: SVM: Don't use kmap_atomic in nested_svm_map
On Thu, Feb 18, 2010 at 03:40:56PM +0200, Avi Kivity wrote: > On 02/18/2010 01:38 PM, Joerg Roedel wrote: > >Use of kmap_atomic disables preemption but if we run in > >shadow-shadow mode the vmrun emulation executes kvm_set_cr3 > >which might sleep or fault. So use kmap instead for > >nested_svm_map. > > > > > > > >-static void nested_svm_unmap(void *addr, enum km_type idx) > >+static void nested_svm_unmap(void *addr) > > { > > struct page *page; > > > >@@ -1443,7 +1443,7 @@ static void nested_svm_unmap(void *addr, enum km_type > >idx) > > > > page = kmap_atomic_to_page(addr); > > > >-kunmap_atomic(addr, idx); > >+kunmap(addr); > > kvm_release_page_dirty(page); > > } > > kunmap() takes a struct page *, not the virtual address (a > consistent source of bugs). Ah true, thanks. I'll fix that. > kmap() is generally an unloved interface, it is slow and possibly > deadlock prone, but it's better than sleeping in atomic context. If > you can hack your way around it, that is preferred. Best would be to use kvm_read_guest, but I fear that this will have an performance impact. Maybe I'll try this and measure if it really has a significant performance impact. Joerg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] qemu-kvm: extboot: Clean up host-guest interface
Drop the unused boot mode port 0x404 from the host-guest interface of extboot and remove related code from both sides. Signed-off-by: Jan Kiszka --- hw/extboot.c| 14 +- hw/pc.c |2 +- hw/pc.h |2 +- pc-bios/optionrom/extboot.S | 23 --- 4 files changed, 3 insertions(+), 38 deletions(-) diff --git a/hw/extboot.c b/hw/extboot.c index b91d54f..8ada21b 100644 --- a/hw/extboot.c +++ b/hw/extboot.c @@ -65,12 +65,6 @@ static void get_translated_chs(BlockDriverState *bs, int *c, int *h, int *s) } } -static uint32_t extboot_read(void *opaque, uint32_t addr) -{ -int *pcmd = opaque; -return *pcmd; -} - static void extboot_write_cmd(void *opaque, uint32_t addr, uint32_t value) { union extboot_cmd cmd; @@ -123,13 +117,7 @@ static void extboot_write_cmd(void *opaque, uint32_t addr, uint32_t value) qemu_free(buf); } -void extboot_init(BlockDriverState *bs, int cmd) +void extboot_init(BlockDriverState *bs) { -int *pcmd; - -pcmd = qemu_mallocz(sizeof(int)); - -*pcmd = cmd; -register_ioport_read(0x404, 1, 1, extboot_read, pcmd); register_ioport_write(0x405, 1, 2, extboot_write_cmd, bs); } diff --git a/hw/pc.c b/hw/pc.c index 97e16ce..8175874 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -1057,7 +1057,7 @@ static void pc_init1(ram_addr_t ram_size, bdrv_set_geometry_hint(info->bdrv, cyls, heads, secs); } - extboot_init(info->bdrv, 1); + extboot_init(info->bdrv); } #ifdef CONFIG_KVM_DEVICE_ASSIGNMENT diff --git a/hw/pc.h b/hw/pc.h index b00f311..e9da683 100644 --- a/hw/pc.h +++ b/hw/pc.h @@ -165,7 +165,7 @@ void isa_ne2000_init(int base, int irq, NICInfo *nd); /* extboot.c */ -void extboot_init(BlockDriverState *bs, int cmd); +void extboot_init(BlockDriverState *bs); int cpu_is_bsp(CPUState *env); diff --git a/pc-bios/optionrom/extboot.S b/pc-bios/optionrom/extboot.S index 1eeb172..db6c2b6 100644 --- a/pc-bios/optionrom/extboot.S +++ b/pc-bios/optionrom/extboot.S @@ -62,15 +62,6 @@ int19_handler: xor %ax, %ax mov %ax, %ds - movw $0x404, %dx - inb %dx, %al - cmp $1, %al - je 1f - cmp $2, %al - je 2f - jmp 3f - -1: /* hook int13: intb(0x404) == 1 */ /* save old int 13 to int 2c */ mov (0x13*4), %eax mov %eax, (OLD_INT13) @@ -78,21 +69,7 @@ int19_handler: /* install our int 13 handler */ movw $int13_handler, (0x13*4) mov %cs, (0x13*4+2) - jmp 3f -2: /* linux boot: intb(0x404) == 2 */ - cli - cld - mov $0x9000, %ax - mov %ax, %ds - mov %ax, %es - mov %ax, %fs - mov %ax, %gs - mov %ax, %ss - mov $0x8ffe, %sp - ljmp $0x9000 + 0x20, $0 - -3: /* fall through: inb(0x404) == 0 */ /* restore previous int $0x19 handler */ mov (OLD_INT19),%eax mov %eax,(0x19*4) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] qemu-kvm: extboot: Keep variables in RAM
Instead of saving the old INT 0x13 and 0x19 handlers in ROM which fails under QEMU as it enforces protection, keep them in spare vectors of the interrupt table, namely INT 0x80 and 0x81. Signed-off-by: Jan Kiszka --- Don't forget to update extboot.bin after merging both patches. pc-bios/optionrom/extboot.S | 41 ++--- 1 files changed, 30 insertions(+), 11 deletions(-) diff --git a/pc-bios/optionrom/extboot.S b/pc-bios/optionrom/extboot.S index 1e60f68..1eeb172 100644 --- a/pc-bios/optionrom/extboot.S +++ b/pc-bios/optionrom/extboot.S @@ -19,6 +19,9 @@ * Authors: Anthony Liguori */ +#define OLD_INT19 (0x80 * 4) /* re-use INT 0x80 BASIC vector */ +#define OLD_INT13 (0x81 * 4) /* re-use INT 0x81 BASIC vector */ + .code16 .text .global _start @@ -37,7 +40,7 @@ _start: /* save old int 19 */ mov (0x19*4), %eax - mov %eax, %cs:old_int19 + mov %eax, (OLD_INT19) /* install out int 19 handler */ movw $int19_handler, (0x19*4) @@ -48,6 +51,7 @@ _start: lret int19_handler: + push %eax /* reserve space for lret */ push %eax push %bx push %cx @@ -69,7 +73,7 @@ int19_handler: 1: /* hook int13: intb(0x404) == 1 */ /* save old int 13 to int 2c */ mov (0x13*4), %eax - mov %eax, %cs:old_int13 + mov %eax, (OLD_INT13) /* install our int 13 handler */ movw $int13_handler, (0x13*4) @@ -90,15 +94,21 @@ int19_handler: 3: /* fall through: inb(0x404) == 0 */ /* restore previous int $0x19 handler */ - mov %cs:old_int19,%eax + mov (OLD_INT19),%eax mov %eax,(0x19*4) - + + /* write old handler as return address onto stack */ + push %bp + mov %sp, %bp + mov %eax, 14(%bp) + pop %bp + pop %ds pop %dx pop %cx pop %bx pop %eax - ljmpw *%cs:old_int19 + lret #define FLAGS_CF 0x01 @@ -626,7 +636,21 @@ terminate_disk_emulation: int13_handler: cmp $0x80, %dl je 1f - ljmpw *%cs:old_int13 + + /* write old handler as return address onto stack */ + push %eax + push %eax + push %ds + push %bp + mov %sp, %bp + xor %ax, %ax + mov %ax, %ds + mov (OLD_INT13), %eax + mov %eax, 8(%bp) + pop %bp + pop %ds + pop %eax + lret 1: cmp $0x0, %ah jne 1f @@ -686,10 +710,5 @@ int13_handler: int $0x18 /* boot failed */ iret -/* Variables */ -.align 4, 0 -old_int13: .long 0 -old_int19: .long 0 - .align 512, 0 _end: -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/10] KVM: SVM: Make lazy FPU switching work with nested svm
On 02/18/2010 04:51 PM, Alexander Graf wrote: On 18.02.2010, at 12:38, Joerg Roedel wrote: TDB. TDB? That's not a patch description. Short for "To De Befined", I presume. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/10] Nested SVM fixes (and Win7-64bit bringup)
On 02/18/2010 04:48 PM, Alexander Graf wrote: On 18.02.2010, at 15:33, Avi Kivity wrote: On 02/18/2010 01:38 PM, Joerg Roedel wrote: Hi, here is a couple of fixes for the nested SVM implementation. I collected these fixes mostly when trying to get Windows 7 64bit running as an L2 guest. Most important fixes in this set make lazy fpu switching working with nested SVM and the nested tpr handling fixes. Without the later fix the l1 guest freezes when trying to run win7 as l2 guest. Please review and comment on these patches :-) Overall looks good. Would appreciate Alex looking over these as well. The kmap thing is broken though, right? Oh yes, but that's a search and replace, not something needing deep rework. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/10] KVM: SVM: Make lazy FPU switching work with nested svm
On 18.02.2010, at 12:38, Joerg Roedel wrote: > TDB. TDB? That's not a patch description. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/10] Nested SVM fixes (and Win7-64bit bringup)
On 18.02.2010, at 15:33, Avi Kivity wrote: > On 02/18/2010 01:38 PM, Joerg Roedel wrote: >> Hi, >> >> here is a couple of fixes for the nested SVM implementation. I collected >> these >> fixes mostly when trying to get Windows 7 64bit running as an L2 guest. Most >> important fixes in this set make lazy fpu switching working with nested SVM >> and >> the nested tpr handling fixes. Without the later fix the l1 guest freezes >> when >> trying to run win7 as l2 guest. Please review and comment on these patches >> :-) >> > > Overall looks good. Would appreciate Alex looking over these as well. The kmap thing is broken though, right? Alex-- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch uq/master 2/4] qemu: kvm specific wait_io_event
On 02/18/2010 03:58 PM, Marcelo Tosatti wrote: On Thu, Feb 18, 2010 at 10:29:35AM +0200, Avi Kivity wrote: +static void qemu_kvm_wait_io_event(CPUState *env) +{ +while (!cpu_has_work(env)) +qemu_cond_timedwait(env->halt_cond,&qemu_global_mutex, 1000); + +qemu_wait_io_event_common(env); } Shouldn't kvm specific code be in kvm-all.c? The context is in vl.c, so don't see much gain. ok. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/10] Nested SVM fixes (and Win7-64bit bringup)
On 02/18/2010 01:38 PM, Joerg Roedel wrote: Hi, here is a couple of fixes for the nested SVM implementation. I collected these fixes mostly when trying to get Windows 7 64bit running as an L2 guest. Most important fixes in this set make lazy fpu switching working with nested SVM and the nested tpr handling fixes. Without the later fix the l1 guest freezes when trying to run win7 as l2 guest. Please review and comment on these patches :-) Overall looks good. Would appreciate Alex looking over these as well. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/10] KVM: SVM: Make lazy FPU switching work with nested svm
On 02/18/2010 01:38 PM, Joerg Roedel wrote: TDB. ... @@ -973,6 +973,7 @@ static void svm_decache_cr4_guest_bits(struct kvm_vcpu *vcpu) static void update_cr0_intercept(struct vcpu_svm *svm) { + struct vmcb *vmcb = svm->vmcb; ulong gcr0 = svm->vcpu.arch.cr0; u64 *hcr0 =&svm->vmcb->save.cr0; @@ -984,11 +985,25 @@ static void update_cr0_intercept(struct vcpu_svm *svm) if (gcr0 == *hcr0&& svm->vcpu.fpu_active) { - svm->vmcb->control.intercept_cr_read&= ~INTERCEPT_CR0_MASK; - svm->vmcb->control.intercept_cr_write&= ~INTERCEPT_CR0_MASK; + vmcb->control.intercept_cr_read&= ~INTERCEPT_CR0_MASK; + vmcb->control.intercept_cr_write&= ~INTERCEPT_CR0_MASK; + if (is_nested(svm)) { + struct vmcb *hsave = svm->nested.hsave; + + hsave->control.intercept_cr_read&= ~INTERCEPT_CR0_MASK; + hsave->control.intercept_cr_write&= ~INTERCEPT_CR0_MASK; + vmcb->control.intercept_cr_read |= svm->nested.intercept_cr_read; + vmcb->control.intercept_cr_write |= svm->nested.intercept_cr_write; Why are the last two lines needed? + } } else { svm->vmcb->control.intercept_cr_read |= INTERCEPT_CR0_MASK; svm->vmcb->control.intercept_cr_write |= INTERCEPT_CR0_MASK; + if (is_nested(svm)) { + struct vmcb *hsave = svm->nested.hsave; + + hsave->control.intercept_cr_read |= INTERCEPT_CR0_MASK; + hsave->control.intercept_cr_write |= INTERCEPT_CR0_MASK; + } } } Maybe it's better to call update_cr0_intercept() after a vmexit instead, to avoid this repetition, and since the if () may take a different branch for the nested guest and guest cr0. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: extboot: Purpose of cmd value
On 02/18/2010 08:01 AM, Jan Kiszka wrote: Hi Anthony, I have to fix extboot for non-KVM usage which means pushing its variables to writable RAM (probably some spare INT vector(s) in the BASIC area). In that process, I stumbled over the cmd value passed to extboot_init, then stored in qemu_malloc'ed memory, and finally reported to the guest side via port 0x404. It's constant so far, always 1. So there is quite a bit unused code and data on both sides. Can we drop that? Yes, that's historic from the days when extboot also did Linux kernel loading. I don't see a problem dropping. Regards, Anthony Liguori Jan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
extboot: Purpose of cmd value
Hi Anthony, I have to fix extboot for non-KVM usage which means pushing its variables to writable RAM (probably some spare INT vector(s) in the BASIC area). In that process, I stumbled over the cmd value passed to extboot_init, then stored in qemu_malloc'ed memory, and finally reported to the guest side via port 0x404. It's constant so far, always 1. So there is quite a bit unused code and data on both sides. Can we drop that? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch uq/master 2/4] qemu: kvm specific wait_io_event
On Thu, Feb 18, 2010 at 10:29:35AM +0200, Avi Kivity wrote: > >+static void qemu_kvm_wait_io_event(CPUState *env) > >+{ > >+while (!cpu_has_work(env)) > >+qemu_cond_timedwait(env->halt_cond,&qemu_global_mutex, 1000); > >+ > >+qemu_wait_io_event_common(env); > > } > > Shouldn't kvm specific code be in kvm-all.c? The context is in vl.c, so don't see much gain. > > > > static int qemu_cpu_exec(CPUState *env); > >@@ -3448,7 +3462,7 @@ static void *kvm_cpu_thread_fn(void *arg > > while (1) { > > if (cpu_can_run(env)) > > qemu_cpu_exec(env); > >-qemu_wait_io_event(env); > >+qemu_kvm_wait_io_event(env); > > } > > > > return NULL; > > Well, kvm_cpu_thread_fn() apparently isn't. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/10] KVM: SVM: Fix schedule-while-atomic on nested exception handling
On 02/18/2010 01:38 PM, Joerg Roedel wrote: Move the actual vmexit routine out of code that runs with irqs and preemption disabled. Cc: sta...@kernel.org Signed-off-by: Joerg Roedel --- arch/x86/kvm/svm.c | 20 +--- 1 files changed, 17 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 7c96b8b..25d26ec 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -128,6 +128,7 @@ static void svm_flush_tlb(struct kvm_vcpu *vcpu); static void svm_complete_interrupts(struct vcpu_svm *svm); static int nested_svm_exit_handled(struct vcpu_svm *svm); +static int nested_svm_exit_handled_atomic(struct vcpu_svm *svm); static int nested_svm_vmexit(struct vcpu_svm *svm); static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr, bool has_error_code, u32 error_code); @@ -1386,7 +1387,7 @@ static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr, svm->vmcb->control.exit_info_1 = error_code; svm->vmcb->control.exit_info_2 = svm->vcpu.arch.cr2; - return nested_svm_exit_handled(svm); + return nested_svm_exit_handled_atomic(svm); } What do you say to if (nested_svm_intercepts(svm)) svm->nested.exit_required = true; here, and recoding nested_svm_exit_handled() to call nested_svm_intercepts()? I think it improves readability a little by avoiding a function that changes behaviour according to how it is called. Long term, we may want to split out the big switch into the individual handlers, to avoid decoding the exit reason twice. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] KVM: SVM: Don't use kmap_atomic in nested_svm_map
On 02/18/2010 01:38 PM, Joerg Roedel wrote: Use of kmap_atomic disables preemption but if we run in shadow-shadow mode the vmrun emulation executes kvm_set_cr3 which might sleep or fault. So use kmap instead for nested_svm_map. -static void nested_svm_unmap(void *addr, enum km_type idx) +static void nested_svm_unmap(void *addr) { struct page *page; @@ -1443,7 +1443,7 @@ static void nested_svm_unmap(void *addr, enum km_type idx) page = kmap_atomic_to_page(addr); - kunmap_atomic(addr, idx); + kunmap(addr); kvm_release_page_dirty(page); } kunmap() takes a struct page *, not the virtual address (a consistent source of bugs). kmap() is generally an unloved interface, it is slow and possibly deadlock prone, but it's better than sleeping in atomic context. If you can hack your way around it, that is preferred. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Remove all references to KVM_CR3_CACHE
On 02/18/2010 11:57 AM, jes.soren...@redhat.com wrote: This patch removes all references to KVM_CR3_CACHE as suggested by Marcello. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm: do not allow vcpu stop with in progress PIO
On 02/09/2010 10:58 PM, Marcelo Tosatti wrote: You're right... this should be enough to avoid a stop with uncomplete PIO (and this is what happens for MMIO already). The signal will not be dequeued, so KVM will complete_pio and exit before entering with -EAGAIN. Please review and queue for stable. qemu upstream needs a bit more work. --- Re-enter the kernel to complete in progress PIO. Otherwise the operation can be lost during migration. Thanks - applied to master and 0.12. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm-0.12.2 hangs when booting grub, when kvm is disabled
Gleb Natapov wrote: > On Thu, Feb 18, 2010 at 12:32:39PM +0100, Jan Kiszka wrote: >> Jacques Landru wrote: >>> Hi, >>> >>> Same problem here >>> >>> qemu-kvm-0.12.x hangs if I have at the same time "-no-kvm" and >>> "file=essai-slitaz.raw,if=ide,index=0,boot=on" sometime with the >>> message below >>> >>> but >>> >>> qemu-kvm with "-no-kvm" and without "boot=on" option for the file >>> parameter, works >>> qemu-kvm with "boot=on" option and kvm enable works. (kvm-kmod is 2.6.32.7) >> I have to confirm this issue: Something badly crashes here as well, >> either grub or the extboot ROM or Seabios. >> >> Does anyone has a good idea what makes the difference here, ie. where to >> start debugging? >> > May be TCG interprets something incorrectly in extboot.bin. Sounds > unlikely, but symptoms look like this is the case. Looks like the old story again: extboot tries to write to ROM (old_int13 and old_int19 variables). That happens to work due to KVM limitations, but breaks once true protection is established. Don't we have some heap managed by Seabios that extension ROMs can use? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/10] KVM: SVM: Sync all control registers on nested vmexit
Currently the vmexit emulation does not sync control registers were the access is typically intercepted by the nested hypervisor. But we can not count on that intercepts to sync these registers too and make the code architecturally more correct. Cc: sta...@kernel.org Signed-off-by: Joerg Roedel --- arch/x86/kvm/svm.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 25d26ec..9a4f9ee 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1644,9 +1644,13 @@ static int nested_svm_vmexit(struct vcpu_svm *svm) nested_vmcb->save.ds = vmcb->save.ds; nested_vmcb->save.gdtr = vmcb->save.gdtr; nested_vmcb->save.idtr = vmcb->save.idtr; + nested_vmcb->save.cr0= kvm_read_cr0(&svm->vcpu); if (npt_enabled) nested_vmcb->save.cr3= vmcb->save.cr3; + else + nested_vmcb->save.cr3= svm->vcpu.arch.cr3; nested_vmcb->save.cr2= vmcb->save.cr2; + nested_vmcb->save.cr4= svm->vcpu.arch.cr4; nested_vmcb->save.rflags = vmcb->save.rflags; nested_vmcb->save.rip= vmcb->save.rip; nested_vmcb->save.rsp= vmcb->save.rsp; -- 1.6.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm-0.12.2 hangs when booting grub, when kvm is disabled
Gleb Natapov wrote: > On Thu, Feb 18, 2010 at 12:32:39PM +0100, Jan Kiszka wrote: >> Jacques Landru wrote: >>> Hi, >>> >>> Same problem here >>> >>> qemu-kvm-0.12.x hangs if I have at the same time "-no-kvm" and >>> "file=essai-slitaz.raw,if=ide,index=0,boot=on" sometime with the >>> message below >>> >>> but >>> >>> qemu-kvm with "-no-kvm" and without "boot=on" option for the file >>> parameter, works >>> qemu-kvm with "boot=on" option and kvm enable works. (kvm-kmod is 2.6.32.7) >> I have to confirm this issue: Something badly crashes here as well, >> either grub or the extboot ROM or Seabios. >> >> Does anyone has a good idea what makes the difference here, ie. where to >> start debugging? >> > May be TCG interprets something incorrectly in extboot.bin. Sounds > unlikely, but symptoms look like this is the case. Yes, maybe. I'm currently building a debug version that has support for TCG block tracing enabled. Should get us closer to the crash. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm-0.12.2 hangs when booting grub, when kvm is disabled
On Thu, Feb 18, 2010 at 12:32:39PM +0100, Jan Kiszka wrote: > Jacques Landru wrote: > > Hi, > > > > Same problem here > > > > qemu-kvm-0.12.x hangs if I have at the same time "-no-kvm" and > > "file=essai-slitaz.raw,if=ide,index=0,boot=on" sometime with the > > message below > > > > but > > > > qemu-kvm with "-no-kvm" and without "boot=on" option for the file > > parameter, works > > qemu-kvm with "boot=on" option and kvm enable works. (kvm-kmod is 2.6.32.7) > > I have to confirm this issue: Something badly crashes here as well, > either grub or the extboot ROM or Seabios. > > Does anyone has a good idea what makes the difference here, ie. where to > start debugging? > May be TCG interprets something incorrectly in extboot.bin. Sounds unlikely, but symptoms look like this is the case. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/10] KVM: SVM: Annotate nested_svm_map with might_sleep()
The nested_svm_map() function can sleep and must not be called from atomic context. So annotate that function. Signed-off-by: Joerg Roedel --- arch/x86/kvm/svm.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 9a4f9ee..3f59cbd 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1423,6 +1423,8 @@ static void *nested_svm_map(struct vcpu_svm *svm, u64 gpa) { struct page *page; + might_sleep(); + page = gfn_to_page(svm->vcpu.kvm, gpa >> PAGE_SHIFT); if (is_error_page(page)) goto error; -- 1.6.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/10] KVM: SVM: Don't use kmap_atomic in nested_svm_map
Use of kmap_atomic disables preemption but if we run in shadow-shadow mode the vmrun emulation executes kvm_set_cr3 which might sleep or fault. So use kmap instead for nested_svm_map. Cc: sta...@kernel.org Signed-off-by: Joerg Roedel --- arch/x86/kvm/svm.c | 32 1 files changed, 16 insertions(+), 16 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 52f78dd..041ef6f 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1417,7 +1417,7 @@ static inline int nested_svm_intr(struct vcpu_svm *svm) return 0; } -static void *nested_svm_map(struct vcpu_svm *svm, u64 gpa, enum km_type idx) +static void *nested_svm_map(struct vcpu_svm *svm, u64 gpa) { struct page *page; @@ -1425,7 +1425,7 @@ static void *nested_svm_map(struct vcpu_svm *svm, u64 gpa, enum km_type idx) if (is_error_page(page)) goto error; - return kmap_atomic(page, idx); + return kmap(page); error: kvm_release_page_clean(page); @@ -1434,7 +1434,7 @@ error: return NULL; } -static void nested_svm_unmap(void *addr, enum km_type idx) +static void nested_svm_unmap(void *addr) { struct page *page; @@ -1443,7 +1443,7 @@ static void nested_svm_unmap(void *addr, enum km_type idx) page = kmap_atomic_to_page(addr); - kunmap_atomic(addr, idx); + kunmap(addr); kvm_release_page_dirty(page); } @@ -1458,7 +1458,7 @@ static bool nested_svm_exit_handled_msr(struct vcpu_svm *svm) if (!(svm->nested.intercept & (1ULL << INTERCEPT_MSR_PROT))) return false; - msrpm = nested_svm_map(svm, svm->nested.vmcb_msrpm, KM_USER0); + msrpm = nested_svm_map(svm, svm->nested.vmcb_msrpm); if (!msrpm) goto out; @@ -1486,7 +1486,7 @@ static bool nested_svm_exit_handled_msr(struct vcpu_svm *svm) ret = msrpm[t1] & ((1 << param) << t0); out: - nested_svm_unmap(msrpm, KM_USER0); + nested_svm_unmap(msrpm); return ret; } @@ -1616,7 +1616,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm) vmcb->control.exit_int_info, vmcb->control.exit_int_info_err); - nested_vmcb = nested_svm_map(svm, svm->nested.vmcb, KM_USER0); + nested_vmcb = nested_svm_map(svm, svm->nested.vmcb); if (!nested_vmcb) return 1; @@ -1706,7 +1706,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm) /* Exit nested SVM mode */ svm->nested.vmcb = 0; - nested_svm_unmap(nested_vmcb, KM_USER0); + nested_svm_unmap(nested_vmcb); kvm_mmu_reset_context(&svm->vcpu); kvm_mmu_load(&svm->vcpu); @@ -1719,7 +1719,7 @@ static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm) u32 *nested_msrpm; int i; - nested_msrpm = nested_svm_map(svm, svm->nested.vmcb_msrpm, KM_USER0); + nested_msrpm = nested_svm_map(svm, svm->nested.vmcb_msrpm); if (!nested_msrpm) return false; @@ -1728,7 +1728,7 @@ static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm) svm->vmcb->control.msrpm_base_pa = __pa(svm->nested.msrpm); - nested_svm_unmap(nested_msrpm, KM_USER0); + nested_svm_unmap(nested_msrpm); return true; } @@ -1739,7 +1739,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) struct vmcb *hsave = svm->nested.hsave; struct vmcb *vmcb = svm->vmcb; - nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax, KM_USER0); + nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax); if (!nested_vmcb) return false; @@ -1851,7 +1851,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) svm->vmcb->control.event_inj = nested_vmcb->control.event_inj; svm->vmcb->control.event_inj_err = nested_vmcb->control.event_inj_err; - nested_svm_unmap(nested_vmcb, KM_USER0); + nested_svm_unmap(nested_vmcb); enable_gif(svm); @@ -1884,12 +1884,12 @@ static int vmload_interception(struct vcpu_svm *svm) svm->next_rip = kvm_rip_read(&svm->vcpu) + 3; skip_emulated_instruction(&svm->vcpu); - nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax, KM_USER0); + nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax); if (!nested_vmcb) return 1; nested_svm_vmloadsave(nested_vmcb, svm->vmcb); - nested_svm_unmap(nested_vmcb, KM_USER0); + nested_svm_unmap(nested_vmcb); return 1; } @@ -1904,12 +1904,12 @@ static int vmsave_interception(struct vcpu_svm *svm) svm->next_rip = kvm_rip_read(&svm->vcpu) + 3; skip_emulated_instruction(&svm->vcpu); - nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax, KM_USER0); + nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax); if (!nested_vmcb) return 1; nested_svm_vmloadsav
[PATCH 09/10] KVM: SVM: Make lazy FPU switching work with nested svm
TDB. Signed-off-by: Joerg Roedel --- arch/x86/kvm/svm.c | 43 +++ 1 files changed, 39 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index a64b871..ad419aa 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -973,6 +973,7 @@ static void svm_decache_cr4_guest_bits(struct kvm_vcpu *vcpu) static void update_cr0_intercept(struct vcpu_svm *svm) { + struct vmcb *vmcb = svm->vmcb; ulong gcr0 = svm->vcpu.arch.cr0; u64 *hcr0 = &svm->vmcb->save.cr0; @@ -984,11 +985,25 @@ static void update_cr0_intercept(struct vcpu_svm *svm) if (gcr0 == *hcr0 && svm->vcpu.fpu_active) { - svm->vmcb->control.intercept_cr_read &= ~INTERCEPT_CR0_MASK; - svm->vmcb->control.intercept_cr_write &= ~INTERCEPT_CR0_MASK; + vmcb->control.intercept_cr_read &= ~INTERCEPT_CR0_MASK; + vmcb->control.intercept_cr_write &= ~INTERCEPT_CR0_MASK; + if (is_nested(svm)) { + struct vmcb *hsave = svm->nested.hsave; + + hsave->control.intercept_cr_read &= ~INTERCEPT_CR0_MASK; + hsave->control.intercept_cr_write &= ~INTERCEPT_CR0_MASK; + vmcb->control.intercept_cr_read |= svm->nested.intercept_cr_read; + vmcb->control.intercept_cr_write |= svm->nested.intercept_cr_write; + } } else { svm->vmcb->control.intercept_cr_read |= INTERCEPT_CR0_MASK; svm->vmcb->control.intercept_cr_write |= INTERCEPT_CR0_MASK; + if (is_nested(svm)) { + struct vmcb *hsave = svm->nested.hsave; + + hsave->control.intercept_cr_read |= INTERCEPT_CR0_MASK; + hsave->control.intercept_cr_write |= INTERCEPT_CR0_MASK; + } } } @@ -1263,7 +1278,22 @@ static int ud_interception(struct vcpu_svm *svm) static void svm_fpu_activate(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); - svm->vmcb->control.intercept_exceptions &= ~(1 << NM_VECTOR); + u32 excp; + + if (is_nested(svm)) { + u32 h_excp, n_excp; + + h_excp = svm->nested.hsave->control.intercept_exceptions; + n_excp = svm->nested.intercept_exceptions; + h_excp &= ~(1 << NM_VECTOR); + excp= h_excp | n_excp; + } else { + excp = svm->vmcb->control.intercept_exceptions; + excp &= ~(1 << NM_VECTOR); + } + + svm->vmcb->control.intercept_exceptions = excp; + svm->vcpu.fpu_active = 1; update_cr0_intercept(svm); } @@ -1507,6 +1537,9 @@ static int nested_svm_exit_special(struct vcpu_svm *svm) if (!npt_enabled) return NESTED_EXIT_HOST; break; + case SVM_EXIT_EXCP_BASE + NM_VECTOR: + nm_interception(svm); + break; default: break; } @@ -2972,8 +3005,10 @@ static void svm_fpu_deactivate(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); - update_cr0_intercept(svm); svm->vmcb->control.intercept_exceptions |= 1 << NM_VECTOR; + if (is_nested(svm)) + svm->nested.hsave->control.intercept_exceptions |= 1 << NM_VECTOR; + update_cr0_intercept(svm); } static struct kvm_x86_ops svm_x86_ops = { -- 1.6.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/10] KVM: SVM: Fix wrong interrupt injection in enable_irq_windows
The nested_svm_intr() function does not execute the vmexit anymore. Therefore we may still be in the nested state after that function ran. This patch changes the nested_svm_intr() function to return wether the irq window could be enabled. Cc: sta...@kernel.org Signed-off-by: Joerg Roedel --- arch/x86/kvm/svm.c | 17 - 1 files changed, 8 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 041ef6f..7c96b8b 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1389,16 +1389,17 @@ static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr, return nested_svm_exit_handled(svm); } -static inline int nested_svm_intr(struct vcpu_svm *svm) +/* This function returns true if it is save to enable the irq window */ +static inline bool nested_svm_intr(struct vcpu_svm *svm) { if (!is_nested(svm)) - return 0; + return true; if (!(svm->vcpu.arch.hflags & HF_VINTR_MASK)) - return 0; + return true; if (!(svm->vcpu.arch.hflags & HF_HIF_MASK)) - return 0; + return false; svm->vmcb->control.exit_code = SVM_EXIT_INTR; @@ -1411,10 +1412,10 @@ static inline int nested_svm_intr(struct vcpu_svm *svm) */ svm->nested.exit_required = true; trace_kvm_nested_intr_vmexit(svm->vmcb->save.rip); - return 1; + return false; } - return 0; + return true; } static void *nested_svm_map(struct vcpu_svm *svm, u64 gpa) @@ -2562,13 +2563,11 @@ static void enable_irq_window(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); - nested_svm_intr(svm); - /* In case GIF=0 we can't rely on the CPU to tell us when * GIF becomes 1, because that's a separate STGI/VMRUN intercept. * The next time we get that intercept, this function will be * called again though and we'll get the vintr intercept. */ - if (gif_set(svm)) { + if (gif_set(svm) && nested_svm_intr(svm)) { svm_set_vintr(svm); svm_inject_irq(svm, 0x0); } -- 1.6.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/10] KVM: SVM: Activate nested state only when guest state is complete
Certain functions called during the emulated world switch behave differently when the vcpu is running nested. This is not the expected behavior during a world switch emulation. This patch ensures that the nested state is activated only if the vcpu is completly in nested state. Signed-off-by: Joerg Roedel --- arch/x86/kvm/svm.c | 15 +-- 1 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 2a3d525..a64b871 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1631,6 +1631,9 @@ static int nested_svm_vmexit(struct vcpu_svm *svm) if (!nested_vmcb) return 1; + /* Exit nested SVM mode */ + svm->nested.vmcb = 0; + /* Give the current vmcb to the guest */ disable_gif(svm); @@ -1718,9 +1721,6 @@ static int nested_svm_vmexit(struct vcpu_svm *svm) svm->vmcb->save.cpl = 0; svm->vmcb->control.exit_int_info = 0; - /* Exit nested SVM mode */ - svm->nested.vmcb = 0; - nested_svm_unmap(nested_vmcb); kvm_mmu_reset_context(&svm->vcpu); @@ -1753,14 +1753,14 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) struct vmcb *nested_vmcb; struct vmcb *hsave = svm->nested.hsave; struct vmcb *vmcb = svm->vmcb; + u64 vmcb_gpa; + + vmcb_gpa = svm->vmcb->save.rax; nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax); if (!nested_vmcb) return false; - /* nested_vmcb is our indicator if nested SVM is activated */ - svm->nested.vmcb = svm->vmcb->save.rax; - trace_kvm_nested_vmrun(svm->vmcb->save.rip - 3, svm->nested.vmcb, nested_vmcb->save.rip, nested_vmcb->control.int_ctl, @@ -1875,6 +1875,9 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) nested_svm_unmap(nested_vmcb); + /* nested_vmcb is our indicator if nested SVM is activated */ + svm->nested.vmcb = vmcb_gpa; + enable_gif(svm); return true; -- 1.6.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/10] KVM: SVM: Fix nested msr intercept handling
The nested_svm_exit_handled_msr() function maps only one page of the guests msr permission bitmap. This patch changes the code to use kvm_read_guest to fix the bug. Cc: sta...@kernel.org Signed-off-by: Joerg Roedel --- arch/x86/kvm/svm.c | 12 +++- 1 files changed, 3 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 3f59cbd..cbf798f 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1457,16 +1457,11 @@ static bool nested_svm_exit_handled_msr(struct vcpu_svm *svm) u32 msr = svm->vcpu.arch.regs[VCPU_REGS_RCX]; bool ret = false; u32 t0, t1; - u8 *msrpm; + u8 val; if (!(svm->nested.intercept & (1ULL << INTERCEPT_MSR_PROT))) return false; - msrpm = nested_svm_map(svm, svm->nested.vmcb_msrpm); - - if (!msrpm) - goto out; - switch (msr) { case 0 ... 0x1fff: t0 = (msr * 2) % 8; @@ -1487,11 +1482,10 @@ static bool nested_svm_exit_handled_msr(struct vcpu_svm *svm) goto out; } - ret = msrpm[t1] & ((1 << param) << t0); + if (!kvm_read_guest(svm->vcpu.kvm, svm->nested.vmcb_msrpm + t1, &val, 1)) + ret = val & ((1 << param) << t0); out: - nested_svm_unmap(msrpm); - return ret; } -- 1.6.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/10] KVM: SVM: Remove newlines from nested trace points
The tracing infrastructure adds its own newlines. Remove them from the trace point printk format strings. Signed-off-by: Joerg Roedel --- arch/x86/kvm/trace.h | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index 6ad30a2..12f8d2d 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -413,7 +413,7 @@ TRACE_EVENT(kvm_nested_vmrun, ), TP_printk("rip: 0x%016llx vmcb: 0x%016llx nrip: 0x%016llx int_ctl: 0x%08x " - "event_inj: 0x%08x npt: %s\n", + "event_inj: 0x%08x npt: %s", __entry->rip, __entry->vmcb, __entry->nested_rip, __entry->int_ctl, __entry->event_inj, __entry->npt ? "on" : "off") @@ -447,7 +447,7 @@ TRACE_EVENT(kvm_nested_vmexit, __entry->exit_int_info_err = exit_int_info_err; ), TP_printk("rip: 0x%016llx reason: %s ext_inf1: 0x%016llx " - "ext_inf2: 0x%016llx ext_int: 0x%08x ext_int_err: 0x%08x\n", + "ext_inf2: 0x%016llx ext_int: 0x%08x ext_int_err: 0x%08x", __entry->rip, ftrace_print_symbols_seq(p, __entry->exit_code, kvm_x86_ops->exit_reasons_str), @@ -482,7 +482,7 @@ TRACE_EVENT(kvm_nested_vmexit_inject, ), TP_printk("reason: %s ext_inf1: 0x%016llx " - "ext_inf2: 0x%016llx ext_int: 0x%08x ext_int_err: 0x%08x\n", + "ext_inf2: 0x%016llx ext_int: 0x%08x ext_int_err: 0x%08x", ftrace_print_symbols_seq(p, __entry->exit_code, kvm_x86_ops->exit_reasons_str), __entry->exit_info1, __entry->exit_info2, @@ -504,7 +504,7 @@ TRACE_EVENT(kvm_nested_intr_vmexit, __entry->rip= rip ), - TP_printk("rip: 0x%016llx\n", __entry->rip) + TP_printk("rip: 0x%016llx", __entry->rip) ); /* @@ -526,7 +526,7 @@ TRACE_EVENT(kvm_invlpga, __entry->address= address; ), - TP_printk("rip: 0x%016llx asid: %d address: 0x%016llx\n", + TP_printk("rip: 0x%016llx asid: %d address: 0x%016llx", __entry->rip, __entry->asid, __entry->address) ); @@ -547,7 +547,7 @@ TRACE_EVENT(kvm_skinit, __entry->slb= slb; ), - TP_printk("rip: 0x%016llx slb: 0x%08x\n", + TP_printk("rip: 0x%016llx slb: 0x%08x", __entry->rip, __entry->slb) ); -- 1.6.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/10] KVM: SVM: Don't sync nested cr8 to lapic and back
This patch makes syncing of the guest tpr to the lapic conditional on !nested. Otherwise a nested guest using the TPR could freeze the guest. Another important change this patch introduces is that the cr8 intercept bits are no longer ORed at vmrun emulation if the guest sets VINTR_MASKING in its VMCB. The reason is that nested cr8 accesses need alway be handled by the nested hypervisor because they change the shadow version of the tpr. Cc: sta...@kernel.org Signed-off-by: Joerg Roedel --- arch/x86/kvm/svm.c | 46 +++--- 1 files changed, 31 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index cbf798f..2a3d525 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1828,21 +1828,6 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) svm->vmcb->save.dr6 = nested_vmcb->save.dr6; svm->vmcb->save.cpl = nested_vmcb->save.cpl; - /* We don't want a nested guest to be more powerful than the guest, - so all intercepts are ORed */ - svm->vmcb->control.intercept_cr_read |= - nested_vmcb->control.intercept_cr_read; - svm->vmcb->control.intercept_cr_write |= - nested_vmcb->control.intercept_cr_write; - svm->vmcb->control.intercept_dr_read |= - nested_vmcb->control.intercept_dr_read; - svm->vmcb->control.intercept_dr_write |= - nested_vmcb->control.intercept_dr_write; - svm->vmcb->control.intercept_exceptions |= - nested_vmcb->control.intercept_exceptions; - - svm->vmcb->control.intercept |= nested_vmcb->control.intercept; - svm->nested.vmcb_msrpm = nested_vmcb->control.msrpm_base_pa; /* cache intercepts */ @@ -1860,6 +1845,28 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) else svm->vcpu.arch.hflags &= ~HF_VINTR_MASK; + if (svm->vcpu.arch.hflags & HF_VINTR_MASK) { + /* We only want the cr8 intercept bits of the guest */ + svm->vmcb->control.intercept_cr_read &= ~INTERCEPT_CR8_MASK; + svm->vmcb->control.intercept_cr_write &= ~INTERCEPT_CR8_MASK; + } + + /* We don't want a nested guest to be more powerful than the guest, + so all intercepts are ORed */ + svm->vmcb->control.intercept_cr_read |= + nested_vmcb->control.intercept_cr_read; + svm->vmcb->control.intercept_cr_write |= + nested_vmcb->control.intercept_cr_write; + svm->vmcb->control.intercept_dr_read |= + nested_vmcb->control.intercept_dr_read; + svm->vmcb->control.intercept_dr_write |= + nested_vmcb->control.intercept_dr_write; + svm->vmcb->control.intercept_exceptions |= + nested_vmcb->control.intercept_exceptions; + + svm->vmcb->control.intercept |= nested_vmcb->control.intercept; + + svm->vmcb->control.lbr_ctl = nested_vmcb->control.lbr_ctl; svm->vmcb->control.int_vector = nested_vmcb->control.int_vector; svm->vmcb->control.int_state = nested_vmcb->control.int_state; svm->vmcb->control.tsc_offset += nested_vmcb->control.tsc_offset; @@ -2520,6 +2527,9 @@ static void update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr) { struct vcpu_svm *svm = to_svm(vcpu); + if (is_nested(svm) && (vcpu->arch.hflags & HF_VINTR_MASK)) + return; + if (irr == -1) return; @@ -2621,6 +2631,9 @@ static inline void sync_cr8_to_lapic(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); + if (is_nested(svm) && (vcpu->arch.hflags & HF_VINTR_MASK)) + return; + if (!(svm->vmcb->control.intercept_cr_write & INTERCEPT_CR8_MASK)) { int cr8 = svm->vmcb->control.int_ctl & V_TPR_MASK; kvm_set_cr8(vcpu, cr8); @@ -2632,6 +2645,9 @@ static inline void sync_lapic_to_cr8(struct kvm_vcpu *vcpu) struct vcpu_svm *svm = to_svm(vcpu); u64 cr8; + if (is_nested(svm) && (vcpu->arch.hflags & HF_VINTR_MASK)) + return; + cr8 = kvm_get_cr8(vcpu); svm->vmcb->control.int_ctl &= ~V_TPR_MASK; svm->vmcb->control.int_ctl |= cr8 & V_TPR_MASK; -- 1.6.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/10] KVM: SVM: Fix schedule-while-atomic on nested exception handling
Move the actual vmexit routine out of code that runs with irqs and preemption disabled. Cc: sta...@kernel.org Signed-off-by: Joerg Roedel --- arch/x86/kvm/svm.c | 20 +--- 1 files changed, 17 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 7c96b8b..25d26ec 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -128,6 +128,7 @@ static void svm_flush_tlb(struct kvm_vcpu *vcpu); static void svm_complete_interrupts(struct vcpu_svm *svm); static int nested_svm_exit_handled(struct vcpu_svm *svm); +static int nested_svm_exit_handled_atomic(struct vcpu_svm *svm); static int nested_svm_vmexit(struct vcpu_svm *svm); static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr, bool has_error_code, u32 error_code); @@ -1386,7 +1387,7 @@ static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr, svm->vmcb->control.exit_info_1 = error_code; svm->vmcb->control.exit_info_2 = svm->vcpu.arch.cr2; - return nested_svm_exit_handled(svm); + return nested_svm_exit_handled_atomic(svm); } /* This function returns true if it is save to enable the irq window */ @@ -1520,7 +1521,7 @@ static int nested_svm_exit_special(struct vcpu_svm *svm) /* * If this function returns true, this #vmexit was already handled */ -static int nested_svm_exit_handled(struct vcpu_svm *svm) +static int __nested_svm_exit_handled(struct vcpu_svm *svm, bool atomic) { u32 exit_code = svm->vmcb->control.exit_code; int vmexit = NESTED_EXIT_HOST; @@ -1567,12 +1568,25 @@ static int nested_svm_exit_handled(struct vcpu_svm *svm) } if (vmexit == NESTED_EXIT_DONE) { - nested_svm_vmexit(svm); + if (!atomic) + nested_svm_vmexit(svm); + else + svm->nested.exit_required = true; } return vmexit; } +static int nested_svm_exit_handled(struct vcpu_svm *svm) +{ + return __nested_svm_exit_handled(svm, false); +} + +static int nested_svm_exit_handled_atomic(struct vcpu_svm *svm) +{ + return __nested_svm_exit_handled(svm, true); +} + static inline void copy_vmcb_control_area(struct vmcb *dst_vmcb, struct vmcb *from_vmcb) { struct vmcb_control_area *dst = &dst_vmcb->control; -- 1.6.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/10] Nested SVM fixes (and Win7-64bit bringup)
Hi, here is a couple of fixes for the nested SVM implementation. I collected these fixes mostly when trying to get Windows 7 64bit running as an L2 guest. Most important fixes in this set make lazy fpu switching working with nested SVM and the nested tpr handling fixes. Without the later fix the l1 guest freezes when trying to run win7 as l2 guest. Please review and comment on these patches :-) Joerg Diffstat: arch/x86/kvm/svm.c | 187 ++ arch/x86/kvm/trace.h | 12 ++-- 2 files changed, 133 insertions(+), 66 deletions(-) Shortlog: Joerg Roedel (10): KVM: SVM: Don't use kmap_atomic in nested_svm_map KVM: SVM: Fix wrong interrupt injection in enable_irq_windows KVM: SVM: Fix schedule-while-atomic on nested exception handling KVM: SVM: Sync all control registers on nested vmexit KVM: SVM: Annotate nested_svm_map with might_sleep() KVM: SVM: Fix nested msr intercept handling KVM: SVM: Don't sync nested cr8 to lapic and back KVM: SVM: Activate nested state only when guest state is complete KVM: SVM: Make lazy FPU switching work with nested svm KVM: SVM: Remove newlines from nested trace points -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm-0.12.2 hangs when booting grub, when kvm is disabled
Jacques Landru wrote: > Hi, > > Same problem here > > qemu-kvm-0.12.x hangs if I have at the same time "-no-kvm" and > "file=essai-slitaz.raw,if=ide,index=0,boot=on" sometime with the > message below > > but > > qemu-kvm with "-no-kvm" and without "boot=on" option for the file > parameter, works > qemu-kvm with "boot=on" option and kvm enable works. (kvm-kmod is 2.6.32.7) I have to confirm this issue: Something badly crashes here as well, either grub or the extboot ROM or Seabios. Does anyone has a good idea what makes the difference here, ie. where to start debugging? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [RFC] KVM test: Control files automatic generation to save memory
- "Lucas Meneghel Rodrigues" wrote: > On Sun, 2010-02-14 at 12:07 -0500, Michael Goldish wrote: > > - "Lucas Meneghel Rodrigues" wrote: > > > > > As our configuration system generates a list of dicts > > > with test parameters, and that list might be potentially > > > *very* large, keeping all this information in memory might > > > be a problem for smaller virtualization hosts due to > > > the memory pressure created. Tests made on my 4GB laptop > > > show that most of the memory is being used during a > > > typical kvm autotest session. > > > > > > So, instead of keeping all this information in memory, > > > let's take a different approach and unfold all the > > > tests generated by the config system and generate a > > > control file: > > > > > > job.run_test('kvm', params={param1, param2, ...}, tag='foo', ...) > > > job.run_test('kvm', params={param1, param2, ...}, tag='bar', ...) > > > > > > By dumping all the dicts that were before in the memory to > > > a control file, the memory usage of a typical kvm autotest > > > session is drastically reduced making it easier to run in smaller > > > virt hosts. > > > > > > The advantages of taking this new approach are: > > > * You can see what tests are going to run and the dependencies > > >between them by looking at the generated control file > > > * The control file is all ready to use, you can for example > > >paste it on the web interface and profit > > > * As mentioned, a lot less memory consumption, avoiding > > >memory pressure on virtualization hosts. > > > > > > This is a crude 1st pass at implementing this approach, so please > > > provide comments. > > > > > > Signed-off-by: Lucas Meneghel Rodrigues > > > --- > > > > Interesting idea! > > > > - Personally I don't like the renaming of kvm_config.py to > > generate_control.py, and prefer to keep them separate, so that > > generate_control.py has the create_control() function and > > kvm_config.py has everything else. It's just a matter of naming; > > kvm_config.py deals mostly with config files, not with control > files, > > and it can be used for other purposes than generating control > files. > > Fair enough, no problem. > > > - I wonder why so much memory is used by the test list. Our daily > > test sets aren't very big, so although the parser should use a huge > > amount of memory while parsing, nearly all of that memory should be > > freed by the time the parser is done, because the final 'only' > > statement reduces the number of tests to a small fraction of the > total > > number in a full set. What test set did you try with that 4 GB > > machine, and how much memory was used by the test list? If a > > ridiculous amount of memory was used, this might indicate a bug in > > kvm_config.py (maybe it keeps references to deleted tests, forcing > > them to stay in memory). > > This problem wasn't found during the daily test routine, rather it was > a > comment I heard from Naphtali about the typical autotest memory > usage. > Also Marcelo made a similar comment, so I thought it was a problem > worth > looking. I tried to run the default test set that we selected for > upstream (3 resulting dicts) on my 4GB RAM laptop, here are my > findings: > > * Before autotest usage: Around 20% of memory used, 10% used as > cache. > * During autotest usage: About 99% of memory used, 27% used as > cache. Before autotest usage, were there any VMs running? 3 dicts can't possibly take up so much space. If it is indeed kvm_config's fault (which I doubt), there's probably a bug in it that prevents it from freeing unused memory, and once we fix that bug the problem should be gone. > So yes, there's a significant memory usage increase, that doesn't > happen > using a "flat", autogenerated control file. Sure it doesn't make my > laptop crawl, but it is a *lot* of resource usage anyway. > > Also, let's assume that for small test sets, we can can reclaim all > memory back. Still we have to consider large test sets. I am all for > profiling the memory usage and fix eventual bugs, but we need to keep > in > mind that one might want to run large test sets, and large test sets > imply keeping a fairly large amount of data in memory. If the amount > of > memory is negligible on most use cases, then let's just fix bugs and > forget about using the proposed approach. > > Also, a "flat" control file is quicker to run, because there's no > parsing of the config file happening in there. So, this control file Agreed, but on the other hand, the static control file idea introduces an extra preprocessing step (not necessarily bad). > generation thing makes some sense, that's why I decided to code this > 1st pass attempt at doing it. > > > - I don't think this approach will work for control.parallel, > because > > the tests have to be assigned dynamically to available queues, and > > AFAIK this can't be done by a simple static control file. > > Not necessarily, as the control file is a
Re: [PATCH 0/4] More emulator correctness fixes
Gleb Natapov wrote: This patch series adds proper permission checking during segment selector loading. Some missing fault injections are added. Gleb Natapov (2): KVM: Forbid modifying CS segment register by mov instruction. KVM: Fix segment descriptor loading. Takuya Yoshikawa (2): KVM: Fix load_guest_segment_descriptor() to inject page fault KVM: Fix emulate_sys[call, enter, exit]()'s fault handling Thanks you for rebasing my work and making the fix more comprehensive. arch/x86/include/asm/kvm_host.h |3 +- arch/x86/kvm/emulate.c | 71 +++ arch/x86/kvm/x86.c | 190 +++ 3 files changed, 186 insertions(+), 78 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance with KVM, how to figure out why?
On Thursday 18 February 2010 11:31:36 you wrote: > Hi, sorry about the lengthy e-mail. Hi, are you sure the kvm-intel kernel module is loaded? What is the output of "lsmod" ? Any useful kernel messages on the host or the VMs? What's the output of "dmesg"? Cheers, Thomas > We've been evaluating KVM for a while. Currently the host is on > 2.6.30-bpo.1-amd64, 4 CPU cores on an Intel Xeon 2,33. Disk controller is > Areca ARC-1210 and the machine has 12GB of memory. KVM 85+dfsg-4~bpo50+1 > and libvirt 0.6.5-3~bpo50+1, both from backports.org. Guests are in qcow2 > images. > > A few test servers have been running here for a while. That worked ok, > so we've moved a few production servers on it as well. It's now running > 8 guests, none of them are CPU or disk intensive (well, there are a mail > server and web server there, which from time to time spike, but it's > generally very low). > > After a reboot the other day, performance is suddenly disaster. The only > change we can see we've done is that we've allocated a bit more memory > to the guests, and enabled 4 vcpus on all guests (some of them ran with > 1 vcpu before). When I say performance is bad, it's to the point where > typing on the keyboard is lagging. It seems load on one guest affects all > of the others. > > What is weird is that before the reboot, the host machine usually had a > system load of about 0.30 on average, and a CPU load of 20-30% (total of > all cores). After the reboot, this is a typical top output: > > top - 11:17:49 up 1 day, 5:32, 3 users, load average: 3.81, 3.85, 3.96 > Tasks: 113 total, 4 running, 109 sleeping, 0 stopped, 0 zombie > Cpu0 : 93.7%us, 3.6%sy, 0.0%ni, 0.7%id, 1.7%wa, 0.3%hi, 0.0%si, > 0.0%st Cpu1 : 96.3%us, 3.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, > 0.0%si, 0.0%st Cpu2 : 93.7%us, 5.0%sy, 0.0%ni, 1.0%id, 0.0%wa, > 0.0%hi, 0.3%si, 0.0%st Cpu3 : 91.4%us, 5.6%sy, 0.0%ni, 2.7%id, > 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 12335492k total, 12257056k used, > 78436k free, 24k buffers Swap: 7807580k total, 744584k used, > 7062996k free, 4927212k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 3398 root 20 0 2287m 1.9g 680 S 149 16.5 603:10.92 kvm > 5041 root 20 0 2255m 890m 540 R 99 7.4 603:38.07 kvm > 5055 root 20 0 2272m 980m 668 S 86 8.1 305:42.95 kvm > 5095 root 20 0 2287m 1.9g 532 R 33 16.6 655:11.53 kvm > 5073 root 20 0 2253m 435m 532 S 19 3.6 371:59.80 kvm > 3334 root 20 0 2254m 66m 532 S6 0.5 106:58.20 kvm > > Now this is the weird part: The guests are (really!) doing nothing. > Before this started, each guest's load were typically 0.02 - 0.30. Now > their load is suddenly 2.x and in top, even simple CPU processes like > syslogd uses 20% CPU. > > It _might_ seem like an i/o problem, because disk performance seems bad > on all guests. find / would ususally fly by, now you'd see a bit lag'ish > output (I know, bad performance test). > > The host machine seems fast&fine, except it has a system load of about > 2-6. It seems snappy, though. Here you have some info like iostat etc: > > # iostat -kdxx1 > > inux 2.6.30-bpo.1-amd64 (cf01) 02/18/2010 _x86_64_ > > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await svctm %util sda 1.7318.91 29.89 > 18.40 855.89 454.1754.26 0.489.98 2.23 10.75 sda1 >0.3516.17 29.58 18.24 849.11 442.5854.03 0.47 > 9.79 2.20 10.52 sda2 1.38 2.740.310.16 > 6.7811.5977.44 0.01 29.14 13.60 0.64 > > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await svctm %util sda 6.00 0.001.00 > 0.00 4.00 0.00 8.00 0.310.00 308.00 30.80 sda1 > 0.00 0.000.000.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 sda2 6.00 0.001.000.00 > 4.00 0.00 8.00 0.310.00 308.00 30.80 > > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await svctm %util sda 0.00 0.001.00 > 0.0028.00 0.0056.00 0.63 936.00 628.00 62.80 sda1 > 0.00 0.000.000.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 sda2 0.00 0.001.000.00 > 28.00 0.0056.00 0.63 936.00 628.00 62.80 > > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await svctm %util sda 0.00 0.004.00 > 0.00 228.00 0.00 114.00 0.049.00 5.00 2.00 sda1 > 0.00 0.004.000.00 228.00 0.00 114.00 0.04 > 9.00 5.00 2.00 sda2 0.00 0.000.000.00 > 0.00 0.00 0.00
Re: Poor performance with KVM, how to figure out why?
On 02/18/2010 12:31 PM, Vegard Svanberg wrote: Hi, sorry about the lengthy e-mail. We've been evaluating KVM for a while. Currently the host is on 2.6.30-bpo.1-amd64, 4 CPU cores on an Intel Xeon 2,33. Disk controller is Areca ARC-1210 and the machine has 12GB of memory. KVM 85+dfsg-4~bpo50+1 and libvirt 0.6.5-3~bpo50+1, both from backports.org. Guests are in qcow2 images. These are all really old. A few test servers have been running here for a while. That worked ok, so we've moved a few production servers on it as well. It's now running 8 guests, none of them are CPU or disk intensive (well, there are a mail server and web server there, which from time to time spike, but it's generally very low). After a reboot the other day, performance is suddenly disaster. The only change we can see we've done is that we've allocated a bit more memory to the guests, and enabled 4 vcpus on all guests (some of them ran with 1 vcpu before). When I say performance is bad, it's to the point where typing on the keyboard is lagging. It seems load on one guest affects all of the others. What is weird is that before the reboot, the host machine usually had a system load of about 0.30 on average, and a CPU load of 20-30% (total of all cores). After the reboot, this is a typical top output: top - 11:17:49 up 1 day, 5:32, 3 users, load average: 3.81, 3.85, 3.96 Tasks: 113 total, 4 running, 109 sleeping, 0 stopped, 0 zombie Cpu0 : 93.7%us, 3.6%sy, 0.0%ni, 0.7%id, 1.7%wa, 0.3%hi, 0.0%si, 0.0%st Cpu1 : 96.3%us, 3.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 93.7%us, 5.0%sy, 0.0%ni, 1.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu3 : 91.4%us, 5.6%sy, 0.0%ni, 2.7%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 12335492k total, 12257056k used,78436k free, 24k buffers Swap: 7807580k total, 744584k used, 7062996k free, 4927212k cached Looks like you're running into swap. Does 'vmstat 1' show swap activity? Try dropping the vcpu count and memory back to initial levels, separately, to see which triggers the bad behaviour. PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3398 root 20 0 2287m 1.9g 680 S 149 16.5 603:10.92 kvm 5041 root 20 0 2255m 890m 540 R 99 7.4 603:38.07 kvm 5055 root 20 0 2272m 980m 668 S 86 8.1 305:42.95 kvm 5095 root 20 0 2287m 1.9g 532 R 33 16.6 655:11.53 kvm 5073 root 20 0 2253m 435m 532 S 19 3.6 371:59.80 kvm 3334 root 20 0 2254m 66m 532 S6 0.5 106:58.20 kvm None of the RSS figures are nice round numbers, which might mean some of guest memory is swapped out. (note: you can run kvm as non-root). Now this is the weird part: The guests are (really!) doing nothing. Before this started, each guest's load were typically 0.02 - 0.30. Now their load is suddenly 2.x and in top, even simple CPU processes like syslogd uses 20% CPU. That may also be an indication of swap. When the guest accesses a swapped out page, the time to swap in the page is accounted to the instruction that caused the access. # vmstat procs ---memory-- ---swap-- -io -system-- cpu r b swpd free buff cache si sobibo in cs us sy id wa 4 0 747292 98584 24 491236823 207 1108 14 35 5 58 2 'vmstat' rate data is from machine boot up. 'vmstat 1' gives current rates. kvm statistics efer_reload 2493 0 exits 7012998022 86420 fpu_reload 1079562451121 halt_exits 839269827 10930 halt_wakeup 795288051364 host_state_reload 1159155068 15293 hypercalls 1471039754 17008 insn_emulation 2782749902 35121 insn_emulation_fail 0 0 invlpg 1721196871754 io_exits 1294826882084 irq_exits4555154344884 irq_injections 973172925 12423 irq_window41631517 635 largepages 0 0 mmio_exits 941756 0 mmu_cache_miss74512394 849 mmu_flooded5132926 41 mmu_pde_zapped40341877 356 mmu_pte_updated 1150029759 13443 mmu_pte_write 2184765599 27182 mmu_recycled 52261 0 mmu_shadow_zapped 74494953 766 mmu_unsync 390 23 mmu_unsync_global0 0 nmi_injections 0 0 nmi_window 0 0 pf_fixed 4700579395144 pf_guest 4638018765900 remote_tlb_flush 1287650571024 request_irq 0 0 request_nmi 0 0 signal_exits 0 0 tlb_flush 1528830191 17996 This is reasonable for a 4-cpu host under moderate load. -- error compiling committee.c: too many arguments to fu
Re: [PATCH 0/4] More emulator correctness fixes
On 02/18/2010 12:14 PM, Gleb Natapov wrote: This patch series adds proper permission checking during segment selector loading. Some missing fault injections are added. Thanks, applied. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Poor performance with KVM, how to figure out why?
Hi, sorry about the lengthy e-mail. We've been evaluating KVM for a while. Currently the host is on 2.6.30-bpo.1-amd64, 4 CPU cores on an Intel Xeon 2,33. Disk controller is Areca ARC-1210 and the machine has 12GB of memory. KVM 85+dfsg-4~bpo50+1 and libvirt 0.6.5-3~bpo50+1, both from backports.org. Guests are in qcow2 images. A few test servers have been running here for a while. That worked ok, so we've moved a few production servers on it as well. It's now running 8 guests, none of them are CPU or disk intensive (well, there are a mail server and web server there, which from time to time spike, but it's generally very low). After a reboot the other day, performance is suddenly disaster. The only change we can see we've done is that we've allocated a bit more memory to the guests, and enabled 4 vcpus on all guests (some of them ran with 1 vcpu before). When I say performance is bad, it's to the point where typing on the keyboard is lagging. It seems load on one guest affects all of the others. What is weird is that before the reboot, the host machine usually had a system load of about 0.30 on average, and a CPU load of 20-30% (total of all cores). After the reboot, this is a typical top output: top - 11:17:49 up 1 day, 5:32, 3 users, load average: 3.81, 3.85, 3.96 Tasks: 113 total, 4 running, 109 sleeping, 0 stopped, 0 zombie Cpu0 : 93.7%us, 3.6%sy, 0.0%ni, 0.7%id, 1.7%wa, 0.3%hi, 0.0%si, 0.0%st Cpu1 : 96.3%us, 3.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 93.7%us, 5.0%sy, 0.0%ni, 1.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu3 : 91.4%us, 5.6%sy, 0.0%ni, 2.7%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 12335492k total, 12257056k used,78436k free, 24k buffers Swap: 7807580k total, 744584k used, 7062996k free, 4927212k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3398 root 20 0 2287m 1.9g 680 S 149 16.5 603:10.92 kvm 5041 root 20 0 2255m 890m 540 R 99 7.4 603:38.07 kvm 5055 root 20 0 2272m 980m 668 S 86 8.1 305:42.95 kvm 5095 root 20 0 2287m 1.9g 532 R 33 16.6 655:11.53 kvm 5073 root 20 0 2253m 435m 532 S 19 3.6 371:59.80 kvm 3334 root 20 0 2254m 66m 532 S6 0.5 106:58.20 kvm Now this is the weird part: The guests are (really!) doing nothing. Before this started, each guest's load were typically 0.02 - 0.30. Now their load is suddenly 2.x and in top, even simple CPU processes like syslogd uses 20% CPU. It _might_ seem like an i/o problem, because disk performance seems bad on all guests. find / would ususally fly by, now you'd see a bit lag'ish output (I know, bad performance test). The host machine seems fast&fine, except it has a system load of about 2-6. It seems snappy, though. Here you have some info like iostat etc: # iostat -kdxx1 inux 2.6.30-bpo.1-amd64 (cf01) 02/18/2010 _x86_64_ Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 1.7318.91 29.89 18.40 855.89 454.1754.26 0.489.98 2.23 10.75 sda1 0.3516.17 29.58 18.24 849.11 442.5854.03 0.479.79 2.20 10.52 sda2 1.38 2.740.310.16 6.7811.5977.44 0.01 29.14 13.60 0.64 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 6.00 0.001.000.00 4.00 0.00 8.00 0.310.00 308.00 30.80 sda1 0.00 0.000.000.00 0.00 0.00 0.00 0.000.00 0.00 0.00 sda2 6.00 0.001.000.00 4.00 0.00 8.00 0.310.00 308.00 30.80 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.001.000.0028.00 0.0056.00 0.63 936.00 628.00 62.80 sda1 0.00 0.000.000.00 0.00 0.00 0.00 0.000.00 0.00 0.00 sda2 0.00 0.001.000.0028.00 0.0056.00 0.63 936.00 628.00 62.80 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.004.000.00 228.00 0.00 114.00 0.049.00 5.00 2.00 sda1 0.00 0.004.000.00 228.00 0.00 114.00 0.049.00 5.00 2.00 sda2 0.00 0.000.000.00 0.00 0.00 0.00 0.000.00 0.00 0.00 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 6.00 0.003.000.0036.00 0.0024.00 0.06 21.33 14.67 4.40 sda1 0.00 0.001.000.00 4.00 0.00 8.00 0.02 24.00 24.00 2.40 sda2 6.00 0.002.00
Re: [Qemu-devel] Re: [PATCH v2] qemu-kvm: Speed up of the dirty-bitmap-traveling
On 18.02.2010, at 06:57, OHMURA Kei wrote: >> "We think"? I mean - yes, I think so too. But have you actually measured >> it? >> How much improvement are we talking here? >> Is it still faster when a bswap is involved? > Thanks for pointing out. > I will post the data for x86 later. > However, I don't have a test environment to check the impact of bswap. > Would you please measure the run time between the following section if > possible? It'd make more sense to have a real stand alone test program, no? I can try to write one today, but I have some really nasty important bugs to fix first. >>> >>> OK. I will prepare a test code with sample data. Since I found a ppc >>> machine around, I will run the code and post the results of >>> x86 and ppc. >>> >>> >>> By the way, the following data is a result of x86 measured in QEMU/KVM. >>> This data shows, how many times the function is called (#called), runtime >>> of original function(orig.), runtime of this patch(patch), speedup ratio >>> (ratio). >> That does indeed look promising! >> Thanks for doing this micro-benchmark. I just want to be 100% sure that it >> doesn't affect performance for big endian badly. > > > I measured runtime of the test code with sample data. My test environment > and results are described below. > > x86 Test Environment: > CPU: 4x Intel Xeon Quad Core 2.66GHz > Mem size: 6GB > > ppc Test Environment: > CPU: 2x Dual Core PPC970MP > Mem size: 2GB > > The sample data of dirty bitmap was produced by QEMU/KVM while the guest OS > was live migrating. To measure the runtime I copied cpu_get_real_ticks() of > QEMU to my test program. > > > Experimental results: > Test1: Guest OS read 3GB file, which is bigger than memory. orig.(msec) >patch(msec)ratio > x860.30.16.4 ppc7.92.7 > 3.0 > Test2: Guest OS read/write 3GB file, which is bigger than memory. > orig.(msec)patch(msec)ratio > x8612.0 3.23.7 ppc251.1 123 > 2.0 > > I also measured the runtime of bswap itself on ppc, and I found it was only > just 0.3% ~ 0.7 % of the runtime described above. Awesome! Thank you so much for giving actual data to make me feel comfortable with it :-). Alex-- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [RFC] KVM test: Control files automatic generation to save memory
On Sun, 2010-02-14 at 12:07 -0500, Michael Goldish wrote: > - "Lucas Meneghel Rodrigues" wrote: > > > As our configuration system generates a list of dicts > > with test parameters, and that list might be potentially > > *very* large, keeping all this information in memory might > > be a problem for smaller virtualization hosts due to > > the memory pressure created. Tests made on my 4GB laptop > > show that most of the memory is being used during a > > typical kvm autotest session. > > > > So, instead of keeping all this information in memory, > > let's take a different approach and unfold all the > > tests generated by the config system and generate a > > control file: > > > > job.run_test('kvm', params={param1, param2, ...}, tag='foo', ...) > > job.run_test('kvm', params={param1, param2, ...}, tag='bar', ...) > > > > By dumping all the dicts that were before in the memory to > > a control file, the memory usage of a typical kvm autotest > > session is drastically reduced making it easier to run in smaller > > virt hosts. > > > > The advantages of taking this new approach are: > > * You can see what tests are going to run and the dependencies > >between them by looking at the generated control file > > * The control file is all ready to use, you can for example > >paste it on the web interface and profit > > * As mentioned, a lot less memory consumption, avoiding > >memory pressure on virtualization hosts. > > > > This is a crude 1st pass at implementing this approach, so please > > provide comments. > > > > Signed-off-by: Lucas Meneghel Rodrigues > > --- > > Interesting idea! > > - Personally I don't like the renaming of kvm_config.py to > generate_control.py, and prefer to keep them separate, so that > generate_control.py has the create_control() function and > kvm_config.py has everything else. It's just a matter of naming; > kvm_config.py deals mostly with config files, not with control files, > and it can be used for other purposes than generating control files. Fair enough, no problem. > - I wonder why so much memory is used by the test list. Our daily > test sets aren't very big, so although the parser should use a huge > amount of memory while parsing, nearly all of that memory should be > freed by the time the parser is done, because the final 'only' > statement reduces the number of tests to a small fraction of the total > number in a full set. What test set did you try with that 4 GB > machine, and how much memory was used by the test list? If a > ridiculous amount of memory was used, this might indicate a bug in > kvm_config.py (maybe it keeps references to deleted tests, forcing > them to stay in memory). This problem wasn't found during the daily test routine, rather it was a comment I heard from Naphtali about the typical autotest memory usage. Also Marcelo made a similar comment, so I thought it was a problem worth looking. I tried to run the default test set that we selected for upstream (3 resulting dicts) on my 4GB RAM laptop, here are my findings: * Before autotest usage: Around 20% of memory used, 10% used as cache. * During autotest usage: About 99% of memory used, 27% used as cache. So yes, there's a significant memory usage increase, that doesn't happen using a "flat", autogenerated control file. Sure it doesn't make my laptop crawl, but it is a *lot* of resource usage anyway. Also, let's assume that for small test sets, we can can reclaim all memory back. Still we have to consider large test sets. I am all for profiling the memory usage and fix eventual bugs, but we need to keep in mind that one might want to run large test sets, and large test sets imply keeping a fairly large amount of data in memory. If the amount of memory is negligible on most use cases, then let's just fix bugs and forget about using the proposed approach. Also, a "flat" control file is quicker to run, because there's no parsing of the config file happening in there. So, this control file generation thing makes some sense, that's why I decided to code this 1st pass attempt at doing it. > - I don't think this approach will work for control.parallel, because > the tests have to be assigned dynamically to available queues, and > AFAIK this can't be done by a simple static control file. Not necessarily, as the control file is a program, we can just generate the code using some sort of function that can do the assignment. I don't fully see all that's needed to get the job done, but in theory should be possible. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
On 02/18/2010 12:05 PM, Jan Kiszka wrote: Avi Kivity wrote: On 02/18/2010 11:45 AM, Avi Kivity wrote: On 02/18/2010 11:40 AM, Jan Kiszka wrote: Meanwhile, if anyone has any idea how to kill this lock, I'd love to see it. What concurrency does it resolve in the end? On first glance, it only synchronize the fiddling with pre-VCPU request bits, right? What forces us to do this? Wouldn't it suffice to disable preemption (thus migration) and the let concurrent requests race for setting the bits? I mean if some request bit was already set on entry, we don't include the related VCPU in smp_call_function_many anyway. It's more difficult. vcpu 0: sets request bit on vcpu 2 vcpu 1: test_and_set request bit on vcpu 2, returns already set vcpu 1: returns vcpu 0: sends IPI vcpu 0: returns so vcpu 1 returns before the IPI was performed. If the request was a tlb flush, for example, vcpu 1 may free a page that is still in vcpu 2's tlb. One way out would be to have a KVM_REQ_IN_PROGRESS, set it in make_request, clear it in the IPI function. If a second make_request sees it already set, it can simply busy wait until it is cleared, without sending the IPI. Of course the busy wait means we can't enable preemption (or we may busy wait on an unscheduled task), but at least the requests can proceed in parallel instead of serializing. ...or include VCPUs with KVM_REQ_IN_PROGRESS set into the IPI set even if they already have the desired request bit set. But then we're making them take the IPI, which is pointless and expensive. My approach piggy backs multiple requesters on one IPI. Then we should serialize in smp_call_function_many. Do you mean rely on s_c_f_m's internal synchronization? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] KVM: Fix segment descriptor loading.
Add proper error and permission checking. This patch also change task switching code to load segment selectors before segment descriptors, like SDM requires, otherwise permission checking during segment descriptor loading will be incorrect. Signed-off-by: Gleb Natapov --- arch/x86/include/asm/kvm_host.h |3 +- arch/x86/kvm/emulate.c | 30 ++- arch/x86/kvm/x86.c | 177 +++ 3 files changed, 151 insertions(+), 59 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 96b1e6e..d46e791 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -609,8 +609,7 @@ int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr, unsigned long value); void kvm_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); -int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, - int type_bits, int seg); +int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, int seg); int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason); diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 08ac9cf..8d315ab 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1304,7 +1304,7 @@ static int emulate_pop_sreg(struct x86_emulate_ctxt *ctxt, if (rc != X86EMUL_CONTINUE) return rc; - rc = kvm_load_segment_descriptor(ctxt->vcpu, (u16)selector, 1, seg); + rc = kvm_load_segment_descriptor(ctxt->vcpu, (u16)selector, seg); return rc; } @@ -1482,7 +1482,7 @@ static int emulate_ret_far(struct x86_emulate_ctxt *ctxt, rc = emulate_pop(ctxt, ops, &cs, c->op_bytes); if (rc != X86EMUL_CONTINUE) return rc; - rc = kvm_load_segment_descriptor(ctxt->vcpu, (u16)cs, 1, VCPU_SREG_CS); + rc = kvm_load_segment_descriptor(ctxt->vcpu, (u16)cs, VCPU_SREG_CS); return rc; } @@ -2118,12 +2118,11 @@ special_insn: break; case 0x8e: { /* mov seg, r/m16 */ uint16_t sel; - int type_bits; - int err; sel = c->src.val; - if (c->modrm_reg == VCPU_SREG_CS) { + if (c->modrm_reg == VCPU_SREG_CS || + c->modrm_reg > VCPU_SREG_GS) { kvm_queue_exception(ctxt->vcpu, UD_VECTOR); goto done; } @@ -2131,18 +2130,7 @@ special_insn: if (c->modrm_reg == VCPU_SREG_SS) toggle_interruptibility(ctxt, X86_SHADOW_INT_MOV_SS); - if (c->modrm_reg <= 5) { - type_bits = (c->modrm_reg == 1) ? 9 : 1; - err = kvm_load_segment_descriptor(ctxt->vcpu, sel, - type_bits, c->modrm_reg); - } else { - printk(KERN_INFO "Invalid segreg in modrm byte 0x%02x\n", - c->modrm); - goto cannot_emulate; - } - - if (err < 0) - goto cannot_emulate; + rc = kvm_load_segment_descriptor(ctxt->vcpu, sel, c->modrm_reg); c->dst.type = OP_NONE; /* Disable writeback. */ break; @@ -2316,11 +2304,9 @@ special_insn: case 0xe9: /* jmp rel */ goto jmp; case 0xea: /* jmp far */ - if (kvm_load_segment_descriptor(ctxt->vcpu, c->src2.val, 9, - VCPU_SREG_CS) < 0) { - DPRINTF("jmp far: Failed to load CS descriptor\n"); - goto cannot_emulate; - } + if (kvm_load_segment_descriptor(ctxt->vcpu, c->src2.val, + VCPU_SREG_CS)) + goto done; c->eip = c->src.val; break; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0273980..c43d73d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4787,7 +4787,7 @@ static int kvm_load_realmode_segment(struct kvm_vcpu *vcpu, u16 selector, int se .unusable = 0, }; kvm_x86_ops->set_segment(vcpu, &segvar, seg); - return 0; + return X86EMUL_CONTINUE; } static int is_vm86_segment(struct kvm_vcpu *vcpu, int seg) @@ -4797,43 +4797,112 @@ static int is_vm86_segment(struct kvm_vcpu *vcpu, int seg) (kvm_get_rflags(vcpu) & X86_EFLAGS_VM); } -static void kvm_check_segment_descriptor(struct kvm_vcpu *vcpu, int seg, -u16 selector) -{ - /* NULL selector is not valid for CS and SS */ - if (seg == VCPU_SREG_CS || seg == VCPU_SREG_SS) - if (!selector) - kvm_queue_exception_e(vcpu, TS_VECTOR, s
[PATCH 1/4] KVM: Forbid modifying CS segment register by mov instruction.
Inject #UD if guest attempts to do so. This is in accordance to Intel SDM. Signed-off-by: Gleb Natapov --- arch/x86/kvm/emulate.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 9beda8e..08ac9cf 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -2122,6 +2122,12 @@ special_insn: int err; sel = c->src.val; + + if (c->modrm_reg == VCPU_SREG_CS) { + kvm_queue_exception(ctxt->vcpu, UD_VECTOR); + goto done; + } + if (c->modrm_reg == VCPU_SREG_SS) toggle_interruptibility(ctxt, X86_SHADOW_INT_MOV_SS); -- 1.6.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] KVM: Fix load_guest_segment_descriptor() to inject page fault
From: Takuya Yoshikawa This patch injects page fault when reading descriptor in load_guest_segment_descriptor() fails with FAULT. Effects of this injection: This function is used by kvm_load_segment_descriptor() which is necessary for the following instructions. - mov seg,r/m16 - jmp far - pop ?s This patch makes it possible to emulate the page faults generated by these instructions. But be sure that unless we change the kvm_load_segment_descriptor()'s ret value propagation this patch has no effect. Signed-off-by: Takuya Yoshikawa Signed-off-by: Gleb Natapov --- arch/x86/kvm/x86.c | 13 ++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b2335f6..0273980 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4713,6 +4713,9 @@ static int load_guest_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, { struct desc_ptr dtable; u16 index = selector >> 3; + int ret; + u32 err; + gva_t addr; get_segment_descriptor_dtable(vcpu, selector, &dtable); @@ -4720,9 +4723,13 @@ static int load_guest_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, kvm_queue_exception_e(vcpu, GP_VECTOR, selector & 0xfffc); return X86EMUL_PROPAGATE_FAULT; } - return kvm_read_guest_virt_system(dtable.address + index*8, - seg_desc, sizeof(*seg_desc), - vcpu, NULL); + addr = dtable.address + index * 8; + ret = kvm_read_guest_virt_system(addr, seg_desc, sizeof(*seg_desc), +vcpu, &err); + if (ret == X86EMUL_PROPAGATE_FAULT) + kvm_inject_page_fault(vcpu, addr, err); + + return ret; } /* allowed just for 8 bytes segments */ -- 1.6.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] More emulator correctness fixes
This patch series adds proper permission checking during segment selector loading. Some missing fault injections are added. Gleb Natapov (2): KVM: Forbid modifying CS segment register by mov instruction. KVM: Fix segment descriptor loading. Takuya Yoshikawa (2): KVM: Fix load_guest_segment_descriptor() to inject page fault KVM: Fix emulate_sys[call, enter, exit]()'s fault handling arch/x86/include/asm/kvm_host.h |3 +- arch/x86/kvm/emulate.c | 71 +++ arch/x86/kvm/x86.c | 190 +++ 3 files changed, 186 insertions(+), 78 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] KVM: Fix emulate_sys[call, enter, exit]()'s fault handling
From: Takuya Yoshikawa This patch fixes emulate_syscall(), emulate_sysenter() and emulate_sysexit() to handle injected faults properly. Even though original code injects faults in these functions, we cannot handle these unless we use the different return value from the UNHANDLEABLE case. So this patch use X86EMUL_* codes instead of -1 and 0 and makes x86_emulate_insn() to handle these propagated faults. Be sure that, in x86_emulate_insn(), goto cannot_emulate and goto done with rc equals X86EMUL_UNHANDLEABLE have same effect. Signed-off-by: Takuya Yoshikawa Signed-off-by: Gleb Natapov --- arch/x86/kvm/emulate.c | 37 - 1 files changed, 20 insertions(+), 17 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 8d315ab..35f7acd 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1590,7 +1590,7 @@ emulate_syscall(struct x86_emulate_ctxt *ctxt) /* syscall is not available in real mode */ if (ctxt->mode == X86EMUL_MODE_REAL || ctxt->mode == X86EMUL_MODE_VM86) - return -1; + return X86EMUL_UNHANDLEABLE; setup_syscalls_segments(ctxt, &cs, &ss); @@ -1627,7 +1627,7 @@ emulate_syscall(struct x86_emulate_ctxt *ctxt) ctxt->eflags &= ~(EFLG_VM | EFLG_IF | EFLG_RF); } - return 0; + return X86EMUL_CONTINUE; } static int @@ -1640,14 +1640,14 @@ emulate_sysenter(struct x86_emulate_ctxt *ctxt) /* inject #GP if in real mode */ if (ctxt->mode == X86EMUL_MODE_REAL) { kvm_inject_gp(ctxt->vcpu, 0); - return -1; + return X86EMUL_UNHANDLEABLE; } /* XXX sysenter/sysexit have not been tested in 64bit mode. * Therefore, we inject an #UD. */ if (ctxt->mode == X86EMUL_MODE_PROT64) - return -1; + return X86EMUL_UNHANDLEABLE; setup_syscalls_segments(ctxt, &cs, &ss); @@ -1656,13 +1656,13 @@ emulate_sysenter(struct x86_emulate_ctxt *ctxt) case X86EMUL_MODE_PROT32: if ((msr_data & 0xfffc) == 0x0) { kvm_inject_gp(ctxt->vcpu, 0); - return -1; + return X86EMUL_PROPAGATE_FAULT; } break; case X86EMUL_MODE_PROT64: if (msr_data == 0x0) { kvm_inject_gp(ctxt->vcpu, 0); - return -1; + return X86EMUL_PROPAGATE_FAULT; } break; } @@ -1687,7 +1687,7 @@ emulate_sysenter(struct x86_emulate_ctxt *ctxt) kvm_x86_ops->get_msr(ctxt->vcpu, MSR_IA32_SYSENTER_ESP, &msr_data); c->regs[VCPU_REGS_RSP] = msr_data; - return 0; + return X86EMUL_CONTINUE; } static int @@ -1702,7 +1702,7 @@ emulate_sysexit(struct x86_emulate_ctxt *ctxt) if (ctxt->mode == X86EMUL_MODE_REAL || ctxt->mode == X86EMUL_MODE_VM86) { kvm_inject_gp(ctxt->vcpu, 0); - return -1; + return X86EMUL_UNHANDLEABLE; } setup_syscalls_segments(ctxt, &cs, &ss); @@ -1720,7 +1720,7 @@ emulate_sysexit(struct x86_emulate_ctxt *ctxt) cs.selector = (u16)(msr_data + 16); if ((msr_data & 0xfffc) == 0x0) { kvm_inject_gp(ctxt->vcpu, 0); - return -1; + return X86EMUL_PROPAGATE_FAULT; } ss.selector = (u16)(msr_data + 24); break; @@ -1728,7 +1728,7 @@ emulate_sysexit(struct x86_emulate_ctxt *ctxt) cs.selector = (u16)(msr_data + 32); if (msr_data == 0x0) { kvm_inject_gp(ctxt->vcpu, 0); - return -1; + return X86EMUL_PROPAGATE_FAULT; } ss.selector = cs.selector + 8; cs.db = 0; @@ -1744,7 +1744,7 @@ emulate_sysexit(struct x86_emulate_ctxt *ctxt) c->eip = ctxt->vcpu->arch.regs[VCPU_REGS_RDX]; c->regs[VCPU_REGS_RSP] = ctxt->vcpu->arch.regs[VCPU_REGS_RCX]; - return 0; + return X86EMUL_CONTINUE; } static bool emulator_bad_iopl(struct x86_emulate_ctxt *ctxt) @@ -2472,8 +2472,9 @@ twobyte_insn: } break; case 0x05: /* syscall */ - if (emulate_syscall(ctxt) == -1) - goto cannot_emulate; + rc = emulate_syscall(ctxt); + if (rc != X86EMUL_CONTINUE) + goto done; else goto writeback; break; @@ -2541,14 +2542,16 @@ twobyte_insn: c->dst.type = OP_NONE; break; case 0x34: /* sysenter */ - if (emulate_sysenter(ctxt) == -1) - goto cannot_emulate; +
Re: [PATCH] KVM test: Modifying finish.exe to support parallel installs
On Sun, 2010-02-14 at 11:21 -0500, Michael Goldish wrote: > - "Lucas Meneghel Rodrigues" wrote: > > > In order to adapt all the OS unattended installs to parallel > > installs, finish.exe also had to be adapted to be a server > > instead of a client. These are the modifications needed. > > > > Once the whole patchset is worked out, an updated version > > of finish.exe will be shipped on version control. > > > > Signed-off-by: Lucas Meneghel Rodrigues > > Now that finish.exe is a server it looks like a stripped version > of rss.exe. Since we're already running rss.exe at the end of > each unattended_install, why not just use VM.remote_login() to > verify that the installation was successful? If we can guarantee that the sshd daemon is up and running by the end of linux guest installs, then that is a perfectly fine solution. Need to check though. > > client/tests/kvm/deps/finish.cpp | 111 > > -- > > 1 files changed, 59 insertions(+), 52 deletions(-) > > > > diff --git a/client/tests/kvm/deps/finish.cpp > > b/client/tests/kvm/deps/finish.cpp > > index 9c2867c..e5ba128 100644 > > --- a/client/tests/kvm/deps/finish.cpp > > +++ b/client/tests/kvm/deps/finish.cpp > > @@ -1,12 +1,13 @@ > > -// Simple app that only sends an ack string to the KVM unattended > > install > > -// watch code. > > +// Simple application that creates a server socket, listening for > > connections > > +// of the unattended install test. Once it gets a client connected, > > the > > +// app will send back an ACK string, indicating the install process > > is done. > > // > > // You must link this code with Ws2_32.lib, Mswsock.lib, and > > Advapi32.lib > > // > > // Author: Lucas Meneghel Rodrigues > > // Code was adapted from an MSDN sample. > > > > -// Usage: finish.exe [Host OS IP] > > +// Usage: finish.exe > > > > // MinGW's ws2tcpip.h only defines getaddrinfo and other functions > > only for > > // the case _WIN32_WINNT >= 0x0501. > > @@ -21,24 +22,18 @@ > > #include > > #include > > > > -#define DEFAULT_BUFLEN 512 > > #define DEFAULT_PORT "12323" > > - > > int main(int argc, char **argv) > > { > > WSADATA wsaData; > > -SOCKET ConnectSocket = INVALID_SOCKET; > > -struct addrinfo *result = NULL, > > -*ptr = NULL, > > -hints; > > +SOCKET ListenSocket = INVALID_SOCKET, ClientSocket = > > INVALID_SOCKET; > > +struct addrinfo *result = NULL, hints; > > char *sendbuf = "done"; > > -char recvbuf[DEFAULT_BUFLEN]; > > -int iResult; > > -int recvbuflen = DEFAULT_BUFLEN; > > +int iResult, iSendResult; > > > > // Validate the parameters > > -if (argc != 2) { > > -printf("usage: %s server-name\n", argv[0]); > > +if (argc != 1) { > > +printf("usage: %s", argv[0]); > > return 1; > > } > > > > @@ -49,72 +44,84 @@ int main(int argc, char **argv) > > return 1; > > } > > > > -ZeroMemory( &hints, sizeof(hints) ); > > -hints.ai_family = AF_UNSPEC; > > +ZeroMemory(&hints, sizeof(hints)); > > +hints.ai_family = AF_INET; > > hints.ai_socktype = SOCK_STREAM; > > hints.ai_protocol = IPPROTO_TCP; > > +hints.ai_flags = AI_PASSIVE; > > > > // Resolve the server address and port > > -iResult = getaddrinfo(argv[1], DEFAULT_PORT, &hints, &result); > > -if ( iResult != 0 ) { > > +iResult = getaddrinfo(NULL, DEFAULT_PORT, &hints, &result); > > +if (iResult != 0) { > > printf("getaddrinfo failed: %d\n", iResult); > > WSACleanup(); > > return 1; > > } > > > > -// Attempt to connect to an address until one succeeds > > -for(ptr=result; ptr != NULL ;ptr=ptr->ai_next) { > > - > > -// Create a SOCKET for connecting to server > > -ConnectSocket = socket(ptr->ai_family, ptr->ai_socktype, > > -ptr->ai_protocol); > > -if (ConnectSocket == INVALID_SOCKET) { > > -printf("Error at socket(): %ld\n", WSAGetLastError()); > > -freeaddrinfo(result); > > -WSACleanup(); > > -return 1; > > -} > > - > > -// Connect to server. > > -iResult = connect( ConnectSocket, ptr->ai_addr, > > (int)ptr->ai_addrlen); > > -if (iResult == SOCKET_ERROR) { > > -closesocket(ConnectSocket); > > -ConnectSocket = INVALID_SOCKET; > > -continue; > > -} > > -break; > > +// Create a SOCKET for connecting to server > > +ListenSocket = socket(result->ai_family, result->ai_socktype, > > + result->ai_protocol); > > +if (ListenSocket == INVALID_SOCKET) { > > +printf("socket failed: %ld\n", WSAGetLastError()); > > +freeaddrinfo(result); > > +WSACleanup(); > > +return 1; > > +} > > + > > +// Setup the TCP listening socket > > +iResult = bind(ListenSocket, result->ai_addr,
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
Avi Kivity wrote: > On 02/18/2010 11:45 AM, Avi Kivity wrote: >> On 02/18/2010 11:40 AM, Jan Kiszka wrote: Meanwhile, if anyone has any idea how to kill this lock, I'd love to see it. >>> What concurrency does it resolve in the end? On first glance, it only >>> synchronize the fiddling with pre-VCPU request bits, right? What forces >>> us to do this? Wouldn't it suffice to disable preemption (thus >>> migration) and the let concurrent requests race for setting the bits? I >>> mean if some request bit was already set on entry, we don't include the >>> related VCPU in smp_call_function_many anyway. >> It's more difficult. >> >> vcpu 0: sets request bit on vcpu 2 >> vcpu 1: test_and_set request bit on vcpu 2, returns already set >> vcpu 1: returns >> vcpu 0: sends IPI >> vcpu 0: returns >> >> so vcpu 1 returns before the IPI was performed. If the request was a >> tlb flush, for example, vcpu 1 may free a page that is still in vcpu >> 2's tlb. > > One way out would be to have a KVM_REQ_IN_PROGRESS, set it in > make_request, clear it in the IPI function. > > If a second make_request sees it already set, it can simply busy wait > until it is cleared, without sending the IPI. Of course the busy wait > means we can't enable preemption (or we may busy wait on an unscheduled > task), but at least the requests can proceed in parallel instead of > serializing. ...or include VCPUs with KVM_REQ_IN_PROGRESS set into the IPI set even if they already have the desired request bit set. Then we should serialize in smp_call_function_many. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Remove all references to KVM_CR3_CACHE
This patch removes all references to KVM_CR3_CACHE as suggested by Marcello. Jes commit 3e3b0979d7d1464baeb5770f1de4954da7e59e1b Author: Jes Sorensen Date: Wed Feb 17 18:03:37 2010 +0100 Remove all references to KVM_CR3_CACHE as it was never implemented. Signed-off-by: Jes Sorensen diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 7f820a4..e57c479 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -1248,9 +1248,6 @@ struct kvm_para_features { #ifdef KVM_CAP_PV_MMU { KVM_CAP_PV_MMU, KVM_FEATURE_MMU_OP }, #endif -#ifdef KVM_CAP_CR3_CACHE - { KVM_CAP_CR3_CACHE, KVM_FEATURE_CR3_CACHE }, -#endif { -1, -1 } }; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 504f501..36fa736 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -154,9 +154,6 @@ struct kvm_para_features { #ifdef KVM_CAP_PV_MMU { KVM_CAP_PV_MMU, KVM_FEATURE_MMU_OP }, #endif -#ifdef KVM_CAP_CR3_CACHE -{ KVM_CAP_CR3_CACHE, KVM_FEATURE_CR3_CACHE }, -#endif { -1, -1 } }; -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
On 02/18/2010 11:49 AM, Jan Kiszka wrote: Avi Kivity wrote: On 02/18/2010 11:40 AM, Jan Kiszka wrote: Meanwhile, if anyone has any idea how to kill this lock, I'd love to see it. What concurrency does it resolve in the end? On first glance, it only synchronize the fiddling with pre-VCPU request bits, right? What forces us to do this? Wouldn't it suffice to disable preemption (thus migration) and the let concurrent requests race for setting the bits? I mean if some request bit was already set on entry, we don't include the related VCPU in smp_call_function_many anyway. It's more difficult. vcpu 0: sets request bit on vcpu 2 vcpu 1: test_and_set request bit on vcpu 2, returns already set vcpu 1: returns vcpu 0: sends IPI vcpu 0: returns so vcpu 1 returns before the IPI was performed. If the request was a tlb flush, for example, vcpu 1 may free a page that is still in vcpu 2's tlb. So the requests bits we are interested in are exclusively set in this function under requests_lock? Yes. vcpu 1 may still see the bit set (if vcpu 2 didn't reach the start of its loop and cleared it), so it's not total serialization, but nearly so. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
On 02/18/2010 11:45 AM, Avi Kivity wrote: On 02/18/2010 11:40 AM, Jan Kiszka wrote: Meanwhile, if anyone has any idea how to kill this lock, I'd love to see it. What concurrency does it resolve in the end? On first glance, it only synchronize the fiddling with pre-VCPU request bits, right? What forces us to do this? Wouldn't it suffice to disable preemption (thus migration) and the let concurrent requests race for setting the bits? I mean if some request bit was already set on entry, we don't include the related VCPU in smp_call_function_many anyway. It's more difficult. vcpu 0: sets request bit on vcpu 2 vcpu 1: test_and_set request bit on vcpu 2, returns already set vcpu 1: returns vcpu 0: sends IPI vcpu 0: returns so vcpu 1 returns before the IPI was performed. If the request was a tlb flush, for example, vcpu 1 may free a page that is still in vcpu 2's tlb. One way out would be to have a KVM_REQ_IN_PROGRESS, set it in make_request, clear it in the IPI function. If a second make_request sees it already set, it can simply busy wait until it is cleared, without sending the IPI. Of course the busy wait means we can't enable preemption (or we may busy wait on an unscheduled task), but at least the requests can proceed in parallel instead of serializing. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
Avi Kivity wrote: > On 02/18/2010 11:40 AM, Jan Kiszka wrote: >>> Meanwhile, if anyone has any idea how to kill this lock, I'd love to see it. >>> >>> >> What concurrency does it resolve in the end? On first glance, it only >> synchronize the fiddling with pre-VCPU request bits, right? What forces >> us to do this? Wouldn't it suffice to disable preemption (thus >> migration) and the let concurrent requests race for setting the bits? I >> mean if some request bit was already set on entry, we don't include the >> related VCPU in smp_call_function_many anyway. >> > > It's more difficult. > > vcpu 0: sets request bit on vcpu 2 >vcpu 1: test_and_set request bit on vcpu 2, returns already set >vcpu 1: returns > vcpu 0: sends IPI > vcpu 0: returns > > so vcpu 1 returns before the IPI was performed. If the request was a > tlb flush, for example, vcpu 1 may free a page that is still in vcpu 2's > tlb. So the requests bits we are interested in are exclusively set in this function under requests_lock? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
On 02/18/2010 11:40 AM, Jan Kiszka wrote: Meanwhile, if anyone has any idea how to kill this lock, I'd love to see it. What concurrency does it resolve in the end? On first glance, it only synchronize the fiddling with pre-VCPU request bits, right? What forces us to do this? Wouldn't it suffice to disable preemption (thus migration) and the let concurrent requests race for setting the bits? I mean if some request bit was already set on entry, we don't include the related VCPU in smp_call_function_many anyway. It's more difficult. vcpu 0: sets request bit on vcpu 2 vcpu 1: test_and_set request bit on vcpu 2, returns already set vcpu 1: returns vcpu 0: sends IPI vcpu 0: returns so vcpu 1 returns before the IPI was performed. If the request was a tlb flush, for example, vcpu 1 may free a page that is still in vcpu 2's tlb. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
Avi Kivity wrote: > On 02/18/2010 11:12 AM, Jan Kiszka wrote: >> Thomas Gleixner wrote: >> >>> The i8254/i8259 locks need to be real spinlocks on preempt-rt. Convert >>> them to raw_spinlock. No change for !RT kernels. >>> >> Last time I ran KVM over RT, I also had to convert requests_lock (struct >> kvm): make_all_cpus_request assumes that this lock prevents migration. >> >> > > True. Will commit something to that effect. > > Meanwhile, if anyone has any idea how to kill this lock, I'd love to see it. > What concurrency does it resolve in the end? On first glance, it only synchronize the fiddling with pre-VCPU request bits, right? What forces us to do this? Wouldn't it suffice to disable preemption (thus migration) and the let concurrent requests race for setting the bits? I mean if some request bit was already set on entry, we don't include the related VCPU in smp_call_function_many anyway. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
On 02/18/2010 11:12 AM, Jan Kiszka wrote: Thomas Gleixner wrote: The i8254/i8259 locks need to be real spinlocks on preempt-rt. Convert them to raw_spinlock. No change for !RT kernels. Last time I ran KVM over RT, I also had to convert requests_lock (struct kvm): make_all_cpus_request assumes that this lock prevents migration. True. Will commit something to that effect. Meanwhile, if anyone has any idea how to kill this lock, I'd love to see it. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
On 02/17/2010 04:00 PM, Thomas Gleixner wrote: The i8254/i8259 locks need to be real spinlocks on preempt-rt. Convert them to raw_spinlock. No change for !RT kernels. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] x86: kvm: Convert i8254/i8259 locks to raw_spinlocks
Thomas Gleixner wrote: > The i8254/i8259 locks need to be real spinlocks on preempt-rt. Convert > them to raw_spinlock. No change for !RT kernels. Last time I ran KVM over RT, I also had to convert requests_lock (struct kvm): make_all_cpus_request assumes that this lock prevents migration. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch uq/master 2/4] qemu: kvm specific wait_io_event
On 02/18/2010 12:14 AM, Marcelo Tosatti wrote: In KVM mode the global mutex is released when vcpus are executing, which means acquiring the fairness mutex is not required. Also for KVM there is one thread per vcpu, so tcg_has_work is meaningless. Add a new qemu_wait_io_event_common function to hold common code between TCG/KVM. Signed-off-by: Marcelo Tosatti Index: qemu/vl.c === --- qemu.orig/vl.c +++ qemu/vl.c @@ -3382,6 +3382,7 @@ static QemuCond qemu_pause_cond; static void block_io_signals(void); static void unblock_io_signals(void); static int tcg_has_work(void); +static int cpu_has_work(CPUState *env); static int qemu_init_main_loop(void) { @@ -3402,6 +3403,15 @@ static int qemu_init_main_loop(void) return 0; } +static void qemu_wait_io_event_common(CPUState *env) +{ +if (env->stop) { +env->stop = 0; +env->stopped = 1; +qemu_cond_signal(&qemu_pause_cond); +} +} + static void qemu_wait_io_event(CPUState *env) { while (!tcg_has_work()) @@ -3418,11 +3428,15 @@ static void qemu_wait_io_event(CPUState qemu_mutex_unlock(&qemu_fair_mutex); qemu_mutex_lock(&qemu_global_mutex); -if (env->stop) { -env->stop = 0; -env->stopped = 1; -qemu_cond_signal(&qemu_pause_cond); -} +qemu_wait_io_event_common(env); +} + +static void qemu_kvm_wait_io_event(CPUState *env) +{ +while (!cpu_has_work(env)) +qemu_cond_timedwait(env->halt_cond,&qemu_global_mutex, 1000); + +qemu_wait_io_event_common(env); } Shouldn't kvm specific code be in kvm-all.c? static int qemu_cpu_exec(CPUState *env); @@ -3448,7 +3462,7 @@ static void *kvm_cpu_thread_fn(void *arg while (1) { if (cpu_can_run(env)) qemu_cpu_exec(env); -qemu_wait_io_event(env); +qemu_kvm_wait_io_event(env); } return NULL; Well, kvm_cpu_thread_fn() apparently isn't. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch uq/master 0/4] uq/master: iothread consume signals via sigtimedwait and cleanups
On 02/18/2010 12:14 AM, Marcelo Tosatti wrote: See individual patches for details. Please repost, copying qemu-devel, since this code is to be queued for qemu.git. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recommended network driver for a windows KVM guest
On 02/17/2010 12:51 PM, carlopmart wrote: Hi all, I need to install several windows KVM (rhel5.4 host fully updated) guests for iSCSI boot. iSCSI servers are Solaris/OpenSolaris storage servers and I need to boot windows guests (2008R2 and Win7) using gpxe. Can i use virtio net dirver during windows install or e1000 driver?? rhel5.4 does not have gpxe so it won't work. rhel5.5 will have such but I don't recall someone testing iScsi with kvm+gpxe on upstream too, worth testing. Anyway, virtio performs better than e1000 and potentially more stable than it. Many thanks. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/18] KVM: PPC: Virtualize Gekko guests
On 02/18/2010 09:40 AM, Avi Kivity wrote: Now you made me check how fast the real hw is. I get about 65,000,000 fmul operations per second on it. That's surprisingly low. I get 3.7 Gflops on my home machine (1G loops, 4 fmul and 4 fadds, all independent, in 2.15 seconds; otherwise I can't saturate the pipeline). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html