Re: trouble with dvd drives
On Sat, Apr 11, 2009 at 11:31:41PM -0500, Gene Horodecki wrote: > Hi there.. I have a new machine with an AMD Phenom processor and two > fairly average LG DVD burners. I've gotten to the point that I seem to > be able to see both but they are not coming up as writable drives.. AFAIK the IDE device emulation from QEMU only supports up to (incomplete) DVD-ROM profile, hence why support for writting isn't working. Carlo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
trouble with dvd drives
Hi there.. I have a new machine with an AMD Phenom processor and two fairly average LG DVD burners. I've gotten to the point that I seem to be able to see both but they are not coming up as writable drives.. I have a data DVD in one drive and a blank in the other. Can anyone help me out? here is the kvm command I'm using: /usr/bin/kvm -M pc -m 512 -smp 1 -name fury -monitor pty -localtime -boot c -drive file=/v/fury/fury.img,if=scsi,index=0,boot=on -drive file=/dev/scd0,if=ide,media=cdrom -drive file=/dev/scd1,if=ide,media=cdrom -net nic,macaddr=00:16:36:53:18:96,vlan=0 -net tap,vlan=0,script=/v/scripts/kvm-ifup -serial pty -parallel none -usb -usbdevice tablet -vnc 0.0.0.0:4 -daemonize -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] deal with interrupt shadow state for emulated instruction
Avi Kivity wrote: >> >> -rip = kvm_rip_read(vcpu); >> -rip += vmcs_read32(VM_EXIT_INSTRUCTION_LEN); >> -kvm_rip_write(vcpu, rip); >> +if (!(interruptibility & interruptibility_mask)) >> +vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, >> + interruptibility | interruptibility_mask); >> +vcpu->arch.interrupt_window_open = 0; >> > > Setting both _MOV_SS and _STI is wierd; can't happen on real hardware. > Not at architecturally visible boundaries, for sure. It can be an implementation artifact internally to an instruction, though. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] Consolidate userspace and kernel interrupt injection for VMX.
On Sat, Apr 11, 2009 at 02:29:28PM +0300, Avi Kivity wrote: > Gleb Natapov wrote: >> Use the same callback to inject irq/nmi events no matter what irqchip is >> in use. Only from VMX for now. >> >> Signed-off-by: Gleb Natapov >> --- >> >> arch/x86/include/asm/kvm_host.h |2 + >> arch/x86/kvm/svm.c |2 + >> arch/x86/kvm/vmx.c | 71 >> +-- >> arch/x86/kvm/x86.c |2 + >> 4 files changed, 19 insertions(+), 58 deletions(-) >> >> diff --git a/arch/x86/include/asm/kvm_host.h >> b/arch/x86/include/asm/kvm_host.h >> index e672ca5..4e39c40 100644 >> --- a/arch/x86/include/asm/kvm_host.h >> +++ b/arch/x86/include/asm/kvm_host.h >> @@ -520,7 +520,7 @@ struct kvm_x86_ops { >> void (*queue_exception)(struct kvm_vcpu *vcpu, unsigned nr, >> bool has_error_code, u32 error_code); >> bool (*exception_injected)(struct kvm_vcpu *vcpu); >> -void (*inject_pending_irq)(struct kvm_vcpu *vcpu); >> +void (*inject_pending_irq)(struct kvm_vcpu *vcpu, struct kvm_run *run); >> > > kvm_run is available as vcpu->run, so this isn't needed. But better to > keep it for now and drop it in a later patch. > >> static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr) >> { >> int ret; >> @@ -3351,8 +3309,11 @@ static void vmx_complete_interrupts(struct vcpu_vmx >> *vmx) >> } >> } >> -static void vmx_intr_assist(struct kvm_vcpu *vcpu) >> +static void vmx_intr_assist(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) >> { >> +bool req_int_win = !irqchip_in_kernel(vcpu->kvm) && >> +kvm_run->request_interrupt_window; >> + >> update_tpr_threshold(vcpu); >> vmx_update_window_states(vcpu); >> @@ -3373,25 +3334,25 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu) >> return; >> } >> } >> > > Why not convert the 'enable_nmi_window(); return;' above to a 'goto out' > like you do elsewhere? > > Mainly because of the next patch in the series that reworks this code anyway. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Cleanup vmx_intr_assist()
On Sat, Apr 11, 2009 at 02:30:30PM +0300, Avi Kivity wrote: > Gleb Natapov wrote: >> Signed-off-by: Gleb Natapov >> --- >> >> arch/x86/kvm/vmx.c | 55 >> >> 1 files changed, 30 insertions(+), 25 deletions(-) >> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index 06252f7..9eb518f 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -3309,6 +3309,34 @@ static void vmx_complete_interrupts(struct vcpu_vmx >> *vmx) >> } >> } >> +static void vmx_intr_inject(struct kvm_vcpu *vcpu) >> +{ >> +/* try to reinject previous events if any */ >> +if (vcpu->arch.nmi_injected) { >> +vmx_inject_nmi(vcpu); >> +return; >> +} >> + >> +if (vcpu->arch.interrupt.pending) { >> +vmx_inject_irq(vcpu, vcpu->arch.interrupt.nr); >> +return; >> +} >> + >> +/* try to inject new event if pending */ >> +if (vcpu->arch.nmi_pending) { >> +if (vcpu->arch.nmi_window_open) { >> +vcpu->arch.nmi_pending = false; >> +vcpu->arch.nmi_injected = true; >> +vmx_inject_nmi(vcpu); >> +} >> +} else if (kvm_cpu_has_interrupt(vcpu)) { >> +if (vcpu->arch.interrupt_window_open) { >> +kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu)); >> +vmx_inject_irq(vcpu, vcpu->arch.interrupt.nr); >> +} >> +} >> +} >> + >> static void vmx_intr_assist(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) >> { >> bool req_int_win = !irqchip_in_kernel(vcpu->kvm) && >> @@ -3323,32 +3351,9 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu, >> struct kvm_run *kvm_run) >> GUEST_INTR_STATE_STI | >> GUEST_INTR_STATE_MOV_SS); >> - if (vcpu->arch.nmi_pending && !vcpu->arch.nmi_injected) { >> -if (vcpu->arch.interrupt.pending) { >> -enable_nmi_window(vcpu); >> -} else if (vcpu->arch.nmi_window_open) { >> -vcpu->arch.nmi_pending = false; >> -vcpu->arch.nmi_injected = true; >> -} else { >> -enable_nmi_window(vcpu); >> -return; >> -} >> -} >> - >> -if (vcpu->arch.nmi_injected) { >> -vmx_inject_nmi(vcpu); >> -goto out; >> -} >> - >> -if (!vcpu->arch.interrupt.pending && kvm_cpu_has_interrupt(vcpu)) { >> -if (vcpu->arch.interrupt_window_open) >> -kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu)); >> -} >> - >> -if (vcpu->arch.interrupt.pending) >> -vmx_inject_irq(vcpu, vcpu->arch.interrupt.nr); >> +vmx_intr_inject(vcpu); >> -out: >> +/* enable NMI/IRQ window open exits if needed */ >> if (vcpu->arch.nmi_pending) >> enable_nmi_window(vcpu); >> else if (kvm_cpu_has_interrupt(vcpu) || req_int_win) >> > > Not sure I understand the motivation. Just replace a 'goto out' with a > return? > Yes. The code is much cleaner this way. It is easy to see that no matter what happened during injection irq/nmi windows is enabled if there are pending events. With gotos and returns scattered over the function you'll need to check every single path that may or may not lead to the window enable code. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Server network went away. Host + Guests not reachable.
Hello, i have a big problem with my kvm setup. My network went away and i cant find any informations about the reasons. Its not possible to reach the Host or the Guest by SSH. I'm using the latest kvm (QEMU PC emulator version 0.9.1 (kvm-84)) and a setup with a network bridge. server1:~# dmesg [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 2.6.26-1-amd64 (Debian 2.6.26-13lenny2) (da...@debian.org) (gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)) #1 SMP Fri Mar 13 17:46:45 UTC 2009 [0.00] Command line: root=/dev/md1 ro [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009fc00 (usable) [0.00] BIOS-e820: 0009fc00 - 000a (reserved) [0.00] BIOS-e820: 000e4000 - 0010 (reserved) [0.00] BIOS-e820: 0010 - ddfd (usable) [0.00] BIOS-e820: ddfd - ddfde000 (ACPI data) [0.00] BIOS-e820: ddfde000 - de00 (ACPI NVS) [0.00] BIOS-e820: fff0 - 0001 (reserved) [0.00] BIOS-e820: 0001 - 00012000 (usable) [0.00] Entering add_active_range(0, 0, 159) 0 entries of 3200 used [0.00] Entering add_active_range(0, 256, 909264) 1 entries of 3200 used [0.00] Entering add_active_range(0, 1048576, 1179648) 2 entries of 3200 used [0.00] max_pfn_mapped = 1179648 [0.00] init_memory_mapping [0.00] DMI present. [0.00] ACPI: RSDP 000F98A0, 0014 (r0 ACPIAM) [0.00] ACPI: RSDT DDFD, 003C (r1 M S I OEMRSDT 1731 MSFT 97) [0.00] ACPI: FACP DDFD0200, 0084 (r2 M S I OEMFACP 1731 MSFT 97) [0.00] ACPI: DSDT DDFD0430, 40D9 (r1 1ADNC 1ADNC0000 INTL 20051117) [0.00] ACPI: FACS DDFDE000, 0040 [0.00] ACPI: APIC DDFD0390, 005C (r1 M S I OEMAPIC 1731 MSFT 97) [0.00] ACPI: MCFG DDFD03F0, 003C (r1 M S I OEMMCFG 1731 MSFT 97) [0.00] ACPI: OEMB DDFDE040, 0060 (r1 M S I AMI_OEM 1731 MSFT 97) [0.00] ACPI: EPTH DDFD4510, 0038 (r1 M S I OEMHPET 1731 MSFT 97) [0.00] ACPI: SSDT DDFD4550, 030E (r1 A M I POWERNOW1 AMD 1) [0.00] Scanning NUMA topology in Northbridge 24 [0.00] No NUMA configuration found [0.00] Faking a node at -00012000 [0.00] Entering add_active_range(0, 0, 159) 0 entries of 3200 used [0.00] Entering add_active_range(0, 256, 909264) 1 entries of 3200 used [0.00] Entering add_active_range(0, 1048576, 1179648) 2 entries of 3200 used [0.00] Bootmem setup node 0 -00012000 [0.00] NODE_DATA [e000 - 00012fff] [0.00] bootmap [00013000 - 00036fff] pages 24 [0.00] early res: 0 [0-fff] BIOS data page [0.00] early res: 1 [6000-7fff] TRAMPOLINE [0.00] early res: 2 [20-675397] TEXT DATA BSS [0.00] early res: 3 [37953000-37fefff3] RAMDISK [0.00] early res: 4 [9fc00-f] BIOS reserved [0.00] early res: 5 [8000-dfff] PGTABLE [0.00] [e200-e20002df] PMD -> [81000120-810003ff] on node 0 [0.00] [e20002e0-e20003ff] PMD -> [81000c00-81000cbf] on node 0 [0.00] Zone PFN ranges: [0.00] DMA 0 -> 4096 [0.00] DMA324096 -> 1048576 [0.00] Normal1048576 -> 1179648 [0.00] Movable zone start PFN for each node [0.00] early_node_map[3] active PFN ranges [0.00] 0:0 -> 159 [0.00] 0: 256 -> 909264 [0.00] 0: 1048576 -> 1179648 [0.00] On node 0 totalpages: 1040239 [0.00] DMA zone: 56 pages used for memmap [0.00] DMA zone: 1248 pages reserved [0.00] DMA zone: 2695 pages, LIFO batch:0 [0.00] DMA32 zone: 14280 pages used for memmap [0.00] DMA32 zone: 890888 pages, LIFO batch:31 [0.00] Normal zone: 1792 pages used for memmap [0.00] Normal zone: 129280 pages, LIFO batch:31 [0.00] Movable zone: 0 pages used for memmap [0.00] ATI board detected. Disabling timer routing over 8254. [0.00] Detected use of extended apic ids on hypertransport bus [0.00] ACPI: PM-Timer IO Port: 0x808 [0.00] ACPI: Local APIC address 0xfee0 [0.00] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) [0.00] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) [0.00] ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0]) [0.00] IOAPIC[0]: apic_id 2, version 0, address 0xfec0, GSI 0-23 [0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl d
Server network went away. Host + Guests not reachable.
Hello, i have a big problem with my kvm setup. My network went away and i cant find any informations about the reasons. Its not possible to reach the Host or the Guest by SSH. I'm using the latest kvm (QEMU PC emulator version 0.9.1 (kvm-84)) and a setup with a network bridge. server1:~# dmesg [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 2.6.26-1-amd64 (Debian 2.6.26-13lenny2) (da...@debian.org) (gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)) #1 SMP Fri Mar 13 17:46:45 UTC 2009 [0.00] Command line: root=/dev/md1 ro [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009fc00 (usable) [0.00] BIOS-e820: 0009fc00 - 000a (reserved) [0.00] BIOS-e820: 000e4000 - 0010 (reserved) [0.00] BIOS-e820: 0010 - ddfd (usable) [0.00] BIOS-e820: ddfd - ddfde000 (ACPI data) [0.00] BIOS-e820: ddfde000 - de00 (ACPI NVS) [0.00] BIOS-e820: fff0 - 0001 (reserved) [0.00] BIOS-e820: 0001 - 00012000 (usable) [0.00] Entering add_active_range(0, 0, 159) 0 entries of 3200 used [0.00] Entering add_active_range(0, 256, 909264) 1 entries of 3200 used [0.00] Entering add_active_range(0, 1048576, 1179648) 2 entries of 3200 used [0.00] max_pfn_mapped = 1179648 [0.00] init_memory_mapping [0.00] DMI present. [0.00] ACPI: RSDP 000F98A0, 0014 (r0 ACPIAM) [0.00] ACPI: RSDT DDFD, 003C (r1 M S I OEMRSDT 1731 MSFT 97) [0.00] ACPI: FACP DDFD0200, 0084 (r2 M S I OEMFACP 1731 MSFT 97) [0.00] ACPI: DSDT DDFD0430, 40D9 (r1 1ADNC 1ADNC0000 INTL 20051117) [0.00] ACPI: FACS DDFDE000, 0040 [0.00] ACPI: APIC DDFD0390, 005C (r1 M S I OEMAPIC 1731 MSFT 97) [0.00] ACPI: MCFG DDFD03F0, 003C (r1 M S I OEMMCFG 1731 MSFT 97) [0.00] ACPI: OEMB DDFDE040, 0060 (r1 M S I AMI_OEM 1731 MSFT 97) [0.00] ACPI: EPTH DDFD4510, 0038 (r1 M S I OEMHPET 1731 MSFT 97) [0.00] ACPI: SSDT DDFD4550, 030E (r1 A M I POWERNOW1 AMD 1) [0.00] Scanning NUMA topology in Northbridge 24 [0.00] No NUMA configuration found [0.00] Faking a node at -00012000 [0.00] Entering add_active_range(0, 0, 159) 0 entries of 3200 used [0.00] Entering add_active_range(0, 256, 909264) 1 entries of 3200 used [0.00] Entering add_active_range(0, 1048576, 1179648) 2 entries of 3200 used [0.00] Bootmem setup node 0 -00012000 [0.00] NODE_DATA [e000 - 00012fff] [0.00] bootmap [00013000 - 00036fff] pages 24 [0.00] early res: 0 [0-fff] BIOS data page [0.00] early res: 1 [6000-7fff] TRAMPOLINE [0.00] early res: 2 [20-675397] TEXT DATA BSS [0.00] early res: 3 [37953000-37fefff3] RAMDISK [0.00] early res: 4 [9fc00-f] BIOS reserved [0.00] early res: 5 [8000-dfff] PGTABLE [0.00] [e200-e20002df] PMD -> [81000120-810003ff] on node 0 [0.00] [e20002e0-e20003ff] PMD -> [81000c00-81000cbf] on node 0 [0.00] Zone PFN ranges: [0.00] DMA 0 -> 4096 [0.00] DMA324096 -> 1048576 [0.00] Normal1048576 -> 1179648 [0.00] Movable zone start PFN for each node [0.00] early_node_map[3] active PFN ranges [0.00] 0:0 -> 159 [0.00] 0: 256 -> 909264 [0.00] 0: 1048576 -> 1179648 [0.00] On node 0 totalpages: 1040239 [0.00] DMA zone: 56 pages used for memmap [0.00] DMA zone: 1248 pages reserved [0.00] DMA zone: 2695 pages, LIFO batch:0 [0.00] DMA32 zone: 14280 pages used for memmap [0.00] DMA32 zone: 890888 pages, LIFO batch:31 [0.00] Normal zone: 1792 pages used for memmap [0.00] Normal zone: 129280 pages, LIFO batch:31 [0.00] Movable zone: 0 pages used for memmap [0.00] ATI board detected. Disabling timer routing over 8254. [0.00] Detected use of extended apic ids on hypertransport bus [0.00] ACPI: PM-Timer IO Port: 0x808 [0.00] ACPI: Local APIC address 0xfee0 [0.00] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) [0.00] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) [0.00] ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0]) [0.00] IOAPIC[0]: apic_id 2, version 0, address 0xfec0, GSI 0-23 [0.00] ACPI: INT_SRC_OVR (bus 0 bu
Re: [RFC PATCH v2 15/19] kvm: add dynamic IRQ support
Gregory Haskins wrote: This patch provides the ability to dynamically declare and map an interrupt-request handle to an x86 8-bit vector. Problem Statement: Emulated devices (such as PCI, ISA, etc) have interrupt routing done via standard PC mechanisms (MP-table, ACPI, etc). However, we also want to support a new class of devices which exist in a new virtualized namespace and therefore should not try to piggyback on these emulated mechanisms. Rather, we create a way to dynamically register interrupt resources that acts indepent of the emulated counterpart. On x86, a simplistic view of the interrupt model is that each core has a local-APIC which can recieve messages from APIC-compliant routing devices (such as IO-APIC and MSI) regarding details about an interrupt (such as which vector to raise). These routing devices are controlled by the OS so they may translate a physical event (such as "e1000: raise an RX interrupt") to a logical destination (such as "inject IDT vector 46 on core 3"). A dynirq is a virtual implementation of such a router (think of it as a virtual-MSI, but without the coupling to an existing standard, such as PCI). The model is simple: A guest OS can allocate the mapping of "IRQ" handle to "vector/core" in any way it sees fit, and provide this information to the dynirq module running in the host. The assigned IRQ then becomes the sole handle needed to inject an IDT vector to the guest from a host. A host entity that wishes to raise an interrupt simple needs to call kvm_inject_dynirq(irq) and the routing is performed transparently. +static int +_kvm_inject_dynirq(struct kvm *kvm, struct dynirq *entry) +{ + struct kvm_vcpu *vcpu; + int ret; + + mutex_lock(&kvm->lock); + + vcpu = kvm->vcpus[entry->dest]; + if (!vcpu) { + ret = -ENOENT; + goto out; + } + + ret = kvm_apic_set_irq(vcpu, entry->vec, 1); + +out: + mutex_unlock(&kvm->lock); + + return ret; +} + Given that you're using the apic to inject the IRQ, you'll need an EOI. So what's the difference between dynirq and MSI, performance wise? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 14/19] kvm: add a reset capability
Gregory Haskins wrote: We need a way to detect if a VM is reset later in the series, so lets add a capability for userspace to signal a VM reset down to the kernel. As I mentioned, this won't be reliable. It needs to be driven from userspace and be per-device. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: Defer remote tlb flushes on invlpg (v4)
On Sun, Mar 29, 2009 at 01:36:01PM +0300, Avi Kivity wrote: > Marcelo, Andrea? Had to read the code a bit more to understand the reason of the unsync_mmu flush in cr3 overwrite. > Avi Kivity wrote: >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >> index 2a36f7f..f0ea56c 100644 >> --- a/arch/x86/kvm/mmu.c >> +++ b/arch/x86/kvm/mmu.c >> @@ -1184,8 +1184,7 @@ static void mmu_sync_children(struct kvm_vcpu *vcpu, >> for_each_sp(pages, sp, parents, i) >> protected |= rmap_write_protect(vcpu->kvm, sp->gfn); >> - if (protected) >> -kvm_flush_remote_tlbs(vcpu->kvm); >> +kvm_flush_remote_tlbs_cond(vcpu->kvm, protected); >> for_each_sp(pages, sp, parents, i) { >> kvm_sync_page(vcpu, sp); Ok so because we didn't flush the tlb on the other vcpus when invlpg run, if cr3 overwrite needs to re-sync sptes wrprotecting them, we've to flush the tlb in all vcpus to be sure the possibly writable tlb entry reflecting the old writable spte instantiated before invlpg run, is removed from the physical cpus. We wouldn't find it in for_each_sp because it was rmap_removed, but we'll find something in mmu_unsync_walk (right? we definitely have to find something in mmu_unsync_walk for this to work, the parent sp have to leave child->unsync set even after rmap_remove run in invlpg without flushing the other vcpus tlbs). >> @@ -465,7 +464,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, >> gva_t gva) >> rmap_remove(vcpu->kvm, sptep); >> if (is_large_pte(*sptep)) >> --vcpu->kvm->stat.lpages; >> -need_flush = 1; >> +vcpu->kvm->remote_tlbs_dirty = true; >> } >> set_shadow_pte(sptep, shadow_trap_nonpresent_pte); >> break; >> @@ -475,8 +474,6 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t >> gva) >> break; >> } >> - if (need_flush) >> -kvm_flush_remote_tlbs(vcpu->kvm); >> spin_unlock(&vcpu->kvm->mmu_lock); AFIK to be compliant with lowlevel archs (without ASN it doesn't matter I think as vmx always flush on exit), we have to flush the local tlb here, with set_bit(KVM_REQ_TLB_FLUSH, &vcpu->requests). I don't see why it's missing. Or am I wrong? >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >> index 68b217e..12afa50 100644 >> --- a/virt/kvm/kvm_main.c >> +++ b/virt/kvm/kvm_main.c >> @@ -758,10 +758,18 @@ static bool make_all_cpus_request(struct kvm *kvm, >> unsigned int req) >> void kvm_flush_remote_tlbs(struct kvm *kvm) >> { >> +kvm->remote_tlbs_dirty = false; >> +smp_wmb(); Still no lock prefix to the asm insn and here it runs outside the mmu_lock, but ok, I tend to agree smp_wmb should be enough to be sure the write is fully finished by the time smb_wmb returns. There's another problem though. CPU0CPU1 --- - remote_tlbs_dirty = false remote_tlbs_dirty = true smp_tlb_flush set_shadow_pte(sptep, shadow_trap_nonpresent_pte); The flush for the sptep will be lost. >> @@ -907,8 +913,7 @@ static int kvm_mmu_notifier_clear_flush_young(struct >> mmu_notifier *mn, >> young = kvm_age_hva(kvm, address); >> spin_unlock(&kvm->mmu_lock); >> - if (young) >> -kvm_flush_remote_tlbs(kvm); >> +kvm_flush_remote_tlbs_cond(kvm, young); >> return young; >> } No need to flush for clear_flush_young method, pages can't be freed there. I mangled over the patch a bit, plus fixed the above smp race, let me know what you think. The the best workload to exercise this is running a VM with lots of VCPUs and 8G of ram with a 32bit guest kernel and then just malloc and touch a byte for each 4096 page allocated by malloc. That will run a flood of invlpg. Then push the system to swap. while :; do cp /dev/hda /dev/null; done, also works without O_DIRECT so the host cache make it fast at the second run (not so much faster with host swapping though). I only tested it so far with 12 VM on swap with 64bit kernels with heavy I/O so it's not good test as I doubt any invlpg runs, not even munmap(addr, 4k) uses invlpg. Signed-off-by: Andrea Arcangeli --- diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index d5bdf3a..900bc31 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1185,7 +1185,11 @@ static void mmu_sync_children(struct kvm_vcpu *vcpu, for_each_sp(pages, sp, parents, i) protected |= rmap_write_protect(vcpu->kvm, sp->gfn); - if (protected) + /* +* Avoid leaving stale tlb entries in this vcpu representing +* sptes rmap_removed by invlpg without vcpu smp tlb flush. +
Re: [RFC PATCH v2 00/19] virtual-bus
Gregory Haskins wrote: Avi, Gregory Haskins wrote: Todo: *) Develop some kind of hypercall registration mechanism for KVM so that we can use that as an integration point instead of directly hooking kvm hypercalls What would you like to see here? I now remember why I removed the original patch I had for registration...it requires some kind of discovery mechanism on its own. Note that this is hard, but I figured it would make the overall series simpler if I didn't go this route and instead just integrated with a statically allocated vector. That being said, I have no problem adding this back in but figure we should discuss the approach so I don't go down a rat-hole ;) One idea is similar to signalfd() or eventfd(). Provide a kvm ioctl that takes a gsi and returns an fd. Writes to the fd change the state of the line, possible triggering an interrupt. Another ioctl takes a hypercall number or pio port as well as an existing fd. Invocations of the hypercall or writes to the port write to the fd (using the same protocol as eventfd), so the other end can respond. The nice thing is that this can be used by both kernel and userspace components, and for kernel components, hypercalls can be either buffered or unbuffered. So, one thing we could do is use a string-identifier to discover hypercall resources. In this model, we would have one additional hypercall registered with kvm (in addition to the mmu-ops, etc) called KVM_HC_DYNHC or something like that. The support for DYNHC could be indicated in the cpuid (much like I do with the RESET, DYNIRQ, and VBUS support today. When hypercall provides register, the could provide a string such as "vbus", and they would be allocated a hypercall id. Likewise, the HC_DYNHC interface would allow a guest to query the cpuid for the DYNHC feature, and then query the HC_DYNHC vector for a string to hc# translation. If the provider is not present, we return -1 for the hc#, otherwise we return the one that was allocated. I know how you feel about string-ids in general, but I am not quite sure how to design this otherwise without it looking eerily similar to what I already have (which is registering a new HC vector in kvm_para.h) No need for a string ID. Reserve a range of hypercall numbers for dynamic IDs. Userspace allocates one and gives it to the device using its configuration space (as applies to whatever bus it is using). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] deal with interrupt shadow state for emulated instruction
Glauber Costa wrote: we currently unblock shadow interrupt state when we skip an instruction, but failing to do so when we actually emulate one. This blocks interrupts in key instruction blocks, in particular sti; hlt; sequences If the instruction emulated is an sti, we have to block shadow interrupts. The same goes for mov ss. pop ss also needs it, but we don't currently emulate it. For sequences of two or more instructions of the same type among those instructions, only the first one has this effect. Without this patch, I cannot boot gpxe option roms at vmx machines. This is described at https://bugzilla.redhat.com/show_bug.cgi?id=494469 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -513,6 +513,8 @@ struct kvm_x86_ops { void (*run)(struct kvm_vcpu *vcpu, struct kvm_run *run); int (*handle_exit)(struct kvm_run *run, struct kvm_vcpu *vcpu); void (*skip_emulated_instruction)(struct kvm_vcpu *vcpu); + void (*block_interrupt_shadow)(struct kvm_vcpu *vcpu); + void (*unblock_interrupt_shadow)(struct kvm_vcpu *vcpu); set_interrupt_shadow(), clear_interrupt_shadow(). The interrupt shadow blocks interrupts, but what happens when you block the interrupt shadow? Maybe better to fold into one callback with a parameter. int x86_emulate_insn(struct x86_emulate_ctxt *ctxt, -struct x86_emulate_ops *ops); +struct x86_emulate_ops *ops, int *interruptibility); Add to x86_emulate_ctxt, there's already some +static void svm_block_interrupt_shadow(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + svm->vmcb->control.int_state |= SVM_INTERRUPT_SHADOW_MASK; + vcpu->arch.interrupt_window_open = 0; +} + +static void svm_unblock_interrupt_shadow(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + svm->vmcb->control.int_state &= ~SVM_INTERRUPT_SHADOW_MASK; + vcpu->arch.interrupt_window_open = (svm->vcpu.arch.hflags & HF_GIF_MASK); If eflags.if = 0, the interrupt window is closed. diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c6997c0..5158c2b 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -736,26 +736,45 @@ static void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags) vmcs_writel(GUEST_RFLAGS, rflags); } -static void skip_emulated_instruction(struct kvm_vcpu *vcpu) +static void vmx_block_interrupt_shadow(struct kvm_vcpu *vcpu) { - unsigned long rip; - u32 interruptibility; + u32 interruptibility = vmcs_read32(GUEST_INTERRUPTIBILITY_INFO); + u32 interruptibility_mask = ((GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS)); - rip = kvm_rip_read(vcpu); - rip += vmcs_read32(VM_EXIT_INSTRUCTION_LEN); - kvm_rip_write(vcpu, rip); + if (!(interruptibility & interruptibility_mask)) + vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, +interruptibility | interruptibility_mask); + vcpu->arch.interrupt_window_open = 0; Setting both _MOV_SS and _STI is wierd; can't happen on real hardware. { unsigned long memop = 0; u64 msr_data; @@ -1359,6 +1360,10 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops) unsigned int port; int io_dir_in; int rc = 0; + static int movss_int_flag, movss_int_flag_old; static, for shame. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: AW: AW: AW: KVM performance
BRAUN, Stefanie wrote: Hello, now I was able to start the guest vmu with disk virtio, and some of the tests with disk involvement even improved a bit. But the test in which a logo is added to the video stream does not improve. I don't know why the performance is so bad? Subtest: Reading video locally, adding a logo to the video stream and then saving the video locally Host performance: 50% kvm process in host (top) : 99% vlc process in vmu (top) : 99% The output of kvm_stat -1 during the subtest is the following: efer_reload0 0 exits9913473 3994 This indicates that kvm is running in guest mode all of the time and is therefore quite efficient. Perhaps the test uses sse instructions which kvm doesn't expose? Try adding -cpu core2duo to the command line. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Add MCE support to KVM
Andi Kleen wrote: Right, but we can allocate the maximum number, no? it's a fairly small amount of memory. There are 255 banks worst case which need upto 4 (in theory 5) MSRs each That's 8KB, which would normally not be allocated. But I think we can live with it. Do we actually need to support 255 banks? Or can we support a smaller number transparently? (IOW: are the banks interchangable?) -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Add MCE support to KVM
> Right, but we can allocate the maximum number, no? it's a fairly small > amount of memory. There are 255 banks worst case which need upto 4 (in theory 5) MSRs each -Andi -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Add MCE support to KVM
Huang Ying wrote: On Thu, 2009-04-09 at 23:50 +0800, Avi Kivity wrote: Huang Ying wrote: +int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data) +{ + switch (msr) { + case MSR_EFER: + set_efer(vcpu, data); break; case MSR_IA32_DEBUGCTLMSR: if (!data) { @@ -807,6 +828,8 @@ int kvm_set_msr_common(struct kvm_vcpu * break; } default: + if (!set_msr_mce(vcpu, msr, data)) + break; pr_unimpl(vcpu, "unhandled wrmsr: 0x%x data %llx\n", msr, data); return 1; } Is there any reason you split kvm_set_msr_common() into two functions? I want to group MCE related MSR together. And most MCE MSR read/write need to access vcpu->arch.mcg_xxx or vcpu->arch_mce_banks, So I think use a MCE specific function would be cleaner. But It seems that something as follow would be better. kvm_set_msr_comm() { switch (msr) { case MSR_IA32_P5_MC_ADDR: case MSR_IA32_P5_MC_TYPE: case MSR_IA32_MCG_CAP: case MSR_IA32_MCG_CTL: case MSR_IA32_MCG_STATUS: case MSR_IA32_MC0_CTL ... MSR_IA32_MC0_MISC + 4 * KVM_MCE_MAX_BANK: set_msr_mce(); break; ... } ... } Yes. Just make sure KVM_MCE_MAX_BANK (better change to KVM_MCE_NR_BANK, with MAX you never know if it's the index of the last bank or the number of banks) doesn't conflict with any other MSRs. + +int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) +{ + u64 data; + + switch (msr) { + case 0xc0010010: /* SYSCFG */ + case 0xc0010015: /* HWCR */ Please use MSR_ constants (add them if they don't exist yet). In fact, this is not added by me. But I can change this by the way. Oh okay. So don't change them in this patch. Why not always allocate it on vcpu setup? Because the MCE bank number is not fixed, it is based on mcg_cap from user space. Right, but we can allocate the maximum number, no? it's a fairly small amount of memory. +static int kvm_vcpu_ioctl_x86_set_mce(struct kvm_vcpu *vcpu, + struct kvm_x86_mce *mce) +{ + u64 mcg_cap = vcpu->arch.mcg_cap; + unsigned bank_num = mcg_cap & 0xff; + u64 *banks = vcpu->arch.mce_banks; + + if (mce->bank >= bank_num || !(mce->status & MCI_STATUS_VAL)) + return -EINVAL; + /* +* if IA32_MCG_CTL is not all 1s, the uncorrected error +* reporting is disabled +*/ + if ((mce->status & MCI_STATUS_UC) && (mcg_cap & MCG_CTL_P) && + vcpu->arch.mcg_ctl != ~(u64)0) + return 0; + banks += 4 * mce->bank; + /* +* if IA32_MCi_CTL is not all 1s, the uncorrected error +* reporting is disabled for the bank +*/ + if ((mce->status & MCI_STATUS_UC) && banks[0] != ~(u64)0) + return 0; + if (mce->status & MCI_STATUS_UC) { + u64 status = mce->status; + if ((vcpu->arch.mcg_status & MCG_STATUS_MCIP) || + !(vcpu->arch.cr4 & X86_CR4_MCE)) { + printk(KERN_DEBUG "kvm: set_mce: " + "injects mce exception while " + "previous one is in progress!\n"); + set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); + return 0; + } + if (banks[1] & MCI_STATUS_VAL) + status |= MCI_STATUS_OVER; + banks[1] = mce->status; + banks[2] = mce->addr; + banks[3] = mce->misc; + vcpu->arch.mcg_status = mce->mcg_status; + kvm_queue_exception(vcpu, MC_VECTOR); + } else if (!(banks[1] & MCI_STATUS_VAL) || + (!(banks[1] & MCI_STATUS_UC) && + !((mcg_cap & MCG_TES_P) && ((banks[1]>>53) & 0x3) < 2))) { + u64 status = mce->status; + if (banks[1] & MCI_STATUS_VAL) + status |= MCI_STATUS_OVER; + banks[1] = mce->status; + banks[2] = mce->addr; + banks[3] = mce->misc; + } else + banks[1] |= MCI_STATUS_OVER; + return 0; +} Can userspace just use KVM_SET_MSR for this? In addition to assigning MSR, we have some other logic for MCE, such as BANK reporting disabling, overwriting rules, triple fault for UC MCE during MCIP. So I think we need some dedicate interface. Yes, you're right. + case KVM_X86_SETUP_MCE: { + u64 mcg_cap; + + r = -EFAULT; + if (copy_from_user(&mcg_cap, argp, sizeof mcg_cap)) + goto out; + /* +* extended machine-check state registers and CMCI are +
Re: deactivate fpu when cr0.PE ON?
Dong, Eddie wrote: Following code deactivate fpu when CR0.PE is on, any explaination? Rest of code active/deactive fpu based on cr0.TS bit. The idea is, if the guest is doing a context switch, then it will likely set its own lazy fpu code, so we anticipate it and deactivate the cpu. When cr0.pe is cleared, lazy fpu usually doesn't work so we leave it enabled. However the mechanism of deactivating the fpu on cr3 write doesn't work with npt or ept, I'd like to replace it with something better. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libkvm: fix build with --with-patched-kernel on 2.6.29
Michael S. Tsirkin wrote: Make libkvm build with --with-patched-kernel on 2.6.29. Applied, thanks. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Cleanup vmx_intr_assist()
Gleb Natapov wrote: Signed-off-by: Gleb Natapov --- arch/x86/kvm/vmx.c | 55 1 files changed, 30 insertions(+), 25 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 06252f7..9eb518f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3309,6 +3309,34 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx) } } +static void vmx_intr_inject(struct kvm_vcpu *vcpu) +{ + /* try to reinject previous events if any */ + if (vcpu->arch.nmi_injected) { + vmx_inject_nmi(vcpu); + return; + } + + if (vcpu->arch.interrupt.pending) { + vmx_inject_irq(vcpu, vcpu->arch.interrupt.nr); + return; + } + + /* try to inject new event if pending */ + if (vcpu->arch.nmi_pending) { + if (vcpu->arch.nmi_window_open) { + vcpu->arch.nmi_pending = false; + vcpu->arch.nmi_injected = true; + vmx_inject_nmi(vcpu); + } + } else if (kvm_cpu_has_interrupt(vcpu)) { + if (vcpu->arch.interrupt_window_open) { + kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu)); + vmx_inject_irq(vcpu, vcpu->arch.interrupt.nr); + } + } +} + static void vmx_intr_assist(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { bool req_int_win = !irqchip_in_kernel(vcpu->kvm) && @@ -3323,32 +3351,9 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS); - if (vcpu->arch.nmi_pending && !vcpu->arch.nmi_injected) { - if (vcpu->arch.interrupt.pending) { - enable_nmi_window(vcpu); - } else if (vcpu->arch.nmi_window_open) { - vcpu->arch.nmi_pending = false; - vcpu->arch.nmi_injected = true; - } else { - enable_nmi_window(vcpu); - return; - } - } - - if (vcpu->arch.nmi_injected) { - vmx_inject_nmi(vcpu); - goto out; - } - - if (!vcpu->arch.interrupt.pending && kvm_cpu_has_interrupt(vcpu)) { - if (vcpu->arch.interrupt_window_open) - kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu)); - } - - if (vcpu->arch.interrupt.pending) - vmx_inject_irq(vcpu, vcpu->arch.interrupt.nr); + vmx_intr_inject(vcpu); -out: + /* enable NMI/IRQ window open exits if needed */ if (vcpu->arch.nmi_pending) enable_nmi_window(vcpu); else if (kvm_cpu_has_interrupt(vcpu) || req_int_win) Not sure I understand the motivation. Just replace a 'goto out' with a return? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] Consolidate userspace and kernel interrupt injection for VMX.
Gleb Natapov wrote: Use the same callback to inject irq/nmi events no matter what irqchip is in use. Only from VMX for now. Signed-off-by: Gleb Natapov --- arch/x86/include/asm/kvm_host.h |2 + arch/x86/kvm/svm.c |2 + arch/x86/kvm/vmx.c | 71 +-- arch/x86/kvm/x86.c |2 + 4 files changed, 19 insertions(+), 58 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e672ca5..4e39c40 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -520,7 +520,7 @@ struct kvm_x86_ops { void (*queue_exception)(struct kvm_vcpu *vcpu, unsigned nr, bool has_error_code, u32 error_code); bool (*exception_injected)(struct kvm_vcpu *vcpu); - void (*inject_pending_irq)(struct kvm_vcpu *vcpu); + void (*inject_pending_irq)(struct kvm_vcpu *vcpu, struct kvm_run *run); kvm_run is available as vcpu->run, so this isn't needed. But better to keep it for now and drop it in a later patch. static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr) { int ret; @@ -3351,8 +3309,11 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx) } } -static void vmx_intr_assist(struct kvm_vcpu *vcpu) +static void vmx_intr_assist(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { + bool req_int_win = !irqchip_in_kernel(vcpu->kvm) && + kvm_run->request_interrupt_window; + update_tpr_threshold(vcpu); vmx_update_window_states(vcpu); @@ -3373,25 +3334,25 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu) return; } } Why not convert the 'enable_nmi_window(); return;' above to a 'goto out' like you do elsewhere? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] Make kvm_cpu_(has|get)_interrupt() work for userspace irqchip too.
Gleb Natapov wrote: * check if there are pending timer events @@ -48,14 +49,17 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v) { struct kvm_pic *s; - if (kvm_apic_has_interrupt(v) == -1) { /* LAPIC */ - if (kvm_apic_accept_pic_intr(v)) { - s = pic_irqchip(v->kvm); /* PIC */ - return s->output; - } else - return 0; + if (irqchip_in_kernel(v->kvm)) { + if (kvm_apic_has_interrupt(v) == -1) { /* LAPIC */ + if (kvm_apic_accept_pic_intr(v)) { + s = pic_irqchip(v->kvm); /* PIC */ + return s->output; + } else + return 0; + } + return 1; } - return 1; + return v->arch.irq_summary; } Use if (!irqchip_in_kernel(v->kvm)) for userspace seems more simple(rather than a series of indention...). As long as lines are smaller then 80 chars I don't care much :) If new version of the patch will be needed I'll change this. Please change it, the less indentation levels the better. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] align vga rom to 4k boundary.
Glauber Costa wrote: Instead of aligning to 2k boundary, as required by the bios, align to 4k boundary, as required by kvm memory functions. Without this patch, starting kvm with -vga std option fails with: create_userspace_phys_mem: Invalid argument kvm_cpu_register_physical_memory: failed as described by: https://bugzilla.redhat.com/show_bug.cgi?id=494376 It does not fail with cirrus vga, because it is naturally aligned. This problem does not seem to affect upstream qemu. exit(1); } /* Round up vga bios size to the next 2k boundary */ - vga_bios_size = (vga_bios_size + 2047) & ~2047; + vga_bios_size = (vga_bios_size + 4095) & ~4095; option_rom_start = 0xc + vga_bios_size; /* setup basic memory access */ Comment needs updating. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] propagate errors on failed migration.
Glauber Costa wrote: We're currently ignoring any errors if dirty logging fails. Set error on migration file if we're unable to put dirty logging on. Applied, thanks. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] Add lcall decoding.
Gleb Natapov wrote: No emulation yet. Signed-off-by: Gleb Natapov --- arch/x86/kvm/x86_emulate.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index c015063..fe0dec2 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -154,7 +154,8 @@ static u32 opcode_table[256] = { /* 0x90 - 0x97 */ DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, /* 0x98 - 0x9F */ - 0, 0, 0, 0, ImplicitOps | Stack, ImplicitOps | Stack, 0, 0, + 0, 0, SrcImm | Src2Imm16 | ImplicitOps, 0, ImplicitOps | Stack, + ImplicitOps | Stack, 0, 0, /* 0xA0 - 0xA7 */ ByteOp | DstReg | SrcMem | Mov | MemAbs, DstReg | SrcMem | Mov | MemAbs, ByteOp | DstMem | SrcReg | Mov | MemAbs, DstMem | SrcReg | Mov | MemAbs, Please keep 4 or 8 opcodes per line. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/9] Completely decode in/out at decoding stage.
Gleb Natapov wrote: Signed-off-by: Gleb Natapov --- arch/x86/kvm/x86_emulate.c | 10 ++ 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 3c23af0..cf27e62 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -193,8 +193,10 @@ static u32 opcode_table[256] = { 0, 0, 0, 0, 0, 0, 0, 0, /* 0xE0 - 0xE7 */ 0, 0, 0, 0, - SrcNone | ByteOp | ImplicitOps, SrcNone | ImplicitOps, - SrcNone | ByteOp | ImplicitOps, SrcNone | ImplicitOps, + SrcNone | ByteOp | SrcImmByte | ImplicitOps, + SrcNone | SrcImmByte | ImplicitOps, + SrcNone | ByteOp | SrcImmByte | ImplicitOps, + SrcNone | SrcImmByte | ImplicitOps, SrcImmByte sign extends, but you want zero extension here. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm: Fix overlapping check for memory slots
This nice little buglet complicates a smarter slot management in qemu user space just "slightly". Sigh... > When checking for overlapping slots on registration of a new one, kvm currently also considers zero-length (ie. deleted) slots and rejects requests incorrectly. This finally denies user space from joining slots. Fix the check by skipping deleted slots. Signed-off-by: Jan Kiszka --- virt/kvm/kvm_main.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 363af32..18f06d2 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1117,7 +1117,7 @@ int __kvm_set_memory_region(struct kvm *kvm, for (i = 0; i < KVM_MEMORY_SLOTS; ++i) { struct kvm_memory_slot *s = &kvm->memslots[i]; - if (s == memslot) + if (s == memslot || !s->npages) continue; if (!((base_gfn + npages <= s->base_gfn) || (base_gfn >= s->base_gfn + s->npages))) signature.asc Description: OpenPGP digital signature
Re: kvm-84: Nested virtualization, crashes, and kvm binary name
On 10.04.2009, at 19:08, Mihail Panev wrote: Hi Alex, Alexander Graf wrote: So question #1: Is this the right thing to start, and if yes, what's the story behind that name? I ran across some qemu-system-i386 on google, but my compile did not produce such a binary. Yes, the binary is called "qemu-system-x86_64". The story behind all this is somewhere in the archive of this ML. Yeah, I had already checked that out before I posted, but what I found only seems to clarify the difference between qemu- and qemu-system-. What baffles me most is why the heck does it generate a x86_64 target on a i686 system?! Or has KVM dropped support for 32-bit x86 as a guest platform altogether, relying on x86_64's legacy/compatibility mode? That would make sense, but on the other hand it could mislead into thinking that you can run a 64-bit guest on it, which wouldn't work when the host is actually 32-bit (like in my case). Or do I miss something here? Please don't question the usefulness of the decision. It is called "qemu-system-x86_64" for both i386 and x86_64 alike. This will probably go away when qemu and kvm-userspace merge some day. I actually thought that had already happened, after I read the changelog for kvm-84. Now I see that I must have interpreted "merge qemu-svn" the other way round as it was meant :-) kvm-userspace merges in upstream qemu changes from time to time. Also some developers try to push their changes in kvm-userspace to upstream qemu. The process is not finished yet though, as you still have 2 distinct trees. I agree that this is not exactly obvious, but I wanted to guard the code as heavily as possible :-). Guard it? Is that still experimental? I thought this was enabled by default... Yes, it's experimental. Actually it would have been more obvious if there was a manpage documenting that. However, my build did not produce any manpages, although the configure script said "Manual directory /usr/local/ share/man". Even if so, I think that needing to set the module parameter explicitly is enough of a safeguard. Apropos parameter, I'd rather name it something more self-explanatory like "nested_virt" or "nested_svm" or something. Most people, including me, usually associate "nested" with NPT, when it comes to virtualization. I was imagining a world where you would always have the feature enabled and choose its activation via -cpu. Per default -cpu is set to "compatible" - a CPU that can be migrated even across intel and AMD platforms. If you choose -cpu barcelona or so you get SVM features in the guest. For now it's the way you see it though and will be until I find time to make the code work perfectly ;-). And while we are at parameters, kvm-intel.ko's enable/disable parameters are bool, which actually makes more sense. I think it wouldn't hurt to make that consistent across modules, i.e. either turn amd's to bool, or intel's to int. Also, keep in mind that for now only KVM in KVM works for me. Getting for example Xen running should be definitely doable - it just didn't work for me last time I tried. Speed is not exactly great yet either. After I "discovered" the -enable-nesting thing, I tried it and it worked fine. More precisely, I was able to start another guest within the guest, and it ran OK in text mode. As soon as it booted gdm, it hanged. I used the cirrus vga for the nested guest. The (outer) guest was a standard Debian Lenny, so the nested guest used the kvm version shipped with that, which is kvm-72. That actually shouldn't matter, but who knows... At that point, I somehow managed to shoot off the sshd, since I was doing that over SSH to my university machine. Now the remote machine seems to run, but the SSH daemon is down. Thus, I cannot reconnect to make further tests. Due to Easter holidays, I have no physical access to the machine right now, so I will be able to follow up on the matter from Tuesday onwards. Btw, SDL performance over SSH X-forwarding just plain sucks, even on a 100MBit network. Yep - there was a discussion on qemu-devel about this. VNC is a bit better, but there I have an issue with a "double mouse pointer", having the local one as a black dot and the remote one the usual way, and they get terribly out of sync all the time. It's an enormous PITA! Do you experience that too? It happens here no matter how I set the "render mouse pointer locally" setting in xvnc. This is because you have several conversion layers here. Normal VNC just sets the mouse pointer on the remote display. With a VM you emulate a normal mouse though, that only knows things like "move left by 20". That means you're stuck with a relative input device, which will always bring you the "double mouse cursor" effect. If you want to have an absolute input device, try to use "vmmouse" instead of "mouse" as input driver in your xorg.conf. PS: I'm offline for 2 weeks now, so don't expect any reply