[v2 00/25] Add VT-d Posted-Interrupts support
VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt. With VT-d Posted-Interrupts enabled, external interrupts from direct-assigned devices can be delivered to guests without VMM intervention when guest is running in non-root mode. You can find the VT-d Posted-Interrtups Spec. in the following URL: http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html v1-v2: * Use VFIO framework to enable this feature, the VFIO part of this series is base on Eric's patch [PATCH v3 0/8] KVM-VFIO IRQ forward control * Rebase this patchset on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git, then revise some irq logic based on the new hierarchy irqdomain patches provided by Jiang Liu jiang@linux.intel.com This patch series is made of the following groups: 1-6: Some preparation changes in iommu and irq component, this is based on the new hierarchy irqdomain logic. 7-9, 25: IOMMU changes for VT-d Posted-Interrupts, such as, feature detection, command line parameter. 10-16, 21-24: Changes related to KVM itself. 17-19: Changes in VFIO component, this part was previously sent out as [RFC PATCH v2 0/2] kvm-vfio: implement the vfio skeleton for VT-d Posted-Interrupts 20: x86 irq related changes Feng Wu (25): genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU iommu: Add new member capability to struct irq_remap_ops iommu, x86: Define new irte structure for VT-d Posted-Interrupts iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip x86, irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller iommu, x86: No need to migrating irq for VT-d Posted-Interrupts iommu, x86: Add cap_pi_support() to detect VT-d PI capability iommu, x86: Add intel_irq_remapping_capability() for Intel iommu, x86: define irq_remapping_cap() KVM: change struct pi_desc for VT-d Posted-Interrupts KVM: Add some helper functions for Posted-Interrupts KVM: Initialize VT-d Posted-Interrupts Descriptor KVM: Define a new interface kvm_find_dest_vcpu() for VT-d PI KVM: Get Posted-Interrupts descriptor address from struct kvm_vcpu KVM: Make struct kvm_irq_routing_table accessible KVM: make kvm_set_msi_irq() public KVM: kvm-vfio: User API for VT-d Posted-Interrupts KVM: kvm-vfio: implement the VFIO skeleton for VT-d Posted-Interrupts KVM: x86: kvm-vfio: VT-d posted-interrupts setup x86, irq: Define a global vector for VT-d Posted-Interrupts KVM: Update Posted-Interrupts descriptor during vCPU scheduling KVM: Change NDST field after vCPU scheduling KVM: Add the handler for Wake-up Vector KVM: Suppress posted-interrupt when 'SN' is set iommu/vt-d: Add a command line parameter for VT-d posted-interrupts Documentation/kernel-parameters.txt|1 + Documentation/virtual/kvm/devices/vfio.txt |9 + arch/x86/include/asm/entry_arch.h |2 + arch/x86/include/asm/hardirq.h |1 + arch/x86/include/asm/hw_irq.h |2 + arch/x86/include/asm/irq_remapping.h | 11 ++ arch/x86/include/asm/irq_vectors.h |1 + arch/x86/include/asm/kvm_host.h| 14 ++ arch/x86/kernel/apic/msi.c |1 + arch/x86/kernel/entry_64.S |2 + arch/x86/kernel/irq.c | 27 +++ arch/x86/kernel/irqinit.c |2 + arch/x86/kvm/Makefile |2 +- arch/x86/kvm/kvm_vfio_x86.c| 68 arch/x86/kvm/vmx.c | 251 +++- arch/x86/kvm/x86.c | 38 - drivers/iommu/intel_irq_remapping.c| 64 +++ drivers/iommu/irq_remapping.c | 24 +++- drivers/iommu/irq_remapping.h |8 + include/linux/dmar.h | 32 include/linux/intel-iommu.h|1 + include/linux/irq.h|7 + include/linux/kvm_host.h | 43 + include/uapi/linux/kvm.h | 10 + kernel/irq/chip.c | 14 ++ kernel/irq/manage.c| 20 +++ virt/kvm/irq_comm.c| 43 +- virt/kvm/irqchip.c | 11 -- virt/kvm/kvm_main.c| 14 ++ virt/kvm/vfio.c| 103 30 files changed, 799 insertions(+), 27 deletions(-) create mode 100644 arch/x86/kvm/kvm_vfio_x86.c -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: cpuid: mask more bits in leaf 0xd and subleaves
On 03/12/2014 00:05, Radim Krčmář wrote: 2014-12-02 14:09+0100, Paolo Bonzini: - EAX=0Dh, ECX=1: output registers EBX/ECX/EDX are reserved. (As good as reserved without XSAVES/IA32_XSS.) - EAX=0Dh, ECX1: output register ECX is zero for all the CPUID leaves we support, because variable supported comes from XCR0 and not XSS. However, only bits above 0 are reserved. Output register EDX is reserved. (Yes. Well, EDX is 0 when the sub-leaf is invalid.) Source: Intel Architecture Instruction Set Extensions Programming Reference, ref. number 319433-022 Signed-off-by: Paolo Bonzini pbonz...@redhat.com -- arch/x86/kvm/cpuid.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 0d919bc33b02..b1366743a728 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -470,10 +470,17 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, goto out; do_cpuid_1_ent(entry[i], function, idx); -if (idx == 1) +if (idx == 1) { entry[i].eax = kvm_supported_word10_x86_features; -else if (entry[i].eax == 0 || !(supported mask)) -continue; +entry[i].ebx = 0; +entry[i].ecx = 0; +} else { +if (entry[i].eax == 0 || !(supported mask)) +continue; +WARN_ON_ONCE(entry[i].ecx 1); +entry[i].ecx = 1; ECX Bit 0 is set if the sub-leaf index, n, maps to a valid bit in the IA32_XSS MSR and bit 0 is clear if n maps to a valid bit in XCR0. ECX should be set to 0 instead, we definitely don't map to a valid bit in IA32_XSS now. Well, there is a WARN just above. :) But I can change it to zero instead. (Having only one part of cpuid ready for it is weird ...) +} +entry[i].edx = 0; entry[i].flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX; (Unrelated, I have yet to understand how this flag translates * If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0.) If the index is invalid, entry[i].eax is zero and we do not return anything at all. Paolo ++*nent; Forcing a change of the XSAVES implementation is a likely purpose of this patch and it is correct after changing the ecx handling, so then, Reviewed-by: Radim Krčmář rkrc...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: usb audio device troubles
Hi all, On 12/02/2014 01:43 PM, Paolo Bonzini wrote: On 02/12/2014 13:16, Eric S. Johansson wrote: I got win7 installed, virtio devices working and took forever to trickle in updates because of a w7 bug update manager bug that take up all cpu resources. now I got DNS 13 installed but I'm getting no audio. I pass throught the usb audio device (logitech h800 USB 046d:0a29) and it is seen as a device in windows. then I hear the headset sync-up beeps and the device vanishes from windows. pointers as to what I should look at next? Adding back Hans and Gerd... Eric are you using usb-host redirection, or Spice's usb network redir ? If you do not know, please describe how (which ui-elements / cmdline) you are redirecting the device. Regards, Hans -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: usb audio device troubles
On 12/3/2014 3:21 AM, Hans de Goede wrote: Hi all, On 12/02/2014 01:43 PM, Paolo Bonzini wrote: On 02/12/2014 13:16, Eric S. Johansson wrote: I got win7 installed, virtio devices working and took forever to trickle in updates because of a w7 bug update manager bug that take up all cpu resources. now I got DNS 13 installed but I'm getting no audio. I pass throught the usb audio device (logitech h800 USB 046d:0a29) and it is seen as a device in windows. then I hear the headset sync-up beeps and the device vanishes from windows. pointers as to what I should look at next? Adding back Hans and Gerd... Eric are you using usb-host redirection, or Spice's usb network redir ? Host redirection I assume. It was from the collection of devices UI and I added the device to pass through from the list of host USB devices. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RESCEND v2] target-i386: Intel xsaves
On 03/12/2014 03:36, Wanpeng Li wrote: Add xsaves related definition, it also adds corresponding part to kvm_get/put, and vmstate. Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com --- v1 - v2: * use a subsection instead of bumping the version number. target-i386/cpu.h | 2 ++ target-i386/kvm.c | 15 +++ target-i386/machine.c | 21 + 3 files changed, 38 insertions(+) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 015f5b5..cff7433 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -389,6 +389,7 @@ #define MSR_VM_HSAVE_PA 0xc0010117 #define MSR_IA32_BNDCFGS0x0d90 +#define MSR_IA32_XSS0x0da0 #define XSTATE_FP (1ULL 0) #define XSTATE_SSE (1ULL 1) @@ -1019,6 +1020,7 @@ typedef struct CPUX86State { uint64_t xstate_bv; uint64_t xcr0; +uint64_t xss; TPRAccess tpr_access_type; } CPUX86State; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index ccf36e8..c6fc417 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -80,6 +80,7 @@ static bool has_msr_hv_hypercall; static bool has_msr_hv_vapic; static bool has_msr_hv_tsc; static bool has_msr_mtrr; +static bool has_msr_xss; static bool has_msr_architectural_pmu; static uint32_t num_architectural_pmu_counters; @@ -826,6 +827,10 @@ static int kvm_get_supported_msrs(KVMState *s) has_msr_bndcfgs = true; continue; } +if (kvm_msr_list-indices[i] == MSR_IA32_XSS) { +has_msr_xss = true; +continue; +} } } @@ -1224,6 +1229,9 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (has_msr_bndcfgs) { kvm_msr_entry_set(msrs[n++], MSR_IA32_BNDCFGS, env-msr_bndcfgs); } +if (has_msr_xss) { +kvm_msr_entry_set(msrs[n++], MSR_IA32_XSS, env-xss); +} #ifdef TARGET_X86_64 if (lm_capable_kernel) { kvm_msr_entry_set(msrs[n++], MSR_CSTAR, env-cstar); @@ -1570,6 +1578,10 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_bndcfgs) { msrs[n++].index = MSR_IA32_BNDCFGS; } +if (has_msr_xss) { +msrs[n++].index = MSR_IA32_XSS; +} + if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1717,6 +1729,9 @@ static int kvm_get_msrs(X86CPU *cpu) case MSR_IA32_BNDCFGS: env-msr_bndcfgs = msrs[i].data; break; +case MSR_IA32_XSS: +env-xss = msrs[i].data; +break; default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { diff --git a/target-i386/machine.c b/target-i386/machine.c index 1c13b14..722d62e 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -687,6 +687,24 @@ static const VMStateDescription vmstate_avx512 = { } }; +static bool xss_needed(void *opaque) +{ +X86CPU *cpu = opaque; +CPUX86State *env = cpu-env; + +return env-xss != 0; +} + +static const VMStateDescription vmstate_xss = { +.name = cpu/xss, +.version_id = 1, +.minimum_version_id = 1, +.fields = (VMStateField[]) { +VMSTATE_UINT64(env.xss, X86CPU), +VMSTATE_END_OF_LIST() +} +}; + VMStateDescription vmstate_x86_cpu = { .name = cpu, .version_id = 12, @@ -832,6 +850,9 @@ VMStateDescription vmstate_x86_cpu = { }, { .vmsd = vmstate_avx512, .needed = avx512_needed, + }, { +.vmsd = vmstate_xss, +.needed = xss_needed, } , { /* empty */ } Thanks, applied to uq/master. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: usb audio device troubles
Hi, On 12/03/2014 09:31 AM, Eric S. Johansson wrote: On 12/3/2014 3:21 AM, Hans de Goede wrote: Hi all, On 12/02/2014 01:43 PM, Paolo Bonzini wrote: On 02/12/2014 13:16, Eric S. Johansson wrote: I got win7 installed, virtio devices working and took forever to trickle in updates because of a w7 bug update manager bug that take up all cpu resources. now I got DNS 13 installed but I'm getting no audio. I pass throught the usb audio device (logitech h800 USB 046d:0a29) and it is seen as a device in windows. then I hear the headset sync-up beeps and the device vanishes from windows. pointers as to what I should look at next? Adding back Hans and Gerd... Eric are you using usb-host redirection, or Spice's usb network redir ? Host redirection I assume. It was from the collection of devices UI and I added the device to pass through from the list of host USB devices. Ok, then Gerd is probably the best person to help you further. Regards, Hans -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] KVM call for agenda for 2014-12-08
Hi Juan, is this for the 9th, or did I get the day wrong Anyway - I would like to talk about Multi-core - a huge thank you to everybody for your feedback, we’ll be starting work on this, and I’d like to bring a proposal in terms of the path we’ll take and get consensus on the first steps. Cheers Mark. On 2 Dec 2014, at 20:56, Juan Quintela quint...@redhat.com wrote: Hi Please, send any topic that you are interested in covering. hanks, Juan. By popular demand, a google calendar public entry with it https://www.google.com/calendar/embed?src=dG9iMXRqcXAzN3Y4ZXZwNzRoMHE4a3BqcXNAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ (Let me know if you have any problems with the calendar entry) If you need phone number details, contact me privately Thanks, Juan. PD. Use the google calendar entry to now the time, I gave up at getting three timezones right. +44 (0)20 7100 3485 x 210 +33 (0)5 33 52 01 77x 210 +33 (0)603762104 mark.burton -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [kvm-unit-tests PATCH] x86: emulator: Fix h_mem usage in tests_smsw
On 02/12/2014 23:22, Chris J Arges wrote: In emulator.c/tests_smsw, smsw (3) fails because h_mem isn't being set correctly before smsw is called. By declaring the h_mem function parameter as volatile, the compiler no longer optimizes out the assignment before smsw. Signed-off-by: Chris J Arges chris.j.ar...@canonical.com --- x86/emulator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/x86/emulator.c b/x86/emulator.c index 5aa4dbf..570628f 100644 --- a/x86/emulator.c +++ b/x86/emulator.c @@ -337,7 +337,7 @@ void test_incdecnotneg(void *mem) report(lock notb, *mb == vb); } -void test_smsw(uint64_t *h_mem) +void test_smsw(volatile uint64_t *h_mem) { char mem[16]; unsigned short msw, msw_orig, *pmsw; What if you change asm volatile(smsw %0 : =m(*h_mem)); to asm volatile(smsw %0 : +m(*h_mem)); Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: usb audio device troubles
On 12/3/2014 3:52 AM, Hans de Goede wrote: Eric are you using usb-host redirection, or Spice's usb network redir ? Host redirection I assume. It was from the collection of devices UI and I added the device to pass through from the list of host USB devices. Ok, then Gerd is probably the best person to help you further Let's try this from a different perspective. Which connection method will give me the most reliable/stable USB connection with the cleanest audio outcomes? whatever would work best is what I want to use. --- eric -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC v5 07/19] virtio: allow virtio-1 queue layout
On Tue, 2 Dec 2014 21:03:45 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Tue, Dec 02, 2014 at 04:41:36PM +0100, Cornelia Huck wrote: void virtio_queue_set_num(VirtIODevice *vdev, int n, int num) { +/* + * For virtio-1 devices, the number of buffers may only be + * updated if the ring addresses have not yet been set up. Where does it say that? Hmpf, may have imagined that. This means we either need to track whether used/avail have been specified or calculated or move responsibility for re-calculation of used/avail for the old layout into the callers. + */ +if (virtio_has_feature(vdev, VIRTIO_F_VERSION_1) +vdev-vq[n].vring.desc) { +error_report(tried to modify buffer num for virtio-1 device); +return; +} /* Don't allow guest to flip queue between existent and * nonexistent states, or to set it to an invalid size. */ -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: usb audio device troubles
Hi, I pass throught the usb audio device (logitech h800 USB 046d:0a29) and it is seen as a device in windows. then I hear the headset sync-up beeps and the device vanishes from windows. pointers as to what I should look at next? Adding back Hans and Gerd... Eric are you using usb-host redirection, or Spice's usb network redir ? Host redirection I assume. It was from the collection of devices UI and I added the device to pass through from the list of host USB devices. Sounds like virt-manager. qemu logs should be at /var/log/libvirt/qemu/$guest.log then. Any error messages in there? Any messages in the host kernel log (about usb device reset maybe?) Do you use an usb 2 controller? If not, can try whenever that improves things? Can be switched when you pick the controller usb in the virt-manager devices ui. cheers, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
What's the difference between EPT_MISCONFIG and EPT_VIOLATION?
Hi, EXIT_REASON_EPT_VIOLATION's corresponding handle is handle_ept_violation(), and EXIT_REASON_EPT_MISCONFIG's corresponding handle is handle_ept_misconfig(), what's the difference between them? I read the SDM-3C 28.2.3 EPT-Induced VM Exits, and found below description, An EPT misconfiguration occurs when, in the course of translating a guest-physical address, the logical processor encounters an EPT paging-structure entry that contains an unsupported value. An EPT violation occurs when there is no EPT misconfiguration but the EPT paging-structure entries disallow an access using the guest physical address. According to above description, EPT-MISCONFIG is from error settings , but from the its exit-handle handle_ept_misconfig(), it seems that handle_ept_misconfig() handles mmio pagefault, I'm really confused, I think I'm missing something, any advices? Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC v5 07/19] virtio: allow virtio-1 queue layout
On Wed, 3 Dec 2014 10:27:36 +0100 Cornelia Huck cornelia.h...@de.ibm.com wrote: On Tue, 2 Dec 2014 21:03:45 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Tue, Dec 02, 2014 at 04:41:36PM +0100, Cornelia Huck wrote: void virtio_queue_set_num(VirtIODevice *vdev, int n, int num) { +/* + * For virtio-1 devices, the number of buffers may only be + * updated if the ring addresses have not yet been set up. Where does it say that? Hmpf, may have imagined that. This means we either need to track whether used/avail have been specified or calculated or move responsibility for re-calculation of used/avail for the old layout into the callers. What about this one instead? diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c index 43b7e02..1e2a720 100644 --- a/hw/virtio/virtio-mmio.c +++ b/hw/virtio/virtio-mmio.c @@ -244,9 +244,13 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, uint64_t value, case VIRTIO_MMIO_QUEUENUM: DPRINTF(mmio_queue write %d max %d\n, (int)value, VIRTQUEUE_MAX_SIZE); virtio_queue_set_num(vdev, vdev-queue_sel, value); +/* Note: only call this function for legacy devices */ +virtio_queue_update_rings(vdev, vdev-queue_sel); break; case VIRTIO_MMIO_QUEUEALIGN: +/* Note: this is only valid for legacy devices */ virtio_queue_set_align(vdev, vdev-queue_sel, value); +virtio_queue_update_rings(vdev, vdev-queue_sel); break; case VIRTIO_MMIO_QUEUEPFN: if (value == 0) { diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 8f69ffa..b2d553e 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -69,7 +69,6 @@ typedef struct VRing struct VirtQueue { VRing vring; -hwaddr pa; uint16_t last_avail_idx; /* Last used index value we have signalled on */ uint16_t signalled_used; @@ -92,15 +91,18 @@ struct VirtQueue }; /* virt queue functions */ -static void virtqueue_init(VirtQueue *vq) +void virtio_queue_update_rings(VirtIODevice *vdev, int n) { -hwaddr pa = vq-pa; +VRing *vring = vdev-vq[n].vring; -vq-vring.desc = pa; -vq-vring.avail = pa + vq-vring.num * sizeof(VRingDesc); -vq-vring.used = vring_align(vq-vring.avail + - offsetof(VRingAvail, ring[vq-vring.num]), - vq-vring.align); +if (!vring-desc) { +/* not yet setup - nothing to do */ +return; +} +vring-avail = vring-desc + vring-num * sizeof(VRingDesc); +vring-used = vring_align(vring-avail + + offsetof(VRingAvail, ring[vring-num]), + vring-align); } static inline uint64_t vring_desc_addr(VirtIODevice *vdev, hwaddr desc_pa, @@ -605,7 +607,6 @@ void virtio_reset(void *opaque) vdev-vq[i].vring.avail = 0; vdev-vq[i].vring.used = 0; vdev-vq[i].last_avail_idx = 0; -vdev-vq[i].pa = 0; vdev-vq[i].vector = VIRTIO_NO_VECTOR; vdev-vq[i].signalled_used = 0; vdev-vq[i].signalled_used_valid = false; @@ -708,13 +709,21 @@ void virtio_config_writel(VirtIODevice *vdev, uint32_t addr, uint32_t data) void virtio_queue_set_addr(VirtIODevice *vdev, int n, hwaddr addr) { -vdev-vq[n].pa = addr; -virtqueue_init(vdev-vq[n]); +vdev-vq[n].vring.desc = addr; +virtio_queue_update_rings(vdev, n); } hwaddr virtio_queue_get_addr(VirtIODevice *vdev, int n) { -return vdev-vq[n].pa; +return vdev-vq[n].vring.desc; +} + +void virtio_queue_set_rings(VirtIODevice *vdev, int n, hwaddr desc, +hwaddr avail, hwaddr used) +{ +vdev-vq[n].vring.desc = desc; +vdev-vq[n].vring.avail = avail; +vdev-vq[n].vring.used = used; } void virtio_queue_set_num(VirtIODevice *vdev, int n, int num) @@ -728,7 +737,6 @@ void virtio_queue_set_num(VirtIODevice *vdev, int n, int num) return; } vdev-vq[n].vring.num = num; -virtqueue_init(vdev-vq[n]); } int virtio_queue_get_num(VirtIODevice *vdev, int n) @@ -748,6 +756,11 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, int align) BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); +/* virtio-1 compliant devices cannot change the aligment */ +if (virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) { +error_report(tried to modify queue alignment for virtio-1 device); +return; +} /* Check that the transport told us it was going to do this * (so a buggy transport will immediately assert rather than * silently failing to migrate this state) @@ -755,7 +768,6 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, int align) assert(k-has_variable_vring_alignment); vdev-vq[n].vring.align = align; -virtqueue_init(vdev-vq[n]); } void virtio_queue_notify_vq(VirtQueue *vq) @@ -949,7 +961,8 @@ void
Re: What's the difference between EPT_MISCONFIG and EPT_VIOLATION?
On Wed, Dec 03, 2014 at 05:50:33PM +0800, Zhang Haoyu wrote: Hi, EXIT_REASON_EPT_VIOLATION's corresponding handle is handle_ept_violation(), and EXIT_REASON_EPT_MISCONFIG's corresponding handle is handle_ept_misconfig(), what's the difference between them? I read the SDM-3C 28.2.3 EPT-Induced VM Exits, and found below description, An EPT misconfiguration occurs when, in the course of translating a guest-physical address, the logical processor encounters an EPT paging-structure entry that contains an unsupported value. An EPT violation occurs when there is no EPT misconfiguration but the EPT paging-structure entries disallow an access using the guest physical address. According to above description, EPT-MISCONFIG is from error settings , but from the its exit-handle handle_ept_misconfig(), it seems that handle_ept_misconfig() handles mmio pagefault, I'm really confused, I think I'm missing something, any advices? EXIT_REASON_EPT_VIOLATION is similar to a page not present pagefault EXIT_REASON_EPT_MISCONFIG is similar to a reserved bit set pagefault. handle_ept_misconfig() handles mmio pagefault because KVM has an optimization that uses reserved bits to mark mmio regions. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's the difference between EPT_MISCONFIG and EPT_VIOLATION?
Hi, EXIT_REASON_EPT_VIOLATION's corresponding handle is handle_ept_violation(), and EXIT_REASON_EPT_MISCONFIG's corresponding handle is handle_ept_misconfig(), what's the difference between them? I read the SDM-3C 28.2.3 EPT-Induced VM Exits, and found below description, An EPT misconfiguration occurs when, in the course of translating a guest-physical address, the logical processor encounters an EPT paging-structure entry that contains an unsupported value. An EPT violation occurs when there is no EPT misconfiguration but the EPT paging-structure entries disallow an access using the guest physical address. According to above description, EPT-MISCONFIG is from error settings , but from the its exit-handle handle_ept_misconfig(), it seems that handle_ept_misconfig() handles mmio pagefault, I'm really confused, I think I'm missing something, any advices? EXIT_REASON_EPT_VIOLATION is similar to a page not present pagefault EXIT_REASON_EPT_MISCONFIG is similar to a reserved bit set pagefault. handle_ept_misconfig() handles mmio pagefault because KVM has an optimization that uses reserved bits to mark mmio regions. Thanks, Gleb, where does kvm use the reserved bits to mark mmio regions? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Windows 7 VM BSOD
Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's the difference between EPT_MISCONFIG and EPT_VIOLATION?
On Wed, Dec 03, 2014 at 06:12:10PM +0800, Zhang Haoyu wrote: Hi, EXIT_REASON_EPT_VIOLATION's corresponding handle is handle_ept_violation(), and EXIT_REASON_EPT_MISCONFIG's corresponding handle is handle_ept_misconfig(), what's the difference between them? I read the SDM-3C 28.2.3 EPT-Induced VM Exits, and found below description, An EPT misconfiguration occurs when, in the course of translating a guest-physical address, the logical processor encounters an EPT paging-structure entry that contains an unsupported value. An EPT violation occurs when there is no EPT misconfiguration but the EPT paging-structure entries disallow an access using the guest physical address. According to above description, EPT-MISCONFIG is from error settings , but from the its exit-handle handle_ept_misconfig(), it seems that handle_ept_misconfig() handles mmio pagefault, I'm really confused, I think I'm missing something, any advices? EXIT_REASON_EPT_VIOLATION is similar to a page not present pagefault EXIT_REASON_EPT_MISCONFIG is similar to a reserved bit set pagefault. handle_ept_misconfig() handles mmio pagefault because KVM has an optimization that uses reserved bits to mark mmio regions. Thanks, Gleb, where does kvm use the reserved bits to mark mmio regions? arch/x86/kvm/mmu.c:mark_mmio_spte -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's the difference between EPT_MISCONFIG and EPT_VIOLATION?
On 03/12/2014 11:12, Zhang Haoyu wrote: EXIT_REASON_EPT_VIOLATION is similar to a page not present pagefault EXIT_REASON_EPT_MISCONFIG is similar to a reserved bit set pagefault. handle_ept_misconfig() handles mmio pagefault because KVM has an optimization that uses reserved bits to mark mmio regions. Thanks, Gleb, where does kvm use the reserved bits to mark mmio regions? ept_set_mmio_spte_mask is where KVM tells mmu.c how to mark MMIO regions. You can search mmu.c for shadow_mmio_mask and is_mmio_spte in order to find out more about this optimization, you'll also get to the mark_mmio_spte function that Gleb mentioned. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: arm/arm64: vgic: add init entry to VGIC KVM device
On Tue, Dec 02, 2014 at 05:50:00PM +, Peter Maydell wrote: On 2 December 2014 at 17:27, Eric Auger eric.au...@linaro.org wrote: Since the advent of dynamic initialization of VGIC, this latter is initialized very late, on the first vcpu run. This initialization could be initiated much earlier by the user, as soon as it has provided the requested dimensioning parameters: - number of IRQs and number of vCPUs, - DIST and CPU interface base address. One motivation behind being able to initialize the VGIC sooner is related to the setup of IRQ injection in VFIO use case. The VFIO signaling, especially when used along with irqfd must be set *after* vgic initialization to prevent any virtual IRQ injection before VGIC initialization. If virtual IRQ injection occurs before the VGIC init, the IRQ cannot be injected and subsequent injection is blocked due to VFIO completion mechanism (unmask/mask or forward/unforward). This implies that you're potentially injecting virtual IRQs (and changing the state of the VGIC) before we actually start running the VM (ie before userspace calls KVM_RUN). Is that right? It seems odd, but maybe vfio works that way? Yeah, I can't think of a cleaner way to do this. VFIO doesn't know anything about KVM or whether a machine is running or not. QEMU has to configure all this before starting a VM (wiring up IRQs after the VM is running is even more weird imho, when would you even do that?) so interrupts from the real hardware are bound to hit VFIO just before/during/after VCPUs are started, and VFIO doesn't have any caching mechanism for this state, it really has to go to the consumer of the interrupt, which is KVM in the case of forwarded interrupts. Did I miss something obvious here? -Christoffer -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: arm/arm64: vgic: add init entry to VGIC KVM device
On Tue, Dec 02, 2014 at 06:27:31PM +0100, Eric Auger wrote: Since the advent of dynamic initialization of VGIC, this latter is initialized very late, on the first vcpu run. This initialization could be initiated much earlier by the user, as soon as it has provided the requested dimensioning parameters: - number of IRQs and number of vCPUs, - DIST and CPU interface base address. One motivation behind being able to initialize the VGIC sooner is related to the setup of IRQ injection in VFIO use case. The VFIO signaling, especially when used along with irqfd must be set *after* vgic initialization to prevent any virtual IRQ injection before VGIC initialization. If virtual IRQ injection occurs before the VGIC init, the IRQ cannot be injected and subsequent injection is blocked due to VFIO completion mechanism (unmask/mask or forward/unforward). This patch adds a new entry to the VGIC KVM device that allows the user to manually request the VGIC init: - a new KVM_DEV_ARM_VGIC_GRP_CTRL group is introduced. - Its first attribute is KVM_DEV_ARM_VGIC_CTRL_INIT The rationale behind introducing a group is to be able to add other controls later on, if needed. Obviously, as soon as the init is done, the dimensioning parameters cannot be changed. you would need to add a check in the vcpu_create path, which I don't believe we currently have. That may conflict with Andre's series so we need to coordinate. We're also seeing this potentially being useful for migration, so my feeling is that the GICv3 patches should be rebased on this patch and this patch should include a check in the vcpu create path. Signed-off-by: Eric Auger eric.au...@linaro.org --- Documentation/virtual/kvm/devices/arm-vgic.txt | 11 +++ arch/arm/include/uapi/asm/kvm.h| 2 ++ arch/arm64/include/uapi/asm/kvm.h | 2 ++ virt/kvm/arm/vgic.c| 14 +- 4 files changed, 28 insertions(+), 1 deletion(-) diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt b/Documentation/virtual/kvm/devices/arm-vgic.txt index df8b0c7..80db43f 100644 --- a/Documentation/virtual/kvm/devices/arm-vgic.txt +++ b/Documentation/virtual/kvm/devices/arm-vgic.txt @@ -81,3 +81,14 @@ Groups: -EINVAL: Value set is out of the expected range -EBUSY: Value has already be set, or GIC has already been initialized with default values. + + KVM_DEV_ARM_VGIC_GRP_CTRL + Attributes: +KVM_DEV_ARM_VGIC_CTRL_INIT + request the initialization of the VGIC, no additional parameter in + kvm_device_attr.addr. + Errors: +-ENXIO: distributor or CPU interface base address were not set prior +to that call this should be more generic to also apply to GICv3, I would suggest: VGIC not properly configured as required prior to calling this attribute. alternatively, the attribute should be KVM_DEV_ARM_VGIC_V2_CTRL_INIT. +-EINVAL: number of vcpus is not known can we have a different error code for this case? ENODEV for example? +-ENOMEM: memory shortage when allocating vgic internal data diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h index 77547bb..2499867 100644 --- a/arch/arm/include/uapi/asm/kvm.h +++ b/arch/arm/include/uapi/asm/kvm.h @@ -175,6 +175,8 @@ struct kvm_arch_memory_slot { #define KVM_DEV_ARM_VGIC_OFFSET_SHIFT 0 #define KVM_DEV_ARM_VGIC_OFFSET_MASK (0xULL KVM_DEV_ARM_VGIC_OFFSET_SHIFT) #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS 3 +#define KVM_DEV_ARM_VGIC_GRP_CTRL 4 +#define KVM_DEV_ARM_VGIC_CTRL_INIT0 /* KVM_IRQ_LINE irq field index values */ #define KVM_ARM_IRQ_TYPE_SHIFT 24 diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 1ed4417..b35c95a 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -161,6 +161,8 @@ struct kvm_arch_memory_slot { #define KVM_DEV_ARM_VGIC_OFFSET_SHIFT 0 #define KVM_DEV_ARM_VGIC_OFFSET_MASK (0xULL KVM_DEV_ARM_VGIC_OFFSET_SHIFT) #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS 3 +#define KVM_DEV_ARM_VGIC_GRP_CTRL4 +#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 /* KVM_IRQ_LINE irq field index values */ #define KVM_ARM_IRQ_TYPE_SHIFT 24 diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index b76c38c..2fe5bdb 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -2474,7 +2474,14 @@ static int vgic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr) return ret; } - + case KVM_DEV_ARM_VGIC_GRP_CTRL: { + switch (attr-attr) { + case KVM_DEV_ARM_VGIC_CTRL_INIT: + r = kvm_vgic_init(dev-kvm); + return r; + } + break; + } } return -ENXIO; @@ -2553,6 +2560,11 @@ static int
Re: [PATCH RFC v5 07/19] virtio: allow virtio-1 queue layout
On Wed, Dec 03, 2014 at 10:50:04AM +0100, Cornelia Huck wrote: On Wed, 3 Dec 2014 10:27:36 +0100 Cornelia Huck cornelia.h...@de.ibm.com wrote: On Tue, 2 Dec 2014 21:03:45 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Tue, Dec 02, 2014 at 04:41:36PM +0100, Cornelia Huck wrote: void virtio_queue_set_num(VirtIODevice *vdev, int n, int num) { +/* + * For virtio-1 devices, the number of buffers may only be + * updated if the ring addresses have not yet been set up. Where does it say that? Hmpf, may have imagined that. This means we either need to track whether used/avail have been specified or calculated or move responsibility for re-calculation of used/avail for the old layout into the callers. What about this one instead? Looks ok overall - some questions below. diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c index 43b7e02..1e2a720 100644 --- a/hw/virtio/virtio-mmio.c +++ b/hw/virtio/virtio-mmio.c @@ -244,9 +244,13 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, uint64_t value, case VIRTIO_MMIO_QUEUENUM: DPRINTF(mmio_queue write %d max %d\n, (int)value, VIRTQUEUE_MAX_SIZE); virtio_queue_set_num(vdev, vdev-queue_sel, value); +/* Note: only call this function for legacy devices */ +virtio_queue_update_rings(vdev, vdev-queue_sel); break; case VIRTIO_MMIO_QUEUEALIGN: +/* Note: this is only valid for legacy devices */ virtio_queue_set_align(vdev, vdev-queue_sel, value); +virtio_queue_update_rings(vdev, vdev-queue_sel); break; case VIRTIO_MMIO_QUEUEPFN: if (value == 0) { Let's just call virtio_queue_update_rings from virtio_queue_set_align? diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 8f69ffa..b2d553e 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -69,7 +69,6 @@ typedef struct VRing struct VirtQueue { VRing vring; -hwaddr pa; uint16_t last_avail_idx; /* Last used index value we have signalled on */ uint16_t signalled_used; @@ -92,15 +91,18 @@ struct VirtQueue }; /* virt queue functions */ -static void virtqueue_init(VirtQueue *vq) +void virtio_queue_update_rings(VirtIODevice *vdev, int n) { -hwaddr pa = vq-pa; +VRing *vring = vdev-vq[n].vring; -vq-vring.desc = pa; -vq-vring.avail = pa + vq-vring.num * sizeof(VRingDesc); -vq-vring.used = vring_align(vq-vring.avail + - offsetof(VRingAvail, ring[vq-vring.num]), - vq-vring.align); +if (!vring-desc) { +/* not yet setup - nothing to do */ +return; +} +vring-avail = vring-desc + vring-num * sizeof(VRingDesc); +vring-used = vring_align(vring-avail + + offsetof(VRingAvail, ring[vring-num]), + vring-align); } static inline uint64_t vring_desc_addr(VirtIODevice *vdev, hwaddr desc_pa, @@ -605,7 +607,6 @@ void virtio_reset(void *opaque) vdev-vq[i].vring.avail = 0; vdev-vq[i].vring.used = 0; vdev-vq[i].last_avail_idx = 0; -vdev-vq[i].pa = 0; vdev-vq[i].vector = VIRTIO_NO_VECTOR; vdev-vq[i].signalled_used = 0; vdev-vq[i].signalled_used_valid = false; @@ -708,13 +709,21 @@ void virtio_config_writel(VirtIODevice *vdev, uint32_t addr, uint32_t data) void virtio_queue_set_addr(VirtIODevice *vdev, int n, hwaddr addr) { -vdev-vq[n].pa = addr; -virtqueue_init(vdev-vq[n]); +vdev-vq[n].vring.desc = addr; +virtio_queue_update_rings(vdev, n); } hwaddr virtio_queue_get_addr(VirtIODevice *vdev, int n) { -return vdev-vq[n].pa; +return vdev-vq[n].vring.desc; +} + +void virtio_queue_set_rings(VirtIODevice *vdev, int n, hwaddr desc, +hwaddr avail, hwaddr used) +{ +vdev-vq[n].vring.desc = desc; +vdev-vq[n].vring.avail = avail; +vdev-vq[n].vring.used = used; } void virtio_queue_set_num(VirtIODevice *vdev, int n, int num) @@ -728,7 +737,6 @@ void virtio_queue_set_num(VirtIODevice *vdev, int n, int num) return; } vdev-vq[n].vring.num = num; -virtqueue_init(vdev-vq[n]); } int virtio_queue_get_num(VirtIODevice *vdev, int n) @@ -748,6 +756,11 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, int align) BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); +/* virtio-1 compliant devices cannot change the aligment */ +if (virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) { +error_report(tried to modify queue alignment for virtio-1 device); +return; +} /* Check that the transport told us it was going to do this * (so a buggy transport will immediately assert rather than *
Re: Windows 7 VM BSOD
https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
Hi, How do I know if my qemu-kvm version support this? I don't know which qemu version starts to support hv-relaxed, but I'm sure qemu-1.4.1 and later versions support it. qemu will report error if it dosen't support it. Please show your qemu version. Thanks, Zhang Haoyu hv-relaxed was started to support from commit 89314504, On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
qemu-system-x86_64 -version QEMU emulator version 2.0.0 (Debian 2.0.0+dfsg-2ubuntu1.7), Copyright (c) 2003-2008 Fabrice Bellard On Wed, Dec 3, 2014 at 7:01 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi, How do I know if my qemu-kvm version support this? I don't know which qemu version starts to support hv-relaxed, but I'm sure qemu-1.4.1 and later versions support it. qemu will report error if it dosen't support it. Please show your qemu version. Thanks, Zhang Haoyu hv-relaxed was started to support from commit 89314504, On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC v5 07/19] virtio: allow virtio-1 queue layout
On Wed, 3 Dec 2014 12:52:51 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Wed, Dec 03, 2014 at 10:50:04AM +0100, Cornelia Huck wrote: diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c index 43b7e02..1e2a720 100644 --- a/hw/virtio/virtio-mmio.c +++ b/hw/virtio/virtio-mmio.c @@ -244,9 +244,13 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, uint64_t value, case VIRTIO_MMIO_QUEUENUM: DPRINTF(mmio_queue write %d max %d\n, (int)value, VIRTQUEUE_MAX_SIZE); virtio_queue_set_num(vdev, vdev-queue_sel, value); +/* Note: only call this function for legacy devices */ +virtio_queue_update_rings(vdev, vdev-queue_sel); break; case VIRTIO_MMIO_QUEUEALIGN: +/* Note: this is only valid for legacy devices */ virtio_queue_set_align(vdev, vdev-queue_sel, value); +virtio_queue_update_rings(vdev, vdev-queue_sel); break; case VIRTIO_MMIO_QUEUEPFN: if (value == 0) { Let's just call virtio_queue_update_rings from virtio_queue_set_align? You're right, set_align is legacy only so we can always call update_rings. @@ -748,6 +756,11 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, int align) BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); +/* virtio-1 compliant devices cannot change the aligment */ +if (virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) { +error_report(tried to modify queue alignment for virtio-1 device); +return; +} /* Check that the transport told us it was going to do this * (so a buggy transport will immediately assert rather than * silently failing to migrate this state) Do we have to touch this now? It's only used by MMIO, right? I don't think it hurts to put a guard in here. @@ -755,7 +768,6 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, int align) assert(k-has_variable_vring_alignment); vdev-vq[n].vring.align = align; -virtqueue_init(vdev-vq[n]); Don't we need to update rings? See above, I'll call update_rings in there. } void virtio_queue_notify_vq(VirtQueue *vq) @@ -949,7 +961,8 @@ void virtio_save(VirtIODevice *vdev, QEMUFile *f) if (k-has_variable_vring_alignment) { qemu_put_be32(f, vdev-vq[i].vring.align); } -qemu_put_be64(f, vdev-vq[i].pa); +/* XXX virtio-1 devices */ +qemu_put_be64(f, vdev-vq[i].vring.desc); qemu_put_be16s(f, vdev-vq[i].last_avail_idx); if (k-save_queue) { k-save_queue(qbus-parent, i, f); @@ -1044,13 +1057,14 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id) if (k-has_variable_vring_alignment) { vdev-vq[i].vring.align = qemu_get_be32(f); } -vdev-vq[i].pa = qemu_get_be64(f); +vdev-vq[i].vring.desc = qemu_get_be64(f); qemu_get_be16s(f, vdev-vq[i].last_avail_idx); vdev-vq[i].signalled_used_valid = false; vdev-vq[i].notification = true; -if (vdev-vq[i].pa) { -virtqueue_init(vdev-vq[i]); +if (vdev-vq[i].vring.desc) { +/* XXX virtio-1 devices */ What does XXX mean here? That I have not cared about migration of virtio-1 devices yet :) +virtio_queue_update_rings(vdev, i); } else if (vdev-vq[i].last_avail_idx) { error_report(VQ %d address 0x0 inconsistent with Host index 0x%x, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC v5 07/19] virtio: allow virtio-1 queue layout
On Wed, Dec 03, 2014 at 12:14:10PM +0100, Cornelia Huck wrote: On Wed, 3 Dec 2014 12:52:51 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Wed, Dec 03, 2014 at 10:50:04AM +0100, Cornelia Huck wrote: diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c index 43b7e02..1e2a720 100644 --- a/hw/virtio/virtio-mmio.c +++ b/hw/virtio/virtio-mmio.c @@ -244,9 +244,13 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, uint64_t value, case VIRTIO_MMIO_QUEUENUM: DPRINTF(mmio_queue write %d max %d\n, (int)value, VIRTQUEUE_MAX_SIZE); virtio_queue_set_num(vdev, vdev-queue_sel, value); +/* Note: only call this function for legacy devices */ +virtio_queue_update_rings(vdev, vdev-queue_sel); break; case VIRTIO_MMIO_QUEUEALIGN: +/* Note: this is only valid for legacy devices */ virtio_queue_set_align(vdev, vdev-queue_sel, value); +virtio_queue_update_rings(vdev, vdev-queue_sel); break; case VIRTIO_MMIO_QUEUEPFN: if (value == 0) { Let's just call virtio_queue_update_rings from virtio_queue_set_align? You're right, set_align is legacy only so we can always call update_rings. @@ -748,6 +756,11 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, int align) BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); +/* virtio-1 compliant devices cannot change the aligment */ +if (virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) { +error_report(tried to modify queue alignment for virtio-1 device); +return; +} /* Check that the transport told us it was going to do this * (so a buggy transport will immediately assert rather than * silently failing to migrate this state) Do we have to touch this now? It's only used by MMIO, right? I don't think it hurts to put a guard in here. I'd say let's not touch mmio ATM. @@ -755,7 +768,6 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, int align) assert(k-has_variable_vring_alignment); vdev-vq[n].vring.align = align; -virtqueue_init(vdev-vq[n]); Don't we need to update rings? See above, I'll call update_rings in there. } void virtio_queue_notify_vq(VirtQueue *vq) @@ -949,7 +961,8 @@ void virtio_save(VirtIODevice *vdev, QEMUFile *f) if (k-has_variable_vring_alignment) { qemu_put_be32(f, vdev-vq[i].vring.align); } -qemu_put_be64(f, vdev-vq[i].pa); +/* XXX virtio-1 devices */ +qemu_put_be64(f, vdev-vq[i].vring.desc); qemu_put_be16s(f, vdev-vq[i].last_avail_idx); if (k-save_queue) { k-save_queue(qbus-parent, i, f); @@ -1044,13 +1057,14 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id) if (k-has_variable_vring_alignment) { vdev-vq[i].vring.align = qemu_get_be32(f); } -vdev-vq[i].pa = qemu_get_be64(f); +vdev-vq[i].vring.desc = qemu_get_be64(f); qemu_get_be16s(f, vdev-vq[i].last_avail_idx); vdev-vq[i].signalled_used_valid = false; vdev-vq[i].notification = true; -if (vdev-vq[i].pa) { -virtqueue_init(vdev-vq[i]); +if (vdev-vq[i].vring.desc) { +/* XXX virtio-1 devices */ What does XXX mean here? That I have not cared about migration of virtio-1 devices yet :) OK sure, but why put comment here not at start of function? +virtio_queue_update_rings(vdev, i); } else if (vdev-vq[i].last_avail_idx) { error_report(VQ %d address 0x0 inconsistent with Host index 0x%x, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC v5 07/19] virtio: allow virtio-1 queue layout
On Wed, 3 Dec 2014 13:19:17 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Wed, Dec 03, 2014 at 12:14:10PM +0100, Cornelia Huck wrote: On Wed, 3 Dec 2014 12:52:51 +0200 Michael S. Tsirkin m...@redhat.com wrote: On Wed, Dec 03, 2014 at 10:50:04AM +0100, Cornelia Huck wrote: @@ -748,6 +756,11 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, int align) BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); +/* virtio-1 compliant devices cannot change the aligment */ +if (virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) { +error_report(tried to modify queue alignment for virtio-1 device); +return; +} /* Check that the transport told us it was going to do this * (so a buggy transport will immediately assert rather than * silently failing to migrate this state) Do we have to touch this now? It's only used by MMIO, right? I don't think it hurts to put a guard in here. I'd say let's not touch mmio ATM. This is not mmio but common code :) I don't really see how this can possibly hurt us; when mmio is converted to virtio-1, their queue setup code needs to be changed anyway. @@ -949,7 +961,8 @@ void virtio_save(VirtIODevice *vdev, QEMUFile *f) if (k-has_variable_vring_alignment) { qemu_put_be32(f, vdev-vq[i].vring.align); } -qemu_put_be64(f, vdev-vq[i].pa); +/* XXX virtio-1 devices */ +qemu_put_be64(f, vdev-vq[i].vring.desc); qemu_put_be16s(f, vdev-vq[i].last_avail_idx); if (k-save_queue) { k-save_queue(qbus-parent, i, f); @@ -1044,13 +1057,14 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id) if (k-has_variable_vring_alignment) { vdev-vq[i].vring.align = qemu_get_be32(f); } -vdev-vq[i].pa = qemu_get_be64(f); +vdev-vq[i].vring.desc = qemu_get_be64(f); qemu_get_be16s(f, vdev-vq[i].last_avail_idx); vdev-vq[i].signalled_used_valid = false; vdev-vq[i].notification = true; -if (vdev-vq[i].pa) { -virtqueue_init(vdev-vq[i]); +if (vdev-vq[i].vring.desc) { +/* XXX virtio-1 devices */ What does XXX mean here? That I have not cared about migration of virtio-1 devices yet :) OK sure, but why put comment here not at start of function? I find it easier to annotate the places I notice. YMMV. +virtio_queue_update_rings(vdev, i); } else if (vdev-vq[i].last_avail_idx) { error_report(VQ %d address 0x0 inconsistent with Host index 0x%x, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: cpuid: mask more bits in leaf 0xd and subleaves
2014-12-03 09:04+0100, Paolo Bonzini: On 03/12/2014 00:05, Radim Krčmář wrote: 2014-12-02 14:09+0100, Paolo Bonzini: + } else { + if (entry[i].eax == 0 || !(supported mask)) + continue; + WARN_ON_ONCE(entry[i].ecx 1); + entry[i].ecx = 1; ECX Bit 0 is set if the sub-leaf index, n, maps to a valid bit in the IA32_XSS MSR and bit 0 is clear if n maps to a valid bit in XCR0. ECX should be set to 0 instead, we definitely don't map to a valid bit in IA32_XSS now. Well, there is a WARN just above. :) But I can change it to zero instead. Yeah, I wasn't sure about the WARN ... I can only see it trigger after host xcr0 changes and we are much more screwed in that case anyway :) (But it has a chance of catching a bug, so it isn't only bad.) The guest expects 0 here, so I'd rather have it ... (Having only one part of cpuid ready for it is weird ...) + } + entry[i].edx = 0; entry[i].flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX; (Unrelated, I have yet to understand how this flag translates * If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0.) If the index is invalid, entry[i].eax is zero and we do not return anything at all. I see, the field is sparse and ++*nent; ++i;, not the flag, does it, thanks. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: cpuid: mask more bits in leaf 0xd and subleaves
On 03/12/2014 13:07, Radim Krčmář wrote: Well, there is a WARN just above. :) But I can change it to zero instead. Yeah, I wasn't sure about the WARN ... I can only see it trigger after host xcr0 changes and we are much more screwed in that case anyway :) (But it has a chance of catching a bug, so it isn't only bad.) The guest expects 0 here, so I'd rather have it ... Ok, I'll have if (WARN_ON_ONCE(entry[i].ecx 1)) continue; } entry[i].ecx = 0; entry[i].edx = 0; ... Thanks for the review! Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 0/2] assign each vcpu an owning thread and improve yielding
This series improves yielding on architectures that cannot disable preemption while entering the guest and makes the creating thread of a VCPU the owning thread and therefore the yield target when yielding to that VCPU. We should focus on the case creating thread == executing thread and therefore remove the complicated handling of PIDs involving synchronize_rcus. This way we can speed up the creation of VCPUs and directly yield to the executing vcpu threads. Please note that - in theory - all VCPU ioctls should be triggered from the same VCPU thread, so changing threads is not a scenario we should optimize. David Hildenbrand (2): KVM: don't check for PF_VCPU when yielding KVM: thread creating a vcpu is the owner of that vcpu include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 22 ++ 2 files changed, 3 insertions(+), 20 deletions(-) Hi Paolo, would be good if you could have a look at these patches. Thanks! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
On Wed, 2014-12-03 at 19:36 +0800, t...@tetrioncapital.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. http://msdn.microsoft.com/en-us/library/windows/hardware/ff557211% 28v=vs.85%29.aspx I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Can you post the qemu command line? Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 3/3] kvm: x86: Enable Intel XSAVES for guest
2014-12-02 19:21+0800, Wanpeng Li: Exporse intel xsaves feature to guest. 0xD.1:ebx ought to be non-zero with XSAVES, even if IA32_XSS is known to be 0, so we'll need to set it after Paolo's patch. Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com --- arch/x86/kvm/cpuid.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index a4f5ac4..0d919bc 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -267,6 +267,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, unsigned f_rdtscp = kvm_x86_ops-rdtscp_supported() ? F(RDTSCP) : 0; unsigned f_invpcid = kvm_x86_ops-invpcid_supported() ? F(INVPCID) : 0; unsigned f_mpx = kvm_x86_ops-mpx_supported() ? F(MPX) : 0; + unsigned f_xsaves = kvm_x86_ops-xsaves_supported() ? F(XSAVES) : 0; /* cpuid 1.edx */ const u32 kvm_supported_word0_x86_features = @@ -322,7 +323,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 0xD.1.eax */ const u32 kvm_supported_word10_x86_features = - F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1); + F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1) | f_xsaves; /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); -- 1.9.1 --- rant --- The documentation isn't clearly backward-compatible, it took me a while to understand why EBX Bits 31-00: The size in bytes of the XSAVE area containing all states enabled by XCRO | IA32_XSS. Is 0 on my old machine. Different section gives a better hint by mentioning XSAVES directly * EBX enumerates the size (in bytes) required by the XSAVES instruction for an XSAVE area containing all the state components corresponding to bits currently set in XCR0 | IA32_XSS. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 1/2] KVM: don't check for PF_VCPU when yielding
On 28/11/2014 12:40, Raghavendra K T wrote: I am seeing very small improvement in = 1x commit cases and for 1x overcommit, a very slight regression. But considering the test environment noises, I do not see much effect from the patch. I think these results are the only one that could be statisically significant: base %stdev patched %stdev%improvement kernbench 1x53.1421 2.3086 54.6671 2.9673 -2.86966 dbench1x 6386.4737 1.04876703.9113 1.2298 4.97047 and, of course :) one of them says things get worse and the other says things get better. Paolo But I admit, I have not explored deeply about, 1. assumption of preempted approximately equals PF_VCPU case logic, 2. whether it helps for any future usages of yield_to against current sole usage of virtualization. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 0/2] assign each vcpu an owning thread and improve yielding
On 03/12/2014 13:12, David Hildenbrand wrote: This series improves yielding on architectures that cannot disable preemption while entering the guest and makes the creating thread of a VCPU the owning thread and therefore the yield target when yielding to that VCPU. We should focus on the case creating thread == executing thread and therefore remove the complicated handling of PIDs involving synchronize_rcus. This way we can speed up the creation of VCPUs and directly yield to the executing vcpu threads. Please note that - in theory - all VCPU ioctls should be triggered from the same VCPU thread, so changing threads is not a scenario we should optimize. David Hildenbrand (2): KVM: don't check for PF_VCPU when yielding KVM: thread creating a vcpu is the owner of that vcpu include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 22 ++ 2 files changed, 3 insertions(+), 20 deletions(-) Hi Paolo, would be good if you could have a look at these patches. Sure. I think patch 1 is fine and I am applying it. For patch 2, what about moving the -pid assignment in the KVM_RUN case of kvm_vcpu_ioctl? Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 2/2] KVM: thread creating a vcpu is the owner of that vcpu
On 25/11/2014 17:04, David Hildenbrand wrote: @@ -124,15 +124,6 @@ int vcpu_load(struct kvm_vcpu *vcpu) if (mutex_lock_killable(vcpu-mutex)) return -EINTR; - if (unlikely(vcpu-pid != current-pids[PIDTYPE_PID].pid)) { - /* The thread running this VCPU changed. */ - struct pid *oldpid = vcpu-pid; - struct pid *newpid = get_task_pid(current, PIDTYPE_PID); - rcu_assign_pointer(vcpu-pid, newpid); - if (oldpid) - synchronize_rcu(); - put_pid(oldpid); - } I think it would make more sense to do this only for the KVM_RUN ioctl. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 0/2] assign each vcpu an owning thread and improve yielding
On 03/12/2014 13:12, David Hildenbrand wrote: This series improves yielding on architectures that cannot disable preemption while entering the guest and makes the creating thread of a VCPU the owning thread and therefore the yield target when yielding to that VCPU. We should focus on the case creating thread == executing thread and therefore remove the complicated handling of PIDs involving synchronize_rcus. This way we can speed up the creation of VCPUs and directly yield to the executing vcpu threads. Please note that - in theory - all VCPU ioctls should be triggered from the same VCPU thread, so changing threads is not a scenario we should optimize. David Hildenbrand (2): KVM: don't check for PF_VCPU when yielding KVM: thread creating a vcpu is the owner of that vcpu include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 22 ++ 2 files changed, 3 insertions(+), 20 deletions(-) Hi Paolo, would be good if you could have a look at these patches. Sure. I think patch 1 is fine and I am applying it. For patch 2, what about moving the -pid assignment in the KVM_RUN case of kvm_vcpu_ioctl? Thanks Paolo! Well, do we have any known user that relies on this thread-switching in case of KVM_RUN? If yes, I am totally with you. If not I'd prefer to get this code completely out, as it contains some unnecessary complexity. And maintaining such code that already had a couple of bugs in it without any benefit doesn't make much sense. What do you think? David Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 1/2] KVM: don't check for PF_VCPU when yielding
On 25/11/2014 17:04, David Hildenbrand wrote: As some architectures (e.g. s390) can't disable preemption while entering/leaving the guest, they won't receive the yield in all situations. kvm_enter_guest() has to be called with preemption_disabled and will set PF_VCPU. After that point e.g. s390 reenables preemption and starts to execute the guest. The thread might therefore be scheduled out between kvm_enter_guest() and kvm_exit_guest(), resulting in PF_VCPU being set but not being run. Please note that preemption has to stay enabled in order to correctly process page faults on s390. Current code takes PF_VCPU as a hint that the VCPU thread is running and therefore needs no yield. yield_to() checks whether the target thread is running, so let's use the inbuilt functionality to make it independent of PF_VCPU and preemption. Signed-off-by: David Hildenbrand d...@linux.vnet.ibm.com --- virt/kvm/kvm_main.c | 4 1 file changed, 4 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 5b45330..184f52e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1782,10 +1782,6 @@ int kvm_vcpu_yield_to(struct kvm_vcpu *target) rcu_read_unlock(); if (!task) return ret; - if (task-flags PF_VCPU) { - put_task_struct(task); - return ret; - } ret = yield_to(task, 1); put_task_struct(task); Applied with a rewritten commit message: KVM: don't check for PF_VCPU when yielding kvm_enter_guest() has to be called with preemption disabled and will set PF_VCPU. Current code takes PF_VCPU as a hint that the VCPU thread is running and therefore needs no yield. However, the check on PF_VCPU is wrong on s390, where preemption has to stay enabled on s390 in order to correctly process page faults. Thus, s390 reenables preemption and starts to execute the guest. The thread might be scheduled out between kvm_enter_guest() and kvm_exit_guest(), resulting in PF_VCPU being set but not being run. When this happens, the opportunity for directed yield is missed. However, this check is done already in kvm_vcpu_on_spin before calling kvm_vcpu_yield_loop: if (!ACCESS_ONCE(vcpu-preempted)) continue; so the check on PF_VCPU is superfluous in general, and this patch removes it. Signed-off-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 0/2] assign each vcpu an owning thread and improve yielding
Am 03.12.2014 um 13:54 schrieb Paolo Bonzini: On 03/12/2014 13:12, David Hildenbrand wrote: This series improves yielding on architectures that cannot disable preemption while entering the guest and makes the creating thread of a VCPU the owning thread and therefore the yield target when yielding to that VCPU. We should focus on the case creating thread == executing thread and therefore remove the complicated handling of PIDs involving synchronize_rcus. This way we can speed up the creation of VCPUs and directly yield to the executing vcpu threads. Please note that - in theory - all VCPU ioctls should be triggered from the same VCPU thread, so changing threads is not a scenario we should optimize. David Hildenbrand (2): KVM: don't check for PF_VCPU when yielding KVM: thread creating a vcpu is the owner of that vcpu include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 22 ++ 2 files changed, 3 insertions(+), 20 deletions(-) Hi Paolo, would be good if you could have a look at these patches. Sure. I think patch 1 is fine and I am applying it. For patch 2, what about moving the -pid assignment in the KVM_RUN case of kvm_vcpu_ioctl? That was my initial patch for the rcu specific latencies (do you remember?) But IMHO patch 2 is actually the proper thing to do, no? Christian -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 1/2] KVM: don't check for PF_VCPU when yielding
Applied with a rewritten commit message: KVM: don't check for PF_VCPU when yielding kvm_enter_guest() has to be called with preemption disabled and will set PF_VCPU. Current code takes PF_VCPU as a hint that the VCPU thread is running and therefore needs no yield. However, the check on PF_VCPU is wrong on s390, where preemption has to stay enabled on s390 in order to correctly process page faults. Thus, s390 reenables preemption and starts to execute the guest. The thread might be scheduled out between kvm_enter_guest() and kvm_exit_guest(), resulting in PF_VCPU being set but not being run. When this happens, the opportunity for directed yield is missed. However, this check is done already in kvm_vcpu_on_spin before calling kvm_vcpu_yield_loop: if (!ACCESS_ONCE(vcpu-preempted)) continue; so the check on PF_VCPU is superfluous in general, and this patch removes it. Signed-off-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com Perfect, thanks! David -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 0/2] assign each vcpu an owning thread and improve yielding
On 03/12/2014 14:00, Christian Borntraeger wrote: Am 03.12.2014 um 13:54 schrieb Paolo Bonzini: On 03/12/2014 13:12, David Hildenbrand wrote: This series improves yielding on architectures that cannot disable preemption while entering the guest and makes the creating thread of a VCPU the owning thread and therefore the yield target when yielding to that VCPU. We should focus on the case creating thread == executing thread and therefore remove the complicated handling of PIDs involving synchronize_rcus. This way we can speed up the creation of VCPUs and directly yield to the executing vcpu threads. Please note that - in theory - all VCPU ioctls should be triggered from the same VCPU thread, so changing threads is not a scenario we should optimize. David Hildenbrand (2): KVM: don't check for PF_VCPU when yielding KVM: thread creating a vcpu is the owner of that vcpu include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 22 ++ 2 files changed, 3 insertions(+), 20 deletions(-) Hi Paolo, would be good if you could have a look at these patches. Sure. I think patch 1 is fine and I am applying it. For patch 2, what about moving the -pid assignment in the KVM_RUN case of kvm_vcpu_ioctl? That was my initial patch for the rcu specific latencies (do you remember?) But IMHO patch 2 is actually the proper thing to do, no? Was it? :) Found it: http://permalink.gmane.org/gmane.comp.emulators.kvm.devel/125694 I don't know... I think it's feasible to have a userspace that creates all VCPUs in the main thread, and then runs them from multiple threads. Sure, those people would not have read the docs carefully, but it has worked until now. Wrongly-directed yields are a much better reason to apply your patch than QEMU slowness, and I've done that now. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] KVM: track pid for VCPU only on KVM_RUN ioctl
On 05/08/2014 16:44, Christian Borntraeger wrote: We currently track the pid of the task that runs the VCPU in vcpu_load. Since we call vcpu_load for all kind of ioctls on a CPU, this causes hickups due to synchronize_rcu if one CPU is modified by another CPU or the main thread (e.g. initialization, reset). We track the pid only for the purpose of yielding, so let's update the pid only in the KVM_RUN ioctl. In addition, don't do a synchronize_rcu on startup (pid == 0). This speeds up guest boot time on s390 noticably for some configs, e.g. HZ=100, no full state tracking, 64 guest cpus 32 host cpus. Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com CC: Rik van Riel r...@redhat.com CC: Raghavendra K T raghavendra...@linux.vnet.ibm.com CC: Michael Mueller m...@linux.vnet.ibm.com --- virt/kvm/kvm_main.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9ae9135..ebc8f54 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -124,14 +124,6 @@ int vcpu_load(struct kvm_vcpu *vcpu) if (mutex_lock_killable(vcpu-mutex)) return -EINTR; - if (unlikely(vcpu-pid != current-pids[PIDTYPE_PID].pid)) { - /* The thread running this VCPU changed. */ - struct pid *oldpid = vcpu-pid; - struct pid *newpid = get_task_pid(current, PIDTYPE_PID); - rcu_assign_pointer(vcpu-pid, newpid); - synchronize_rcu(); - put_pid(oldpid); - } cpu = get_cpu(); preempt_notifier_register(vcpu-preempt_notifier); kvm_arch_vcpu_load(vcpu, cpu); @@ -1991,6 +1983,15 @@ static long kvm_vcpu_ioctl(struct file *filp, r = -EINVAL; if (arg) goto out; + if (unlikely(vcpu-pid != current-pids[PIDTYPE_PID].pid)) { + /* The thread running this VCPU changed. */ + struct pid *oldpid = vcpu-pid; + struct pid *newpid = get_task_pid(current, PIDTYPE_PID); + rcu_assign_pointer(vcpu-pid, newpid); + if (oldpid) + synchronize_rcu(); + put_pid(oldpid); + } r = kvm_arch_vcpu_ioctl_run(vcpu, vcpu-run); trace_kvm_userspace_exit(vcpu-run-exit_reason, r); break; Applied with rewritten commit message: KVM: track pid for VCPU only on KVM_RUN ioctl We currently track the pid of the task that runs the VCPU in vcpu_load. If a yield to that VCPU is triggered while the PID of the wrong thread is active, the wrong thread might receive a yield, but this will most likely not help the executing thread at all. Instead, if we only track the pid on the KVM_RUN ioctl, there are two possibilities: 1) the thread that did a non-KVM_RUN ioctl is holding a mutex that the VCPU thread is waiting for. In this case, the VCPU thread is not runnable, but we also do not do a wrong yield. 2) the thread that did a non-KVM_RUN ioctl is sleeping, or doing something that does not block the VCPU thread. In this case, the VCPU thread can receive the directed yield correctly. Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com CC: Rik van Riel r...@redhat.com CC: Raghavendra K T raghavendra...@linux.vnet.ibm.com CC: Michael Mueller m...@linux.vnet.ibm.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com Thanks, Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] KVM: cpuid: set CPUID(EAX=0xd,ECX=1).EBX correctly
This is the size of the XSAVES area. This completes guest support for XSAVES (with no support yet for supervisor states, i.e. XSS == 0 always in guests for now). Suggested-by: Radim Krčmář rkrc...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- arch/x86/kvm/cpuid.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 2c561dba81c0..646e6e830ac3 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -23,7 +23,7 @@ #include mmu.h #include trace.h -static u32 xstate_required_size(u64 xstate_bv) +static u32 xstate_required_size(u64 xstate_bv, bool compacted) { int feature_bit = 0; u32 ret = XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET; @@ -31,9 +31,10 @@ static u32 xstate_required_size(u64 xstate_bv) xstate_bv = XSTATE_EXTEND_MASK; while (xstate_bv) { if (xstate_bv 0x1) { - u32 eax, ebx, ecx, edx; + u32 eax, ebx, ecx, edx, offset; cpuid_count(0xD, feature_bit, eax, ebx, ecx, edx); - ret = max(ret, eax + ebx); + offset = compacted ? ret : ebx; + ret = max(ret, offset + eax); } xstate_bv = 1; @@ -87,9 +88,13 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) (best-eax | ((u64)best-edx 32)) kvm_supported_xcr0(); vcpu-arch.guest_xstate_size = best-ebx = - xstate_required_size(vcpu-arch.xcr0); + xstate_required_size(vcpu-arch.xcr0, false); } + best = kvm_find_cpuid_entry(vcpu, 0xD, 1); + if (best (best-eax F(XSAVES))) + best-ebx = xstate_required_size(vcpu-arch.xcr0, true); + /* * The existing code assumes virtual address is 48-bit in the canonical * address checks; exit if it is ever changed. -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] kvm: x86: final XSAVES bits
The final thing to do, besides adding support for XSS != 0, is to set CPUID(EAX=0xd,ECX=1).EBX to the size of the XSAVES area. Paolo Bonzini (2): KVM: x86: use F() macro throughout cpuid.c KVM: x86: set CPUID(EAX=0xd,ECX=1).EBX correctly arch/x86/kvm/cpuid.c | 27 --- 1 file changed, 16 insertions(+), 11 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] KVM: x86: use F() macro throughout cpuid.c
For code that deals with cpuid, this makes things a bit more readable. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- arch/x86/kvm/cpuid.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index d82204ac555e..2c561dba81c0 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -53,6 +53,8 @@ u64 kvm_supported_xcr0(void) return xcr0; } +#define F(x) bit(X86_FEATURE_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -64,13 +66,13 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) /* Update OSXSAVE bit */ if (cpu_has_xsave best-function == 0x1) { - best-ecx = ~(bit(X86_FEATURE_OSXSAVE)); + best-ecx = ~F(OSXSAVE); if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE)) - best-ecx |= bit(X86_FEATURE_OSXSAVE); + best-ecx |= F(OSXSAVE); } if (apic) { - if (best-ecx bit(X86_FEATURE_TSC_DEADLINE_TIMER)) + if (best-ecx F(TSC_DEADLINE_TIMER)) apic-lapic_timer.timer_mode_mask = 3 17; else apic-lapic_timer.timer_mode_mask = 1 17; @@ -122,8 +124,8 @@ static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu) break; } } - if (entry (entry-edx bit(X86_FEATURE_NX)) !is_efer_nx()) { - entry-edx = ~bit(X86_FEATURE_NX); + if (entry (entry-edx F(NX)) !is_efer_nx()) { + entry-edx = ~F(NX); printk(KERN_INFO kvm: guest NX capability removed\n); } } @@ -227,8 +229,6 @@ static void do_cpuid_1_ent(struct kvm_cpuid_entry2 *entry, u32 function, entry-flags = 0; } -#define F(x) bit(X86_FEATURE_##x) - static int __do_cpuid_ent_emulated(struct kvm_cpuid_entry2 *entry, u32 func, u32 index, int *nent, int maxnent) { -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: cpuid: set CPUID(EAX=0xd,ECX=1).EBX correctly
2014-12-03 14:40+0100, Paolo Bonzini: This is the size of the XSAVES area. This completes guest support for XSAVES (with no support yet for supervisor states, i.e. XSS == 0 always in guests for now). Suggested-by: Radim Krčmář rkrc...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- Reviewed-by: Radim Krčmář rkrc...@redhat.com (The first one is ok too.) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [CFT PATCH v2 2/2] KVM: x86: support XSAVES usage in the host
Paolo Bonzini pbonz...@redhat.com wrote: Userspace is expecting non-compacted format for KVM_GET_XSAVE, but struct xsave_struct might be using the compacted format. Convert in order to preserve userspace ABI. Likewise, userspace is passing non-compacted format for KVM_SET_XSAVE but the kernel will pass it to XRSTORS, and we need to convert back. Fixes: f31a9f7c71691569359fa7fb8b0acaa44bce0324 Cc: Fenghua Yu fenghua...@intel.com Cc: H. Peter Anvin h...@linux.intel.com Cc: Nadav Amit na...@cs.technion.ac.il Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- arch/x86/kvm/x86.c | 87 +- 1 file changed, 80 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 08b5657e57ed..373b0ab9a32e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3132,15 +3132,89 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu, return 0; } +#define XSTATE_COMPACTION_ENABLED (1ULL 63) + +static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu) +{ + struct xsave_struct *xsave = vcpu-arch.guest_fpu.state-xsave; + u64 xstate_bv = vcpu-arch.guest_supported_xcr0 | XSTATE_FPSSE; + u64 valid; + + /* + * Copy legacy XSAVE area, to avoid complications with CPUID + * leaves 0 and 1 in the loop below. + */ + memcpy(dest, xsave, XSAVE_HDR_OFFSET); + + /* Set XSTATE_BV */ + *(u64 *)(dest + XSAVE_HDR_OFFSET) = xstate_bv; I have a problem with this line. I ran some experiments and it has a side-effect of causing XINUSE (an internal register which saves which state components are not in the initial state) to be all set. As a results, after load_xsave runs, when the guest runs xsave instruction, initialised xsave state components are marked as not-initialised in the guest’s xstate_bv. This causes both transparency issues (the VM does not behave as bare-metal machine). In addition it may cause performance overheads, since from this point on, xsave and xrstor instructions would save and load state which is in fact in the initial state. I think it is better just to replace the last line with: *(u64 *)(dest + XSAVE_HDR_OFFSET) = xsave-xsave_hdr.xstate_bv Thanks, Nadav -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
Thanks for the link, Here is my qemu command line: qemu-system-x86_64 -enable-kvm -name test_server_Windows1 -S -machine pc-i440fx-trusty,accel=kvm,usb=off -cpu SandyBridge,+erms,+smep,+fsgsbase,+pdpe1gb,+rdrand,+f16c,+osxsave,+dca,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme,hv_relaxed -m 16000 -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 -uuid bdab5b38-855d-3a47-136d-42e2ca9ea86e -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/test_server_Windows1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-reboot -boot menu=off,strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/test_server_Windows1.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/libvirt/images/test_server_Windows1-1.img,if=none,id=drive-virtio-disk1,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:dd:cf:5f,bus=pci.0,addr=0x3 -netdev tap,fd=29,id=hostnet1,vhost=on,vhostfd=30 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:85:66:fe,bus=pci.0,addr=0x6 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0,password -device VGA,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 the number of cores have mistake, I will change to 2 sockets, 4 cores, 1 thread tomorrow after overnight test is done. On Wed, Dec 3, 2014 at 8:14 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Wed, 2014-12-03 at 19:36 +0800, t...@tetrioncapital.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. http://msdn.microsoft.com/en-us/library/windows/hardware/ff557211% 28v=vs.85%29.aspx I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Can you post the qemu command line? Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [CFT PATCH v2 2/2] KVM: x86: support XSAVES usage in the host
On 03/12/2014 15:23, Nadav Amit wrote: I think it is better just to replace the last line with: *(u64 *)(dest + XSAVE_HDR_OFFSET) = xsave-xsave_hdr.xstate_bv Right, this matches u64 xstate_bv = *(u64 *)(src + XSAVE_HDR_OFFSET); ... xsave-xsave_hdr.xstate_bv = xstate_bv; in load_xsave. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: freebsd 10 under linux with kvm and apicv
On 01/12/2014 14:40, Vasiliy Tolstov wrote: Hello. I found some issues with enable_apicv=Y and freebsd, does this problem solved or no? I'm have latest linux 3.10.x. Also i'm succeseful run freebsd 10 with 1Gb memory, but failed to boot with 4gb memory. But in this case i think that is feebsd problem. Hi, I think the problem was already reported, but I haven't reproduced it yet (no machine with apicv). Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
the link that I was trying to follow and Win7 bootup stuck is this: http://blog.wikichoon.com/2014/07/enabling-hyper-v-enlightenments-with-kvm.html On Wed, Dec 3, 2014 at 10:51 PM, Thomas Lau t...@tetrioncapital.com wrote: Thanks for the link, Here is my qemu command line: qemu-system-x86_64 -enable-kvm -name test_server_Windows1 -S -machine pc-i440fx-trusty,accel=kvm,usb=off -cpu SandyBridge,+erms,+smep,+fsgsbase,+pdpe1gb,+rdrand,+f16c,+osxsave,+dca,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme,hv_relaxed -m 16000 -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 -uuid bdab5b38-855d-3a47-136d-42e2ca9ea86e -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/test_server_Windows1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-reboot -boot menu=off,strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/test_server_Windows1.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/libvirt/images/test_server_Windows1-1.img,if=none,id=drive-virtio-disk1,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:dd:cf:5f,bus=pci.0,addr=0x3 -netdev tap,fd=29,id=hostnet1,vhost=on,vhostfd=30 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:85:66:fe,bus=pci.0,addr=0x6 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0,password -device VGA,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 the number of cores have mistake, I will change to 2 sockets, 4 cores, 1 thread tomorrow after overnight test is done. On Wed, Dec 3, 2014 at 8:14 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Wed, 2014-12-03 at 19:36 +0800, t...@tetrioncapital.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. http://msdn.microsoft.com/en-us/library/windows/hardware/ff557211% 28v=vs.85%29.aspx I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Can you post the qemu command line? Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: usb audio device troubles
On 12/3/2014 3:52 AM, Hans de Goede wrote: Eric are you using usb-host redirection, or Spice's usb network redir ? This little bit of time this morning learning about spice and the network redirection. It worked for about half an hour and then failed in the same way the host redirection failed. The audio device would appear for a while, I would try to use it and then it would disappear. The spice model has some very nice features and that I could, in theory, have a working speech recognition engine somewhere on my air quotescloud/air quotes and then be able to use it via spice on any desktop I happen to be located in front of. it would also work nicely with my original idea of putting a working KVM virtual machine on and an e-sata SSD external drive and be able to bring my working speech recognition environment with me without having cart a laptop. I hope you can see that this could be generalized into a nicely portable accessibility solution where the accessibility environment moves with the disabled user and removes the need to make every machine have user specific accessibility software and configuration. Yes, it does impose a requirement the KVM runs everywhere but, we know that's the future anyway so why fight it :-) Anyway, I think if we can solve this USB audio device problem then I'll be very happy and can make further progress towards my goal. Thank you so very much for the help so far and I hope we can fix this USB problem. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[kvm-unit-tests PATCH v2] x86: emulator: Fix h_mem usage in tests_smsw
In emulator.c/tests_smsw, smsw (3) fails because h_mem isn't being set correctly before smsw is called. By using the + constraint modifier for memory we can ensure the compiler no longer optimizes out the assignment before smsw. Signed-off-by: Chris J Arges chris.j.ar...@canonical.com --- x86/emulator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/x86/emulator.c b/x86/emulator.c index 5aa4dbf..1e05574 100644 --- a/x86/emulator.c +++ b/x86/emulator.c @@ -359,7 +359,7 @@ void test_smsw(uint64_t *h_mem) /* Trigger exit on smsw */ *h_mem = 0x12345678abcdeful; - asm volatile(smsw %0 : =m(*h_mem)); + asm volatile(smsw %0 : +m(*h_mem)); report(smsw (3), msw == (unsigned short)*h_mem (*h_mem ~0xul) == 0x12345678abul); } -- 2.1.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [kvm-unit-tests PATCH v2] x86: emulator: Fix h_mem usage in tests_smsw
On 03/12/2014 16:44, Chris J Arges wrote: In emulator.c/tests_smsw, smsw (3) fails because h_mem isn't being set correctly before smsw is called. By using the + constraint modifier for memory we can ensure the compiler no longer optimizes out the assignment before smsw. Signed-off-by: Chris J Arges chris.j.ar...@canonical.com --- x86/emulator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/x86/emulator.c b/x86/emulator.c index 5aa4dbf..1e05574 100644 --- a/x86/emulator.c +++ b/x86/emulator.c @@ -359,7 +359,7 @@ void test_smsw(uint64_t *h_mem) /* Trigger exit on smsw */ *h_mem = 0x12345678abcdeful; - asm volatile(smsw %0 : =m(*h_mem)); + asm volatile(smsw %0 : +m(*h_mem)); report(smsw (3), msw == (unsigned short)*h_mem (*h_mem ~0xul) == 0x12345678abul); } Applied, thanks. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 3/5] KVM: arm/arm64: implement kvm_arch_is_virtual_intc_initialized
on arm/arm64 the VGIC is dynamically instantiated and it is useful to expose its state, especially for irqfd setup. This patch defines __KVM_HAVE_ARCH_VIRTUAL_INTC_INITIALIZED and implements kvm_arch_is_virtual_intc_initialized Signed-off-by: Eric Auger eric.au...@linaro.org --- arch/arm/include/asm/kvm_host.h | 6 ++ arch/arm/kvm/arm.c| 5 + arch/arm64/include/asm/kvm_host.h | 5 + 3 files changed, 16 insertions(+) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 53036e2..fe2c89b 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -27,6 +27,8 @@ #include asm/fpstate.h #include kvm/arm_arch_timer.h +#define __KVM_HAVE_ARCH_VIRTUAL_INTC_INITIALIZED + #if defined(CONFIG_KVM_ARM_MAX_VCPUS) #define KVM_MAX_VCPUS CONFIG_KVM_ARM_MAX_VCPUS #else @@ -242,4 +244,8 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} +/* returns true if the vgic dynamic initialization is done*/ +bool kvm_arch_is_virtual_intc_initialized(struct kvm *kvm); + + #endif /* __ARM_KVM_HOST_H__ */ diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 9e193c8..5309e4b 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -439,6 +439,11 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu) return 0; } +bool kvm_arch_is_virtual_intc_initialized(struct kvm *kvm) +{ + return vgic_initialized(kvm); +} + static void vcpu_pause(struct kvm_vcpu *vcpu) { wait_queue_head_t *wq = kvm_arch_vcpu_wq(vcpu); diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 2012c4b..5badd30 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -28,6 +28,8 @@ #include asm/kvm_asm.h #include asm/kvm_mmio.h +#define __KVM_HAVE_ARCH_VIRTUAL_INTC_INITIALIZED + #if defined(CONFIG_KVM_ARM_MAX_VCPUS) #define KVM_MAX_VCPUS CONFIG_KVM_ARM_MAX_VCPUS #else @@ -254,4 +256,7 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} +/* returns true if the vgic dynamic initialization is done*/ +bool kvm_arch_is_virtual_intc_initialized(struct kvm *kvm); + #endif /* __ARM64_KVM_HOST_H__ */ -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 4/5] KVM: irqfd: use kvm_arch_is_virtual_intc_initialized
On arm/arm64, the interrupt controller is dynamically instantiated. There is a risk the user-space assigns an irqfd before this latter is initialized and ready to accept virtual irq injection. On such attempt, the IRQFD setup is rejected and -EAGAIN is returned. Signed-off-by: Eric Auger eric.au...@linaro.org --- virt/kvm/eventfd.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index b0fb390..f837c83 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -314,6 +314,9 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args) unsigned int events; int idx; + if (!kvm_arch_is_virtual_intc_initialized(kvm)) + return -EAGAIN; + irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL); if (!irqfd) return -ENOMEM; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 5/5] KVM: arm/arm64: add irqfd support
This patch enables irqfd on arm/arm64. Both irqfd and resamplefd are supported. Injection is implemented in vgic.c without routing. This patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD. KVM_CAP_IRQFD is now advertised. KVM_CAP_IRQFD_RESAMPLE capability automatically is advertised as soon as CONFIG_HAVE_KVM_IRQFD is set. Irqfd injection is restricted to SPI. The rationale behind not supporting PPI irqfd injection is that any device using a PPI would be a private-to-the-CPU device (timer for instance), so its state would have to be context-switched along with the VCPU and would require in-kernel wiring anyhow. It is not a relevant use case for irqfds. Signed-off-by: Eric Auger eric.au...@linaro.org --- v4 - v5: - squash [PATCH v4 3/3] KVM: arm64: add irqfd support into this patch - some rewording in Documentation/virtual/kvm/api.txt and in vgic vgic_process_maintenance unlock comment. - move explanation of why not supporting PPI into commit message - in case of injection before gic readiness, -ENODEV is returned. It is up to the user space to avoid this situation. v3 - v4: - reword commit message - explain why we unlock the distributor before calling kvm_notify_acked_irq - rename is_assigned_irq into has_notifier - change EOI and injection kvm_debug format string - remove error local variable in kvm_set_irq - Move HAVE_KVM_IRQCHIP unset in a separate patch - handle case were the irqfd injection is attempted before the vgic is ready. in such a case the notifier, if any, is called immediatly - use nr_irqs to test spi is within correct range v2 - v3: - removal of irq.h from eventfd.c put in a separate patch to increase visibility - properly expose KVM_CAP_IRQFD capability in arm.c - remove CONFIG_HAVE_KVM_IRQCHIP meaningfull only if irq_comm.c is used v1 - v2: - rebase on 3.17rc1 - move of the dist unlock in process_maintenance - remove of dist lock in __kvm_vgic_sync_hwstate - rewording of the commit message (add resamplefd reference) - remove irq.h --- Documentation/virtual/kvm/api.txt | 6 +++- arch/arm/include/uapi/asm/kvm.h | 3 ++ arch/arm/kvm/Kconfig | 2 ++ arch/arm/kvm/Makefile | 2 +- arch/arm/kvm/arm.c| 3 ++ arch/arm64/include/uapi/asm/kvm.h | 3 ++ arch/arm64/kvm/Kconfig| 2 ++ arch/arm64/kvm/Makefile | 2 +- virt/kvm/arm/vgic.c | 63 --- 9 files changed, 79 insertions(+), 7 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 7610eaa..8993556 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2206,7 +2206,7 @@ into the hash PTE second double word). 4.75 KVM_IRQFD Capability: KVM_CAP_IRQFD -Architectures: x86 s390 +Architectures: x86 s390 arm arm64 Type: vm ioctl Parameters: struct kvm_irqfd (in) Returns: 0 on success, -1 on error @@ -2232,6 +2232,10 @@ Note that closing the resamplefd is not sufficient to disable the irqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment and need not be specified with KVM_IRQFD_FLAG_DEASSIGN. +On ARM/ARM64, the gsi field in the kvm_irqfd struct specifies the Shared +Peripheral Interrupt (SPI) index, such that the GIC interrupt ID is +given by gsi + 32. + 4.76 KVM_PPC_ALLOCATE_HTAB Capability: KVM_CAP_PPC_ALLOC_HTAB diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h index 09ee408..77547bb 100644 --- a/arch/arm/include/uapi/asm/kvm.h +++ b/arch/arm/include/uapi/asm/kvm.h @@ -196,6 +196,9 @@ struct kvm_arch_memory_slot { /* Highest supported SPI, from VGIC_NR_IRQS */ #define KVM_ARM_IRQ_GIC_MAX127 +/* One single KVM irqchip, ie. the VGIC */ +#define KVM_NR_IRQCHIPS 1 + /* PSCI interface */ #define KVM_PSCI_FN_BASE 0x95c1ba5e #define KVM_PSCI_FN(n) (KVM_PSCI_FN_BASE + (n)) diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig index 9f581b1..e519a40 100644 --- a/arch/arm/kvm/Kconfig +++ b/arch/arm/kvm/Kconfig @@ -24,6 +24,7 @@ config KVM select KVM_MMIO select KVM_ARM_HOST depends on ARM_VIRT_EXT ARM_LPAE + select HAVE_KVM_EVENTFD ---help--- Support hosting virtualized guest machines. You will also need to select one or more of the processor modules below. @@ -55,6 +56,7 @@ config KVM_ARM_MAX_VCPUS config KVM_ARM_VGIC bool KVM support for Virtual GIC depends on KVM_ARM_HOST OF + select HAVE_KVM_IRQFD default y ---help--- Adds support for a hardware assisted, in-kernel GIC emulation. diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile index f7057ed..859db09 100644 --- a/arch/arm/kvm/Makefile +++ b/arch/arm/kvm/Makefile @@ -15,7 +15,7 @@ AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt) AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt) KVM := ../../../virt/kvm -kvm-arm-y = $(KVM)/kvm_main.o
Re: [question] lots of interrupts injected to vm when pressing some key w/o releasing
On 28/11/2014 12:59, Zhang, Yang Z wrote: According the feedback from Haoyu on my test patch which skipping the interrupt injection if irq line is active (See another thread), it seems QEMU does not follow the rule. But my patch is just a workaround. I guess we should have more though on this issue to find a better solution. Yes, I agree. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 0/5] irqfd support for arm/arm64
This patch series enables irqfd on arm and arm64. Irqfd framework enables to inject a virtual IRQ into a guest upon an eventfd trigger. User-side uses KVM_IRQFD VM ioctl to provide KVM with a kvm_irqfd struct that associates a VM, an eventfd, a virtual IRQ number (aka. the gsi). When an actor signals the eventfd (typically a VFIO platform driver), the kvm irqfd subsystem injects the gsi into the VM. Resamplefd also is supported for level sensitive interrupts, ie. the user can provide another eventfd that is triggered when the completion of the virtual IRQ (gsi) is detected by the GIC. The gsi must correspond to a shared peripheral interrupt (SPI), ie the GIC interrupt ID is gsi + 32. The rationale behind not supporting PPI irqfd injection is that any device using a PPI would be a private-to-the-CPU device (timer for instance), so its state would have to be context-switched along with the VCPU and would require in-kernel wiring anyhow. It is not a relevant use case for irqfds. This patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD. No IRQ routing table is used, enabling to remove CONFIG_HAVE_KVM_IRQCHIP The ARM virtual interrupt controller, the VGIC, is dynamically instantiated. The user-space may attempt to assign an irqfd before the virtual interrupt controller is ready. For that reason a check is added in the generic irqfd code to test whether the virtual interrupt controller is ready. This is a new functionality in v5. This work was tested with Calxeda Midway xgmac main interrupt with qemu-system-arm and QEMU VFIO platform device. Also irqfd was proven functional on several vhost-net prototypes. v4 - v5: - add the capability to check whether vgic is initialized when assigning an irqfd. objective is to avoid injecting IRQ before this vgic is ready: this corresponds to new patch files 2, 3, 4. - do not specifically handle early virtual IRQ injections in kvm_set_irq. In case of injection when vgic is not yet ready, simply return an error. User-space now has means to force vgic init and get notified if irqfd assign takes place too early. - squash [PATCH v4 2/3] KVM: arm: add irqfd support and [PATCH v4 3/3] KVM: arm64: add irqfd support - add Acked-by's in KVM: arm/arm64: unset CONFIG_HAVE_KVM_IRQCHIP - some comment rewording in vgic v3 - v4: - rebase on 3.18rc5 - vgic dynamic instantiation brought new challenges: handling of irqfd injection when vgic is not ready - unset of CONFIG_HAVE_KVM_IRQCHIP in a separate patch - add arm64 enable - vgic.c style modifications according to Christoffer comments v2 - v3: - removal of irq.h from eventfd.c put in a separate patch to increase visibility - properly expose KVM_CAP_IRQFD capability in arm.c - remove CONFIG_HAVE_KVM_IRQCHIP meaningfull only if irq_comm.c is used v1 - v2: - rebase on 3.17rc1 - move of the dist unlock in process_maintenance - remove of dist lock in __kvm_vgic_sync_hwstate - rewording of the commit message (add resamplefd reference) - remove irq.h Eric Auger (5): KVM: arm/arm64: unset CONFIG_HAVE_KVM_IRQCHIP KVM: introduce kvm_arch_is_virtual_intc_initialized KVM: arm/arm64: implement kvm_arch_is_virtual_intc_initialized KVM: irqfd: use kvm_arch_is_virtual_intc_initialized KVM: arm/arm64: add irqfd support Documentation/virtual/kvm/api.txt | 6 +++- arch/arm/include/asm/kvm_host.h | 6 arch/arm/include/uapi/asm/kvm.h | 3 ++ arch/arm/kvm/Kconfig | 4 +-- arch/arm/kvm/Makefile | 2 +- arch/arm/kvm/arm.c| 8 + arch/arm64/include/asm/kvm_host.h | 5 arch/arm64/include/uapi/asm/kvm.h | 3 ++ arch/arm64/kvm/Kconfig| 3 +- arch/arm64/kvm/Makefile | 2 +- include/linux/kvm_host.h | 12 virt/kvm/arm/vgic.c | 63 --- virt/kvm/eventfd.c| 3 ++ 13 files changed, 110 insertions(+), 10 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 2/5] KVM: introduce kvm_arch_is_virtual_intc_initialized
Introduce __KVM_HAVE_ARCH_VIRTUAL_INTC_INITIALIZED define and associated kvm_arch_is_virtual_intc_initialized function. This latter allows to test whether the virtual interrupt controller is initialized and ready to accept virtual IRQ injection. On some architectures, the virtual interrupt controller is dynamically instantiated, justifying that kind of check. Signed-off-by: Eric Auger eric.au...@linaro.org --- include/linux/kvm_host.h | 12 1 file changed, 12 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index ea53b04..45fea3c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -696,6 +696,18 @@ static inline wait_queue_head_t *kvm_arch_vcpu_wq(struct kvm_vcpu *vcpu) #endif } +#ifndef __KVM_HAVE_ARCH_VIRTUAL_INTC_INITIALIZED +/* + * returns trues if the virtual interrupt controller is initialized and + * ready to accept virtual IRQ. On some architectures the virtual interrupt + * controller is dynamically instantiated and this is not always true. + */ +static inline bool kvm_arch_is_virtual_intc_initialized(struct kvm *kvm) +{ + return true; +} +#endif + int kvm_arch_init_vm(struct kvm *kvm, unsigned long type); void kvm_arch_destroy_vm(struct kvm *kvm); void kvm_arch_sync_events(struct kvm *kvm); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 1/5] KVM: arm/arm64: unset CONFIG_HAVE_KVM_IRQCHIP
CONFIG_HAVE_KVM_IRQCHIP is needed to support IRQ routing (along with irq_comm.c and irqchip.c usage). This is not the case for arm/arm64 currently. This patch unsets the flag for both arm and arm64. Signed-off-by: Eric Auger eric.au...@linaro.org Acked-by: Christoffer Dall christoffer.d...@linaro.org Acked-by: Will Deacon will.dea...@arm.com --- arch/arm/kvm/Kconfig | 2 -- arch/arm64/kvm/Kconfig | 1 - 2 files changed, 3 deletions(-) diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig index 466bd29..9f581b1 100644 --- a/arch/arm/kvm/Kconfig +++ b/arch/arm/kvm/Kconfig @@ -55,7 +55,6 @@ config KVM_ARM_MAX_VCPUS config KVM_ARM_VGIC bool KVM support for Virtual GIC depends on KVM_ARM_HOST OF - select HAVE_KVM_IRQCHIP default y ---help--- Adds support for a hardware assisted, in-kernel GIC emulation. @@ -63,7 +62,6 @@ config KVM_ARM_VGIC config KVM_ARM_TIMER bool KVM support for Architected Timers depends on KVM_ARM_VGIC ARM_ARCH_TIMER - select HAVE_KVM_IRQCHIP default y ---help--- Adds support for the Architected Timers in virtual machines diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index 8ba85e9..279e1a0 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -50,7 +50,6 @@ config KVM_ARM_MAX_VCPUS config KVM_ARM_VGIC bool depends on KVM_ARM_HOST OF - select HAVE_KVM_IRQCHIP ---help--- Adds support for a hardware assisted, in-kernel GIC emulation. -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [CFT PATCH v2 2/2] KVM: x86: support XSAVES usage in the host
2014-12-03 15:26+0100, Paolo Bonzini: On 03/12/2014 15:23, Nadav Amit wrote: I think it is better just to replace the last line with: *(u64 *)(dest + XSAVE_HDR_OFFSET) = xsave-xsave_hdr.xstate_bv Yeah, or we can use this value for xstate_bv to save some copying too, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 19e5e8f..ba2b7bd 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3137,7 +3137,7 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu, static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu) { struct xsave_struct *xsave = vcpu-arch.guest_fpu.state-xsave; - u64 xstate_bv = vcpu-arch.guest_supported_xcr0 | XSTATE_FPSSE; + u64 xstate_bv = xsave-xsave_hdr.xstate_bv; u64 valid; /* Right, this matches u64 xstate_bv = *(u64 *)(src + XSAVE_HDR_OFFSET); ... xsave-xsave_hdr.xstate_bv = xstate_bv; in load_xsave. Btw, we don't care about crashers from userspace? ---8--- KVM: x86: prevent #GP with malicious xsave XRSTORS throws #GP when XSTATE_BV isn't a subset of XCOMP_BV. Make it so. SDM: XRSTORS Exceptions #GP If a bit in the XCOMP_BV field in the XSAVE header is 0 and the corresponding bit in the XSTATE_BV field is 1. (Also in SDM: 13.11 OPERATION OF XRSTORS) Signed-off-by: Radim Krčmář rkrc...@redhat.com --- arch/x86/kvm/x86.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ca26681..19e5e8f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3184,8 +3184,10 @@ static void load_xsave(struct kvm_vcpu *vcpu, u8 *src) /* Set XSTATE_BV and possibly XCOMP_BV. */ xsave-xsave_hdr.xstate_bv = xstate_bv; - if (cpu_has_xsaves) + if (cpu_has_xsaves) { xsave-xsave_hdr.xcomp_bv = host_xcr0 | XSTATE_COMPACTION_ENABLED; + xsave-xsave_hdr.xstate_bv = xsave-xsave_hdr.xcomp_bv; + } /* * Copy each region from the non-compacted offset to the -- 2.2.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] [PATCH] xen: privcmd: schedule() after private hypercall when non CONFIG_PREEMPT
On Wed, Dec 03, 2014 at 05:37:51AM +0100, Juergen Gross wrote: On 12/03/2014 03:28 AM, Luis R. Rodriguez wrote: On Tue, Dec 02, 2014 at 11:11:18AM +, David Vrabel wrote: On 01/12/14 22:36, Luis R. Rodriguez wrote: Then I do agree its a fair analogy (and find this obviously odd that how widespread cond_resched() is), we just don't have an equivalent for IRQ context, why not avoid the special check then and use this all the time in the middle of a hypercall on the return from an interrupt (e.g., the timer interrupt)? http://lists.xen.org/archives/html/xen-devel/2014-02/msg01101.html OK thanks! That explains why we need some asm code but in that submission you still also had used is_preemptible_hypercall(regs) and in the new implementation you use a CPU variable xen_in_preemptible_hcall prior to calling preempt_schedule_irq(). I believe you added the CPU variable because preempt_schedule_irq() will preempt first without any checks if it should, I'm asking why not do something like cond_resched_irq() where we check with should_resched() prior to preempting and that way we can avoid having to use the CPU variable? Because that could preempt at any asynchronous interrupt making the no-preempt kernel fully preemptive. OK yeah I see. That still doesn't negate the value of using something like cond_resched_irq() with a should_resched() on only critical hypercalls. The current implementation (patch by David) forces preemption without checking for should_resched() so it would preempt unnecessarily at least once. How would you know you are just doing a critical hypercall which should be preempted? You would not, you're right. I was just trying to see if we could generalize an API for this to avoid having users having to create their own CPU variables but this all seems very specialized as we want to use this on the timer so if we do generalize a cond_resched_irq() perhaps the documentation can warn about this type of case or abuse. Luis -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/6] arm/arm64: KVM: Reset the HCR on each vcpu when resetting the vcpu
When userspace resets the vcpu using KVM_ARM_VCPU_INIT, we should also reset the HCR, because we now modify the HCR dynamically to enable/disable trapping of guest accesses to the VM registers. This is crucial for reboot of VMs working since otherwise we will not be doing the necessary cache maintenance operations when faulting in pages with the guest MMU off. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- arch/arm/include/asm/kvm_emulate.h | 5 + arch/arm/kvm/arm.c | 2 ++ arch/arm/kvm/guest.c | 1 - arch/arm64/include/asm/kvm_emulate.h | 5 + arch/arm64/kvm/guest.c | 1 - 5 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h index b9db269..66ce176 100644 --- a/arch/arm/include/asm/kvm_emulate.h +++ b/arch/arm/include/asm/kvm_emulate.h @@ -33,6 +33,11 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu); void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr); void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr); +static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) +{ + vcpu-arch.hcr = HCR_GUEST_MASK; +} + static inline bool vcpu_mode_is_32bit(struct kvm_vcpu *vcpu) { return 1; diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index edc1964..24c9ca4 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -658,6 +658,8 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, if (ret) return ret; + vcpu_reset_hcr(vcpu); + /* * Handle the start in power-off case by marking the VCPU as paused. */ diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index cc0b787..8c97208 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -38,7 +38,6 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) { - vcpu-arch.hcr = HCR_GUEST_MASK; return 0; } diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 5674a55..8127e45 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -38,6 +38,11 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu); void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr); void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr); +static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) +{ + vcpu-arch.hcr_el2 = HCR_GUEST_FLAGS; +} + static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) { return (unsigned long *)vcpu_gp_regs(vcpu)-regs.pc; diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 7679469..84d5959 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -38,7 +38,6 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) { - vcpu-arch.hcr_el2 = HCR_GUEST_FLAGS; return 0; } -- 2.1.2.330.g565301e.dirty -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/6] arm/arm64: KVM: Clarify KVM_ARM_VCPU_INIT ABI
It is not clear that this ioctl can be called multiple times for a given vcpu. Userspace already does this, so clarify the ABI. Also specify that userspace is expected to always make secondary and subsequent calls to the ioctl with the same parameters for the VCPU as the initial call (which userspace also already does). Add code to check that userspace doesn't violate that ABI in the future, and move the kvm_vcpu_set_target() function which is currently duplicated between the 32-bit and 64-bit versions in guest.c to a common static function in arm.c, shared between both architectures. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Documentation/virtual/kvm/api.txt | 5 + arch/arm/include/asm/kvm_host.h | 2 -- arch/arm/kvm/arm.c| 43 +++ arch/arm/kvm/guest.c | 25 --- arch/arm64/include/asm/kvm_host.h | 2 -- arch/arm64/kvm/guest.c| 25 --- 6 files changed, 48 insertions(+), 54 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index bb82a90..81f1b97 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2453,6 +2453,11 @@ return ENOEXEC for that vcpu. Note that because some registers reflect machine topology, all vcpus should be created before this ioctl is invoked. +Userspace can call this function multiple times for a given vcpu, including +after the vcpu has been run. This will reset the vcpu to its initial +state. All calls to this function after the initial call must use the same +target and same set of feature flags, otherwise EINVAL will be returned. + Possible features: - KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state. Depends on KVM_CAP_ARM_PSCI. If not set, the CPU will be powered on diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 53036e2..254e065 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -150,8 +150,6 @@ struct kvm_vcpu_stat { u32 halt_wakeup; }; -int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, - const struct kvm_vcpu_init *init); int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init); unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu); int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices); diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 24c9ca4..4043769 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -263,6 +263,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) { /* Force users to call KVM_ARM_VCPU_INIT */ vcpu-arch.target = -1; + bitmap_zero(vcpu-arch.features, KVM_VCPU_MAX_FEATURES); /* Set up the timer */ kvm_timer_vcpu_init(vcpu); @@ -649,6 +650,48 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level, return -EINVAL; } +static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, + const struct kvm_vcpu_init *init) +{ + unsigned int i; + int phys_target = kvm_target_cpu(); + + if (init-target != phys_target) + return -EINVAL; + + /* +* Secondary and subsequent calls to KVM_ARM_VCPU_INIT must +* use the same target. +*/ + if (vcpu-arch.target != -1 vcpu-arch.target != init-target) + return -EINVAL; + + /* -ENOENT for unknown features, -EINVAL for invalid combinations. */ + for (i = 0; i sizeof(init-features) * 8; i++) { + bool set = (init-features[i / 32] (1 (i % 32))); + + if (set i = KVM_VCPU_MAX_FEATURES) + return -ENOENT; + + /* +* Secondary and subsequent calls to KVM_ARM_VCPU_INIT must +* use the same feature set. +*/ + if (vcpu-arch.target != -1 i KVM_VCPU_MAX_FEATURES + test_bit(i, vcpu-arch.features) != set) + return -EINVAL; + + if (set) + set_bit(i, vcpu-arch.features); + } + + vcpu-arch.target = phys_target; + + /* Now we know what it is, we can reset it. */ + return kvm_reset_vcpu(vcpu); +} + + static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init) { diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index 8c97208..384bab6 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -273,31 +273,6 @@ int __attribute_const__ kvm_target_cpu(void) } } -int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, - const struct kvm_vcpu_init *init) -{ - unsigned int i; - - /* We can only cope with guest==host and only on A15/A7 (for now). */ - if (init-target != kvm_target_cpu()) - return -EINVAL; -
[PATCH v2 1/6] arm/arm64: KVM: Don't clear the VCPU_POWER_OFF flag
If a VCPU was originally started with power off (typically to be brought up by PSCI in SMP configurations), there is no need to clear the POWER_OFF flag in the kernel, as this flag is only tested during the init ioctl itself. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- arch/arm/kvm/arm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 9e193c8..b160bea 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -661,7 +661,7 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, /* * Handle the start in power-off case by marking the VCPU as paused. */ - if (__test_and_clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu-arch.features)) + if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu-arch.features)) vcpu-arch.pause = true; return 0; -- 2.1.2.330.g565301e.dirty -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 6/6] arm/arm64: KVM: Introduce stage2_unmap_vm
Introduce a new function to unmap user RAM regions in the stage2 page tables. This is needed on reboot (or when the guest turns off the MMU) to ensure we fault in pages again and make the dcache, RAM, and icache coherent. Using unmap_stage2_range for the whole guest physical range does not work, because that unmaps IO regions (such as the GIC) which will not be recreated or in the best case faulted in on a page-by-page basis. Call this function on secondary and subsequent calls to the KVM_ARM_VCPU_INIT ioctl so that a reset VCPU will detect the guest Stage-1 MMU is off when faulting in pages and make the caches coherent. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- arch/arm/include/asm/kvm_mmu.h | 1 + arch/arm/kvm/arm.c | 7 + arch/arm/kvm/mmu.c | 65 arch/arm64/include/asm/kvm_mmu.h | 1 + 4 files changed, 74 insertions(+) diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h index acb0d57..4654c42 100644 --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -52,6 +52,7 @@ int create_hyp_io_mappings(void *from, void *to, phys_addr_t); void free_boot_hyp_pgd(void); void free_hyp_pgds(void); +void stage2_unmap_vm(struct kvm *kvm); int kvm_alloc_stage2_pgd(struct kvm *kvm); void kvm_free_stage2_pgd(struct kvm *kvm); int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 4043769..da87c07 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -701,6 +701,13 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, if (ret) return ret; + /* +* Ensure a rebooted VM will fault in RAM pages and detect if the +* guest MMU is turned off and flush the caches as needed. +*/ + if (vcpu-arch.has_run_once) + stage2_unmap_vm(vcpu-kvm); + vcpu_reset_hcr(vcpu); /* diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c index 57a403a..b1f3c9a 100644 --- a/arch/arm/kvm/mmu.c +++ b/arch/arm/kvm/mmu.c @@ -611,6 +611,71 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size) unmap_range(kvm, kvm-arch.pgd, start, size); } +static void stage2_unmap_memslot(struct kvm *kvm, +struct kvm_memory_slot *memslot) +{ + hva_t hva = memslot-userspace_addr; + phys_addr_t addr = memslot-base_gfn PAGE_SHIFT; + phys_addr_t size = PAGE_SIZE * memslot-npages; + hva_t reg_end = hva + size; + + /* +* A memory region could potentially cover multiple VMAs, and any holes +* between them, so iterate over all of them to find out if we should +* unmap any of them. +* +* ++ +* +---++ ++ +* | : VMA 1 | VMA 2 | |VMA 3 :| +* +---++ ++ +* | memory region| +* ++ +*/ + do { + struct vm_area_struct *vma = find_vma(current-mm, hva); + hva_t vm_start, vm_end; + + if (!vma || vma-vm_start = reg_end) + break; + + /* +* Take the intersection of this VMA with the memory region +*/ + vm_start = max(hva, vma-vm_start); + vm_end = min(reg_end, vma-vm_end); + + if (!(vma-vm_flags VM_PFNMAP)) { + gpa_t gpa = addr + (vm_start - memslot-userspace_addr); + unmap_stage2_range(kvm, gpa, vm_end - vm_start); + } + hva = vm_end; + } while (hva reg_end); +} + +/** + * stage2_unmap_vm - Unmap Stage-2 RAM mappings + * @kvm: The struct kvm pointer + * + * Go through the memregions and unmap any reguler RAM + * backing memory already mapped to the VM. + */ +void stage2_unmap_vm(struct kvm *kvm) +{ + struct kvm_memslots *slots; + struct kvm_memory_slot *memslot; + int idx; + + idx = srcu_read_lock(kvm-srcu); + spin_lock(kvm-mmu_lock); + + slots = kvm_memslots(kvm); + kvm_for_each_memslot(memslot, slots) + stage2_unmap_memslot(kvm, memslot); + + spin_unlock(kvm-mmu_lock); + srcu_read_unlock(kvm-srcu, idx); +} + /** * kvm_free_stage2_pgd - free all stage-2 tables * @kvm: The KVM struct pointer for the VM. diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 0caf7a5..061fed7 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -83,6 +83,7 @@ int create_hyp_io_mappings(void *from, void *to, phys_addr_t); void free_boot_hyp_pgd(void); void
[PATCH v2 2/6] arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option
The implementation of KVM_ARM_VCPU_INIT is currently not doing what userspace expects, namely making sure that a vcpu which may have been turned off using PSCI is returned to its initial state, which would be powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag. Implement the expected functionality and clarify the ABI. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Documentation/virtual/kvm/api.txt | 3 ++- arch/arm/kvm/arm.c| 2 ++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 7610eaa..bb82a90 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2455,7 +2455,8 @@ should be created before this ioctl is invoked. Possible features: - KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state. - Depends on KVM_CAP_ARM_PSCI. + Depends on KVM_CAP_ARM_PSCI. If not set, the CPU will be powered on + and execute guest code when KVM_RUN is called. - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode. Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only). - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU. diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index b160bea..edc1964 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -663,6 +663,8 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, */ if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu-arch.features)) vcpu-arch.pause = true; + else + vcpu-arch.pause = false; return 0; } -- 2.1.2.330.g565301e.dirty -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/6] Improve PSCI system events and fix reboot bugs
Several people have reported problems with rebooting ARM VMs, especially on 32-bit ARM. This is mainly due to the same reason we were seeing boot errors in the past, namely that the ram, dcache, and icache weren't coherent on guest boot with the guest (stage-1) MMU disabled. We solved this by ensuring coherency when we fault in pages, but since most memory is already mapped after a reboot, we don't do anything. The solution is to unmap the regular RAM on VCPU init, but we must take care to not unmap the GIC or other IO regions, hence the somehwat complicated solution. As part of figuring this out, it became clear that some semantics around the KVM_ARM_VCPU_INIT ABI and system event ABI was unclear (what is userspace expected to do when it receives a system event). This series also clarifies the ABI and changes the kernel functionality to do what userspace expects (turn off VCPUs on a system shutdown event). The code is avaliable here as well: http://git.linaro.org/people/christoffer.dall/linux-kvm-arm.git vcpu_init_fixes-v2 There is an alternative version with more code-reuse for the unmapping implementation for the previous version of this patch series available in the following git repo: http://git.linaro.org/people/christoffer.dall/linux-kvm-arm.git vcpu_init_fixes-alternative Testing --- This has been tested on CubieBoard, Arndale, TC2, and Juno. On Arndale and TC2 it was extremely easy to reproduce the problem (just start a VM that runs reboot from /etc/rc.local or similar) and this series clearly fixes the behavior. For the previous version of this series, I was seeing some problems on Juno, but it turned out to be because I wasn't limiting my testing to one of the clusters, and since we don't support re-initing a VCPU on a different physical host CPU (big.LITTLE), it was failing. For this version of the patch series, it has been running a reboot loop on Juno for hours. Changelog - Changes v1-v2: - New patch to not clear the VCPU_POWER_OFF flag - Fixed spelling error in commit message - Adapted ABI texts based on Peter's feedback - Check for changed parameters to KVM_ARM_VCPU_INIT - Now unmap the Stage-2 RAM mappings at VCPU init instead of at PSCI system event time. Christoffer Dall (6): arm/arm64: KVM: Don't clear the VCPU_POWER_OFF flag arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option arm/arm64: KVM: Reset the HCR on each vcpu when resetting the vcpu arm/arm64: KVM: Clarify KVM_ARM_VCPU_INIT ABI arm/arm64: KVM: Turn off vcpus on PSCI shutdown/reboot arm/arm64: KVM: Introduce stage2_unmap_vm Documentation/virtual/kvm/api.txt| 17 +- arch/arm/include/asm/kvm_emulate.h | 5 +++ arch/arm/include/asm/kvm_host.h | 2 -- arch/arm/include/asm/kvm_mmu.h | 1 + arch/arm/kvm/arm.c | 56 ++- arch/arm/kvm/guest.c | 26 --- arch/arm/kvm/mmu.c | 65 arch/arm/kvm/psci.c | 19 +++ arch/arm64/include/asm/kvm_emulate.h | 5 +++ arch/arm64/include/asm/kvm_host.h| 3 +- arch/arm64/include/asm/kvm_mmu.h | 1 + arch/arm64/kvm/guest.c | 26 --- 12 files changed, 168 insertions(+), 58 deletions(-) -- 2.1.2.330.g565301e.dirty -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 5/6] arm/arm64: KVM: Turn off vcpus on PSCI shutdown/reboot
When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus should really be turned off for the VM adhering to the suggestions in the PSCI spec, and it's the sane thing to do. Also, clarify the behavior and expectations for exits to user space with the KVM_EXIT_SYSTEM_EVENT case. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Documentation/virtual/kvm/api.txt | 9 + arch/arm/kvm/psci.c | 19 +++ arch/arm64/include/asm/kvm_host.h | 1 + 3 files changed, 29 insertions(+) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 81f1b97..228f9cf 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2957,6 +2957,15 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes the system-level event type. The 'flags' field describes architecture specific flags for the system-level event. +Valid values for 'type' are: + KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the + VM. Userspace is not obliged to honour this, and if it does honour + this does not need to destroy the VM synchronously (ie it may call + KVM_RUN again before shutdown finally occurs). + KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM. + As with SHUTDOWN, userspace can choose to ignore the request, or + to schedule the reset to occur in the future and may call KVM_RUN again. + /* Fix the size of the union. */ char padding[256]; }; diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c index 09cf377..ae0bb91 100644 --- a/arch/arm/kvm/psci.c +++ b/arch/arm/kvm/psci.c @@ -15,6 +15,7 @@ * along with this program. If not, see http://www.gnu.org/licenses/. */ +#include linux/preempt.h #include linux/kvm_host.h #include linux/wait.h @@ -166,6 +167,24 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu) static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type) { + int i; + struct kvm_vcpu *tmp; + + /* +* The KVM ABI specifies that a system event exit may call KVM_RUN +* again and may perform shutdown/reboot at a later time that when the +* actual request is made. Since we are implementing PSCI and a +* caller of PSCI reboot and shutdown expects that the system shuts +* down or reboots immediately, let's make sure that VCPUs are not run +* after this call is handled and before the VCPUs have been +* re-initialized. +*/ + kvm_for_each_vcpu(i, tmp, vcpu-kvm) + tmp-arch.pause = true; + preempt_disable(); + force_vm_exit(cpu_all_mask); + preempt_enable(); + memset(vcpu-run-system_event, 0, sizeof(vcpu-run-system_event)); vcpu-run-system_event.type = type; vcpu-run-exit_reason = KVM_EXIT_SYSTEM_EVENT; diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 65c6152..0b7dfdb 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -198,6 +198,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void); struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void); u64 kvm_call_hyp(void *hypfn, ...); +void force_vm_exit(const cpumask_t *mask); int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run, int exception_index); -- 2.1.2.330.g565301e.dirty -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: x86: vmx: add checks on guest RIP
On 12/01/2014 08:27 AM, Paolo Bonzini wrote: On 29/11/2014 16:27, Eugene Korenevsky wrote: Signed-off-by: Eugene Korenevsky ekorenev...@gmail.com --- Notes: This patch adds checks on Guest RIP specified in Intel Software Developer Manual. The following checks are performed on processors that support Intel 64 architecture: - Bits 63:32 must be 0 if the IA-32e mode guest VM-entry control is 0 or if the L bit (bit 13) in the access-rights field for CS is 0. - If the processor supports N 64 linear-address bits, bits 63:N must be identical if the IA-32e mode guest VM-entry control is 1 and the L bit in the access-rights field for CS is 1. (No check applies if the processor supports 64 linear-address bits.) arch/x86/kvm/vmx.c | 27 ++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6a951d8..e2da83b 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3828,6 +3828,28 @@ static bool cs_ss_rpl_check(struct kvm_vcpu *vcpu) (ss.selector SELECTOR_RPL_MASK)); } +#ifdef CONFIG_X86_64 +static bool rip_valid(struct kvm_vcpu *vcpu) +{ +unsigned long rip; +struct kvm_segment cs; +bool longmode; + +/* RIP must be canonical in long mode + * Bits 63:32 of RIP must be zero in other processor modes */ +longmode = false; +if (vm_entry_controls_get(to_vmx(vcpu)) VM_ENTRY_IA32E_MODE) { +vmx_get_segment(vcpu, cs, VCPU_SREG_CS); +longmode = (cs.l != 0); +} +rip = kvm_register_read(vcpu, VCPU_REGS_RIP); +if (longmode) +return !is_noncanonical_address(rip); This check is off by one. It is checking bits 63:47 instead of bits 63:48 (this quirk is intentionally part of the specification, so that you can reenter a guest at 0x8000 after e.g. a VMCALL vmexit and cause a general protection fault). Seriously? Intel did that for vmcall but not sysret? However, I am not sure how this can occur. A #GP should have been injected as part of the instruction that caused RIP to become invalid. Perhaps you should check in nested_vmx_run instead? For syscall/sysret, at least, if you put a syscall at the highest possible non-negative canonical address, the sysret will fault on the way back. --Andy Paolo +else +return (rip 32) == 0; +} +#endif + /* * Check if guest state is valid. Returns true if valid, false if * not. @@ -3873,8 +3895,11 @@ static bool guest_state_valid(struct kvm_vcpu *vcpu) if (!ldtr_valid(vcpu)) return false; } +#ifdef CONFIG_X86_64 +if (!rip_valid(vcpu)) +return false; +#endif /* TODO: - * - Add checks on RIP * - Add checks on RFLAGS */ My -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: Book3S HV: Tracepoints for KVM HV guest interactions
This patch adds trace points in the guest entry and exit code and also for exceptions handled by the host in kernel mode - hypercalls and page faults. The new events are added to /sys/kernel/debug/tracing/events under a new subsystem called kvm_hv. Acked-by: Paul Mackerras pau...@samba.org Signed-off-by: Suresh Warrier warr...@linux.vnet.ibm.com --- Added new include file for common trace defines for kvm_pr and kvm_hv. Replaced hand-written numbers with defines in trace_hv.h. arch/powerpc/kvm/book3s_64_mmu_hv.c | 12 +- arch/powerpc/kvm/book3s_hv.c| 19 ++ arch/powerpc/kvm/trace_book3s.h | 32 +++ arch/powerpc/kvm/trace_hv.h | 477 arch/powerpc/kvm/trace_pr.h | 25 +- 5 files changed, 538 insertions(+), 27 deletions(-) create mode 100644 arch/powerpc/kvm/trace_book3s.h create mode 100644 arch/powerpc/kvm/trace_hv.h diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 8190e36..52e8fa1 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -39,6 +39,7 @@ #include asm/cputable.h #include book3s_hv_cma.h +#include trace_hv.h /* POWER7 has 10-bit LPIDs, PPC970 has 6-bit LPIDs */ #define MAX_LPID_970 63 @@ -628,6 +629,8 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, gfn = gpa PAGE_SHIFT; memslot = gfn_to_memslot(kvm, gfn); + trace_kvm_page_fault_enter(vcpu, hpte, memslot, ea, dsisr); + /* No memslot means it's an emulated MMIO region */ if (!memslot || (memslot-flags KVM_MEMSLOT_INVALID)) { gpa |= (ea (psize - 1)); @@ -642,6 +645,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, mmu_seq = kvm-mmu_notifier_seq; smp_rmb(); + ret = -EFAULT; is_io = 0; pfn = 0; page = NULL; @@ -665,7 +669,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, } up_read(current-mm-mmap_sem); if (!pfn) - return -EFAULT; + goto out_put; } else { page = pages[0]; if (PageHuge(page)) { @@ -693,14 +697,14 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, pfn = page_to_pfn(page); } - ret = -EFAULT; if (psize pte_size) goto out_put; /* Check WIMG vs. the actual page we're accessing */ if (!hpte_cache_flags_ok(r, is_io)) { if (is_io) - return -EFAULT; + goto out_put; + /* * Allow guest to map emulated device memory as * uncacheable, but actually make it cacheable. @@ -756,6 +760,8 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, SetPageDirty(page); out_put: + trace_kvm_page_fault_exit(vcpu, hpte, ret); + if (page) { /* * We drop pages[0] here, not page because page might diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index c2d2535..40615ab 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -58,6 +58,9 @@ #include book3s.h +#define CREATE_TRACE_POINTS +#include trace_hv.h + /* #define EXIT_DEBUG */ /* #define EXIT_DEBUG_SIMPLE */ /* #define EXIT_DEBUG_INT */ @@ -1721,6 +1724,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc) list_for_each_entry(vcpu, vc-runnable_threads, arch.run_list) { kvmppc_start_thread(vcpu); kvmppc_create_dtl_entry(vcpu, vc); + trace_kvm_guest_enter(vcpu); } /* Set this explicitly in case thread 0 doesn't have a vcpu */ @@ -1729,6 +1733,9 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc) vc-vcore_state = VCORE_RUNNING; preempt_disable(); + + trace_kvmppc_run_core(vc, 0); + spin_unlock(vc-lock); kvm_guest_enter(); @@ -1774,6 +1781,8 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc) kvmppc_core_pending_dec(vcpu)) kvmppc_core_dequeue_dec(vcpu); + trace_kvm_guest_exit(vcpu); + ret = RESUME_GUEST; if (vcpu-arch.trap) ret = kvmppc_handle_exit_hv(vcpu-arch.kvm_run, vcpu, @@ -1799,6 +1808,8 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc) wake_up(vcpu-arch.cpu_run); } } + + trace_kvmppc_run_core(vc, 1); } /* @@ -1845,11 +1856,13 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc) } vc-vcore_state = VCORE_SLEEPING; + trace_kvmppc_vcore_blocked(vc, 0); spin_unlock(vc-lock); schedule(); finish_wait(vc-wq,
Re: Windows 7 VM BSOD
Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu N§²æìr¸yúèØb²X¬¶Ç§vØ^)Þº{.nÇ+·¤¾h§¶¡Ü¨}©²Æ zÚj:+v¨¾«êçzZ+Ê+zf£¢·h§~Ûiÿûàz¹®w¥¢¸?¨èÚ¢)ߢf
Re: Windows 7 VM BSOD
Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] x86_64,entry: Rearrange the syscall exit optimizations
[adding potentially interested people] On Fri, Nov 7, 2014 at 3:58 PM, Andy Lutomirski l...@amacapital.net wrote: The syscall exit asm is a big mess. There's a really fast path, some kind of fast path code (with a hard-coded optimization for audit), and the really slow path. The result is that it's very hard to work with this code. There are some asm paths that are much slower than they should be (context tracking is a major offender), but no one really wants to add even more asm to speed them up. This series takes a different, unorthodox approach. Rather than trying to avoid entering the very slow iret path, it adds a way back out of the iret path. The result is a dramatic speedup for context tracking, user return notification, and similar code, as the cost of a few lines of tricky asm. Nonetheless, it's barely a net addition of asm code, because we get to remove the fast path optimizations for audit and rescheduling. Thoughts? If this works, it opens the door for a lot of further consolidation of the exit code. Note: patch 1 in this series has been floating around on the list for quite a while. It's mandatory for this series to work, because the buglet that it fixes almost completely defeats the optimization that I'm introducing. It turns out that sysret_audit may be rather buggy. I think it leaves edx and edi in a confused state, and it interacts badly with SCHEDULE_USER if context tracking is on. My preferred long-term solution is to delete sysret_audit entirely, which this patch set does. Can you (x86 people and people who, for reasons that escape me, enjoy reviewing this stuff) take a look? This clearly isn't 3.18 material, and it may want to soak in -next (can -tip do that? I can do it myself, I suppose), but it might also be a good idea to try to do this for 3.19 to get rid of sysret_audit. For those who haven't followed all the recent threads: the asm that's deleted in patch 3 currently has a nasty RCU + context tracking + audit bug that has become much easier to trigger as a result of the seccomp changes in 3.18. This isn't directly a bug in the seccomp changes -- it's just that the seccomp changes make it much easier to cause the offending ask to be executed. --Andy Andy Lutomirski (3): x86_64,entry: Fix RCX for traced syscalls x86_64,entry: Use sysret to return to userspace when possible x86_64,entry: Remove the syscall exit audit and schedule optimizations arch/x86/kernel/entry_64.S | 103 - 1 file changed, 55 insertions(+), 48 deletions(-) -- 1.9.3 -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
I just confirmed that vapic is causing win7 stuck. On Thu, Dec 4, 2014 at 9:34 AM, Thomas Lau t...@tetrioncapital.com wrote: Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
I just confirmed that vapic is causing win7 stuck. You'd better try the commit fc57ac2c9ca :-) On Thu, Dec 4, 2014 at 9:34 AM, Thomas Lau t...@tetrioncapital.com wrote: Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
Sure, but I am little confused as KVM is part of linux kernel now, if I want to try it, should I just upgrade kernel or compile kvm kernel module by myself ?! On Thu, Dec 4, 2014 at 10:01 AM, Zhang Haoyu zhan...@sangfor.com wrote: I just confirmed that vapic is causing win7 stuck. You'd better try the commit fc57ac2c9ca :-) On Thu, Dec 4, 2014 at 9:34 AM, Thomas Lau t...@tetrioncapital.com wrote: Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
Sure, but I am little confused as KVM is part of linux kernel now, if I want to try it, should I just upgrade kernel or compile kvm kernel module by myself ?! You can just apply the patch to kvm module and rebuild it. On Thu, Dec 4, 2014 at 10:01 AM, Zhang Haoyu zhan...@sangfor.com wrote: I just confirmed that vapic is causing win7 stuck. You'd better try the commit fc57ac2c9ca :-) On Thu, Dec 4, 2014 at 9:34 AM, Thomas Lau t...@tetrioncapital.com wrote: Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM
Hi all, We are pleased to announce the first release of KVMGT project. KVMGT is the implementation of Intel GVT-g technology, a full GPU virtualization solution. Under Intel GVT-g, a virtual GPU instance is maintained for each VM, with part of performance critical resources directly assigned. The capability of running native graphics driver inside a VM, without hypervisor intervention in performance critical paths, achieves a good balance of performance, feature, and sharing capability. KVMGT is still in the early stage: - Basic functions of full GPU virtualization works, guest can see a full-featured vGPU. We ran several 3D workloads such as lightsmark, nexuiz, urbanterror and warsow. - Only Linux guest supported so far, and PPGTT must be disabled in guest through a kernel parameter(see README.kvmgt in QEMU). - This drop also includes some Xen specific changes, which will be cleaned up later. - Our end goal is to upstream both XenGT and KVMGT, which shares ~90% logic for vGPU device model (will be part of i915 driver), with only difference in hypervisor specific services - insufficient test coverage, so please bear with stability issues :) There are things need to be improved, esp. the KVM interfacing part: 1 a domid was added to each KVMGT guest An ID is needed for foreground OS switching, e.g. # echo domid /sys/kernel/vgt/control/foreground_vm domid 0 is reserved for host OS. 2 SRCU workarounds. Some KVM functions, such as: kvm_io_bus_register_dev install_new_memslots must be called *without* kvm-srcu read-locked. Otherwise it hangs. In KVMGT, we need to register an iodev only *after* BAR registers are written by guest. That means, we already have kvm-srcu hold - trapping/emulating PIO(BAR registers) makes us in such a condition. That will make kvm_io_bus_register_dev hangs. Currently we have to disable rcu_assign_pointer() in such functions. These were dirty workarounds, your suggestions are high welcome! 3 syscalls were called to access /dev/mem from kernel An in-kernel memslot was added for aperture, but using syscalls like open and mmap to open and access the character device /dev/mem, for pass-through. The source codes(kernel, qemu as well as seabios) are available at github: git://github.com/01org/KVMGT-kernel git://github.com/01org/KVMGT-qemu git://github.com/01org/KVMGT-seabios In the KVMGT-qemu repository, there is a README.kvmgt to be referred. More information about Intel GVT-g and KVMGT can be found at: https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian http://events.linuxfoundation.org/sites/events/files/slides/KVMGT-a%20Full%20GPU%20Virtualization%20Solution_1.pdf Appreciate your comments, BUG reports, and contributions! -- Thanks, Jike -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
what does vapic affect Windows 7 at all if I disable it? if it just a minor performance drop, I am fine with that. On Thu, Dec 4, 2014 at 10:06 AM, Zhang Haoyu zhan...@sangfor.com wrote: Sure, but I am little confused as KVM is part of linux kernel now, if I want to try it, should I just upgrade kernel or compile kvm kernel module by myself ?! You can just apply the patch to kvm module and rebuild it. On Thu, Dec 4, 2014 at 10:01 AM, Zhang Haoyu zhan...@sangfor.com wrote: I just confirmed that vapic is causing win7 stuck. You'd better try the commit fc57ac2c9ca :-) On Thu, Dec 4, 2014 at 9:34 AM, Thomas Lau t...@tetrioncapital.com wrote: Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
what does vapic affect Windows 7 at all if I disable it? if it just a minor performance drop, I am fine with that. hv_vapic provides accelerated MSR access to high usage memory mapped APIC registers, EOI, ICR, TPR. You can gain performance promotion from it, not too much, but it also depends on the frequency of access to above three apic regs. On Thu, Dec 4, 2014 at 10:06 AM, Zhang Haoyu zhan...@sangfor.com wrote: Sure, but I am little confused as KVM is part of linux kernel now, if I want to try it, should I just upgrade kernel or compile kvm kernel module by myself ?! You can just apply the patch to kvm module and rebuild it. On Thu, Dec 4, 2014 at 10:01 AM, Zhang Haoyu zhan...@sangfor.com wrote: I just confirmed that vapic is causing win7 stuck. You'd better try the commit fc57ac2c9ca :-) On Thu, Dec 4, 2014 at 9:34 AM, Thomas Lau t...@tetrioncapital.com wrote: Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: Book3S HV: Simplify locking around stolen time calculations
Currently the calculations of stolen time for PPC Book3S HV guests uses fields in both the vcpu struct and the kvmppc_vcore struct. The fields in the kvmppc_vcore struct are protected by the vcpu-arch.tbacct_lock of the vcpu that has taken responsibility for running the virtual core. This works correctly but confuses lockdep, because it sees that the code takes the tbacct_lock for a vcpu in kvmppc_remove_runnable() and then takes another vcpu's tbacct_lock in vcore_stolen_time(), and it thinks there is a possibility of deadlock, causing it to print reports like this: = [ INFO: possible recursive locking detected ] 3.18.0-rc7-kvm-00016-g8db4bc6 #89 Not tainted - qemu-system-ppc/6188 is trying to acquire lock: ((vcpu-arch.tbacct_lock)-rlock){..}, at: [decb1fe8] .vcore_stolen_time+0x48/0xd0 [kvm_hv] but task is already holding lock: ((vcpu-arch.tbacct_lock)-rlock){..}, at: [decb25a0] .kvmppc_remove_runnable.part.3+0x30/0xd0 [kvm_hv] other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock((vcpu-arch.tbacct_lock)-rlock); lock((vcpu-arch.tbacct_lock)-rlock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by qemu-system-ppc/6188: #0: (vcpu-mutex){+.+.+.}, at: [deb93f98] .vcpu_load+0x28/0xe0 [kvm] #1: ((vcore-lock)-rlock){+.+...}, at: [decb41b0] .kvmppc_vcpu_run_hv+0x530/0x1530 [kvm_hv] #2: ((vcpu-arch.tbacct_lock)-rlock){..}, at: [decb25a0] .kvmppc_remove_runnable.part.3+0x30/0xd0 [kvm_hv] stack backtrace: CPU: 40 PID: 6188 Comm: qemu-system-ppc Not tainted 3.18.0-rc7-kvm-00016-g8db4bc6 #89 Call Trace: [c00b2754f3f0] [c0b31b6c] .dump_stack+0x88/0xb4 (unreliable) [c00b2754f470] [c00faeb8] .__lock_acquire+0x1878/0x2190 [c00b2754f600] [c00fbf0c] .lock_acquire+0xcc/0x1a0 [c00b2754f6d0] [c0b2954c] ._raw_spin_lock_irq+0x4c/0x70 [c00b2754f760] [decb1fe8] .vcore_stolen_time+0x48/0xd0 [kvm_hv] [c00b2754f7f0] [decb25b4] .kvmppc_remove_runnable.part.3+0x44/0xd0 [kvm_hv] [c00b2754f880] [decb43ec] .kvmppc_vcpu_run_hv+0x76c/0x1530 [kvm_hv] [c00b2754f9f0] [deb9f46c] .kvmppc_vcpu_run+0x2c/0x40 [kvm] [c00b2754fa60] [deb9c9a4] .kvm_arch_vcpu_ioctl_run+0x54/0x160 [kvm] [c00b2754faf0] [deb94538] .kvm_vcpu_ioctl+0x498/0x760 [kvm] [c00b2754fcb0] [c0267eb4] .do_vfs_ioctl+0x444/0x770 [c00b2754fd90] [c02682a4] .SyS_ioctl+0xc4/0xe0 [c00b2754fe30] [c00092e4] syscall_exit+0x0/0x98 In order to make the locking easier to analyse, we change the code to use a spinlock in the kvmppc_vcore struct to protect the stolen_tb and preempt_tb fields. This lock needs to be an irq-safe lock since it is used in the kvmppc_core_vcpu_load_hv() and kvmppc_core_vcpu_put_hv() functions, which are called with the scheduler rq lock held, which is an irq-safe lock. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/kvm/book3s_hv.c| 60 +++-- 2 files changed, 32 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 19ff9ee..63a66dd 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -278,6 +278,7 @@ struct kvmppc_vcore { struct list_head runnable_threads; spinlock_t lock; wait_queue_head_t wq; + spinlock_t stoltb_lock; /* protects stolen_tb and preempt_tb */ u64 stolen_tb; u64 preempt_tb; struct kvm_vcpu *runner; diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index b404cc6..02fbf5d 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -135,11 +135,10 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu) * stolen. * * Updates to busy_stolen are protected by arch.tbacct_lock; - * updates to vc-stolen_tb are protected by the arch.tbacct_lock - * of the vcpu that has taken responsibility for running the vcore - * (i.e. vc-runner). The stolen times are measured in units of - * timebase ticks. (Note that the != TB_NIL checks below are - * purely defensive; they should never fail.) + * updates to vc-stolen_tb are protected by the vcore-stoltb_lock + * lock. The stolen times are measured in units of timebase ticks. + * (Note that the != TB_NIL checks below are purely defensive; + * they should never fail.) */ static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu *vcpu, int cpu) @@ -147,12 +146,21 @@ static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu *vcpu, int cpu) struct kvmppc_vcore *vc = vcpu-arch.vcore; unsigned long flags; - spin_lock_irqsave(vcpu-arch.tbacct_lock, flags); - if (vc-runner == vcpu
Re: Windows 7 VM BSOD
I see, so it's minor performance gain, and not stability related option which is good. I am checking http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13.11.5-trusty/, changelog showing lsr function is included, but when I download and extract kvm.ko out then run nm kvm.ko | grep lsr, nothing found. On Thu, Dec 4, 2014 at 12:32 PM, Zhang Haoyu zhan...@sangfor.com wrote: what does vapic affect Windows 7 at all if I disable it? if it just a minor performance drop, I am fine with that. hv_vapic provides accelerated MSR access to high usage memory mapped APIC registers, EOI, ICR, TPR. You can gain performance promotion from it, not too much, but it also depends on the frequency of access to above three apic regs. On Thu, Dec 4, 2014 at 10:06 AM, Zhang Haoyu zhan...@sangfor.com wrote: Sure, but I am little confused as KVM is part of linux kernel now, if I want to try it, should I just upgrade kernel or compile kvm kernel module by myself ?! You can just apply the patch to kvm module and rebuild it. On Thu, Dec 4, 2014 at 10:01 AM, Zhang Haoyu zhan...@sangfor.com wrote: I just confirmed that vapic is causing win7 stuck. You'd better try the commit fc57ac2c9ca :-) On Thu, Dec 4, 2014 at 9:34 AM, Thomas Lau t...@tetrioncapital.com wrote: Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
Is this the correct function? kvm_lapic_set_eoi I found that one tho. On Thu, Dec 4, 2014 at 2:22 PM, Thomas Lau t...@tetrioncapital.com wrote: I see, so it's minor performance gain, and not stability related option which is good. I am checking http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13.11.5-trusty/, changelog showing lsr function is included, but when I download and extract kvm.ko out then run nm kvm.ko | grep lsr, nothing found. On Thu, Dec 4, 2014 at 12:32 PM, Zhang Haoyu zhan...@sangfor.com wrote: what does vapic affect Windows 7 at all if I disable it? if it just a minor performance drop, I am fine with that. hv_vapic provides accelerated MSR access to high usage memory mapped APIC registers, EOI, ICR, TPR. You can gain performance promotion from it, not too much, but it also depends on the frequency of access to above three apic regs. On Thu, Dec 4, 2014 at 10:06 AM, Zhang Haoyu zhan...@sangfor.com wrote: Sure, but I am little confused as KVM is part of linux kernel now, if I want to try it, should I just upgrade kernel or compile kvm kernel module by myself ?! You can just apply the patch to kvm module and rebuild it. On Thu, Dec 4, 2014 at 10:01 AM, Zhang Haoyu zhan...@sangfor.com wrote: I just confirmed that vapic is causing win7 stuck. You'd better try the commit fc57ac2c9ca :-) On Thu, Dec 4, 2014 at 9:34 AM, Thomas Lau t...@tetrioncapital.com wrote: Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
Is this the correct function? kvm_lapic_set_eoi No, see the detail of commit fc57ac2c9ca, https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc I found that one tho. On Thu, Dec 4, 2014 at 2:22 PM, Thomas Lau t...@tetrioncapital.com wrote: I see, so it's minor performance gain, and not stability related option which is good. I am checking http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13.11.5-trusty/, changelog showing lsr function is included, but when I download and extract kvm.ko out then run nm kvm.ko | grep lsr, nothing found. On Thu, Dec 4, 2014 at 12:32 PM, Zhang Haoyu zhan...@sangfor.com wrote: what does vapic affect Windows 7 at all if I disable it? if it just a minor performance drop, I am fine with that. hv_vapic provides accelerated MSR access to high usage memory mapped APIC registers, EOI, ICR, TPR. You can gain performance promotion from it, not too much, but it also depends on the frequency of access to above three apic regs. On Thu, Dec 4, 2014 at 10:06 AM, Zhang Haoyu zhan...@sangfor.com wrote: Sure, but I am little confused as KVM is part of linux kernel now, if I want to try it, should I just upgrade kernel or compile kvm kernel module by myself ?! You can just apply the patch to kvm module and rebuild it. On Thu, Dec 4, 2014 at 10:01 AM, Zhang Haoyu zhan...@sangfor.com wrote: I just confirmed that vapic is causing win7 stuck. You'd better try the commit fc57ac2c9ca :-) On Thu, Dec 4, 2014 at 9:34 AM, Thomas Lau t...@tetrioncapital.com wrote: Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: x86: vmx: add checks on guest RIP
On 03/12/2014 23:56, Andy Lutomirski wrote: This check is off by one. It is checking bits 63:47 instead of bits 63:48 (this quirk is intentionally part of the specification, so that you can reenter a guest at 0x8000 after e.g. a VMCALL vmexit and cause a general protection fault). Seriously? Intel did that for vmcall but not sysret? Yes, it is even tested by kvm-unit-tests. :) Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: Book3S HV: Tracepoints for KVM HV guest interactions
This patch adds trace points in the guest entry and exit code and also for exceptions handled by the host in kernel mode - hypercalls and page faults. The new events are added to /sys/kernel/debug/tracing/events under a new subsystem called kvm_hv. Acked-by: Paul Mackerras pau...@samba.org Signed-off-by: Suresh Warrier warr...@linux.vnet.ibm.com --- Added new include file for common trace defines for kvm_pr and kvm_hv. Replaced hand-written numbers with defines in trace_hv.h. arch/powerpc/kvm/book3s_64_mmu_hv.c | 12 +- arch/powerpc/kvm/book3s_hv.c| 19 ++ arch/powerpc/kvm/trace_book3s.h | 32 +++ arch/powerpc/kvm/trace_hv.h | 477 arch/powerpc/kvm/trace_pr.h | 25 +- 5 files changed, 538 insertions(+), 27 deletions(-) create mode 100644 arch/powerpc/kvm/trace_book3s.h create mode 100644 arch/powerpc/kvm/trace_hv.h diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 8190e36..52e8fa1 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -39,6 +39,7 @@ #include asm/cputable.h #include book3s_hv_cma.h +#include trace_hv.h /* POWER7 has 10-bit LPIDs, PPC970 has 6-bit LPIDs */ #define MAX_LPID_970 63 @@ -628,6 +629,8 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, gfn = gpa PAGE_SHIFT; memslot = gfn_to_memslot(kvm, gfn); + trace_kvm_page_fault_enter(vcpu, hpte, memslot, ea, dsisr); + /* No memslot means it's an emulated MMIO region */ if (!memslot || (memslot-flags KVM_MEMSLOT_INVALID)) { gpa |= (ea (psize - 1)); @@ -642,6 +645,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, mmu_seq = kvm-mmu_notifier_seq; smp_rmb(); + ret = -EFAULT; is_io = 0; pfn = 0; page = NULL; @@ -665,7 +669,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, } up_read(current-mm-mmap_sem); if (!pfn) - return -EFAULT; + goto out_put; } else { page = pages[0]; if (PageHuge(page)) { @@ -693,14 +697,14 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, pfn = page_to_pfn(page); } - ret = -EFAULT; if (psize pte_size) goto out_put; /* Check WIMG vs. the actual page we're accessing */ if (!hpte_cache_flags_ok(r, is_io)) { if (is_io) - return -EFAULT; + goto out_put; + /* * Allow guest to map emulated device memory as * uncacheable, but actually make it cacheable. @@ -756,6 +760,8 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, SetPageDirty(page); out_put: + trace_kvm_page_fault_exit(vcpu, hpte, ret); + if (page) { /* * We drop pages[0] here, not page because page might diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index c2d2535..40615ab 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -58,6 +58,9 @@ #include book3s.h +#define CREATE_TRACE_POINTS +#include trace_hv.h + /* #define EXIT_DEBUG */ /* #define EXIT_DEBUG_SIMPLE */ /* #define EXIT_DEBUG_INT */ @@ -1721,6 +1724,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc) list_for_each_entry(vcpu, vc-runnable_threads, arch.run_list) { kvmppc_start_thread(vcpu); kvmppc_create_dtl_entry(vcpu, vc); + trace_kvm_guest_enter(vcpu); } /* Set this explicitly in case thread 0 doesn't have a vcpu */ @@ -1729,6 +1733,9 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc) vc-vcore_state = VCORE_RUNNING; preempt_disable(); + + trace_kvmppc_run_core(vc, 0); + spin_unlock(vc-lock); kvm_guest_enter(); @@ -1774,6 +1781,8 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc) kvmppc_core_pending_dec(vcpu)) kvmppc_core_dequeue_dec(vcpu); + trace_kvm_guest_exit(vcpu); + ret = RESUME_GUEST; if (vcpu-arch.trap) ret = kvmppc_handle_exit_hv(vcpu-arch.kvm_run, vcpu, @@ -1799,6 +1808,8 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc) wake_up(vcpu-arch.cpu_run); } } + + trace_kvmppc_run_core(vc, 1); } /* @@ -1845,11 +1856,13 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc) } vc-vcore_state = VCORE_SLEEPING; + trace_kvmppc_vcore_blocked(vc, 0); spin_unlock(vc-lock); schedule(); finish_wait(vc-wq,