Re: [PATCH v7 3/3] x86, apicv: add virtual x2apic support
On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote: Zhang, Yang Z wrote on 2012-12-24: Gleb Natapov wrote on 2012-12-20: On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote: basically to benefit from apicv, we need clear MSR bitmap for corresponding x2apic MSRs: 0x800 - 0x8ff: no read intercept for apicv register virtualization TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery We do not set Virtualize x2APIC mode bit in secondary execution control. If I read the spec correctly without that those MSR read/writes will go straight to physical local APIC. Right. Now it cannot get benefit, but we may enable it in future and then we can benefit from it. Without enabling it you cannot disable MSR intercept for x2apic MSRs. how about to add the following check: if (apicv_enabled virtual_x2apic_enabled) clear_msr(); I do not understand what do you mean here. Signed-off-by: Yang Zhang yang.z.zh...@intel.com Signed-off-by: Kevin Tian kevin.t...@intel.com --- arch/x86/kvm/vmx.c | 62 ++-- 1 files changed, 55 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index be66c3e..9b5e7a2 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3773,7 +3773,10 @@ static void free_vpid(struct vcpu_vmx *vmx) spin_unlock(vmx_vpid_lock); } -static void __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr) +#define MSR_TYPE_R 1 +#define MSR_TYPE_W 2 +static void __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, +u32 msr, int type) { int f = sizeof(unsigned long); @@ -3786,20 +3789,52 @@ static void __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr) * We can control MSRs 0x-0x1fff and 0xc000-0xc0001fff.*/ if (msr = 0x1fff) { - __clear_bit(msr, msr_bitmap + 0x000 / f); /* read-low */ - __clear_bit(msr, msr_bitmap + 0x800 / f); /* write-low */ + if (type MSR_TYPE_R) + /* read-low */ + __clear_bit(msr, msr_bitmap + 0x000 / f); + + if (type MSR_TYPE_W) + /* write-low */ + __clear_bit(msr, msr_bitmap + 0x800 / f); + } else if ((msr = 0xc000) (msr = 0xc0001fff)) { msr = 0x1fff; - __clear_bit(msr, msr_bitmap + 0x400 / f); /* read-high */ - __clear_bit(msr, msr_bitmap + 0xc00 / f); /* write-high */ + if (type MSR_TYPE_R) + /* read-high */ + __clear_bit(msr, msr_bitmap + 0x400 / f); + + if (type MSR_TYPE_W) + /* write-high */ + __clear_bit(msr, msr_bitmap + 0xc00 / f); + } } static void vmx_disable_intercept_for_msr(u32 msr, bool longmode_only) { if (!longmode_only) - __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, msr); - __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, msr); + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, + msr, MSR_TYPE_R | MSR_TYPE_W); + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, + msr, MSR_TYPE_R | MSR_TYPE_W); +} + +static void vmx_disable_intercept_for_msr_read(u32 msr, bool longmode_only) +{ + if (!longmode_only) + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, + msr, MSR_TYPE_R); + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, + msr, MSR_TYPE_R); +} + +static void vmx_disable_intercept_for_msr_write(u32 msr, bool longmode_only) +{ + if (!longmode_only) + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, + msr, MSR_TYPE_W); + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, + msr, MSR_TYPE_W); } /* @@ -7633,6 +7668,19 @@ static int __init vmx_init(void) vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false); vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false); + if (enable_apicv_reg_vid) { + int msr; + for (msr = 0x800; msr = 0x8ff; msr++) + vmx_disable_intercept_for_msr_read(msr, false); + + /* TPR */ + vmx_disable_intercept_for_msr_write(0x808, false); + /* EOI */ + vmx_disable_intercept_for_msr_write(0x80b, false); + /* SELF-IPI */ + vmx_disable_intercept_for_msr_write(0x83f, false); + } + if (enable_ept) { kvm_mmu_set_mask_ptes(0ull, (enable_ept_ad_bits) ? VMX_EPT_ACCESS_BIT : 0ull, -- 1.7.1 -- Gleb. Best regards, Yang
Re: [PATCH 1/7] KVM: Write protect the updated slot only when we start dirty logging
On Tue, Dec 18, 2012 at 04:26:47PM +0900, Takuya Yoshikawa wrote: This is needed to make kvm_mmu_slot_remove_write_access() rmap based: otherwise we may end up using invalid rmap's. Signed-off-by: Takuya Yoshikawa yoshikawa_takuya...@lab.ntt.co.jp --- arch/x86/kvm/x86.c |9 - virt/kvm/kvm_main.c |1 - 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1c9c834..9451efa 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6897,7 +6897,14 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, spin_lock(kvm-mmu_lock); if (nr_mmu_pages) kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages); - kvm_mmu_slot_remove_write_access(kvm, mem-slot); + /* + * Write protect all pages for dirty logging. + * Existing largepage mappings are destroyed here and new ones will + * not be created until the end of the logging. + */ + if ((mem-flags KVM_MEM_LOG_DIRTY_PAGES) + !(old.flags KVM_MEM_LOG_DIRTY_PAGES)) + kvm_mmu_slot_remove_write_access(kvm, mem-slot); We should not check old slot flags here or at least check that old.npages is not zero. Userspace may delete a slot using old flags, then, if new memslot is created with dirty log enabled, it will not be protected. spin_unlock(kvm-mmu_lock); /* * If memory slot is created, or moved, we need to clear all diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index bd31096..0ef5daa 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -805,7 +805,6 @@ int __kvm_set_memory_region(struct kvm *kvm, if ((new.flags KVM_MEM_LOG_DIRTY_PAGES) !new.dirty_bitmap) { if (kvm_create_dirty_bitmap(new) 0) goto out_free; - /* destroy any largepage mappings for dirty tracking */ } if (!npages || base_gfn != old.base_gfn) { -- 1.7.5.4 -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] qemu-kvm/pci-assign: 64 bits bar emulation
On Sun, 2012-12-23 at 10:06 +0200, Gleb Natapov wrote: On Thu, Dec 20, 2012 at 11:07:23AM +0800, Xudong Hao wrote: Enable 64 bits bar emulation. v3 changes from v2: - Leave original error string and drop the leading 016. v2 changes from v1: - Change 0lx% to 0x%016 when print a 64 bit variable. Test pass with the current seabios which already support 64bit pci bars. Signed-off-by: Xudong Hao xudong@intel.com Alex, is this OK with you now? Yep Reviewed-by: Alex Williamson alex.william...@redhat.com --- hw/kvm/pci-assign.c | 14 ++ 1 files changed, 10 insertions(+), 4 deletions(-) diff --git a/hw/kvm/pci-assign.c b/hw/kvm/pci-assign.c index 7a0998c..2271a2e 100644 --- a/hw/kvm/pci-assign.c +++ b/hw/kvm/pci-assign.c @@ -46,6 +46,7 @@ #define IORESOURCE_IRQ 0x0400 #define IORESOURCE_DMA 0x0800 #define IORESOURCE_PREFETCH 0x2000 /* No side effects */ +#define IORESOURCE_MEM_64 0x0010 //#define DEVICE_ASSIGNMENT_DEBUG @@ -442,9 +443,13 @@ static int assigned_dev_register_regions(PCIRegion *io_regions, /* handle memory io regions */ if (cur_region-type IORESOURCE_MEM) { -int t = cur_region-type IORESOURCE_PREFETCH -? PCI_BASE_ADDRESS_MEM_PREFETCH -: PCI_BASE_ADDRESS_SPACE_MEMORY; +int t = PCI_BASE_ADDRESS_SPACE_MEMORY; +if (cur_region-type IORESOURCE_PREFETCH) { +t |= PCI_BASE_ADDRESS_MEM_PREFETCH; +} +if (cur_region-type IORESOURCE_MEM_64) { +t |= PCI_BASE_ADDRESS_MEM_TYPE_64; +} /* map physical memory */ pci_dev-v_addrs[i].u.r_virtbase = mmap(NULL, cur_region-size, @@ -632,7 +637,8 @@ again: rp-valid = 0; rp-resource_fd = -1; size = end - start + 1; -flags = IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH; +flags = IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH + | IORESOURCE_MEM_64; if (size == 0 || (flags ~IORESOURCE_PREFETCH) == 0) { continue; } -- 1.5.5 -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: Fix memory slot generation updates
On Sun, 2012-12-23 at 10:55 +0200, Gleb Natapov wrote: On Fri, Dec 21, 2012 at 08:20:16AM -0700, Alex Williamson wrote: Previous patch kvm: Minor memory slot optimization overlooked the generation field of the memory slots. Re-using the original memory slots left us with with two slightly different memory slots with the same generation. To fix this, make update_memslots() take a new parameter to specify the last generation. This also makes generation management more explicit to avoid such problems in the future. Reported-by: Takuya Yoshikawa yoshikawa_takuya...@lab.ntt.co.jp Signed-off-by: Alex Williamson alex.william...@redhat.com Applied, thanks. What about this small cleanup on to of the patch: Nice cleanup Reviewed-by: Alex Williamson alex.william...@redhat.com diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 14cbae8..e45c20c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -702,6 +702,17 @@ static int check_memory_region_flags(struct kvm_userspace_memory_region *mem) return 0; } +static struct kvm_memslots *install_new_memslots(struct kvm *kvm, + struct kvm_memslots *slots, struct kvm_memory_slot *new) +{ + struct kvm_memslots *old_memslots = kvm-memslots; + + update_memslots(slots, new, kvm-memslots-generation); + rcu_assign_pointer(kvm-memslots, slots); + synchronize_srcu_expedited(kvm-srcu); + return old_memslots; +} + /* * Allocate some memory and give it an address in the guest physical address * space. @@ -820,11 +831,8 @@ int __kvm_set_memory_region(struct kvm *kvm, slot = id_to_memslot(slots, mem-slot); slot-flags |= KVM_MEMSLOT_INVALID; - update_memslots(slots, NULL, kvm-memslots-generation); + old_memslots = install_new_memslots(kvm, slots, NULL); - old_memslots = kvm-memslots; - rcu_assign_pointer(kvm-memslots, slots); - synchronize_srcu_expedited(kvm-srcu); /* slot was deleted or moved, clear iommu mapping */ kvm_iommu_unmap_pages(kvm, old); /* From this point no new shadow pages pointing to a deleted, @@ -868,10 +876,7 @@ int __kvm_set_memory_region(struct kvm *kvm, memset(new.arch, 0, sizeof(new.arch)); } - update_memslots(slots, new, kvm-memslots-generation); - old_memslots = kvm-memslots; - rcu_assign_pointer(kvm-memslots, slots); - synchronize_srcu_expedited(kvm-srcu); + old_memslots = install_new_memslots(kvm, slots, new); kvm_arch_commit_memory_region(kvm, mem, old, user_alloc); -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support
Gleb Natapov wrote on 2012-12-24: On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote: Zhang, Yang Z wrote on 2012-12-24: Gleb Natapov wrote on 2012-12-20: On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote: basically to benefit from apicv, we need clear MSR bitmap for corresponding x2apic MSRs: 0x800 - 0x8ff: no read intercept for apicv register virtualization TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery We do not set Virtualize x2APIC mode bit in secondary execution control. If I read the spec correctly without that those MSR read/writes will go straight to physical local APIC. Right. Now it cannot get benefit, but we may enable it in future and then we can benefit from it. Without enabling it you cannot disable MSR intercept for x2apic MSRs. how about to add the following check: if (apicv_enabled virtual_x2apic_enabled) clear_msr(); I do not understand what do you mean here. In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As you said, since kvm doesn't set virtualize x2apic mode, APIC register virtualization never take effect. So we need to clear MSR bitmap only when apicv enabled and virtualize x2apic mode set. Signed-off-by: Yang Zhang yang.z.zh...@intel.com Signed-off-by: Kevin Tian kevin.t...@intel.com --- arch/x86/kvm/vmx.c | 62 ++-- 1 files changed, 55 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index be66c3e..9b5e7a2 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3773,7 +3773,10 @@ static void free_vpid(struct vcpu_vmx *vmx) spin_unlock(vmx_vpid_lock); } -static void __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr) +#define MSR_TYPE_R 1 +#define MSR_TYPE_W 2 +static void __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, +u32 msr, int type) { int f = sizeof(unsigned long); @@ -3786,20 +3789,52 @@ static void __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr) * We can control MSRs 0x-0x1fff and 0xc000-0xc0001fff.*/ if (msr = 0x1fff) { - __clear_bit(msr, msr_bitmap + 0x000 / f); /* read-low */ - __clear_bit(msr, msr_bitmap + 0x800 / f); /* write-low */ + if (type MSR_TYPE_R) + /* read-low */ + __clear_bit(msr, msr_bitmap + 0x000 / f); + + if (type MSR_TYPE_W) + /* write-low */ + __clear_bit(msr, msr_bitmap + 0x800 / f); + } else if ((msr = 0xc000) (msr = 0xc0001fff)) { msr = 0x1fff; - __clear_bit(msr, msr_bitmap + 0x400 / f); /* read-high */ - __clear_bit(msr, msr_bitmap + 0xc00 / f); /* write-high */ + if (type MSR_TYPE_R) + /* read-high */ + __clear_bit(msr, msr_bitmap + 0x400 / f); + + if (type MSR_TYPE_W) + /* write-high */ + __clear_bit(msr, msr_bitmap + 0xc00 / f); + } } static void vmx_disable_intercept_for_msr(u32 msr, bool longmode_only) { if (!longmode_only) - __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, msr); - __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, msr); + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, + msr, MSR_TYPE_R | MSR_TYPE_W); + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, + msr, MSR_TYPE_R | MSR_TYPE_W); +} + +static void vmx_disable_intercept_for_msr_read(u32 msr, bool longmode_only) +{ + if (!longmode_only) + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, + msr, MSR_TYPE_R); + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, + msr, MSR_TYPE_R); +} + +static void vmx_disable_intercept_for_msr_write(u32 msr, bool longmode_only) +{ + if (!longmode_only) + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, + msr, MSR_TYPE_W); + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, + msr, MSR_TYPE_W); } /* @@ -7633,6 +7668,19 @@ static int __init vmx_init(void) vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false); vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false); + if (enable_apicv_reg_vid) { + int msr; + for (msr = 0x800; msr = 0x8ff; msr++) + vmx_disable_intercept_for_msr_read(msr, false); + + /* TPR */ + vmx_disable_intercept_for_msr_write(0x808, false); + /* EOI */ + vmx_disable_intercept_for_msr_write(0x80b, false); + /* SELF-IPI */ + vmx_disable_intercept_for_msr_write(0x83f, false); + } + if (enable_ept) {
Re: [PATCH 1/7] KVM: Write protect the updated slot only when we start dirty logging
On Mon, 24 Dec 2012 15:27:17 +0200 Gleb Natapov g...@redhat.com wrote: @@ -6897,7 +6897,14 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, spin_lock(kvm-mmu_lock); if (nr_mmu_pages) kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages); - kvm_mmu_slot_remove_write_access(kvm, mem-slot); + /* +* Write protect all pages for dirty logging. +* Existing largepage mappings are destroyed here and new ones will +* not be created until the end of the logging. +*/ + if ((mem-flags KVM_MEM_LOG_DIRTY_PAGES) + !(old.flags KVM_MEM_LOG_DIRTY_PAGES)) + kvm_mmu_slot_remove_write_access(kvm, mem-slot); We should not check old slot flags here or at least check that old.npages is not zero. Userspace may delete a slot using old flags, then, if new memslot is created with dirty log enabled, it will not be protected. The flag, KVM_MEM_LOG_DIRTY_PAGES, is explicitely cleared when we delete a slot to free dirty_bitmap: if (!npages) mem-flags = ~KVM_MEM_LOG_DIRTY_PAGES; So when old.npages is not zero, the second condition must be true. Other parts are doing if (!slot-dirty_bitmap) to see if the slot is in dirty logging mode. If you prefer, we can do the same here. Thanks, Takuya -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/7] KVM: Write protect the updated slot only when we start dirty logging
On Tue, Dec 25, 2012 at 01:08:40PM +0900, Takuya Yoshikawa wrote: On Mon, 24 Dec 2012 15:27:17 +0200 Gleb Natapov g...@redhat.com wrote: @@ -6897,7 +6897,14 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, spin_lock(kvm-mmu_lock); if (nr_mmu_pages) kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages); - kvm_mmu_slot_remove_write_access(kvm, mem-slot); + /* + * Write protect all pages for dirty logging. + * Existing largepage mappings are destroyed here and new ones will + * not be created until the end of the logging. + */ + if ((mem-flags KVM_MEM_LOG_DIRTY_PAGES) + !(old.flags KVM_MEM_LOG_DIRTY_PAGES)) + kvm_mmu_slot_remove_write_access(kvm, mem-slot); We should not check old slot flags here or at least check that old.npages is not zero. Userspace may delete a slot using old flags, then, if new memslot is created with dirty log enabled, it will not be protected. The flag, KVM_MEM_LOG_DIRTY_PAGES, is explicitely cleared when we delete a slot to free dirty_bitmap: if (!npages) mem-flags = ~KVM_MEM_LOG_DIRTY_PAGES; So when old.npages is not zero, the second condition must be true. Right. I was looking for where it is cleared and missed that. Other parts are doing if (!slot-dirty_bitmap) to see if the slot is in dirty logging mode. If you prefer, we can do the same here. Does not matter to me. If you think it will be more consistent do it. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/7] KVM: Write protect the updated slot only when we start dirty logging
On Tue, 25 Dec 2012 07:05:55 +0200 Gleb Natapov g...@redhat.com wrote: Other parts are doing if (!slot-dirty_bitmap) to see if the slot is in dirty logging mode. If you prefer, we can do the same here. Does not matter to me. If you think it will be more consistent do it. No preference at the moment. But since .dirty_bitmap and .rmap may be changed to control free_physmem(new, old) behaviour, I'll keep the code as is. I'll send some patches to clean up the current mem, new, old, slot games in __kvm_set_memory_region() later. I was confused many times by these. In addition, passing old slot by value to kvm_arch_commit_memory_region() should be fixed as well. The structure is too big to copy. Thanks, Takuya -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 3/3] x86, apicv: add virtual x2apic support
On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-24: On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote: Zhang, Yang Z wrote on 2012-12-24: Gleb Natapov wrote on 2012-12-20: On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote: basically to benefit from apicv, we need clear MSR bitmap for corresponding x2apic MSRs: 0x800 - 0x8ff: no read intercept for apicv register virtualization TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery We do not set Virtualize x2APIC mode bit in secondary execution control. If I read the spec correctly without that those MSR read/writes will go straight to physical local APIC. Right. Now it cannot get benefit, but we may enable it in future and then we can benefit from it. Without enabling it you cannot disable MSR intercept for x2apic MSRs. how about to add the following check: if (apicv_enabled virtual_x2apic_enabled) clear_msr(); I do not understand what do you mean here. In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As you said, since kvm doesn't set virtualize x2apic mode, APIC register virtualization never take effect. So we need to clear MSR bitmap only when apicv enabled and virtualize x2apic mode set. But currently it is never set. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support
Gleb Natapov wrote on 2012-12-25: On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-24: On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote: Zhang, Yang Z wrote on 2012-12-24: Gleb Natapov wrote on 2012-12-20: On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote: basically to benefit from apicv, we need clear MSR bitmap for corresponding x2apic MSRs: 0x800 - 0x8ff: no read intercept for apicv register virtualization TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery We do not set Virtualize x2APIC mode bit in secondary execution control. If I read the spec correctly without that those MSR read/writes will go straight to physical local APIC. Right. Now it cannot get benefit, but we may enable it in future and then we can benefit from it. Without enabling it you cannot disable MSR intercept for x2apic MSRs. how about to add the following check: if (apicv_enabled virtual_x2apic_enabled) clear_msr(); I do not understand what do you mean here. In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As you said, since kvm doesn't set virtualize x2apic mode, APIC register virtualization never take effect. So we need to clear MSR bitmap only when apicv enabled and virtualize x2apic mode set. But currently it is never set. So you think the third patch is not necessary currently? Unless we enabled virtualize x2apic mode. Best regards, Yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 3/3] x86, apicv: add virtual x2apic support
On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-25: On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-24: On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote: Zhang, Yang Z wrote on 2012-12-24: Gleb Natapov wrote on 2012-12-20: On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote: basically to benefit from apicv, we need clear MSR bitmap for corresponding x2apic MSRs: 0x800 - 0x8ff: no read intercept for apicv register virtualization TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery We do not set Virtualize x2APIC mode bit in secondary execution control. If I read the spec correctly without that those MSR read/writes will go straight to physical local APIC. Right. Now it cannot get benefit, but we may enable it in future and then we can benefit from it. Without enabling it you cannot disable MSR intercept for x2apic MSRs. how about to add the following check: if (apicv_enabled virtual_x2apic_enabled) clear_msr(); I do not understand what do you mean here. In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As you said, since kvm doesn't set virtualize x2apic mode, APIC register virtualization never take effect. So we need to clear MSR bitmap only when apicv enabled and virtualize x2apic mode set. But currently it is never set. So you think the third patch is not necessary currently? Unless we enabled virtualize x2apic mode. Without third patch vid will not work properly if a guest is in x2apic mode. Actually second and third patches need to be reordered to not have a windows where x2apic is broken. The problem is that this patch itself is buggy since it does not set virtualize x2apic mode flag. It should set the flag if vid is enabled and if the flag cannot be set vid should be forced off. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support
Gleb Natapov wrote on 2012-12-25: On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-25: On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-24: On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote: Zhang, Yang Z wrote on 2012-12-24: Gleb Natapov wrote on 2012-12-20: On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote: basically to benefit from apicv, we need clear MSR bitmap for corresponding x2apic MSRs: 0x800 - 0x8ff: no read intercept for apicv register virtualization TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery We do not set Virtualize x2APIC mode bit in secondary execution control. If I read the spec correctly without that those MSR read/writes will go straight to physical local APIC. Right. Now it cannot get benefit, but we may enable it in future and then we can benefit from it. Without enabling it you cannot disable MSR intercept for x2apic MSRs. how about to add the following check: if (apicv_enabled virtual_x2apic_enabled) clear_msr(); I do not understand what do you mean here. In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As you said, since kvm doesn't set virtualize x2apic mode, APIC register virtualization never take effect. So we need to clear MSR bitmap only when apicv enabled and virtualize x2apic mode set. But currently it is never set. So you think the third patch is not necessary currently? Unless we enabled virtualize x2apic mode. Without third patch vid will not work properly if a guest is in x2apic mode. Actually second and third patches need to be reordered to not have a windows where x2apic is broken. The problem is that this patch itself is buggy since it does not set virtualize x2apic mode flag. It should set the flag if vid is enabled and if the flag cannot be set vid should be forced off. In what conditions this flag cannot be set? I think the only case is that KVM doesn't expose the x2apic capability to guest, if this is true, the guest will never use x2apic and we still can use vid. Best regards, Yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 3/3] x86, apicv: add virtual x2apic support
On Tue, Dec 25, 2012 at 07:25:15AM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-25: On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-25: On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-24: On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote: Zhang, Yang Z wrote on 2012-12-24: Gleb Natapov wrote on 2012-12-20: On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote: basically to benefit from apicv, we need clear MSR bitmap for corresponding x2apic MSRs: 0x800 - 0x8ff: no read intercept for apicv register virtualization TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery We do not set Virtualize x2APIC mode bit in secondary execution control. If I read the spec correctly without that those MSR read/writes will go straight to physical local APIC. Right. Now it cannot get benefit, but we may enable it in future and then we can benefit from it. Without enabling it you cannot disable MSR intercept for x2apic MSRs. how about to add the following check: if (apicv_enabled virtual_x2apic_enabled) clear_msr(); I do not understand what do you mean here. In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As you said, since kvm doesn't set virtualize x2apic mode, APIC register virtualization never take effect. So we need to clear MSR bitmap only when apicv enabled and virtualize x2apic mode set. But currently it is never set. So you think the third patch is not necessary currently? Unless we enabled virtualize x2apic mode. Without third patch vid will not work properly if a guest is in x2apic mode. Actually second and third patches need to be reordered to not have a windows where x2apic is broken. The problem is that this patch itself is buggy since it does not set virtualize x2apic mode flag. It should set the flag if vid is enabled and if the flag cannot be set vid should be forced off. In what conditions this flag cannot be set? I think the only case is that KVM doesn't expose the x2apic capability to guest, if this is true, the guest will never use x2apic and we still can use vid. We can indeed set virtualize x2apic mode unconditionally since it does not take any effect if x2apic MSRs are intercepted. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support
Gleb Natapov wrote on 2012-12-25: On Tue, Dec 25, 2012 at 07:25:15AM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-25: On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-25: On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-24: On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote: Zhang, Yang Z wrote on 2012-12-24: Gleb Natapov wrote on 2012-12-20: On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote: basically to benefit from apicv, we need clear MSR bitmap for corresponding x2apic MSRs: 0x800 - 0x8ff: no read intercept for apicv register virtualization TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery We do not set Virtualize x2APIC mode bit in secondary execution control. If I read the spec correctly without that those MSR read/writes will go straight to physical local APIC. Right. Now it cannot get benefit, but we may enable it in future and then we can benefit from it. Without enabling it you cannot disable MSR intercept for x2apic MSRs. how about to add the following check: if (apicv_enabled virtual_x2apic_enabled) clear_msr(); I do not understand what do you mean here. In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As you said, since kvm doesn't set virtualize x2apic mode, APIC register virtualization never take effect. So we need to clear MSR bitmap only when apicv enabled and virtualize x2apic mode set. But currently it is never set. So you think the third patch is not necessary currently? Unless we enabled virtualize x2apic mode. Without third patch vid will not work properly if a guest is in x2apic mode. Actually second and third patches need to be reordered to not have a windows where x2apic is broken. The problem is that this patch itself is buggy since it does not set virtualize x2apic mode flag. It should set the flag if vid is enabled and if the flag cannot be set vid should be forced off. In what conditions this flag cannot be set? I think the only case is that KVM doesn't expose the x2apic capability to guest, if this is true, the guest will never use x2apic and we still can use vid. We can indeed set virtualize x2apic mode unconditionally since it does not take any effect if x2apic MSRs are intercepted. No. Since Virtual APIC access must be cleared if virtualize x2apic mode is set, and if guest still use xAPIC, then there should be lots of ept violations for apic access emulation. This will hurt performance. We should only set virtualize x2apic mode when guest really uses x2apic(guest set bit 11 of APIC_BASE_MSR). Best regards, Yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support
Marcelo Tosatti wrote on 2012-12-21: On Thu, Dec 20, 2012 at 03:12:32PM +0200, Gleb Natapov wrote: On Thu, Dec 20, 2012 at 10:53:16AM -0200, Marcelo Tosatti wrote: On Thu, Dec 20, 2012 at 08:42:06AM +0200, Gleb Natapov wrote: On Wed, Dec 19, 2012 at 10:59:36PM -0200, Marcelo Tosatti wrote: On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote: From: Yang Zhang yang.z.zh...@intel.com Virtual interrupt delivery avoids KVM to inject vAPIC interrupts manually, which is fully taken care of by the hardware. This needs some special awareness into existing interrupr injection path: - for pending interrupt, instead of direct injection, we may need update architecture specific indicators before resuming to guest. - A pending interrupt, which is masked by ISR, should be also considered in above update action, since hardware will decide when to inject it at right time. Current has_interrupt and get_interrupt only returns a valid vector from injection p.o.v. Signed-off-by: Kevin Tian kevin.t...@intel.com Signed-off-by: Yang Zhang yang.z.zh...@intel.com Resuming previous discussion: How about to recaculate irr_pending according the VIRR on each vmexit? No need really. Since HW can only clear VIRR the only situation that may happen is that irr_pending will be true but VIRR is empty and apic_find_highest_irr() will return correct result in this case. Self-IPI does cause VIRR to be set, see 29.1.5 Self-IPI Virtualization. True. But as I said later in that discussion once irr_pending is set to true it never becomes false, so the optimization is effectively disable. We can set it to true doing apic initialization to make it explicit. Its just confusing, to have a variable which has different meanings in different configurations. I would rather have it explicit that its not used rather than check every time the i read the code. if (apic_vid() == 0 !apic-irr_pending) return -1; I'd prefer to avoid this additional if() especially as its sole purpose is documentation. We can add comment instead. Note that irr_pending is just a hint anyway. It can be true when no interrupt is pending in irr. We can even rename it to irr_pending_hint or something. Works for me (documentation). Not sure if you can skip it, its probably necessary to calculate it before HW does so (say migration etc). kvm_apic_has_interrupt() is not called during migration and kvm_apic_post_state_restore() calls apic_update_ppr() explicitly. I am not sure it is needed though since migrated value should be already correct anyway. Ok, best force isr_count to 1 if apic vintr enabled (and add a comment, please). Sorry for the later reply. I will address those problems according your comments. Best regards, Yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 3/3] x86, apicv: add virtual x2apic support
On Tue, Dec 25, 2012 at 07:46:53AM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-25: On Tue, Dec 25, 2012 at 07:25:15AM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-25: On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-25: On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote: Gleb Natapov wrote on 2012-12-24: On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote: Zhang, Yang Z wrote on 2012-12-24: Gleb Natapov wrote on 2012-12-20: On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote: basically to benefit from apicv, we need clear MSR bitmap for corresponding x2apic MSRs: 0x800 - 0x8ff: no read intercept for apicv register virtualization TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery We do not set Virtualize x2APIC mode bit in secondary execution control. If I read the spec correctly without that those MSR read/writes will go straight to physical local APIC. Right. Now it cannot get benefit, but we may enable it in future and then we can benefit from it. Without enabling it you cannot disable MSR intercept for x2apic MSRs. how about to add the following check: if (apicv_enabled virtual_x2apic_enabled) clear_msr(); I do not understand what do you mean here. In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As you said, since kvm doesn't set virtualize x2apic mode, APIC register virtualization never take effect. So we need to clear MSR bitmap only when apicv enabled and virtualize x2apic mode set. But currently it is never set. So you think the third patch is not necessary currently? Unless we enabled virtualize x2apic mode. Without third patch vid will not work properly if a guest is in x2apic mode. Actually second and third patches need to be reordered to not have a windows where x2apic is broken. The problem is that this patch itself is buggy since it does not set virtualize x2apic mode flag. It should set the flag if vid is enabled and if the flag cannot be set vid should be forced off. In what conditions this flag cannot be set? I think the only case is that KVM doesn't expose the x2apic capability to guest, if this is true, the guest will never use x2apic and we still can use vid. We can indeed set virtualize x2apic mode unconditionally since it does not take any effect if x2apic MSRs are intercepted. No. Since Virtual APIC access must be cleared if virtualize x2apic mode is set, and if guest still use xAPIC, then there should be lots of ept violations for apic access emulation. This will hurt performance. Stupid HW, why this pointless limitation? Can you point me where SDM says that? We should only set virtualize x2apic mode when guest really uses x2apic(guest set bit 11 of APIC_BASE_MSR). Looks like SDM force us to. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html