Re: Fwd: Re: [RFC] kvm tools: Implement multiple VQ for virtio-net
Hi Stephen, Benjamin forwarded me your email stating: > I have been playing with userspace-rcu which has a number of neat > lockless routines for queuing and hashing. But there aren't kernel versions > and several of them may require cmpxchg to work. Just FYI, I made sure a few years ago that cmpxchg is implemented on all architectures within the Linux kernel (using a interrupt disable fallback on the cases where it is not supported architecturally, on UP-only architectures), so we should be good to use the lock-free structures as-is in the kernel on this front. As for the RCU use by these structures, userspace RCU has very much the same semantic as in the kernel, so we can implement and test these structures in userspace and eventually port them to the kernel as needed. Lai Jiangshan is actively working at making sure the user-level implementation of the RCU lock-free hash table (currently in a development branch of the userspace RCU git tree : urcu/ht-shrink, not yet in master) is suitable for use in the Linux kernel too. Best regards, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] PCI: Can continually add funcs after adding func0
Boot up a KVM guest, and hotplug multifunction devices(func1,func2,func0,func3) to guest. for i in 1 2 0 3;do qemu-img create /tmp/resize$i.qcow2 1G -f qcow2 (qemu) drive_add 0x11.$i id=drv11$i,if=none,file=/tmp/resize$i.qcow2 (qemu) device_add virtio-blk-pci,id=dev11$i,drive=drv11$i,addr=0x11.$i,multifunction=on done In linux kernel, when func0 of the slot is hot-added, the whole slot will be marked as 'enabled', then driver will ignore other new hotadded funcs. But in Win7 & WinXP, we can continaully add other funcs after adding func0, all funcs will be added in guest. drivers/pci/hotplug/acpiphp_glue.c: static int acpiphp_check_bridge(struct acpiphp_bridge *bridge) { for (slot = bridge->slots; slot; slot = slot->next) { if (slot->flags & SLOT_ENABLED) { acpiphp_disable_slot() else acpiphp_enable_slot() | } v enable_device() | v //only don't enable slot if func0 is not added list_for_each_entry(func, &slot->funcs, sibling) { ... } slot->flags |= SLOT_ENABLED; //mark slot to 'enabled' This patch just make pci driver can continaully add funcs after adding func 0. Only mark slot to 'enabled' when all funcs are added. For pci multifunction hotplug, we can add functions one by one(func 0 is necessary), and all functions will be removed in one time. Signed-off-by: Amos Kong --- drivers/pci/hotplug/acpiphp_glue.c | 29 + 1 files changed, 13 insertions(+), 16 deletions(-) diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c index 596172b..a1b0afc 100644 --- a/drivers/pci/hotplug/acpiphp_glue.c +++ b/drivers/pci/hotplug/acpiphp_glue.c @@ -789,20 +789,10 @@ static int __ref enable_device(struct acpiphp_slot *slot) if (slot->flags & SLOT_ENABLED) goto err_exit; - /* sanity check: dev should be NULL when hot-plugged in */ - dev = pci_get_slot(bus, PCI_DEVFN(slot->device, 0)); - if (dev) { - /* This case shouldn't happen */ - err("pci_dev structure already exists.\n"); - pci_dev_put(dev); - retval = -1; - goto err_exit; - } - num = pci_scan_slot(bus, PCI_DEVFN(slot->device, 0)); if (num == 0) { - err("No new device found\n"); - retval = -1; + /* Maybe only part of funcs are added. */ + dbg("No new device found\n"); goto err_exit; } @@ -837,11 +827,16 @@ static int __ref enable_device(struct acpiphp_slot *slot) pci_bus_add_devices(bus); + slot->flags |= SLOT_ENABLED; list_for_each_entry(func, &slot->funcs, sibling) { dev = pci_get_slot(bus, PCI_DEVFN(slot->device, func->function)); - if (!dev) + if (!dev) { + /* Do not set SLOT_ENABLED flag if some funcs + are not added. */ + slot->flags &= (~SLOT_ENABLED); continue; + } if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE && dev->hdr_type != PCI_HEADER_TYPE_CARDBUS) { @@ -856,7 +851,6 @@ static int __ref enable_device(struct acpiphp_slot *slot) pci_dev_put(dev); } - slot->flags |= SLOT_ENABLED; err_exit: return retval; @@ -881,9 +875,12 @@ static int disable_device(struct acpiphp_slot *slot) { struct acpiphp_func *func; struct pci_dev *pdev; + struct pci_bus *bus = slot->bridge->pci_bus; - /* is this slot already disabled? */ - if (!(slot->flags & SLOT_ENABLED)) + /* The slot will be enabled when func 0 is added, so check + func 0 before disable the slot. */ + pdev = pci_get_slot(bus, PCI_DEVFN(slot->device, 0)); + if (!pdev) goto err_exit; list_for_each_entry(func, &slot->funcs, sibling) { -- 1.7.7.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] kvm: make vcpu life cycle separated from kvm instance
From: Liu Ping Fan Currently, vcpu can be destructed only when kvm instance destroyed. Change this to vcpu as a refer to kvm, and then vcpu MUST and CAN be destroyed before kvm's destroy. Qemu will take advantage of this to exit the vcpu thread if the thread is no longer in use by guest. Signed-off-by: Liu Ping Fan --- arch/x86/kvm/x86.c | 28 include/linux/kvm_host.h |2 ++ virt/kvm/kvm_main.c | 31 +-- 3 files changed, 39 insertions(+), 22 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c38efd7..ea2315a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6560,27 +6560,16 @@ static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu) vcpu_put(vcpu); } -static void kvm_free_vcpus(struct kvm *kvm) +void kvm_arch_vcpu_zap(struct kref *ref) { - unsigned int i; - struct kvm_vcpu *vcpu; - - /* -* Unpin any mmu pages first. -*/ - kvm_for_each_vcpu(i, vcpu, kvm) { - kvm_clear_async_pf_completion_queue(vcpu); - kvm_unload_vcpu_mmu(vcpu); - } - kvm_for_each_vcpu(i, vcpu, kvm) - kvm_arch_vcpu_free(vcpu); - - mutex_lock(&kvm->lock); - for (i = 0; i < atomic_read(&kvm->online_vcpus); i++) - kvm->vcpus[i] = NULL; + struct kvm_vcpu *vcpu = container_of(ref, struct kvm_vcpu, refcount); + struct kvm *kvm = vcpu->kvm; - atomic_set(&kvm->online_vcpus, 0); - mutex_unlock(&kvm->lock); + printk(KERN_INFO "%s, zap vcpu:0x%x\n", __func__, vcpu->vcpu_id); + kvm_clear_async_pf_completion_queue(vcpu); + kvm_unload_vcpu_mmu(vcpu); + kvm_arch_vcpu_free(vcpu); + kvm_put_kvm(kvm); } void kvm_arch_sync_events(struct kvm *kvm) @@ -6594,7 +6583,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kvm_iommu_unmap_guest(kvm); kfree(kvm->arch.vpic); kfree(kvm->arch.vioapic); - kvm_free_vcpus(kvm); if (kvm->arch.apic_access_page) put_page(kvm->arch.apic_access_page); if (kvm->arch.ept_identity_pagetable) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index d526231..fe35078 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -113,6 +113,7 @@ enum { struct kvm_vcpu { struct kvm *kvm; + struct kref refcount; #ifdef CONFIG_PREEMPT_NOTIFIERS struct preempt_notifier preempt_notifier; #endif @@ -460,6 +461,7 @@ void kvm_arch_exit(void); int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu); void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu); +void kvm_arch_vcpu_zap(struct kref *ref); void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu); void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu); void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d9cfb78..f166bc8 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -580,6 +580,7 @@ static void kvm_destroy_vm(struct kvm *kvm) kvm_arch_free_vm(kvm); hardware_disable_all(); mmdrop(mm); + printk(KERN_INFO "%s finished\n", __func__); } void kvm_get_kvm(struct kvm *kvm) @@ -1503,6 +1504,16 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn) mark_page_dirty_in_slot(kvm, memslot, gfn); } +void kvm_vcpu_get(struct kvm_vcpu *vcpu) +{ + kref_get(&vcpu->refcount); +} + +void kvm_vcpu_put(struct kvm_vcpu *vcpu) +{ + kref_put(&vcpu->refcount, kvm_arch_vcpu_zap); +} + /* * The vCPU has executed a HLT instruction with in-kernel mode enabled. */ @@ -1623,8 +1634,13 @@ static int kvm_vcpu_mmap(struct file *file, struct vm_area_struct *vma) static int kvm_vcpu_release(struct inode *inode, struct file *filp) { struct kvm_vcpu *vcpu = filp->private_data; + struct kvm *kvm = vcpu->kvm; - kvm_put_kvm(vcpu->kvm); + filp->private_data = NULL; + mutex_lock(&kvm->lock); + atomic_sub(1, &kvm->online_vcpus); + mutex_unlock(&kvm->lock); + kvm_vcpu_put(vcpu); return 0; } @@ -1646,6 +1662,17 @@ static int create_vcpu_fd(struct kvm_vcpu *vcpu) return anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, O_RDWR); } +static struct kvm_vcpu *kvm_vcpu_create(struct kvm *kvm, u32 id) +{ + struct kvm_vcpu *vcpu; + vcpu = kvm_arch_vcpu_create(kvm, id); + if (IS_ERR(vcpu)) + return vcpu; + + kref_init(&vcpu->refcount); + return vcpu; +} + /* * Creates some virtual cpus. Good luck creating more than one. */ @@ -1654,7 +1681,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) int r; struct kvm_vcpu *vcpu, *v; - vcpu = kvm_arch_vcpu_create(kvm, id); + vcpu = kvm_vcpu_create(kvm, id); if (IS_ERR(vcpu)) return PTR_ERR(vcpu); -- 1.7.4.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a messag
[PATCH 0] A series patches for kvm&qemu to enable vcpu destruction in kvm
A series of patches from kvm, qemu to guest. These patches will finally enable vcpu destruction in kvm instance and let vcpu thread exit in qemu. Currently, the vcpu online feature enables the dynamical creation of vcpu and vcpu thread, while the offline feature can not destruct the vcpu and let vcpu thread exit, it just halt in kvm. Because currently, the vcpu will only be destructed when kvm instance is destroyed. We can change vcpu as an refer of kvm instance, and then vcpu's destruction MUST and CAN come before kvm's destruction. These patches use guest driver to notify the CPU_DEAD event to qemu, and later qemu asks kvm to release the dead vcpu and finally exit the thread. The usage is: qemu$cpu_set n online qemu$cpu_set n zap This will destroy the vcpu-n in kvm and let vcpu thread exit OR qemu$cpu_set n offline - This will just block vcpu-n in kvm Any comment and suggestion are welcome. Patches include: |-- guest | `-- 0001-virtio-add-a-pci-driver-to-notify-host-the-CPU_DEAD-.patch |-- kvm | |-- 0001-kvm-make-vcpu-life-cycle-separated-from-kvm-instance.patch | `-- 0002-kvm-exit-to-userspace-with-reason-KVM_EXIT_VCPU_DEAD.patch `-- qemu |-- 0001-Add-cpu_phyid_to_cpu-to-map-cpu-phyid-to-CPUState.patch |-- 0002-Add-cpu_free-to-support-arch-related-CPUState-releas.patch |-- 0003-Introduce-a-pci-device-cpustate-to-get-CPU_DEAD-even.patch |-- 0004-Release-vcpu-and-finally-exit-vcpu-thread-safely.patch `-- 0005-tmp-patches-for-linux-header-files.patch -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] ivshmem: fix guest unable to start with ioeventfd
On Thu, Nov 24, 2011 at 3:05 AM, wrote: > From: Hongyong Zang > > When a guest boots with ioeventfd, an error (by gdb) occurs: > Program received signal SIGSEGV, Segmentation fault. > 0x006009cc in setup_ioeventfds (s=0x171dc40) > at /home/louzhengwei/git_source/qemu-kvm/hw/ivshmem.c:363 > 363 for (j = 0; j < s->peers[i].nb_eventfds; j++) { > The bug is due to accessing s->peers which is NULL. Can you share the command-line that caused the fault? > > This patch uses the memory region API to replace the old one > kvm_set_ioeventfd_mmio_long(). > And this patch makes memory_region_add_eventfd() called in ivshmem_read() > when qemu receives > eventfd information from ivshmem_server. Should this patch be split into two patches, to separate the bug fix from the other changes related to the Memory API? Unless I misunderstand how the two are necessarily related. Cam > > Signed-off-by: Hongyong Zang > --- > hw/ivshmem.c | 41 ++--- > 1 files changed, 14 insertions(+), 27 deletions(-) > > diff --git a/hw/ivshmem.c b/hw/ivshmem.c > index 242fbea..be26f03 100644 > --- a/hw/ivshmem.c > +++ b/hw/ivshmem.c > @@ -58,7 +58,6 @@ typedef struct IVShmemState { > CharDriverState *server_chr; > MemoryRegion ivshmem_mmio; > > - pcibus_t mmio_addr; > /* We might need to register the BAR before we actually have the memory. > * So prepare a container MemoryRegion for the BAR immediately and > * add a subregion when we have the memory. > @@ -346,8 +345,14 @@ static void close_guest_eventfds(IVShmemState *s, int > posn) > guest_curr_max = s->peers[posn].nb_eventfds; > > for (i = 0; i < guest_curr_max; i++) { > - kvm_set_ioeventfd_mmio_long(s->peers[posn].eventfds[i], > - s->mmio_addr + DOORBELL, (posn << 16) | i, 0); > + if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) { > + memory_region_del_eventfd(&s->ivshmem_mmio, > + DOORBELL, > + 4, > + true, > + (posn << 16) | i, > + s->peers[posn].eventfds[i]); > + } > close(s->peers[posn].eventfds[i]); > } > > @@ -355,22 +360,6 @@ static void close_guest_eventfds(IVShmemState *s, int > posn) > s->peers[posn].nb_eventfds = 0; > } > > -static void setup_ioeventfds(IVShmemState *s) { > - > - int i, j; > - > - for (i = 0; i <= s->max_peer; i++) { > - for (j = 0; j < s->peers[i].nb_eventfds; j++) { > - memory_region_add_eventfd(&s->ivshmem_mmio, > - DOORBELL, > - 4, > - true, > - (i << 16) | j, > - s->peers[i].eventfds[j]); > - } > - } > -} > - > /* this function increase the dynamic storage need to store data about other > * guests */ > static void increase_dynamic_storage(IVShmemState *s, int new_min_size) { > @@ -491,10 +480,12 @@ static void ivshmem_read(void *opaque, const uint8_t * > buf, int flags) > } > > if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) { > - if (kvm_set_ioeventfd_mmio_long(incoming_fd, s->mmio_addr + DOORBELL, > - (incoming_posn << 16) | guest_max_eventfd, 1) < 0) { > - fprintf(stderr, "ivshmem: ioeventfd not available\n"); > - } > + memory_region_add_eventfd(&s->ivshmem_mmio, > + DOORBELL, > + 4, > + true, > + (incoming_posn << 16) | guest_max_eventfd, > + incoming_fd); > } > > return; > @@ -659,10 +650,6 @@ static int pci_ivshmem_init(PCIDevice *dev) > memory_region_init_io(&s->ivshmem_mmio, &ivshmem_mmio_ops, s, > "ivshmem-mmio", IVSHMEM_REG_BAR_SIZE); > > - if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) { > - setup_ioeventfds(s); > - } > - > /* region for registers*/ > pci_register_bar(&s->dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, > &s->ivshmem_mmio); > -- > 1.7.1 > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Changing IOMMU-API for generic DMA-mapping supported by the hardware
On Thu, Nov 24, 2011 at 01:52:33PM +0100, Marek Szyprowski wrote: > In my DMA-mapping IOMMU integration I've used a dma_iommu_mapping structure, > which contains a pointer to iommu domain, a bitmap and a lock. Maybe we > should consider extending iommu domain with allocation bitmap (or other > structure that hold information about used/unused iova ranges)? From the > DMA-mapping (as a IOMMU client) perspective we only need 2 more callbacks > in IOMMU API: alloc_iova_range() and free_iova_range(). > > Each IOMMU implementation can provide these calls based on internal bitmap > allocator which will also cover the issue with reserved ranges. What do you > think about such solution? Hmm, the main point of a generic DMA-mapping implementation is that a common address-allocator will be used. Today every IOMMU driver that implements the DMA-API has its own allocator, this is something to unify between for all drivers. The allocator information can be stored in the default iommu_domain. We need a user-private pointer there, but that is easy to add. Joerg -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo, Andrew Bowd Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nested virtualization on Intel and needed cpu flags to pass
Resend, because probably it didn't reach the ml due to attachments size... I'm posting now links instead... On Wed, Nov 23, 2011 at 12:01 PM, Nadav Har'El wrote: > Unfortunately, this is a known bug - which I promised to work on, but > haven't yet got around to :( > nested-vmx.txt explictly lists under "known limitations" that: "The > current code supports running Linux guests under KVM guests." [snip] > I don't think there is any such guidelines. The only thing you really > need is "-cpu qemu64,+vmx" (replace qemu64 by whatever you want) to > advertise the exisance of VMX. > Ok, thanks for the answer. right now tested this config Host F16 with these packages kernel-3.1.1-2.fc16.x86_64 virt-manager-0.9.0-7.fc16.noarch qemu-kvm-0.15.1-3.fc16.x86_64 libvirt-0.9.6-2.fc16.x86_64 L1 guest f16vm with same virtualization related packages as the host L2 guest c56 configured as red hat 5.4 or above and configured to boot from cd cd is a iso of centos 5.6 live x86_64 L1 guest configured with "copy host cpu configuration" in virt-manager its cpuinfo gives: [root@f16vm ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz stepping: 11 cpu MHz : 2693.880 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc up nopl pni vmx ssse3 cx16 sse4_1 sse4_2 x2apic popcnt hypervisor lahf_lm bogomips: 5387.76 clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: qemu command line on my host for f16vm is /usr/bin/qemu-kvm -S -M pc-0.14 -cpu core2duo,+lahf_lm,+rdtscp,+popcnt,+x2apic,+sse4.2,+sse4.1,+xtpr,+cx16,+tm2,+est,+vmx,+ds_cpl,+pbe,+tm,+ht,+ss,+acpi,+ds -enable-kvm -m 3192 -smp 1,sockets=1,cores=1,threads=1 -name f16vm On L1 f16vm qemu command line for its L2 guest c56 is: [root@f16vm ~]# ps -ef|grep qemu qemu 1834 1 10 12:49 ?00:01:07 /usr/bin/qemu-kvm -S -M pc-0.14 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name c56 -uuid 15526957-51c9-1958-8a15-dea8f2626e5d -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/c56.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -drive file=/var/lib/libvirt/images/CentOS-5.6-x86_64-LiveCD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 [root@f16vm ~]# virsh start c56 After a while c56 goes into a paused state: [root@f16vm ~]# virsh domstate c56 paused [root@f16vm ~]# virsh domstate c56 --reason paused (unknown) I got only up to these lines attached with virt-dmesg run from f16vm against c56: https://docs.google.com/open?id=0BwoPbcrMv8mvMWVmYTRkNDMtMjMzMi00OWViLWI1NTctYjA1YzU2NmM0ZmU5 I post also the image related to what I see in console of c56 L2 guest before it gets paused https://docs.google.com/open?id=0BwoPbcrMv8mvMWVjNGNmYTUtNTcxOC00MzBkLWI5YWYtZDhmNDkxMzM1OTEx Thanks, Gianluca -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On 11/23/2011 05:44 PM, Nadav Har'El wrote: > On Wed, Nov 23, 2011, Nadav Har'El wrote about "Re: [PATCH 02/10] nEPT: MMU > context for nested EPT": > > > +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) > > > +{ > > > + int r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu); > > > + > > > + vcpu->arch.nested_mmu.gva_to_gpa = EPT_gva_to_gpa_nested; > > > + > > > + return r; > > > +} > >.. > > I didn't see you actually call this function anywhere - how is it > > supposed to work? > >.. > > It seems we need a fifth case in that function. > >.. > > On second thought, why is this modifying nested_mmu.gva_to_gpa, and not > mmu.gva_to_gpa? Isn't the nested_mmu the L2 CR3, which is *not* in EPT > format, and what we really want to change is the outer mmu, which is > EPT12 and is indeed in EPT format? > Or am I missing something? I think you're right. The key is to look at what ->walk_mmu points at. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Changing IOMMU-API for generic DMA-mapping supported by the hardware
Hello, On Friday, November 11, 2011 2:17 PM Joerg Roedel wrote: > Okay, seperate thread for this one. If possible, I would like to be CCed: in the next mails in this topic. For a last few months I've been working on DMA-mapping changes on ARM architecture in order to add support for IOMMU-aware DMA mapper. The last version of my patches are available here: http://lists.linaro.org/pipermail/linaro-mm-sig/2011-October/000745.html The next version will be posted soon. > On Thu, Nov 10, 2011 at 07:28:39PM +, David Woodhouse wrote: > > > The plan is to have a single DMA-API implementation for all IOMMU > > > drivers (X86 and ARM) which just uses the IOMMU-API. But to make this > > > performing reasonalbly well a few changes to the IOMMU-API are required. > > > I already have some ideas which we can discuss if you want. > > > > Yeah, that sounds useful. > > As I said some changes to the IOMMU-API are required in my opinion. > These changes should also allow it to move over old-style IOMMUs like > Calgary or GART later. > > The basic idea is that IOMMU drivers should be required to put every > device they are responsible for into a default domain. The DMA mapping > code can query this default domain for each device. Good idea. > Also the default domain has capabilities that can be queried. Those > capabilities include the size and offset of the address space they can > re-map. For GART and Calgary this will be the aperture, for VT-d and AMD > IOMMU the whole 64bit address space. Another capability is whether > addresses outside of that area are 1-1 mapped or no accessible to the > device. > > The generic DMA-mapping code will use that information to initialize its > allocator and uses iommu_map/iommu_unmap to create and destroy mappings > as requested by the DMA-API (but the DMA-mapping code does not need to > create a domain of its own). > > The good thing about these default domains is that IOMMU drivers can > implement their own optimizations on it. The AMD IOMMU driver for > example already makes a distinction between dma-mapping domains and > other protection-domains. The optimization for dma-mapping domains is > that the leaf-pages of the page-table are keept in an array so that it > is very easy to find the PTE for an address. Those optimizations are > still possible with the default-domain concept. > > In short, the benefits of the default-domain concept are: > > 1) It allows existing optimizations for the DMA-mapping code > paths to persist > 2) It also fits old-style IOMMUs like GART, Calgary and others > > An open problem is how to report reserved ranges of an address-space. > These ranges might exist from a BIOS requirement for 1-1 mapping of > certain address ranges (in AMD jargon: Unity mapped ranges, something > similar exists on VT-d afaik) or hardware requirements like the reserved > address range used for MSI interrupts. In my DMA-mapping IOMMU integration I've used a dma_iommu_mapping structure, which contains a pointer to iommu domain, a bitmap and a lock. Maybe we should consider extending iommu domain with allocation bitmap (or other structure that hold information about used/unused iova ranges)? From the DMA-mapping (as a IOMMU client) perspective we only need 2 more callbacks in IOMMU API: alloc_iova_range() and free_iova_range(). Each IOMMU implementation can provide these calls based on internal bitmap allocator which will also cover the issue with reserved ranges. What do you think about such solution? Best regards -- Marek Szyprowski Samsung Poland R&D Center -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right
On 11/24/2011 01:53 PM, Sasha Levin wrote: > On Thu, 2011-11-24 at 12:48 +0200, Avi Kivity wrote: > > On 11/24/2011 12:45 PM, Sasha Levin wrote: > > > If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still > > > fail with -E2BIG due to wrong comparisons. > > > > > > Cc: Avi Kivity > > > Cc: Marcelo Tosatti > > > Signed-off-by: Sasha Levin > > > --- > > > arch/x86/kvm/x86.c |2 +- > > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index 9eff4af..83fef71 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -2710,7 +2710,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct > > > kvm_cpuid2 *cpuid, > > >cpuid->nent); > > > > > > r = -E2BIG; > > > - if (nent >= cpuid->nent) > > > + if (nent > cpuid->nent) > > > goto out_free; > > > > > > > > > > This is just a landmine for the next entry to be added there; surely > > whoever adds it will forget to correct the > back to >=. > > > > Slapping a big warning before that should do the trick? Or maybe add > something similar to 'final_nent = nent - 1;'? Refactor the whole thing so all the repetitive code goes away. Maybe make it table driven. But after my cpuid.c patch please, I'd hate to redo it. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right
On Thu, 2011-11-24 at 12:48 +0200, Avi Kivity wrote: > On 11/24/2011 12:45 PM, Sasha Levin wrote: > > If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still > > fail with -E2BIG due to wrong comparisons. > > > > Cc: Avi Kivity > > Cc: Marcelo Tosatti > > Signed-off-by: Sasha Levin > > --- > > arch/x86/kvm/x86.c |2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 9eff4af..83fef71 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -2710,7 +2710,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct > > kvm_cpuid2 *cpuid, > > cpuid->nent); > > > > r = -E2BIG; > > - if (nent >= cpuid->nent) > > + if (nent > cpuid->nent) > > goto out_free; > > > > > > This is just a landmine for the next entry to be added there; surely > whoever adds it will forget to correct the > back to >=. > Slapping a big warning before that should do the trick? Or maybe add something similar to 'final_nent = nent - 1;'? -- Sasha. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 4/6] KVM: introduce id_to_memslot function
On 11/24/2011 06:23 PM, Takuya Yoshikawa wrote: > (2011/11/24 19:15), Takuya Yoshikawa wrote: >> (2011/11/24 18:40), Xiao Guangrong wrote: > >> You can eliminate this if you use old_slot and new_slot for the two memory >> slots. > > Or old_bitmap and new_bitmap. Anyway, calling id_to_memslot() for getting the > same slot twice is not good, IMO. > Sure. Thanks for your review, Takuya! From: Xiao Guangrong Subject: KVM: introduce id_to_memslot function Introduce id_to_memslot to get memslot by slot id Signed-off-by: Xiao Guangrong --- arch/ia64/kvm/kvm-ia64.c |2 +- arch/powerpc/kvm/book3s.c |2 +- arch/x86/kvm/vmx.c|6 -- arch/x86/kvm/x86.c| 18 +- include/linux/kvm_host.h |6 ++ virt/kvm/kvm_main.c | 13 + 6 files changed, 30 insertions(+), 17 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 42ad1f9..92d9f1e 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1818,7 +1818,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, if (log->slot >= KVM_MEMORY_SLOTS) goto out; - memslot = &kvm->memslots->memslots[log->slot]; + memslot = id_to_memslot(kvm->memslots, log->slot); r = -ENOENT; if (!memslot->dirty_bitmap) goto out; diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index a459479..e41ac6f 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -498,7 +498,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, /* If nothing is dirty, don't bother messing with page tables. */ if (is_dirty) { - memslot = &kvm->memslots->memslots[log->slot]; + memslot = id_to_memslot(kvm->memslots, log->slot); ga = memslot->base_gfn << PAGE_SHIFT; ga_end = ga + (memslot->npages << PAGE_SHIFT); diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index ba24022..8f19d91 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2711,11 +2711,13 @@ static gva_t rmode_tss_base(struct kvm *kvm) { if (!kvm->arch.tss_addr) { struct kvm_memslots *slots; + struct kvm_memory_slot *slot; gfn_t base_gfn; slots = kvm_memslots(kvm); - base_gfn = slots->memslots[0].base_gfn + -kvm->memslots->memslots[0].npages - 3; + slot = id_to_memslot(slots, 0); + base_gfn = slot->base_gfn + slot->npages - 3; + return base_gfn << PAGE_SHIFT; } return kvm->arch.tss_addr; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a9e5a59..886296e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3521,7 +3521,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, if (log->slot >= KVM_MEMORY_SLOTS) goto out; - memslot = &kvm->memslots->memslots[log->slot]; + memslot = id_to_memslot(kvm->memslots, log->slot); r = -ENOENT; if (!memslot->dirty_bitmap) goto out; @@ -3532,27 +3532,27 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, /* If nothing is dirty, don't bother messing with page tables. */ if (nr_dirty_pages) { struct kvm_memslots *slots, *old_slots; - unsigned long *dirty_bitmap; + unsigned long *dirty_bitmap, *dirty_bitmap_head; - dirty_bitmap = memslot->dirty_bitmap_head; - if (memslot->dirty_bitmap == dirty_bitmap) - dirty_bitmap += n / sizeof(long); - memset(dirty_bitmap, 0, n); + dirty_bitmap = memslot->dirty_bitmap; + dirty_bitmap_head = memslot->dirty_bitmap_head; + if (dirty_bitmap == dirty_bitmap_head) + dirty_bitmap_head += n / sizeof(long); + memset(dirty_bitmap_head, 0, n); r = -ENOMEM; slots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL); if (!slots) goto out; memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots)); - memslot = &slots->memslots[log->slot]; - memslot->dirty_bitmap = dirty_bitmap; + memslot = id_to_memslot(slots, log->slot); memslot->nr_dirty_pages = 0; + memslot->dirty_bitmap = dirty_bitmap_head; update_memslots(slots, NULL); old_slots = kvm->memslots; rcu_assign_pointer(kvm->memslots, slots); synchronize_srcu_expedited(&kvm->srcu); - dirty_bitmap = old_slots->memslots[log->slot].dirty_bitmap; kfree(old_slots); write_protect_slot(kvm, memslot, dirty_bitmap, nr_dirty_pages); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 392af47..123925c 100644 --- a/i
[Qemu-devel] [PATCH] ivshmem: fix guest unable to start with ioeventfd
From: Hongyong Zang When a guest boots with ioeventfd, an error (by gdb) occurs: Program received signal SIGSEGV, Segmentation fault. 0x006009cc in setup_ioeventfds (s=0x171dc40) at /home/louzhengwei/git_source/qemu-kvm/hw/ivshmem.c:363 363 for (j = 0; j < s->peers[i].nb_eventfds; j++) { The bug is due to accessing s->peers which is NULL. This patch uses the memory region API to replace the old one kvm_set_ioeventfd_mmio_long(). And this patch makes memory_region_add_eventfd() called in ivshmem_read() when qemu receives eventfd information from ivshmem_server. Signed-off-by: Hongyong Zang --- hw/ivshmem.c | 41 ++--- 1 files changed, 14 insertions(+), 27 deletions(-) diff --git a/hw/ivshmem.c b/hw/ivshmem.c index 242fbea..be26f03 100644 --- a/hw/ivshmem.c +++ b/hw/ivshmem.c @@ -58,7 +58,6 @@ typedef struct IVShmemState { CharDriverState *server_chr; MemoryRegion ivshmem_mmio; -pcibus_t mmio_addr; /* We might need to register the BAR before we actually have the memory. * So prepare a container MemoryRegion for the BAR immediately and * add a subregion when we have the memory. @@ -346,8 +345,14 @@ static void close_guest_eventfds(IVShmemState *s, int posn) guest_curr_max = s->peers[posn].nb_eventfds; for (i = 0; i < guest_curr_max; i++) { -kvm_set_ioeventfd_mmio_long(s->peers[posn].eventfds[i], -s->mmio_addr + DOORBELL, (posn << 16) | i, 0); +if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) { +memory_region_del_eventfd(&s->ivshmem_mmio, + DOORBELL, + 4, + true, + (posn << 16) | i, + s->peers[posn].eventfds[i]); +} close(s->peers[posn].eventfds[i]); } @@ -355,22 +360,6 @@ static void close_guest_eventfds(IVShmemState *s, int posn) s->peers[posn].nb_eventfds = 0; } -static void setup_ioeventfds(IVShmemState *s) { - -int i, j; - -for (i = 0; i <= s->max_peer; i++) { -for (j = 0; j < s->peers[i].nb_eventfds; j++) { -memory_region_add_eventfd(&s->ivshmem_mmio, - DOORBELL, - 4, - true, - (i << 16) | j, - s->peers[i].eventfds[j]); -} -} -} - /* this function increase the dynamic storage need to store data about other * guests */ static void increase_dynamic_storage(IVShmemState *s, int new_min_size) { @@ -491,10 +480,12 @@ static void ivshmem_read(void *opaque, const uint8_t * buf, int flags) } if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) { -if (kvm_set_ioeventfd_mmio_long(incoming_fd, s->mmio_addr + DOORBELL, -(incoming_posn << 16) | guest_max_eventfd, 1) < 0) { -fprintf(stderr, "ivshmem: ioeventfd not available\n"); -} +memory_region_add_eventfd(&s->ivshmem_mmio, + DOORBELL, + 4, + true, + (incoming_posn << 16) | guest_max_eventfd, + incoming_fd); } return; @@ -659,10 +650,6 @@ static int pci_ivshmem_init(PCIDevice *dev) memory_region_init_io(&s->ivshmem_mmio, &ivshmem_mmio_ops, s, "ivshmem-mmio", IVSHMEM_REG_BAR_SIZE); -if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) { -setup_ioeventfds(s); -} - /* region for registers*/ pci_register_bar(&s->dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->ivshmem_mmio); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right
On 11/24/2011 12:45 PM, Sasha Levin wrote: > If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still > fail with -E2BIG due to wrong comparisons. > > Cc: Avi Kivity > Cc: Marcelo Tosatti > Signed-off-by: Sasha Levin > --- > arch/x86/kvm/x86.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 9eff4af..83fef71 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2710,7 +2710,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct > kvm_cpuid2 *cpuid, >cpuid->nent); > > r = -E2BIG; > - if (nent >= cpuid->nent) > + if (nent > cpuid->nent) > goto out_free; > > This is just a landmine for the next entry to be added there; surely whoever adds it will forget to correct the > back to >=. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right
If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still fail with -E2BIG due to wrong comparisons. Cc: Avi Kivity Cc: Marcelo Tosatti Signed-off-by: Sasha Levin --- arch/x86/kvm/x86.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9eff4af..83fef71 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2710,7 +2710,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid, cpuid->nent); r = -E2BIG; - if (nent >= cpuid->nent) + if (nent > cpuid->nent) goto out_free; r = -EFAULT; -- 1.7.8.rc3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right
On 11/24/2011 12:37 PM, Sasha Levin wrote: > On Thu, 2011-11-24 at 12:33 +0200, Avi Kivity wrote: > > On 11/24/2011 12:31 PM, Sasha Levin wrote: > > > > > > > > The protocol goes like "try size x, if it fails with -E2BIG, increase x, > > > > try again". Its awkward. > > > > > > We can set nent to be the amount of entries required like we do in the > > > opposite case where we passed too many entries. > > > > There's no point, since userspace will want to support older kernels. > > In the case of old kernels the cpuid->nent value will not be modified, > so userspace can handle both cases easily: > > - If KVM_GET_SUPPORTED_CPUID returned -E2BIG, check cpuid->nent > - If zero, do same -E2BIG loop as we do now. > - If not, allocate amount needed and pass it to the ioctl again. > What's the point? The code becomes more complicated. Something like 'while (try_get_cpuid(x) == -E2BIG) { x *= 2; }' is simple and works everywhere. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right
On Thu, 2011-11-24 at 12:33 +0200, Avi Kivity wrote: > On 11/24/2011 12:31 PM, Sasha Levin wrote: > > > > > > The protocol goes like "try size x, if it fails with -E2BIG, increase x, > > > try again". Its awkward. > > > > We can set nent to be the amount of entries required like we do in the > > opposite case where we passed too many entries. > > There's no point, since userspace will want to support older kernels. In the case of old kernels the cpuid->nent value will not be modified, so userspace can handle both cases easily: - If KVM_GET_SUPPORTED_CPUID returned -E2BIG, check cpuid->nent - If zero, do same -E2BIG loop as we do now. - If not, allocate amount needed and pass it to the ioctl again. -- Sasha. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right
On 11/24/2011 12:31 PM, Sasha Levin wrote: > > > > The protocol goes like "try size x, if it fails with -E2BIG, increase x, > > try again". Its awkward. > > We can set nent to be the amount of entries required like we do in the > opposite case where we passed too many entries. There's no point, since userspace will want to support older kernels. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right
On Thu, 2011-11-24 at 08:09 -0200, Marcelo Tosatti wrote: > On Thu, Nov 17, 2011 at 12:18:44PM +0200, Sasha Levin wrote: > > If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still > > fail with -E2BIG due to wrong comparisons. > > > > Cc: Avi Kivity > > Cc: Marcelo Tosatti > > Signed-off-by: Sasha Levin > > --- > > arch/x86/kvm/x86.c | 12 ++-- > > 1 files changed, 6 insertions(+), 6 deletions(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 9eff4af..460c49b 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -2664,7 +2664,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct > > kvm_cpuid2 *cpuid, > > do_cpuid_ent(&cpuid_entries[nent], func, 0, > > &nent, cpuid->nent); > > r = -E2BIG; > > - if (nent >= cpuid->nent) > > + if (nent > cpuid->nent) > > goto out_free; > > "int nent" variable contains the index into the array. > "__u32 cpuid->nent", from userspace, contains the number > of entries in the array. > > So the ">=" comparison is necessary to avoid overwriting past the end of > the array. Right, only the last comparison should be changed to ">" because in that case It's ok if the nent (which points to the next entry) equals to cpuid->nent. > > The protocol goes like "try size x, if it fails with -E2BIG, increase x, > try again". Its awkward. We can set nent to be the amount of entries required like we do in the opposite case where we passed too many entries. -- Sasha. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 4/6] KVM: introduce id_to_memslot function
(2011/11/24 19:15), Takuya Yoshikawa wrote: (2011/11/24 18:40), Xiao Guangrong wrote: You can eliminate this if you use old_slot and new_slot for the two memory slots. Or old_bitmap and new_bitmap. Anyway, calling id_to_memslot() for getting the same slot twice is not good, IMO. Takuya -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 4/6] KVM: introduce id_to_memslot function
(2011/11/24 18:40), Xiao Guangrong wrote: ... --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3521,7 +3521,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, if (log->slot>= KVM_MEMORY_SLOTS) goto out; - memslot =&kvm->memslots->memslots[log->slot]; + memslot = id_to_memslot(kvm->memslots, log->slot); r = -ENOENT; if (!memslot->dirty_bitmap) goto out; @@ -3544,15 +3544,16 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, if (!slots) goto out; memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots)); - memslot =&slots->memslots[log->slot]; - memslot->dirty_bitmap = dirty_bitmap; + memslot = id_to_memslot(slots, log->slot); memslot->nr_dirty_pages = 0; + memslot->dirty_bitmap = dirty_bitmap; update_memslots(slots, NULL); old_slots = kvm->memslots; rcu_assign_pointer(kvm->memslots, slots); synchronize_srcu_expedited(&kvm->srcu); - dirty_bitmap = old_slots->memslots[log->slot].dirty_bitmap; + dirty_bitmap = id_to_memslot(old_slots, + log->slot)->dirty_bitmap; You can eliminate this if you use old_slot and new_slot for the two memory slots. Takuya kfree(old_slots); write_protect_slot(kvm, memslot, dirty_bitmap, nr_dirty_pages); -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right
On Thu, Nov 17, 2011 at 12:18:44PM +0200, Sasha Levin wrote: > If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still > fail with -E2BIG due to wrong comparisons. > > Cc: Avi Kivity > Cc: Marcelo Tosatti > Signed-off-by: Sasha Levin > --- > arch/x86/kvm/x86.c | 12 ++-- > 1 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 9eff4af..460c49b 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2664,7 +2664,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct > kvm_cpuid2 *cpuid, > do_cpuid_ent(&cpuid_entries[nent], func, 0, >&nent, cpuid->nent); > r = -E2BIG; > - if (nent >= cpuid->nent) > + if (nent > cpuid->nent) > goto out_free; "int nent" variable contains the index into the array. "__u32 cpuid->nent", from userspace, contains the number of entries in the array. So the ">=" comparison is necessary to avoid overwriting past the end of the array. The protocol goes like "try size x, if it fails with -E2BIG, increase x, try again". Its awkward. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm-tpr-opt: Fix instruction_is_ok() for push tpr
On Tue, Nov 22, 2011 at 10:35:59AM +0100, Markus Armbruster wrote: > Missing break spotted by Coverity. > > Signed-off-by: Markus Armbruster > --- > kvm-tpr-opt.c |1 + > 1 files changed, 1 insertions(+), 0 deletions(-) Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: IA64: fix struct redefinition
From: Xiao Guangrong There is the same struct definition in ia64 and kvm common code: arch/ia64/kvm//kvm-ia64.c: At top level: arch/ia64/kvm//kvm-ia64.c:777:8: error: redefinition of ‘struct kvm_io_range’ include/linux/kvm_host.h:62:8: note: originally defined here So, rename kvm_io_range to kvm_ia64_io_range in ia64 code Signed-off-by: Xiao Guangrong --- arch/ia64/kvm/kvm-ia64.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 92d9f1e..4050520 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -774,13 +774,13 @@ struct kvm *kvm_arch_alloc_vm(void) return kvm; } -struct kvm_io_range { +struct kvm_ia64_io_range { unsigned long start; unsigned long size; unsigned long type; }; -static const struct kvm_io_range io_ranges[] = { +static const struct kvm_ia64_io_range io_ranges[] = { {VGA_IO_START, VGA_IO_SIZE, GPFN_FRAME_BUFFER}, {MMIO_START, MMIO_SIZE, GPFN_LOW_MMIO}, {LEGACY_IO_START, LEGACY_IO_SIZE, GPFN_LEGACY_IO}, -- 1.7.7.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 6/6] KVM: introduce a table to map slot id to index in memslots array
From: Xiao Guangrong The operation of getting dirty log is frequent when framebuffer-based displays are used(for example, Xwindow), so, we introduce a mapping table to speed up id_to_memslot() Signed-off-by: Xiao Guangrong --- include/linux/kvm_host.h | 13 +++-- virt/kvm/kvm_main.c |7 ++- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9efdf5c..8c5c303 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -239,6 +239,8 @@ struct kvm_irq_routing_table {}; struct kvm_memslots { u64 generation; struct kvm_memory_slot memslots[KVM_MEM_SLOTS_NUM]; + /* The mapping table from slot id to the index in memslots[]. */ + int id_to_index[KVM_MEM_SLOTS_NUM]; }; struct kvm { @@ -341,14 +343,13 @@ static inline struct kvm_memslots *kvm_memslots(struct kvm *kvm) static inline struct kvm_memory_slot * id_to_memslot(struct kvm_memslots *slots, int id) { - int i; + int index = slots->id_to_index[id]; + struct kvm_memory_slot *slot; - for (i = 0; i < KVM_MEM_SLOTS_NUM; i++) - if (slots->memslots[i].id == id) - return &slots->memslots[i]; + slot = &slots->memslots[index]; - WARN_ON(1); - return NULL; + WARN_ON(slot->id != id); + return slot; } #define HPA_MSB ((sizeof(hpa_t) * 8) - 1) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 6e8eb15..e289486 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -446,7 +446,7 @@ static void kvm_init_memslots_id(struct kvm *kvm) struct kvm_memslots *slots = kvm->memslots; for (i = 0; i < KVM_MEM_SLOTS_NUM; i++) - slots->memslots[i].id = i; + slots->id_to_index[i] = slots->memslots[i].id = i; } static struct kvm *kvm_create_vm(void) @@ -674,8 +674,13 @@ static int cmp_memslot(const void *slot1, const void *slot2) */ static void sort_memslots(struct kvm_memslots *slots) { + int i; + sort(slots->memslots, KVM_MEM_SLOTS_NUM, sizeof(struct kvm_memory_slot), cmp_memslot, NULL); + + for (i = 0; i < KVM_MEM_SLOTS_NUM; i++) + slots->id_to_index[slots->memslots[i].id] = i; } void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new) -- 1.7.7.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 5/6] KVM: sort memslots by its size and use line search
From: Xiao Guangrong Sort memslots base on its size and use line search to find it, so that the larger memslots have better fit The idea is from Avi Signed-off-by: Xiao Guangrong --- include/linux/kvm_host.h | 18 +-- virt/kvm/kvm_main.c | 79 +- 2 files changed, 72 insertions(+), 25 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 123925c..9efdf5c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -231,8 +231,12 @@ struct kvm_irq_routing_table {}; #define KVM_MEM_SLOTS_NUM (KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS) #endif +/* + * Note: + * memslots are not sorted by id anymore, please use id_to_memslot() + * to get the memslot by its id. + */ struct kvm_memslots { - int nmemslots; u64 generation; struct kvm_memory_slot memslots[KVM_MEM_SLOTS_NUM]; }; @@ -310,7 +314,8 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i) #define kvm_for_each_memslot(memslot, slots) \ for (memslot = &slots->memslots[0]; \ - memslot < slots->memslots + (slots)->nmemslots; memslot++) + memslot < slots->memslots + KVM_MEM_SLOTS_NUM && memslot->npages;\ + memslot++) int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id); void kvm_vcpu_uninit(struct kvm_vcpu *vcpu); @@ -336,7 +341,14 @@ static inline struct kvm_memslots *kvm_memslots(struct kvm *kvm) static inline struct kvm_memory_slot * id_to_memslot(struct kvm_memslots *slots, int id) { - return &slots->memslots[id]; + int i; + + for (i = 0; i < KVM_MEM_SLOTS_NUM; i++) + if (slots->memslots[i].id == id) + return &slots->memslots[i]; + + WARN_ON(1); + return NULL; } #define HPA_MSB ((sizeof(hpa_t) * 8) - 1) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 7b60849..6e8eb15 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -440,6 +440,15 @@ static int kvm_init_mmu_notifier(struct kvm *kvm) #endif /* CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER */ +static void kvm_init_memslots_id(struct kvm *kvm) +{ + int i; + struct kvm_memslots *slots = kvm->memslots; + + for (i = 0; i < KVM_MEM_SLOTS_NUM; i++) + slots->memslots[i].id = i; +} + static struct kvm *kvm_create_vm(void) { int r, i; @@ -465,6 +474,7 @@ static struct kvm *kvm_create_vm(void) kvm->memslots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL); if (!kvm->memslots) goto out_err_nosrcu; + kvm_init_memslots_id(kvm); if (init_srcu_struct(&kvm->srcu)) goto out_err_nosrcu; for (i = 0; i < KVM_NR_BUSES; i++) { @@ -630,15 +640,54 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot) } #endif /* !CONFIG_S390 */ +static struct kvm_memory_slot * +search_memslots(struct kvm_memslots *slots, gfn_t gfn) +{ + struct kvm_memory_slot *memslot; + + kvm_for_each_memslot(memslot, slots) + if (gfn >= memslot->base_gfn && + gfn < memslot->base_gfn + memslot->npages) + return memslot; + + return NULL; +} + +static int cmp_memslot(const void *slot1, const void *slot2) +{ + struct kvm_memory_slot *s1, *s2; + + s1 = (struct kvm_memory_slot *)slot1; + s2 = (struct kvm_memory_slot *)slot2; + + if (s1->npages < s2->npages) + return 1; + if (s1->npages > s2->npages) + return -1; + + return 0; +} + +/* + * Sort the memslots base on its size, so the larger slots + * will get better fit. + */ +static void sort_memslots(struct kvm_memslots *slots) +{ + sort(slots->memslots, KVM_MEM_SLOTS_NUM, + sizeof(struct kvm_memory_slot), cmp_memslot, NULL); +} + void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new) { if (new) { int id = new->id; struct kvm_memory_slot *old = id_to_memslot(slots, id); + unsigned long npages = old->npages; *old = *new; - if (id >= slots->nmemslots) - slots->nmemslots = id + 1; + if (new->npages != npages) + sort_memslots(slots); } slots->generation++; @@ -980,14 +1029,7 @@ EXPORT_SYMBOL_GPL(kvm_is_error_hva); static struct kvm_memory_slot *__gfn_to_memslot(struct kvm_memslots *slots, gfn_t gfn) { - struct kvm_memory_slot *memslot; - - kvm_for_each_memslot(memslot, slots) - if (gfn >= memslot->base_gfn - && gfn < memslot->base_gfn + memslot->npages) - return memslot; - - return NULL; + return search_memslots(slots, gfn); } struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn) @@ -998,20 +
[PATCH v3 4/6] KVM: introduce id_to_memslot function
From: Xiao Guangrong Introduce id_to_memslot to get memslot by slot id Signed-off-by: Xiao Guangrong --- arch/ia64/kvm/kvm-ia64.c |2 +- arch/powerpc/kvm/book3s.c |2 +- arch/x86/kvm/vmx.c|6 -- arch/x86/kvm/x86.c|9 + include/linux/kvm_host.h |6 ++ virt/kvm/kvm_main.c | 13 + 6 files changed, 26 insertions(+), 12 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 42ad1f9..92d9f1e 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1818,7 +1818,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, if (log->slot >= KVM_MEMORY_SLOTS) goto out; - memslot = &kvm->memslots->memslots[log->slot]; + memslot = id_to_memslot(kvm->memslots, log->slot); r = -ENOENT; if (!memslot->dirty_bitmap) goto out; diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index a459479..e41ac6f 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -498,7 +498,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, /* If nothing is dirty, don't bother messing with page tables. */ if (is_dirty) { - memslot = &kvm->memslots->memslots[log->slot]; + memslot = id_to_memslot(kvm->memslots, log->slot); ga = memslot->base_gfn << PAGE_SHIFT; ga_end = ga + (memslot->npages << PAGE_SHIFT); diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index ba24022..8f19d91 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2711,11 +2711,13 @@ static gva_t rmode_tss_base(struct kvm *kvm) { if (!kvm->arch.tss_addr) { struct kvm_memslots *slots; + struct kvm_memory_slot *slot; gfn_t base_gfn; slots = kvm_memslots(kvm); - base_gfn = slots->memslots[0].base_gfn + -kvm->memslots->memslots[0].npages - 3; + slot = id_to_memslot(slots, 0); + base_gfn = slot->base_gfn + slot->npages - 3; + return base_gfn << PAGE_SHIFT; } return kvm->arch.tss_addr; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a9e5a59..b26dd82 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3521,7 +3521,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, if (log->slot >= KVM_MEMORY_SLOTS) goto out; - memslot = &kvm->memslots->memslots[log->slot]; + memslot = id_to_memslot(kvm->memslots, log->slot); r = -ENOENT; if (!memslot->dirty_bitmap) goto out; @@ -3544,15 +3544,16 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, if (!slots) goto out; memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots)); - memslot = &slots->memslots[log->slot]; - memslot->dirty_bitmap = dirty_bitmap; + memslot = id_to_memslot(slots, log->slot); memslot->nr_dirty_pages = 0; + memslot->dirty_bitmap = dirty_bitmap; update_memslots(slots, NULL); old_slots = kvm->memslots; rcu_assign_pointer(kvm->memslots, slots); synchronize_srcu_expedited(&kvm->srcu); - dirty_bitmap = old_slots->memslots[log->slot].dirty_bitmap; + dirty_bitmap = id_to_memslot(old_slots, + log->slot)->dirty_bitmap; kfree(old_slots); write_protect_slot(kvm, memslot, dirty_bitmap, nr_dirty_pages); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 392af47..123925c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -333,6 +333,12 @@ static inline struct kvm_memslots *kvm_memslots(struct kvm *kvm) || lockdep_is_held(&kvm->slots_lock)); } +static inline struct kvm_memory_slot * +id_to_memslot(struct kvm_memslots *slots, int id) +{ + return &slots->memslots[id]; +} + #define HPA_MSB ((sizeof(hpa_t) * 8) - 1) #define HPA_ERR_MASK ((hpa_t)1 << HPA_MSB) static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; } diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4c2900c..7b60849 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -634,8 +634,9 @@ void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new) { if (new) { int id = new->id; + struct kvm_memory_slot *old = id_to_memslot(slots, id); - slots->memslots[id] = *new; + *old = *new; if (id >= slots->nmemslots) slots->nmemslots = id + 1; } @@ -681,7 +682,7 @@ int __kvm_set_memory_region(struct kvm *kvm, if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr) goto ou
[PATCH v3 3/6] KVM: introduce kvm_for_each_memslot macro
From: Xiao Guangrong Introduce kvm_for_each_memslot to walk all valid memslot Signed-off-by: Xiao Guangrong --- arch/ia64/kvm/kvm-ia64.c |6 ++ arch/x86/kvm/mmu.c | 12 ++-- include/linux/kvm_host.h |4 virt/kvm/iommu.c | 17 + virt/kvm/kvm_main.c | 14 ++ 5 files changed, 27 insertions(+), 26 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 43f4c92..42ad1f9 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1366,14 +1366,12 @@ static void kvm_release_vm_pages(struct kvm *kvm) { struct kvm_memslots *slots; struct kvm_memory_slot *memslot; - int i, j; + int j; unsigned long base_gfn; slots = kvm_memslots(kvm); - for (i = 0; i < slots->nmemslots; i++) { - memslot = &slots->memslots[i]; + kvm_for_each_memslot(memslot, slots) { base_gfn = memslot->base_gfn; - for (j = 0; j < memslot->npages; j++) { if (memslot->rmap[j]) put_page((struct page *)memslot->rmap[j]); diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 715dcb4..d737443 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1128,15 +1128,15 @@ static int kvm_handle_hva(struct kvm *kvm, unsigned long hva, int (*handler)(struct kvm *kvm, unsigned long *rmapp, unsigned long data)) { - int i, j; + int j; int ret; int retval = 0; struct kvm_memslots *slots; + struct kvm_memory_slot *memslot; slots = kvm_memslots(kvm); - for (i = 0; i < slots->nmemslots; i++) { - struct kvm_memory_slot *memslot = &slots->memslots[i]; + kvm_for_each_memslot(memslot, slots) { unsigned long start = memslot->userspace_addr; unsigned long end; @@ -3985,15 +3985,15 @@ nomem: */ unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm) { - int i; unsigned int nr_mmu_pages; unsigned int nr_pages = 0; struct kvm_memslots *slots; + struct kvm_memory_slot *memslot; slots = kvm_memslots(kvm); - for (i = 0; i < slots->nmemslots; i++) - nr_pages += slots->memslots[i].npages; + kvm_for_each_memslot(memslot, slots) + nr_pages += memslot->npages; nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000; nr_mmu_pages = max(nr_mmu_pages, diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 23f795c..392af47 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -308,6 +308,10 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i) (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \ idx++) +#define kvm_for_each_memslot(memslot, slots) \ + for (memslot = &slots->memslots[0]; \ + memslot < slots->memslots + (slots)->nmemslots; memslot++) + int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id); void kvm_vcpu_uninit(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c index a195c07..4e5f7b7 100644 --- a/virt/kvm/iommu.c +++ b/virt/kvm/iommu.c @@ -134,14 +134,15 @@ unmap_pages: static int kvm_iommu_map_memslots(struct kvm *kvm) { - int i, idx, r = 0; + int idx, r = 0; struct kvm_memslots *slots; + struct kvm_memory_slot *memslot; idx = srcu_read_lock(&kvm->srcu); slots = kvm_memslots(kvm); - for (i = 0; i < slots->nmemslots; i++) { - r = kvm_iommu_map_pages(kvm, &slots->memslots[i]); + kvm_for_each_memslot(memslot, slots) { + r = kvm_iommu_map_pages(kvm, memslot); if (r) break; } @@ -311,16 +312,16 @@ static void kvm_iommu_put_pages(struct kvm *kvm, static int kvm_iommu_unmap_memslots(struct kvm *kvm) { - int i, idx; + int idx; struct kvm_memslots *slots; + struct kvm_memory_slot *memslot; idx = srcu_read_lock(&kvm->srcu); slots = kvm_memslots(kvm); - for (i = 0; i < slots->nmemslots; i++) { - kvm_iommu_put_pages(kvm, slots->memslots[i].base_gfn, - slots->memslots[i].npages); - } + kvm_for_each_memslot(memslot, slots) + kvm_iommu_put_pages(kvm, memslot->base_gfn, memslot->npages); + srcu_read_unlock(&kvm->srcu, idx); return 0; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b5ed777..4c2900c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -547,11 +547,11 @@ static void kvm_free_physmem_slot(struct kvm_memory_slot *free, void kvm_free_physmem(struct kvm *kvm) { - int i; struct kvm_memslots *slots = kvm->memslots; + struct kvm_memory_slot *memslot; -
[PATCH v3 2/6] KVM: introduce update_memslots function
From: Xiao Guangrong Introduce update_memslots to update slot which will be update to kvm->memslots Signed-off-by: Xiao Guangrong --- arch/x86/kvm/x86.c |2 +- include/linux/kvm_host.h |1 + virt/kvm/kvm_main.c | 22 +++--- 3 files changed, 17 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1985ea1..a9e5a59 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3547,7 +3547,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, memslot = &slots->memslots[log->slot]; memslot->dirty_bitmap = dirty_bitmap; memslot->nr_dirty_pages = 0; - slots->generation++; + update_memslots(slots, NULL); old_slots = kvm->memslots; rcu_assign_pointer(kvm->memslots, slots); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 924df0d..23f795c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -320,6 +320,7 @@ void kvm_exit(void); void kvm_get_kvm(struct kvm *kvm); void kvm_put_kvm(struct kvm *kvm); +void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new); static inline struct kvm_memslots *kvm_memslots(struct kvm *kvm) { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9ad94c9..b5ed777 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -630,6 +630,19 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot) } #endif /* !CONFIG_S390 */ +void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new) +{ + if (new) { + int id = new->id; + + slots->memslots[id] = *new; + if (id >= slots->nmemslots) + slots->nmemslots = id + 1; + } + + slots->generation++; +} + /* * Allocate some memory and give it an address in the guest physical address * space. @@ -780,10 +793,8 @@ skip_lpage: GFP_KERNEL); if (!slots) goto out_free; - if (mem->slot >= slots->nmemslots) - slots->nmemslots = mem->slot + 1; - slots->generation++; slots->memslots[mem->slot].flags |= KVM_MEMSLOT_INVALID; + update_memslots(slots, NULL); old_memslots = kvm->memslots; rcu_assign_pointer(kvm->memslots, slots); @@ -815,9 +826,6 @@ skip_lpage: GFP_KERNEL); if (!slots) goto out_free; - if (mem->slot >= slots->nmemslots) - slots->nmemslots = mem->slot + 1; - slots->generation++; /* actual memory is freed via old in kvm_free_physmem_slot below */ if (!npages) { @@ -827,7 +835,7 @@ skip_lpage: new.lpage_info[i] = NULL; } - slots->memslots[mem->slot] = new; + update_memslots(slots, &new); old_memslots = kvm->memslots; rcu_assign_pointer(kvm->memslots, slots); synchronize_srcu_expedited(&kvm->srcu); -- 1.7.7.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/6] KVM: introduce KVM_MEM_SLOTS_NUM macro
From: Xiao Guangrong Introduce KVM_MEM_SLOTS_NUM macro to instead of KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS Signed-off-by: Xiao Guangrong --- arch/x86/include/asm/kvm_host.h |4 +++- arch/x86/kvm/mmu.c |2 +- include/linux/kvm_host.h|7 +-- virt/kvm/kvm_main.c |2 +- 4 files changed, 10 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 69b6525..1769f3d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -31,6 +31,8 @@ #define KVM_MEMORY_SLOTS 32 /* memory slots that does not exposed to userspace */ #define KVM_PRIVATE_MEM_SLOTS 4 +#define KVM_MEM_SLOTS_NUM (KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS) + #define KVM_MMIO_SIZE 16 #define KVM_PIO_PAGE_OFFSET 1 @@ -228,7 +230,7 @@ struct kvm_mmu_page { * One bit set per slot which has memory * in this shadow page. */ - DECLARE_BITMAP(slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS); + DECLARE_BITMAP(slot_bitmap, KVM_MEM_SLOTS_NUM); bool unsync; int root_count; /* Currently serving as active root */ unsigned int unsync_children; diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index aecdea2..715dcb4 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1349,7 +1349,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, PAGE_SIZE); set_page_private(virt_to_page(sp->spt), (unsigned long)sp); list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages); - bitmap_zero(sp->slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS); + bitmap_zero(sp->slot_bitmap, KVM_MEM_SLOTS_NUM); sp->parent_ptes = 0; mmu_page_add_parent_pte(vcpu, sp, parent_pte); kvm_mod_used_mmu_pages(vcpu->kvm, +1); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7c654aa..924df0d 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -227,11 +227,14 @@ struct kvm_irq_routing_table {}; #endif +#ifndef KVM_MEM_SLOTS_NUM +#define KVM_MEM_SLOTS_NUM (KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS) +#endif + struct kvm_memslots { int nmemslots; u64 generation; - struct kvm_memory_slot memslots[KVM_MEMORY_SLOTS + - KVM_PRIVATE_MEM_SLOTS]; + struct kvm_memory_slot memslots[KVM_MEM_SLOTS_NUM]; }; struct kvm { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index af5c988..9ad94c9 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -663,7 +663,7 @@ int __kvm_set_memory_region(struct kvm *kvm, (void __user *)(unsigned long)mem->userspace_addr, mem->memory_size))) goto out; - if (mem->slot >= KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS) + if (mem->slot >= KVM_MEM_SLOTS_NUM) goto out; if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr) goto out; -- 1.7.7.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/6] KVM: optimize memslots searching
Changelog: - rebase it on current kvm tree and some cleanups This patchset is tested on x86 and build tested on powerpc and ia64 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html