Re: Fwd: Re: [RFC] kvm tools: Implement multiple VQ for virtio-net

2011-11-24 Thread Mathieu Desnoyers
Hi Stephen,

Benjamin forwarded me your email stating:

> I have been playing with userspace-rcu which has a number of neat
> lockless routines for queuing and hashing. But there aren't kernel versions
> and several of them may require cmpxchg to work.

Just FYI, I made sure a few years ago that cmpxchg is implemented on all
architectures within the Linux kernel (using a interrupt disable
fallback on the cases where it is not supported architecturally, on
UP-only architectures), so we should be good to use the lock-free
structures as-is in the kernel on this front. As for the RCU use by
these structures, userspace RCU has very much the same semantic as in
the kernel, so we can implement and test these structures in userspace
and eventually port them to the kernel as needed.

Lai Jiangshan is actively working at making sure the user-level
implementation of the RCU lock-free hash table (currently in a
development branch of the userspace RCU git tree : urcu/ht-shrink, not
yet in master) is suitable for use in the Linux kernel too.

Best regards,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] PCI: Can continually add funcs after adding func0

2011-11-24 Thread Amos Kong
Boot up a KVM guest, and hotplug multifunction
devices(func1,func2,func0,func3) to guest.

for i in 1 2 0 3;do
qemu-img create /tmp/resize$i.qcow2 1G -f qcow2
(qemu) drive_add 0x11.$i id=drv11$i,if=none,file=/tmp/resize$i.qcow2
(qemu) device_add 
virtio-blk-pci,id=dev11$i,drive=drv11$i,addr=0x11.$i,multifunction=on
done

In linux kernel, when func0 of the slot is hot-added, the whole
slot will be marked as 'enabled', then driver will ignore other new
hotadded funcs.
But in Win7 & WinXP, we can continaully add other funcs after adding
func0, all funcs will be added in guest.

drivers/pci/hotplug/acpiphp_glue.c:
static int acpiphp_check_bridge(struct acpiphp_bridge *bridge)
{

for (slot = bridge->slots; slot; slot = slot->next) {
if (slot->flags & SLOT_ENABLED) {
acpiphp_disable_slot()
else
acpiphp_enable_slot()
  |
} v
enable_device()
  |
  v
//only don't enable slot if func0 is not added
list_for_each_entry(func, &slot->funcs, sibling) {
   ...
}
   slot->flags |= SLOT_ENABLED; //mark slot to 'enabled'

This patch just make pci driver can continaully add funcs after adding
func 0. Only mark slot to 'enabled' when all funcs are added.

For pci multifunction hotplug, we can add functions one by one(func 0 is
necessary), and all functions will be removed in one time.

Signed-off-by: Amos Kong 
---
 drivers/pci/hotplug/acpiphp_glue.c |   29 +
 1 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/hotplug/acpiphp_glue.c 
b/drivers/pci/hotplug/acpiphp_glue.c
index 596172b..a1b0afc 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -789,20 +789,10 @@ static int __ref enable_device(struct acpiphp_slot *slot)
if (slot->flags & SLOT_ENABLED)
goto err_exit;
 
-   /* sanity check: dev should be NULL when hot-plugged in */
-   dev = pci_get_slot(bus, PCI_DEVFN(slot->device, 0));
-   if (dev) {
-   /* This case shouldn't happen */
-   err("pci_dev structure already exists.\n");
-   pci_dev_put(dev);
-   retval = -1;
-   goto err_exit;
-   }
-
num = pci_scan_slot(bus, PCI_DEVFN(slot->device, 0));
if (num == 0) {
-   err("No new device found\n");
-   retval = -1;
+   /* Maybe only part of funcs are added. */
+   dbg("No new device found\n");
goto err_exit;
}
 
@@ -837,11 +827,16 @@ static int __ref enable_device(struct acpiphp_slot *slot)
 
pci_bus_add_devices(bus);
 
+   slot->flags |= SLOT_ENABLED;
list_for_each_entry(func, &slot->funcs, sibling) {
dev = pci_get_slot(bus, PCI_DEVFN(slot->device,
  func->function));
-   if (!dev)
+   if (!dev) {
+   /* Do not set SLOT_ENABLED flag if some funcs
+  are not added. */
+   slot->flags &= (~SLOT_ENABLED);
continue;
+   }
 
if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE &&
dev->hdr_type != PCI_HEADER_TYPE_CARDBUS) {
@@ -856,7 +851,6 @@ static int __ref enable_device(struct acpiphp_slot *slot)
pci_dev_put(dev);
}
 
-   slot->flags |= SLOT_ENABLED;
 
  err_exit:
return retval;
@@ -881,9 +875,12 @@ static int disable_device(struct acpiphp_slot *slot)
 {
struct acpiphp_func *func;
struct pci_dev *pdev;
+   struct pci_bus *bus = slot->bridge->pci_bus;
 
-   /* is this slot already disabled? */
-   if (!(slot->flags & SLOT_ENABLED))
+   /* The slot will be enabled when func 0 is added, so check
+  func 0 before disable the slot. */
+   pdev = pci_get_slot(bus, PCI_DEVFN(slot->device, 0));
+   if (!pdev)
goto err_exit;
 
list_for_each_entry(func, &slot->funcs, sibling) {
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm: make vcpu life cycle separated from kvm instance

2011-11-24 Thread Liu Ping Fan
From: Liu Ping Fan 

Currently, vcpu can be destructed only when kvm instance destroyed.
Change this to vcpu as a refer to kvm, and then vcpu MUST and CAN be
destroyed before kvm's destroy. Qemu will take advantage of this to
exit the vcpu thread if the thread is no longer in use by guest.

Signed-off-by: Liu Ping Fan 
---
 arch/x86/kvm/x86.c   |   28 
 include/linux/kvm_host.h |2 ++
 virt/kvm/kvm_main.c  |   31 +--
 3 files changed, 39 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c38efd7..ea2315a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6560,27 +6560,16 @@ static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)
vcpu_put(vcpu);
 }
 
-static void kvm_free_vcpus(struct kvm *kvm)
+void kvm_arch_vcpu_zap(struct kref *ref)
 {
-   unsigned int i;
-   struct kvm_vcpu *vcpu;
-
-   /*
-* Unpin any mmu pages first.
-*/
-   kvm_for_each_vcpu(i, vcpu, kvm) {
-   kvm_clear_async_pf_completion_queue(vcpu);
-   kvm_unload_vcpu_mmu(vcpu);
-   }
-   kvm_for_each_vcpu(i, vcpu, kvm)
-   kvm_arch_vcpu_free(vcpu);
-
-   mutex_lock(&kvm->lock);
-   for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
-   kvm->vcpus[i] = NULL;
+   struct kvm_vcpu *vcpu = container_of(ref, struct kvm_vcpu, refcount);
+   struct kvm *kvm = vcpu->kvm;
 
-   atomic_set(&kvm->online_vcpus, 0);
-   mutex_unlock(&kvm->lock);
+   printk(KERN_INFO "%s, zap vcpu:0x%x\n", __func__, vcpu->vcpu_id);
+   kvm_clear_async_pf_completion_queue(vcpu);
+   kvm_unload_vcpu_mmu(vcpu);
+   kvm_arch_vcpu_free(vcpu);
+   kvm_put_kvm(kvm);
 }
 
 void kvm_arch_sync_events(struct kvm *kvm)
@@ -6594,7 +6583,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
kvm_iommu_unmap_guest(kvm);
kfree(kvm->arch.vpic);
kfree(kvm->arch.vioapic);
-   kvm_free_vcpus(kvm);
if (kvm->arch.apic_access_page)
put_page(kvm->arch.apic_access_page);
if (kvm->arch.ept_identity_pagetable)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d526231..fe35078 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -113,6 +113,7 @@ enum {
 
 struct kvm_vcpu {
struct kvm *kvm;
+   struct kref refcount;
 #ifdef CONFIG_PREEMPT_NOTIFIERS
struct preempt_notifier preempt_notifier;
 #endif
@@ -460,6 +461,7 @@ void kvm_arch_exit(void);
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
+void kvm_arch_vcpu_zap(struct kref *ref);
 void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d9cfb78..f166bc8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -580,6 +580,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
kvm_arch_free_vm(kvm);
hardware_disable_all();
mmdrop(mm);
+   printk(KERN_INFO "%s finished\n", __func__);
 }
 
 void kvm_get_kvm(struct kvm *kvm)
@@ -1503,6 +1504,16 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
mark_page_dirty_in_slot(kvm, memslot, gfn);
 }
 
+void kvm_vcpu_get(struct kvm_vcpu *vcpu)
+{
+   kref_get(&vcpu->refcount);
+}
+
+void kvm_vcpu_put(struct kvm_vcpu *vcpu)
+{
+   kref_put(&vcpu->refcount, kvm_arch_vcpu_zap);
+}
+
 /*
  * The vCPU has executed a HLT instruction with in-kernel mode enabled.
  */
@@ -1623,8 +1634,13 @@ static int kvm_vcpu_mmap(struct file *file, struct 
vm_area_struct *vma)
 static int kvm_vcpu_release(struct inode *inode, struct file *filp)
 {
struct kvm_vcpu *vcpu = filp->private_data;
+   struct kvm *kvm = vcpu->kvm;
 
-   kvm_put_kvm(vcpu->kvm);
+   filp->private_data = NULL;
+   mutex_lock(&kvm->lock);
+   atomic_sub(1, &kvm->online_vcpus);
+   mutex_unlock(&kvm->lock);
+   kvm_vcpu_put(vcpu);
return 0;
 }
 
@@ -1646,6 +1662,17 @@ static int create_vcpu_fd(struct kvm_vcpu *vcpu)
return anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, O_RDWR);
 }
 
+static struct kvm_vcpu *kvm_vcpu_create(struct kvm *kvm, u32 id)
+{
+   struct kvm_vcpu *vcpu;
+   vcpu = kvm_arch_vcpu_create(kvm, id);
+   if (IS_ERR(vcpu))
+   return vcpu;
+
+   kref_init(&vcpu->refcount);
+   return vcpu;
+}
+
 /*
  * Creates some virtual cpus.  Good luck creating more than one.
  */
@@ -1654,7 +1681,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 
id)
int r;
struct kvm_vcpu *vcpu, *v;
 
-   vcpu = kvm_arch_vcpu_create(kvm, id);
+   vcpu = kvm_vcpu_create(kvm, id);
if (IS_ERR(vcpu))
return PTR_ERR(vcpu);
 
-- 
1.7.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a messag

[PATCH 0] A series patches for kvm&qemu to enable vcpu destruction in kvm

2011-11-24 Thread Liu Ping Fan
A series of patches from kvm, qemu to guest. These patches will finally enable 
vcpu destruction in kvm instance and let vcpu thread exit in qemu.
 
Currently, the vcpu online feature enables the dynamical creation of vcpu and 
vcpu thread, while the offline feature can not destruct the vcpu and let vcpu 
thread exit, it just halt in kvm. Because currently, the vcpu will only be 
destructed when kvm instance is destroyed. We can 
change vcpu as an refer of kvm instance, and then vcpu's destruction MUST and 
CAN come before kvm's destruction.

These patches use guest driver to notify the CPU_DEAD event to qemu, and later 
qemu asks kvm to release the dead vcpu and finally exit the 
thread. 
The usage is: 
qemu$cpu_set n online
qemu$cpu_set n zap    This will destroy the vcpu-n in kvm 
and let vcpu thread exit
 OR 
qemu$cpu_set n offline  - This will just block vcpu-n in kvm

Any comment and suggestion are welcome.


Patches include:
|-- guest
|   `-- 0001-virtio-add-a-pci-driver-to-notify-host-the-CPU_DEAD-.patch
|-- kvm
|   |-- 0001-kvm-make-vcpu-life-cycle-separated-from-kvm-instance.patch
|   `-- 0002-kvm-exit-to-userspace-with-reason-KVM_EXIT_VCPU_DEAD.patch
`-- qemu
|-- 0001-Add-cpu_phyid_to_cpu-to-map-cpu-phyid-to-CPUState.patch
|-- 0002-Add-cpu_free-to-support-arch-related-CPUState-releas.patch
|-- 0003-Introduce-a-pci-device-cpustate-to-get-CPU_DEAD-even.patch
|-- 0004-Release-vcpu-and-finally-exit-vcpu-thread-safely.patch
`-- 0005-tmp-patches-for-linux-header-files.patch

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] ivshmem: fix guest unable to start with ioeventfd

2011-11-24 Thread Cam Macdonell
On Thu, Nov 24, 2011 at 3:05 AM,   wrote:
> From: Hongyong Zang 
>
> When a guest boots with ioeventfd, an error (by gdb) occurs:
>  Program received signal SIGSEGV, Segmentation fault.
>  0x006009cc in setup_ioeventfds (s=0x171dc40)
>      at /home/louzhengwei/git_source/qemu-kvm/hw/ivshmem.c:363
>  363             for (j = 0; j < s->peers[i].nb_eventfds; j++) {
> The bug is due to accessing s->peers which is NULL.

Can you share the command-line that caused the fault?

>
> This patch uses the memory region API to replace the old one 
> kvm_set_ioeventfd_mmio_long().
> And this patch makes memory_region_add_eventfd() called in ivshmem_read() 
> when qemu receives
> eventfd information from ivshmem_server.

Should this patch be split into two patches, to separate the bug fix
from the other changes related to the Memory API?  Unless I
misunderstand how the two are necessarily related.

Cam

>
> Signed-off-by: Hongyong Zang 
> ---
>  hw/ivshmem.c |   41 ++---
>  1 files changed, 14 insertions(+), 27 deletions(-)
>
> diff --git a/hw/ivshmem.c b/hw/ivshmem.c
> index 242fbea..be26f03 100644
> --- a/hw/ivshmem.c
> +++ b/hw/ivshmem.c
> @@ -58,7 +58,6 @@ typedef struct IVShmemState {
>     CharDriverState *server_chr;
>     MemoryRegion ivshmem_mmio;
>
> -    pcibus_t mmio_addr;
>     /* We might need to register the BAR before we actually have the memory.
>      * So prepare a container MemoryRegion for the BAR immediately and
>      * add a subregion when we have the memory.
> @@ -346,8 +345,14 @@ static void close_guest_eventfds(IVShmemState *s, int 
> posn)
>     guest_curr_max = s->peers[posn].nb_eventfds;
>
>     for (i = 0; i < guest_curr_max; i++) {
> -        kvm_set_ioeventfd_mmio_long(s->peers[posn].eventfds[i],
> -                    s->mmio_addr + DOORBELL, (posn << 16) | i, 0);
> +        if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
> +            memory_region_del_eventfd(&s->ivshmem_mmio,
> +                                     DOORBELL,
> +                                     4,
> +                                     true,
> +                                     (posn << 16) | i,
> +                                     s->peers[posn].eventfds[i]);
> +        }
>         close(s->peers[posn].eventfds[i]);
>     }
>
> @@ -355,22 +360,6 @@ static void close_guest_eventfds(IVShmemState *s, int 
> posn)
>     s->peers[posn].nb_eventfds = 0;
>  }
>
> -static void setup_ioeventfds(IVShmemState *s) {
> -
> -    int i, j;
> -
> -    for (i = 0; i <= s->max_peer; i++) {
> -        for (j = 0; j < s->peers[i].nb_eventfds; j++) {
> -            memory_region_add_eventfd(&s->ivshmem_mmio,
> -                                      DOORBELL,
> -                                      4,
> -                                      true,
> -                                      (i << 16) | j,
> -                                      s->peers[i].eventfds[j]);
> -        }
> -    }
> -}
> -
>  /* this function increase the dynamic storage need to store data about other
>  * guests */
>  static void increase_dynamic_storage(IVShmemState *s, int new_min_size) {
> @@ -491,10 +480,12 @@ static void ivshmem_read(void *opaque, const uint8_t * 
> buf, int flags)
>     }
>
>     if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
> -        if (kvm_set_ioeventfd_mmio_long(incoming_fd, s->mmio_addr + DOORBELL,
> -                        (incoming_posn << 16) | guest_max_eventfd, 1) < 0) {
> -            fprintf(stderr, "ivshmem: ioeventfd not available\n");
> -        }
> +        memory_region_add_eventfd(&s->ivshmem_mmio,
> +                                  DOORBELL,
> +                                  4,
> +                                  true,
> +                                  (incoming_posn << 16) | guest_max_eventfd,
> +                                  incoming_fd);
>     }
>
>     return;
> @@ -659,10 +650,6 @@ static int pci_ivshmem_init(PCIDevice *dev)
>     memory_region_init_io(&s->ivshmem_mmio, &ivshmem_mmio_ops, s,
>                           "ivshmem-mmio", IVSHMEM_REG_BAR_SIZE);
>
> -    if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
> -        setup_ioeventfds(s);
> -    }
> -
>     /* region for registers*/
>     pci_register_bar(&s->dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY,
>                      &s->ivshmem_mmio);
> --
> 1.7.1
>
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Changing IOMMU-API for generic DMA-mapping supported by the hardware

2011-11-24 Thread 'Joerg Roedel'
On Thu, Nov 24, 2011 at 01:52:33PM +0100, Marek Szyprowski wrote:
> In my DMA-mapping IOMMU integration I've used a dma_iommu_mapping structure,
> which contains a pointer to iommu domain, a bitmap and a lock. Maybe we 
> should consider extending iommu domain with allocation bitmap (or other 
> structure that hold information about used/unused iova ranges)? From the
> DMA-mapping (as a IOMMU client) perspective we only need 2 more callbacks
> in IOMMU API: alloc_iova_range() and free_iova_range(). 
> 
> Each IOMMU implementation can provide these calls based on internal bitmap
> allocator which will also cover the issue with reserved ranges. What do you
> think about such solution?

Hmm, the main point of a generic DMA-mapping implementation is that a
common address-allocator will be used. Today every IOMMU driver that
implements the DMA-API has its own allocator, this is something to unify
between for all drivers.

The allocator information can be stored in the default iommu_domain. We
need a user-private pointer there, but that is easy to add.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nested virtualization on Intel and needed cpu flags to pass

2011-11-24 Thread Gianluca Cecchi
Resend, because probably it didn't reach the ml due to attachments size...
I'm posting now links instead...

On Wed, Nov 23, 2011 at 12:01 PM, Nadav Har'El wrote:

> Unfortunately, this is a known bug - which I promised to work on, but
> haven't yet got around to :(
> nested-vmx.txt explictly lists under "known limitations" that: "The
> current code supports running Linux guests under KVM guests."
[snip]
> I don't think there is any such guidelines. The only thing you really
> need is "-cpu qemu64,+vmx" (replace qemu64 by whatever you want) to
> advertise the exisance of VMX.
>

Ok, thanks for the answer.
right now tested this config

Host F16 with these packages
kernel-3.1.1-2.fc16.x86_64
virt-manager-0.9.0-7.fc16.noarch
qemu-kvm-0.15.1-3.fc16.x86_64
libvirt-0.9.6-2.fc16.x86_64

L1 guest f16vm with same virtualization related packages as the host
L2 guest c56 configured as red hat 5.4 or above and configured to boot from cd
cd is a iso of centos 5.6 live x86_64

L1 guest configured with "copy host cpu configuration" in virt-manager
its cpuinfo gives:
[root@f16vm ~]# cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 Duo CPU T7700  @ 2.40GHz
stepping: 11
cpu MHz : 2693.880
cache size  : 4096 KB
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm
constant_tsc up nopl pni vmx ssse3 cx16 sse4_1 sse4_2 x2apic popcnt
hypervisor lahf_lm
bogomips: 5387.76
clflush size: 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

qemu command line on my host for f16vm is
/usr/bin/qemu-kvm -S -M pc-0.14 -cpu
core2duo,+lahf_lm,+rdtscp,+popcnt,+x2apic,+sse4.2,+sse4.1,+xtpr,+cx16,+tm2,+est,+vmx,+ds_cpl,+pbe,+tm,+ht,+ss,+acpi,+ds
-enable-kvm -m 3192 -smp 1,sockets=1,cores=1,threads=1 -name f16vm


On L1 f16vm qemu command line for its L2 guest c56 is:
[root@f16vm ~]# ps -ef|grep qemu
qemu  1834 1 10 12:49 ?00:01:07 /usr/bin/qemu-kvm -S
-M pc-0.14 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1
-name c56 -uuid 15526957-51c9-1958-8a15-dea8f2626e5d -nodefconfig
-nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/c56.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -drive
file=/var/lib/libvirt/images/CentOS-5.6-x86_64-LiveCD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -vga
cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device
hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

[root@f16vm ~]# virsh start c56

After a while c56 goes into a paused state:
[root@f16vm ~]# virsh domstate c56
paused

[root@f16vm ~]# virsh domstate c56 --reason
paused (unknown)

I got only up to these lines attached with virt-dmesg run from f16vm
against c56:
https://docs.google.com/open?id=0BwoPbcrMv8mvMWVmYTRkNDMtMjMzMi00OWViLWI1NTctYjA1YzU2NmM0ZmU5

I post also the image related to what I see in console of c56 L2
guest before it gets paused
https://docs.google.com/open?id=0BwoPbcrMv8mvMWVjNGNmYTUtNTcxOC00MzBkLWI5YWYtZDhmNDkxMzM1OTEx

Thanks,
Gianluca
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/10] nEPT: MMU context for nested EPT

2011-11-24 Thread Avi Kivity
On 11/23/2011 05:44 PM, Nadav Har'El wrote:
> On Wed, Nov 23, 2011, Nadav Har'El wrote about "Re: [PATCH 02/10] nEPT: MMU 
> context for nested EPT":
> > > +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu)
> > > +{
> > > + int r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu);
> > > +
> > > + vcpu->arch.nested_mmu.gva_to_gpa = EPT_gva_to_gpa_nested;
> > > +
> > > + return r;
> > > +}
> >..
> > I didn't see you actually call this function anywhere - how is it
> > supposed to work?
> >..
> > It seems we need a fifth case in that function.
> >..
>
> On second thought, why is this modifying nested_mmu.gva_to_gpa, and not
> mmu.gva_to_gpa? Isn't the nested_mmu the L2 CR3, which is *not* in EPT
> format, and what we really want to change is the outer mmu, which is
> EPT12 and is indeed in EPT format?
> Or am I missing something?

I think you're right.  The key is to look at what ->walk_mmu points at.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Changing IOMMU-API for generic DMA-mapping supported by the hardware

2011-11-24 Thread Marek Szyprowski
Hello,

On Friday, November 11, 2011 2:17 PM Joerg Roedel wrote:

> Okay, seperate thread for this one.

If possible, I would like to be CCed: in the next mails in this topic. 

For a last few months I've been working on DMA-mapping changes on ARM
architecture in order to add support for IOMMU-aware DMA mapper. The
last version of my patches are available here:
http://lists.linaro.org/pipermail/linaro-mm-sig/2011-October/000745.html

The next version will be posted soon.
 
> On Thu, Nov 10, 2011 at 07:28:39PM +, David Woodhouse wrote:
> > > The plan is to have a single DMA-API implementation for all IOMMU
> > > drivers (X86 and ARM) which just uses the IOMMU-API. But to make this
> > > performing reasonalbly well a few changes to the IOMMU-API are required.
> > > I already have some ideas which we can discuss if you want.
> >
> > Yeah, that sounds useful.
> 
> As I said some changes to the IOMMU-API are required in my opinion.
> These changes should also allow it to move over old-style IOMMUs like
> Calgary or GART later.
> 
> The basic idea is that IOMMU drivers should be required to put every
> device they are responsible for into a default domain. The DMA mapping
> code can query this default domain for each device.

Good idea.
 
> Also the default domain has capabilities that can be queried. Those
> capabilities include the size and offset of the address space they can
> re-map. For GART and Calgary this will be the aperture, for VT-d and AMD
> IOMMU the whole 64bit address space. Another capability is whether
> addresses outside of that area are 1-1 mapped or no accessible to the
> device.
>
> The generic DMA-mapping code will use that information to initialize its
> allocator and uses iommu_map/iommu_unmap to create and destroy mappings
> as requested by the DMA-API (but the DMA-mapping code does not need to
> create a domain of its own).
> 
> The good thing about these default domains is that IOMMU drivers can
> implement their own optimizations on it. The AMD IOMMU driver for
> example already makes a distinction between dma-mapping domains and
> other protection-domains. The optimization for dma-mapping domains is
> that the leaf-pages of the page-table are keept in an array so that it
> is very easy to find the PTE for an address. Those optimizations are
> still possible with the default-domain concept.
> 
> In short, the benefits of the default-domain concept are:
> 
>   1) It allows existing optimizations for the DMA-mapping code
>  paths to persist
>   2) It also fits old-style IOMMUs like GART, Calgary and others
> 
> An open problem is how to report reserved ranges of an address-space.
> These ranges might exist from a BIOS requirement for 1-1 mapping of
> certain address ranges (in AMD jargon: Unity mapped ranges, something
> similar exists on VT-d afaik) or hardware requirements like the reserved
> address range used for MSI interrupts.

In my DMA-mapping IOMMU integration I've used a dma_iommu_mapping structure,
which contains a pointer to iommu domain, a bitmap and a lock. Maybe we 
should consider extending iommu domain with allocation bitmap (or other 
structure that hold information about used/unused iova ranges)? From the
DMA-mapping (as a IOMMU client) perspective we only need 2 more callbacks
in IOMMU API: alloc_iova_range() and free_iova_range(). 

Each IOMMU implementation can provide these calls based on internal bitmap
allocator which will also cover the issue with reserved ranges. What do you
think about such solution?

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right

2011-11-24 Thread Avi Kivity
On 11/24/2011 01:53 PM, Sasha Levin wrote:
> On Thu, 2011-11-24 at 12:48 +0200, Avi Kivity wrote:
> > On 11/24/2011 12:45 PM, Sasha Levin wrote:
> > > If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still
> > > fail with -E2BIG due to wrong comparisons.
> > >
> > > Cc: Avi Kivity 
> > > Cc: Marcelo Tosatti 
> > > Signed-off-by: Sasha Levin 
> > > ---
> > >  arch/x86/kvm/x86.c |2 +-
> > >  1 files changed, 1 insertions(+), 1 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index 9eff4af..83fef71 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -2710,7 +2710,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct 
> > > kvm_cpuid2 *cpuid,
> > >cpuid->nent);
> > >  
> > >   r = -E2BIG;
> > > - if (nent >= cpuid->nent)
> > > + if (nent > cpuid->nent)
> > >   goto out_free;
> > >  
> > >
> > 
> > This is just a landmine for the next entry to be added there; surely
> > whoever adds it will forget to correct the > back to >=.
> > 
>
> Slapping a big warning before that should do the trick? Or maybe add
> something similar to 'final_nent = nent - 1;'?

Refactor the whole thing so all the repetitive code goes away.  Maybe
make it table driven.

But after my cpuid.c patch please, I'd hate to redo it.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right

2011-11-24 Thread Sasha Levin
On Thu, 2011-11-24 at 12:48 +0200, Avi Kivity wrote:
> On 11/24/2011 12:45 PM, Sasha Levin wrote:
> > If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still
> > fail with -E2BIG due to wrong comparisons.
> >
> > Cc: Avi Kivity 
> > Cc: Marcelo Tosatti 
> > Signed-off-by: Sasha Levin 
> > ---
> >  arch/x86/kvm/x86.c |2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 9eff4af..83fef71 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -2710,7 +2710,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct 
> > kvm_cpuid2 *cpuid,
> >  cpuid->nent);
> >  
> > r = -E2BIG;
> > -   if (nent >= cpuid->nent)
> > +   if (nent > cpuid->nent)
> > goto out_free;
> >  
> >
> 
> This is just a landmine for the next entry to be added there; surely
> whoever adds it will forget to correct the > back to >=.
> 

Slapping a big warning before that should do the trick? Or maybe add
something similar to 'final_nent = nent - 1;'?

-- 

Sasha.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 4/6] KVM: introduce id_to_memslot function

2011-11-24 Thread Xiao Guangrong
On 11/24/2011 06:23 PM, Takuya Yoshikawa wrote:

> (2011/11/24 19:15), Takuya Yoshikawa wrote:
>> (2011/11/24 18:40), Xiao Guangrong wrote:
> 
>> You can eliminate this if you use old_slot and new_slot for the two memory 
>> slots.
> 
> Or old_bitmap and new_bitmap.  Anyway, calling id_to_memslot() for getting the
> same slot twice is not good, IMO.
> 


Sure. Thanks for your review, Takuya!

From: Xiao Guangrong 
Subject: KVM: introduce id_to_memslot function

Introduce id_to_memslot to get memslot by slot id

Signed-off-by: Xiao Guangrong 
---
 arch/ia64/kvm/kvm-ia64.c  |2 +-
 arch/powerpc/kvm/book3s.c |2 +-
 arch/x86/kvm/vmx.c|6 --
 arch/x86/kvm/x86.c|   18 +-
 include/linux/kvm_host.h  |6 ++
 virt/kvm/kvm_main.c   |   13 +
 6 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 42ad1f9..92d9f1e 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -1818,7 +1818,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
if (log->slot >= KVM_MEMORY_SLOTS)
goto out;

-   memslot = &kvm->memslots->memslots[log->slot];
+   memslot = id_to_memslot(kvm->memslots, log->slot);
r = -ENOENT;
if (!memslot->dirty_bitmap)
goto out;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index a459479..e41ac6f 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -498,7 +498,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,

/* If nothing is dirty, don't bother messing with page tables. */
if (is_dirty) {
-   memslot = &kvm->memslots->memslots[log->slot];
+   memslot = id_to_memslot(kvm->memslots, log->slot);

ga = memslot->base_gfn << PAGE_SHIFT;
ga_end = ga + (memslot->npages << PAGE_SHIFT);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ba24022..8f19d91 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2711,11 +2711,13 @@ static gva_t rmode_tss_base(struct kvm *kvm)
 {
if (!kvm->arch.tss_addr) {
struct kvm_memslots *slots;
+   struct kvm_memory_slot *slot;
gfn_t base_gfn;

slots = kvm_memslots(kvm);
-   base_gfn = slots->memslots[0].base_gfn +
-kvm->memslots->memslots[0].npages - 3;
+   slot = id_to_memslot(slots, 0);
+   base_gfn = slot->base_gfn + slot->npages - 3;
+
return base_gfn << PAGE_SHIFT;
}
return kvm->arch.tss_addr;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a9e5a59..886296e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3521,7 +3521,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
if (log->slot >= KVM_MEMORY_SLOTS)
goto out;

-   memslot = &kvm->memslots->memslots[log->slot];
+   memslot = id_to_memslot(kvm->memslots, log->slot);
r = -ENOENT;
if (!memslot->dirty_bitmap)
goto out;
@@ -3532,27 +3532,27 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
/* If nothing is dirty, don't bother messing with page tables. */
if (nr_dirty_pages) {
struct kvm_memslots *slots, *old_slots;
-   unsigned long *dirty_bitmap;
+   unsigned long *dirty_bitmap, *dirty_bitmap_head;

-   dirty_bitmap = memslot->dirty_bitmap_head;
-   if (memslot->dirty_bitmap == dirty_bitmap)
-   dirty_bitmap += n / sizeof(long);
-   memset(dirty_bitmap, 0, n);
+   dirty_bitmap = memslot->dirty_bitmap;
+   dirty_bitmap_head = memslot->dirty_bitmap_head;
+   if (dirty_bitmap == dirty_bitmap_head)
+   dirty_bitmap_head += n / sizeof(long);
+   memset(dirty_bitmap_head, 0, n);

r = -ENOMEM;
slots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL);
if (!slots)
goto out;
memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots));
-   memslot = &slots->memslots[log->slot];
-   memslot->dirty_bitmap = dirty_bitmap;
+   memslot = id_to_memslot(slots, log->slot);
memslot->nr_dirty_pages = 0;
+   memslot->dirty_bitmap = dirty_bitmap_head;
update_memslots(slots, NULL);

old_slots = kvm->memslots;
rcu_assign_pointer(kvm->memslots, slots);
synchronize_srcu_expedited(&kvm->srcu);
-   dirty_bitmap = old_slots->memslots[log->slot].dirty_bitmap;
kfree(old_slots);

write_protect_slot(kvm, memslot, dirty_bitmap, nr_dirty_pages);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 392af47..123925c 100644
--- a/i

[Qemu-devel] [PATCH] ivshmem: fix guest unable to start with ioeventfd

2011-11-24 Thread zanghongyong
From: Hongyong Zang 

When a guest boots with ioeventfd, an error (by gdb) occurs:
  Program received signal SIGSEGV, Segmentation fault.
  0x006009cc in setup_ioeventfds (s=0x171dc40)
  at /home/louzhengwei/git_source/qemu-kvm/hw/ivshmem.c:363
  363 for (j = 0; j < s->peers[i].nb_eventfds; j++) {
The bug is due to accessing s->peers which is NULL.

This patch uses the memory region API to replace the old one 
kvm_set_ioeventfd_mmio_long().
And this patch makes memory_region_add_eventfd() called in ivshmem_read() when 
qemu receives
eventfd information from ivshmem_server.

Signed-off-by: Hongyong Zang 
---
 hw/ivshmem.c |   41 ++---
 1 files changed, 14 insertions(+), 27 deletions(-)

diff --git a/hw/ivshmem.c b/hw/ivshmem.c
index 242fbea..be26f03 100644
--- a/hw/ivshmem.c
+++ b/hw/ivshmem.c
@@ -58,7 +58,6 @@ typedef struct IVShmemState {
 CharDriverState *server_chr;
 MemoryRegion ivshmem_mmio;
 
-pcibus_t mmio_addr;
 /* We might need to register the BAR before we actually have the memory.
  * So prepare a container MemoryRegion for the BAR immediately and
  * add a subregion when we have the memory.
@@ -346,8 +345,14 @@ static void close_guest_eventfds(IVShmemState *s, int posn)
 guest_curr_max = s->peers[posn].nb_eventfds;
 
 for (i = 0; i < guest_curr_max; i++) {
-kvm_set_ioeventfd_mmio_long(s->peers[posn].eventfds[i],
-s->mmio_addr + DOORBELL, (posn << 16) | i, 0);
+if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
+memory_region_del_eventfd(&s->ivshmem_mmio,
+ DOORBELL,
+ 4,
+ true,
+ (posn << 16) | i,
+ s->peers[posn].eventfds[i]);
+}
 close(s->peers[posn].eventfds[i]);
 }
 
@@ -355,22 +360,6 @@ static void close_guest_eventfds(IVShmemState *s, int posn)
 s->peers[posn].nb_eventfds = 0;
 }
 
-static void setup_ioeventfds(IVShmemState *s) {
-
-int i, j;
-
-for (i = 0; i <= s->max_peer; i++) {
-for (j = 0; j < s->peers[i].nb_eventfds; j++) {
-memory_region_add_eventfd(&s->ivshmem_mmio,
-  DOORBELL,
-  4,
-  true,
-  (i << 16) | j,
-  s->peers[i].eventfds[j]);
-}
-}
-}
-
 /* this function increase the dynamic storage need to store data about other
  * guests */
 static void increase_dynamic_storage(IVShmemState *s, int new_min_size) {
@@ -491,10 +480,12 @@ static void ivshmem_read(void *opaque, const uint8_t * 
buf, int flags)
 }
 
 if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
-if (kvm_set_ioeventfd_mmio_long(incoming_fd, s->mmio_addr + DOORBELL,
-(incoming_posn << 16) | guest_max_eventfd, 1) < 0) {
-fprintf(stderr, "ivshmem: ioeventfd not available\n");
-}
+memory_region_add_eventfd(&s->ivshmem_mmio,
+  DOORBELL,
+  4,
+  true,
+  (incoming_posn << 16) | guest_max_eventfd,
+  incoming_fd);
 }
 
 return;
@@ -659,10 +650,6 @@ static int pci_ivshmem_init(PCIDevice *dev)
 memory_region_init_io(&s->ivshmem_mmio, &ivshmem_mmio_ops, s,
   "ivshmem-mmio", IVSHMEM_REG_BAR_SIZE);
 
-if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
-setup_ioeventfds(s);
-}
-
 /* region for registers*/
 pci_register_bar(&s->dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY,
  &s->ivshmem_mmio);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right

2011-11-24 Thread Avi Kivity
On 11/24/2011 12:45 PM, Sasha Levin wrote:
> If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still
> fail with -E2BIG due to wrong comparisons.
>
> Cc: Avi Kivity 
> Cc: Marcelo Tosatti 
> Signed-off-by: Sasha Levin 
> ---
>  arch/x86/kvm/x86.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 9eff4af..83fef71 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2710,7 +2710,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct 
> kvm_cpuid2 *cpuid,
>cpuid->nent);
>  
>   r = -E2BIG;
> - if (nent >= cpuid->nent)
> + if (nent > cpuid->nent)
>   goto out_free;
>  
>

This is just a landmine for the next entry to be added there; surely
whoever adds it will forget to correct the > back to >=.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right

2011-11-24 Thread Sasha Levin
If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still
fail with -E2BIG due to wrong comparisons.

Cc: Avi Kivity 
Cc: Marcelo Tosatti 
Signed-off-by: Sasha Levin 
---
 arch/x86/kvm/x86.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9eff4af..83fef71 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2710,7 +2710,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct 
kvm_cpuid2 *cpuid,
 cpuid->nent);
 
r = -E2BIG;
-   if (nent >= cpuid->nent)
+   if (nent > cpuid->nent)
goto out_free;
 
r = -EFAULT;
-- 
1.7.8.rc3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right

2011-11-24 Thread Avi Kivity
On 11/24/2011 12:37 PM, Sasha Levin wrote:
> On Thu, 2011-11-24 at 12:33 +0200, Avi Kivity wrote:
> > On 11/24/2011 12:31 PM, Sasha Levin wrote:
> > > > 
> > > > The protocol goes like "try size x, if it fails with -E2BIG, increase x,
> > > > try again". Its awkward.
> > >
> > > We can set nent to be the amount of entries required like we do in the
> > > opposite case where we passed too many entries.
> > 
> > There's no point, since userspace will want to support older kernels.
>
> In the case of old kernels the cpuid->nent value will not be modified,
> so userspace can handle both cases easily:
>
> - If KVM_GET_SUPPORTED_CPUID returned -E2BIG, check cpuid->nent
>   - If zero, do same -E2BIG loop as we do now.
>   - If not, allocate amount needed and pass it to the ioctl again.
>

What's the point?  The code becomes more complicated.

Something like 'while (try_get_cpuid(x) == -E2BIG) { x *= 2; }' is
simple and works everywhere.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right

2011-11-24 Thread Sasha Levin
On Thu, 2011-11-24 at 12:33 +0200, Avi Kivity wrote:
> On 11/24/2011 12:31 PM, Sasha Levin wrote:
> > > 
> > > The protocol goes like "try size x, if it fails with -E2BIG, increase x,
> > > try again". Its awkward.
> >
> > We can set nent to be the amount of entries required like we do in the
> > opposite case where we passed too many entries.
> 
> There's no point, since userspace will want to support older kernels.

In the case of old kernels the cpuid->nent value will not be modified,
so userspace can handle both cases easily:

- If KVM_GET_SUPPORTED_CPUID returned -E2BIG, check cpuid->nent
- If zero, do same -E2BIG loop as we do now.
- If not, allocate amount needed and pass it to the ioctl again.

-- 

Sasha.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right

2011-11-24 Thread Avi Kivity
On 11/24/2011 12:31 PM, Sasha Levin wrote:
> > 
> > The protocol goes like "try size x, if it fails with -E2BIG, increase x,
> > try again". Its awkward.
>
> We can set nent to be the amount of entries required like we do in the
> opposite case where we passed too many entries.

There's no point, since userspace will want to support older kernels.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right

2011-11-24 Thread Sasha Levin
On Thu, 2011-11-24 at 08:09 -0200, Marcelo Tosatti wrote:
> On Thu, Nov 17, 2011 at 12:18:44PM +0200, Sasha Levin wrote:
> > If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still
> > fail with -E2BIG due to wrong comparisons.
> > 
> > Cc: Avi Kivity 
> > Cc: Marcelo Tosatti 
> > Signed-off-by: Sasha Levin 
> > ---
> >  arch/x86/kvm/x86.c |   12 ++--
> >  1 files changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 9eff4af..460c49b 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -2664,7 +2664,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct 
> > kvm_cpuid2 *cpuid,
> > do_cpuid_ent(&cpuid_entries[nent], func, 0,
> >  &nent, cpuid->nent);
> > r = -E2BIG;
> > -   if (nent >= cpuid->nent)
> > +   if (nent > cpuid->nent)
> > goto out_free;
> 
> "int nent" variable contains the index into the array. 
> "__u32 cpuid->nent", from userspace, contains the number
> of entries in the array.
> 
> So the ">=" comparison is necessary to avoid overwriting past the end of
> the array.

Right, only the last comparison should be changed to ">" because in that
case It's ok if the nent (which points to the next entry) equals to
cpuid->nent.

> 
> The protocol goes like "try size x, if it fails with -E2BIG, increase x,
> try again". Its awkward.

We can set nent to be the amount of entries required like we do in the
opposite case where we passed too many entries.

-- 

Sasha.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 4/6] KVM: introduce id_to_memslot function

2011-11-24 Thread Takuya Yoshikawa

(2011/11/24 19:15), Takuya Yoshikawa wrote:

(2011/11/24 18:40), Xiao Guangrong wrote:



You can eliminate this if you use old_slot and new_slot for the two memory 
slots.


Or old_bitmap and new_bitmap.  Anyway, calling id_to_memslot() for getting the
same slot twice is not good, IMO.

Takuya
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 4/6] KVM: introduce id_to_memslot function

2011-11-24 Thread Takuya Yoshikawa

(2011/11/24 18:40), Xiao Guangrong wrote:

...


--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3521,7 +3521,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
if (log->slot>= KVM_MEMORY_SLOTS)
goto out;

-   memslot =&kvm->memslots->memslots[log->slot];
+   memslot = id_to_memslot(kvm->memslots, log->slot);
r = -ENOENT;
if (!memslot->dirty_bitmap)
goto out;
@@ -3544,15 +3544,16 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
if (!slots)
goto out;
memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots));
-   memslot =&slots->memslots[log->slot];
-   memslot->dirty_bitmap = dirty_bitmap;
+   memslot = id_to_memslot(slots, log->slot);
memslot->nr_dirty_pages = 0;
+   memslot->dirty_bitmap = dirty_bitmap;
update_memslots(slots, NULL);

old_slots = kvm->memslots;
rcu_assign_pointer(kvm->memslots, slots);
synchronize_srcu_expedited(&kvm->srcu);
-   dirty_bitmap = old_slots->memslots[log->slot].dirty_bitmap;
+   dirty_bitmap = id_to_memslot(old_slots,
+ log->slot)->dirty_bitmap;


You can eliminate this if you use old_slot and new_slot for the two memory 
slots.

Takuya


kfree(old_slots);

write_protect_slot(kvm, memslot, dirty_bitmap, nr_dirty_pages);

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Don't fail KVM_GET_SUPPORTED_CPUID if nent is just right

2011-11-24 Thread Marcelo Tosatti
On Thu, Nov 17, 2011 at 12:18:44PM +0200, Sasha Levin wrote:
> If we pass just enough entries to KVM_GET_SUPPORTED_CPUID, we would still
> fail with -E2BIG due to wrong comparisons.
> 
> Cc: Avi Kivity 
> Cc: Marcelo Tosatti 
> Signed-off-by: Sasha Levin 
> ---
>  arch/x86/kvm/x86.c |   12 ++--
>  1 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 9eff4af..460c49b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2664,7 +2664,7 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct 
> kvm_cpuid2 *cpuid,
>   do_cpuid_ent(&cpuid_entries[nent], func, 0,
>&nent, cpuid->nent);
>   r = -E2BIG;
> - if (nent >= cpuid->nent)
> + if (nent > cpuid->nent)
>   goto out_free;

"int nent" variable contains the index into the array. 
"__u32 cpuid->nent", from userspace, contains the number
of entries in the array.

So the ">=" comparison is necessary to avoid overwriting past the end of
the array.

The protocol goes like "try size x, if it fails with -E2BIG, increase x,
try again". Its awkward.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-tpr-opt: Fix instruction_is_ok() for push tpr

2011-11-24 Thread Marcelo Tosatti
On Tue, Nov 22, 2011 at 10:35:59AM +0100, Markus Armbruster wrote:
> Missing break spotted by Coverity.
> 
> Signed-off-by: Markus Armbruster 
> ---
>  kvm-tpr-opt.c |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)

Applied, thanks.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: IA64: fix struct redefinition

2011-11-24 Thread Xiao Guangrong
From: Xiao Guangrong 

There is the same struct definition in ia64 and kvm common code:
arch/ia64/kvm//kvm-ia64.c: At top level:
arch/ia64/kvm//kvm-ia64.c:777:8: error: redefinition of ‘struct kvm_io_range’
include/linux/kvm_host.h:62:8: note: originally defined here

So, rename kvm_io_range to kvm_ia64_io_range in ia64 code

Signed-off-by: Xiao Guangrong 
---
 arch/ia64/kvm/kvm-ia64.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 92d9f1e..4050520 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -774,13 +774,13 @@ struct kvm *kvm_arch_alloc_vm(void)
return kvm;
 }

-struct kvm_io_range {
+struct kvm_ia64_io_range {
unsigned long start;
unsigned long size;
unsigned long type;
 };

-static const struct kvm_io_range io_ranges[] = {
+static const struct kvm_ia64_io_range io_ranges[] = {
{VGA_IO_START, VGA_IO_SIZE, GPFN_FRAME_BUFFER},
{MMIO_START, MMIO_SIZE, GPFN_LOW_MMIO},
{LEGACY_IO_START, LEGACY_IO_SIZE, GPFN_LEGACY_IO},
-- 
1.7.7.3
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 6/6] KVM: introduce a table to map slot id to index in memslots array

2011-11-24 Thread Xiao Guangrong
From: Xiao Guangrong 

The operation of getting dirty log is frequent when framebuffer-based
displays are used(for example, Xwindow), so, we introduce a mapping table
to speed up id_to_memslot()

Signed-off-by: Xiao Guangrong 
---
 include/linux/kvm_host.h |   13 +++--
 virt/kvm/kvm_main.c  |7 ++-
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9efdf5c..8c5c303 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -239,6 +239,8 @@ struct kvm_irq_routing_table {};
 struct kvm_memslots {
u64 generation;
struct kvm_memory_slot memslots[KVM_MEM_SLOTS_NUM];
+   /* The mapping table from slot id to the index in memslots[]. */
+   int id_to_index[KVM_MEM_SLOTS_NUM];
 };

 struct kvm {
@@ -341,14 +343,13 @@ static inline struct kvm_memslots *kvm_memslots(struct 
kvm *kvm)
 static inline struct kvm_memory_slot *
 id_to_memslot(struct kvm_memslots *slots, int id)
 {
-   int i;
+   int index = slots->id_to_index[id];
+   struct kvm_memory_slot *slot;

-   for (i = 0; i < KVM_MEM_SLOTS_NUM; i++)
-   if (slots->memslots[i].id == id)
-   return &slots->memslots[i];
+   slot = &slots->memslots[index];

-   WARN_ON(1);
-   return NULL;
+   WARN_ON(slot->id != id);
+   return slot;
 }

 #define HPA_MSB ((sizeof(hpa_t) * 8) - 1)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6e8eb15..e289486 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -446,7 +446,7 @@ static void kvm_init_memslots_id(struct kvm *kvm)
struct kvm_memslots *slots = kvm->memslots;

for (i = 0; i < KVM_MEM_SLOTS_NUM; i++)
-   slots->memslots[i].id = i;
+   slots->id_to_index[i] = slots->memslots[i].id = i;
 }

 static struct kvm *kvm_create_vm(void)
@@ -674,8 +674,13 @@ static int cmp_memslot(const void *slot1, const void 
*slot2)
  */
 static void sort_memslots(struct kvm_memslots *slots)
 {
+   int i;
+
sort(slots->memslots, KVM_MEM_SLOTS_NUM,
  sizeof(struct kvm_memory_slot), cmp_memslot, NULL);
+
+   for (i = 0; i < KVM_MEM_SLOTS_NUM; i++)
+   slots->id_to_index[slots->memslots[i].id] = i;
 }

 void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new)
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 5/6] KVM: sort memslots by its size and use line search

2011-11-24 Thread Xiao Guangrong
From: Xiao Guangrong 

Sort memslots base on its size and use line search to find it, so that the
larger memslots have better fit

The idea is from Avi

Signed-off-by: Xiao Guangrong 
---
 include/linux/kvm_host.h |   18 +--
 virt/kvm/kvm_main.c  |   79 +-
 2 files changed, 72 insertions(+), 25 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 123925c..9efdf5c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -231,8 +231,12 @@ struct kvm_irq_routing_table {};
 #define KVM_MEM_SLOTS_NUM (KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS)
 #endif

+/*
+ * Note:
+ * memslots are not sorted by id anymore, please use id_to_memslot()
+ * to get the memslot by its id.
+ */
 struct kvm_memslots {
-   int nmemslots;
u64 generation;
struct kvm_memory_slot memslots[KVM_MEM_SLOTS_NUM];
 };
@@ -310,7 +314,8 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm 
*kvm, int i)

 #define kvm_for_each_memslot(memslot, slots)   \
for (memslot = &slots->memslots[0]; \
- memslot < slots->memslots + (slots)->nmemslots; memslot++)
+ memslot < slots->memslots + KVM_MEM_SLOTS_NUM && memslot->npages;\
+   memslot++)

 int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id);
 void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);
@@ -336,7 +341,14 @@ static inline struct kvm_memslots *kvm_memslots(struct kvm 
*kvm)
 static inline struct kvm_memory_slot *
 id_to_memslot(struct kvm_memslots *slots, int id)
 {
-   return &slots->memslots[id];
+   int i;
+
+   for (i = 0; i < KVM_MEM_SLOTS_NUM; i++)
+   if (slots->memslots[i].id == id)
+   return &slots->memslots[i];
+
+   WARN_ON(1);
+   return NULL;
 }

 #define HPA_MSB ((sizeof(hpa_t) * 8) - 1)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7b60849..6e8eb15 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -440,6 +440,15 @@ static int kvm_init_mmu_notifier(struct kvm *kvm)

 #endif /* CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER */

+static void kvm_init_memslots_id(struct kvm *kvm)
+{
+   int i;
+   struct kvm_memslots *slots = kvm->memslots;
+
+   for (i = 0; i < KVM_MEM_SLOTS_NUM; i++)
+   slots->memslots[i].id = i;
+}
+
 static struct kvm *kvm_create_vm(void)
 {
int r, i;
@@ -465,6 +474,7 @@ static struct kvm *kvm_create_vm(void)
kvm->memslots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL);
if (!kvm->memslots)
goto out_err_nosrcu;
+   kvm_init_memslots_id(kvm);
if (init_srcu_struct(&kvm->srcu))
goto out_err_nosrcu;
for (i = 0; i < KVM_NR_BUSES; i++) {
@@ -630,15 +640,54 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot 
*memslot)
 }
 #endif /* !CONFIG_S390 */

+static struct kvm_memory_slot *
+search_memslots(struct kvm_memslots *slots, gfn_t gfn)
+{
+   struct kvm_memory_slot *memslot;
+
+   kvm_for_each_memslot(memslot, slots)
+   if (gfn >= memslot->base_gfn &&
+ gfn < memslot->base_gfn + memslot->npages)
+   return memslot;
+
+   return NULL;
+}
+
+static int cmp_memslot(const void *slot1, const void *slot2)
+{
+   struct kvm_memory_slot *s1, *s2;
+
+   s1 = (struct kvm_memory_slot *)slot1;
+   s2 = (struct kvm_memory_slot *)slot2;
+
+   if (s1->npages < s2->npages)
+   return 1;
+   if (s1->npages > s2->npages)
+   return -1;
+
+   return 0;
+}
+
+/*
+ * Sort the memslots base on its size, so the larger slots
+ * will get better fit.
+ */
+static void sort_memslots(struct kvm_memslots *slots)
+{
+   sort(slots->memslots, KVM_MEM_SLOTS_NUM,
+ sizeof(struct kvm_memory_slot), cmp_memslot, NULL);
+}
+
 void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new)
 {
if (new) {
int id = new->id;
struct kvm_memory_slot *old = id_to_memslot(slots, id);
+   unsigned long npages = old->npages;

*old = *new;
-   if (id >= slots->nmemslots)
-   slots->nmemslots = id + 1;
+   if (new->npages != npages)
+   sort_memslots(slots);
}

slots->generation++;
@@ -980,14 +1029,7 @@ EXPORT_SYMBOL_GPL(kvm_is_error_hva);
 static struct kvm_memory_slot *__gfn_to_memslot(struct kvm_memslots *slots,
gfn_t gfn)
 {
-   struct kvm_memory_slot *memslot;
-
-   kvm_for_each_memslot(memslot, slots)
-   if (gfn >= memslot->base_gfn
-   && gfn < memslot->base_gfn + memslot->npages)
-   return memslot;
-
-   return NULL;
+   return search_memslots(slots, gfn);
 }

 struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
@@ -998,20 +

[PATCH v3 4/6] KVM: introduce id_to_memslot function

2011-11-24 Thread Xiao Guangrong
From: Xiao Guangrong 

Introduce id_to_memslot to get memslot by slot id

Signed-off-by: Xiao Guangrong 
---
 arch/ia64/kvm/kvm-ia64.c  |2 +-
 arch/powerpc/kvm/book3s.c |2 +-
 arch/x86/kvm/vmx.c|6 --
 arch/x86/kvm/x86.c|9 +
 include/linux/kvm_host.h  |6 ++
 virt/kvm/kvm_main.c   |   13 +
 6 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 42ad1f9..92d9f1e 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -1818,7 +1818,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
if (log->slot >= KVM_MEMORY_SLOTS)
goto out;

-   memslot = &kvm->memslots->memslots[log->slot];
+   memslot = id_to_memslot(kvm->memslots, log->slot);
r = -ENOENT;
if (!memslot->dirty_bitmap)
goto out;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index a459479..e41ac6f 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -498,7 +498,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,

/* If nothing is dirty, don't bother messing with page tables. */
if (is_dirty) {
-   memslot = &kvm->memslots->memslots[log->slot];
+   memslot = id_to_memslot(kvm->memslots, log->slot);

ga = memslot->base_gfn << PAGE_SHIFT;
ga_end = ga + (memslot->npages << PAGE_SHIFT);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ba24022..8f19d91 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2711,11 +2711,13 @@ static gva_t rmode_tss_base(struct kvm *kvm)
 {
if (!kvm->arch.tss_addr) {
struct kvm_memslots *slots;
+   struct kvm_memory_slot *slot;
gfn_t base_gfn;

slots = kvm_memslots(kvm);
-   base_gfn = slots->memslots[0].base_gfn +
-kvm->memslots->memslots[0].npages - 3;
+   slot = id_to_memslot(slots, 0);
+   base_gfn = slot->base_gfn + slot->npages - 3;
+
return base_gfn << PAGE_SHIFT;
}
return kvm->arch.tss_addr;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a9e5a59..b26dd82 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3521,7 +3521,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
if (log->slot >= KVM_MEMORY_SLOTS)
goto out;

-   memslot = &kvm->memslots->memslots[log->slot];
+   memslot = id_to_memslot(kvm->memslots, log->slot);
r = -ENOENT;
if (!memslot->dirty_bitmap)
goto out;
@@ -3544,15 +3544,16 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
if (!slots)
goto out;
memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots));
-   memslot = &slots->memslots[log->slot];
-   memslot->dirty_bitmap = dirty_bitmap;
+   memslot = id_to_memslot(slots, log->slot);
memslot->nr_dirty_pages = 0;
+   memslot->dirty_bitmap = dirty_bitmap;
update_memslots(slots, NULL);

old_slots = kvm->memslots;
rcu_assign_pointer(kvm->memslots, slots);
synchronize_srcu_expedited(&kvm->srcu);
-   dirty_bitmap = old_slots->memslots[log->slot].dirty_bitmap;
+   dirty_bitmap = id_to_memslot(old_slots,
+ log->slot)->dirty_bitmap;
kfree(old_slots);

write_protect_slot(kvm, memslot, dirty_bitmap, nr_dirty_pages);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 392af47..123925c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -333,6 +333,12 @@ static inline struct kvm_memslots *kvm_memslots(struct kvm 
*kvm)
|| lockdep_is_held(&kvm->slots_lock));
 }

+static inline struct kvm_memory_slot *
+id_to_memslot(struct kvm_memslots *slots, int id)
+{
+   return &slots->memslots[id];
+}
+
 #define HPA_MSB ((sizeof(hpa_t) * 8) - 1)
 #define HPA_ERR_MASK ((hpa_t)1 << HPA_MSB)
 static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; }
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4c2900c..7b60849 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -634,8 +634,9 @@ void update_memslots(struct kvm_memslots *slots, struct 
kvm_memory_slot *new)
 {
if (new) {
int id = new->id;
+   struct kvm_memory_slot *old = id_to_memslot(slots, id);

-   slots->memslots[id] = *new;
+   *old = *new;
if (id >= slots->nmemslots)
slots->nmemslots = id + 1;
}
@@ -681,7 +682,7 @@ int __kvm_set_memory_region(struct kvm *kvm,
if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr)
goto ou

[PATCH v3 3/6] KVM: introduce kvm_for_each_memslot macro

2011-11-24 Thread Xiao Guangrong
From: Xiao Guangrong 

Introduce kvm_for_each_memslot to walk all valid memslot

Signed-off-by: Xiao Guangrong 
---
 arch/ia64/kvm/kvm-ia64.c |6 ++
 arch/x86/kvm/mmu.c   |   12 ++--
 include/linux/kvm_host.h |4 
 virt/kvm/iommu.c |   17 +
 virt/kvm/kvm_main.c  |   14 ++
 5 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 43f4c92..42ad1f9 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -1366,14 +1366,12 @@ static void kvm_release_vm_pages(struct kvm *kvm)
 {
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot;
-   int i, j;
+   int j;
unsigned long base_gfn;

slots = kvm_memslots(kvm);
-   for (i = 0; i < slots->nmemslots; i++) {
-   memslot = &slots->memslots[i];
+   kvm_for_each_memslot(memslot, slots) {
base_gfn = memslot->base_gfn;
-
for (j = 0; j < memslot->npages; j++) {
if (memslot->rmap[j])
put_page((struct page *)memslot->rmap[j]);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 715dcb4..d737443 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1128,15 +1128,15 @@ static int kvm_handle_hva(struct kvm *kvm, unsigned 
long hva,
  int (*handler)(struct kvm *kvm, unsigned long *rmapp,
 unsigned long data))
 {
-   int i, j;
+   int j;
int ret;
int retval = 0;
struct kvm_memslots *slots;
+   struct kvm_memory_slot *memslot;

slots = kvm_memslots(kvm);

-   for (i = 0; i < slots->nmemslots; i++) {
-   struct kvm_memory_slot *memslot = &slots->memslots[i];
+   kvm_for_each_memslot(memslot, slots) {
unsigned long start = memslot->userspace_addr;
unsigned long end;

@@ -3985,15 +3985,15 @@ nomem:
  */
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
 {
-   int i;
unsigned int nr_mmu_pages;
unsigned int  nr_pages = 0;
struct kvm_memslots *slots;
+   struct kvm_memory_slot *memslot;

slots = kvm_memslots(kvm);

-   for (i = 0; i < slots->nmemslots; i++)
-   nr_pages += slots->memslots[i].npages;
+   kvm_for_each_memslot(memslot, slots)
+   nr_pages += memslot->npages;

nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000;
nr_mmu_pages = max(nr_mmu_pages,
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 23f795c..392af47 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -308,6 +308,10 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm 
*kvm, int i)
 (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
 idx++)

+#define kvm_for_each_memslot(memslot, slots)   \
+   for (memslot = &slots->memslots[0]; \
+ memslot < slots->memslots + (slots)->nmemslots; memslot++)
+
 int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id);
 void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);

diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
index a195c07..4e5f7b7 100644
--- a/virt/kvm/iommu.c
+++ b/virt/kvm/iommu.c
@@ -134,14 +134,15 @@ unmap_pages:

 static int kvm_iommu_map_memslots(struct kvm *kvm)
 {
-   int i, idx, r = 0;
+   int idx, r = 0;
struct kvm_memslots *slots;
+   struct kvm_memory_slot *memslot;

idx = srcu_read_lock(&kvm->srcu);
slots = kvm_memslots(kvm);

-   for (i = 0; i < slots->nmemslots; i++) {
-   r = kvm_iommu_map_pages(kvm, &slots->memslots[i]);
+   kvm_for_each_memslot(memslot, slots) {
+   r = kvm_iommu_map_pages(kvm, memslot);
if (r)
break;
}
@@ -311,16 +312,16 @@ static void kvm_iommu_put_pages(struct kvm *kvm,

 static int kvm_iommu_unmap_memslots(struct kvm *kvm)
 {
-   int i, idx;
+   int idx;
struct kvm_memslots *slots;
+   struct kvm_memory_slot *memslot;

idx = srcu_read_lock(&kvm->srcu);
slots = kvm_memslots(kvm);

-   for (i = 0; i < slots->nmemslots; i++) {
-   kvm_iommu_put_pages(kvm, slots->memslots[i].base_gfn,
-   slots->memslots[i].npages);
-   }
+   kvm_for_each_memslot(memslot, slots)
+   kvm_iommu_put_pages(kvm, memslot->base_gfn, memslot->npages);
+
srcu_read_unlock(&kvm->srcu, idx);

return 0;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b5ed777..4c2900c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -547,11 +547,11 @@ static void kvm_free_physmem_slot(struct kvm_memory_slot 
*free,

 void kvm_free_physmem(struct kvm *kvm)
 {
-   int i;
struct kvm_memslots *slots = kvm->memslots;
+   struct kvm_memory_slot *memslot;

-  

[PATCH v3 2/6] KVM: introduce update_memslots function

2011-11-24 Thread Xiao Guangrong
From: Xiao Guangrong 

Introduce update_memslots to update slot which will be update to
kvm->memslots

Signed-off-by: Xiao Guangrong 
---
 arch/x86/kvm/x86.c   |2 +-
 include/linux/kvm_host.h |1 +
 virt/kvm/kvm_main.c  |   22 +++---
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1985ea1..a9e5a59 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3547,7 +3547,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
memslot = &slots->memslots[log->slot];
memslot->dirty_bitmap = dirty_bitmap;
memslot->nr_dirty_pages = 0;
-   slots->generation++;
+   update_memslots(slots, NULL);

old_slots = kvm->memslots;
rcu_assign_pointer(kvm->memslots, slots);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 924df0d..23f795c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -320,6 +320,7 @@ void kvm_exit(void);

 void kvm_get_kvm(struct kvm *kvm);
 void kvm_put_kvm(struct kvm *kvm);
+void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new);

 static inline struct kvm_memslots *kvm_memslots(struct kvm *kvm)
 {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9ad94c9..b5ed777 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -630,6 +630,19 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot 
*memslot)
 }
 #endif /* !CONFIG_S390 */

+void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new)
+{
+   if (new) {
+   int id = new->id;
+
+   slots->memslots[id] = *new;
+   if (id >= slots->nmemslots)
+   slots->nmemslots = id + 1;
+   }
+
+   slots->generation++;
+}
+
 /*
  * Allocate some memory and give it an address in the guest physical address
  * space.
@@ -780,10 +793,8 @@ skip_lpage:
GFP_KERNEL);
if (!slots)
goto out_free;
-   if (mem->slot >= slots->nmemslots)
-   slots->nmemslots = mem->slot + 1;
-   slots->generation++;
slots->memslots[mem->slot].flags |= KVM_MEMSLOT_INVALID;
+   update_memslots(slots, NULL);

old_memslots = kvm->memslots;
rcu_assign_pointer(kvm->memslots, slots);
@@ -815,9 +826,6 @@ skip_lpage:
GFP_KERNEL);
if (!slots)
goto out_free;
-   if (mem->slot >= slots->nmemslots)
-   slots->nmemslots = mem->slot + 1;
-   slots->generation++;

/* actual memory is freed via old in kvm_free_physmem_slot below */
if (!npages) {
@@ -827,7 +835,7 @@ skip_lpage:
new.lpage_info[i] = NULL;
}

-   slots->memslots[mem->slot] = new;
+   update_memslots(slots, &new);
old_memslots = kvm->memslots;
rcu_assign_pointer(kvm->memslots, slots);
synchronize_srcu_expedited(&kvm->srcu);
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/6] KVM: introduce KVM_MEM_SLOTS_NUM macro

2011-11-24 Thread Xiao Guangrong
From: Xiao Guangrong 

Introduce KVM_MEM_SLOTS_NUM macro to instead of
KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS

Signed-off-by: Xiao Guangrong 
---
 arch/x86/include/asm/kvm_host.h |4 +++-
 arch/x86/kvm/mmu.c  |2 +-
 include/linux/kvm_host.h|7 +--
 virt/kvm/kvm_main.c |2 +-
 4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 69b6525..1769f3d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -31,6 +31,8 @@
 #define KVM_MEMORY_SLOTS 32
 /* memory slots that does not exposed to userspace */
 #define KVM_PRIVATE_MEM_SLOTS 4
+#define KVM_MEM_SLOTS_NUM (KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS)
+
 #define KVM_MMIO_SIZE 16

 #define KVM_PIO_PAGE_OFFSET 1
@@ -228,7 +230,7 @@ struct kvm_mmu_page {
 * One bit set per slot which has memory
 * in this shadow page.
 */
-   DECLARE_BITMAP(slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS);
+   DECLARE_BITMAP(slot_bitmap, KVM_MEM_SLOTS_NUM);
bool unsync;
int root_count;  /* Currently serving as active root */
unsigned int unsync_children;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index aecdea2..715dcb4 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1349,7 +1349,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct 
kvm_vcpu *vcpu,
  PAGE_SIZE);
set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages);
-   bitmap_zero(sp->slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS);
+   bitmap_zero(sp->slot_bitmap, KVM_MEM_SLOTS_NUM);
sp->parent_ptes = 0;
mmu_page_add_parent_pte(vcpu, sp, parent_pte);
kvm_mod_used_mmu_pages(vcpu->kvm, +1);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7c654aa..924df0d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -227,11 +227,14 @@ struct kvm_irq_routing_table {};

 #endif

+#ifndef KVM_MEM_SLOTS_NUM
+#define KVM_MEM_SLOTS_NUM (KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS)
+#endif
+
 struct kvm_memslots {
int nmemslots;
u64 generation;
-   struct kvm_memory_slot memslots[KVM_MEMORY_SLOTS +
-   KVM_PRIVATE_MEM_SLOTS];
+   struct kvm_memory_slot memslots[KVM_MEM_SLOTS_NUM];
 };

 struct kvm {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index af5c988..9ad94c9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -663,7 +663,7 @@ int __kvm_set_memory_region(struct kvm *kvm,
(void __user *)(unsigned long)mem->userspace_addr,
mem->memory_size)))
goto out;
-   if (mem->slot >= KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS)
+   if (mem->slot >= KVM_MEM_SLOTS_NUM)
goto out;
if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr)
goto out;
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/6] KVM: optimize memslots searching

2011-11-24 Thread Xiao Guangrong
Changelog:
- rebase it on current kvm tree and some cleanups

This patchset is tested on x86 and build tested on powerpc and ia64
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html