date:20130408

[PATCH] KVM: x86: fix memory leak in vmx_init

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Free vmx_msr_bitmap_longmode_x2apic and vmx_msr_bitmap_longmode if
kvm_init() fails.

Signed-off-by: Yang Zhang 
---
 arch/x86/kvm/vmx.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 03f5746..c6da8c5 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7741,7 +7741,7 @@ static int __init vmx_init(void)
r = kvm_init(&vmx_x86_ops, sizeof(struct vcpu_vmx),
 __alignof__(struct vcpu_vmx), THIS_MODULE);
if (r)
-   goto out3;
+   goto out5;
 
 #ifdef CONFIG_KEXEC
rcu_assign_pointer(crash_vmclear_loaded_vmcss,
@@ -7789,6 +7789,8 @@ static int __init vmx_init(void)
 
return 0;
 
+out5:
+   free_page((unsigned long)vmx_msr_bitmap_longmode_x2apic);
 out4:
free_page((unsigned long)vmx_msr_bitmap_longmode);
 out3:
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] virtio-serial: set up vqs on demand

2013-04-08 Thread Amit Shah

On (Thu) 12 Jan 2012 [09:20:05], zanghongy...@huawei.com wrote:
> From: Hongyong Zang 
> 
> Virtio-serial set up (max_ports+1)*2 vqs when device probes, but may not all 
> io_ports are used.
> These patches create vqs of port0 and control port when probing the device, 
> then 
> create io-vqs when called add_port().

Hi,

Can you resurrect this series?  I think last time we were waiting for
a v2, but looks like it got lost.

Let me know if you're not interested anymore, and I can do something
about it.

Thanks,

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: x86: fix memory leak in vmx_init

2013-04-08 Thread Gleb Natapov

On Mon, Apr 08, 2013 at 03:26:33PM +0800, Yang Zhang wrote:
> From: Yang Zhang 
> 
> Free vmx_msr_bitmap_longmode_x2apic and vmx_msr_bitmap_longmode if
> kvm_init() fails.
> 
> Signed-off-by: Yang Zhang 
> ---
Applied, thanks.

>  arch/x86/kvm/vmx.c |4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 03f5746..c6da8c5 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -7741,7 +7741,7 @@ static int __init vmx_init(void)
>   r = kvm_init(&vmx_x86_ops, sizeof(struct vcpu_vmx),
>__alignof__(struct vcpu_vmx), THIS_MODULE);
>   if (r)
> - goto out3;
> + goto out5;
>  
>  #ifdef CONFIG_KEXEC
>   rcu_assign_pointer(crash_vmclear_loaded_vmcss,
> @@ -7789,6 +7789,8 @@ static int __init vmx_init(void)
>  
>   return 0;
>  
> +out5:
> + free_page((unsigned long)vmx_msr_bitmap_longmode_x2apic);
>  out4:
>   free_page((unsigned long)vmx_msr_bitmap_longmode);
>  out3:
> -- 
> 1.7.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH V3 0/2] tcm_vhost endpoint

2013-04-08 Thread Michael S. Tsirkin

On Wed, Apr 03, 2013 at 02:17:36PM +0800, Asias He wrote:
> Hello mst,
> 
> How about this one?
> 
> Asias He (2):
>   tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
>   tcm_vhost: Initialize vq->last_used_idx when set endpoint
> 
>  drivers/vhost/tcm_vhost.c | 145 
> --
>  1 file changed, 102 insertions(+), 43 deletions(-)

Looks good to me. 

Acked-by: Michael S. Tsirkin 



> -- 
> 1.8.1.4
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH V3 1/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup

2013-04-08 Thread Michael S. Tsirkin

On Wed, Apr 03, 2013 at 02:17:37PM +0800, Asias He wrote:
> Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> not. It is set or cleared in vhost_scsi_set_endpoint() or
> vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> we check it in vhost_scsi_handle_vq(), we ignored the lock.
> 
> Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> indicate the status of the endpoint, we use per virtqueue
> vq->private_data to indicate it. In this way, we can only take the
> vq->mutex lock which is per queue and make the concurrent multiqueue
> process having less lock contention. Further, in the read side of
> vq->private_data, we can even do not take the lock if it is accessed in
> the vhost worker thread, because it is protected by "vhost rcu".
> 
> Signed-off-by: Asias He 

Not strictly 3.9 material itself but needed for the next one.

Acked-by: Michael S. Tsirkin 

> ---
>  drivers/vhost/tcm_vhost.c | 144 
> --
>  1 file changed, 101 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> index 61f9eab..11121ea 100644
> --- a/drivers/vhost/tcm_vhost.c
> +++ b/drivers/vhost/tcm_vhost.c
> @@ -65,9 +65,8 @@ enum {
>  
>  struct vhost_scsi {
>   /* Protected by vhost_scsi->dev.mutex */
> - struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> + struct tcm_vhost_tpg **vs_tpg;
>   char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> - bool vs_endpoint;
>  
>   struct vhost_dev dev;
>   struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> @@ -573,6 +572,7 @@ static void tcm_vhost_submission_work(struct work_struct 
> *work)
>  static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
>   struct vhost_virtqueue *vq)
>  {
> + struct tcm_vhost_tpg **vs_tpg;
>   struct virtio_scsi_cmd_req v_req;
>   struct tcm_vhost_tpg *tv_tpg;
>   struct tcm_vhost_cmd *tv_cmd;
> @@ -581,8 +581,16 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
>   int head, ret;
>   u8 target;
>  
> - /* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> - if (unlikely(!vs->vs_endpoint))
> + /*
> +  * We can handle the vq only after the endpoint is setup by calling the
> +  * VHOST_SCSI_SET_ENDPOINT ioctl.
> +  *
> +  * TODO: Check that we are running from vhost_worker which acts
> +  * as read-side critical section for vhost kind of RCU.
> +  * See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> +  */
> + vs_tpg = rcu_dereference_check(vq->private_data, 1);
> + if (!vs_tpg)
>   return;
>  
>   mutex_lock(&vq->mutex);
> @@ -652,7 +660,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
>  
>   /* Extract the tpgt */
>   target = v_req.lun[1];
> - tv_tpg = ACCESS_ONCE(vs->vs_tpg[target]);
> + tv_tpg = ACCESS_ONCE(vs_tpg[target]);
>  
>   /* Target does not exist, fail the request */
>   if (unlikely(!tv_tpg)) {
> @@ -771,6 +779,20 @@ static void vhost_scsi_handle_kick(struct vhost_work 
> *work)
>   vhost_scsi_handle_vq(vs, vq);
>  }
>  
> +static void vhost_scsi_flush_vq(struct vhost_scsi *vs, int index)
> +{
> + vhost_poll_flush(&vs->dev.vqs[index].poll);
> +}
> +
> +static void vhost_scsi_flush(struct vhost_scsi *vs)
> +{
> + int i;
> +
> + for (i = 0; i < VHOST_SCSI_MAX_VQ; i++)
> + vhost_scsi_flush_vq(vs, i);
> + vhost_work_flush(&vs->dev, &vs->vs_completion_work);
> +}
> +
>  /*
>   * Called from vhost_scsi_ioctl() context to walk the list of available
>   * tcm_vhost_tpg with an active struct tcm_vhost_nexus
> @@ -781,8 +803,10 @@ static int vhost_scsi_set_endpoint(
>  {
>   struct tcm_vhost_tport *tv_tport;
>   struct tcm_vhost_tpg *tv_tpg;
> + struct tcm_vhost_tpg **vs_tpg;
> + struct vhost_virtqueue *vq;
> + int index, ret, i, len;
>   bool match = false;
> - int index, ret;
>  
>   mutex_lock(&vs->dev.mutex);
>   /* Verify that ring has been setup correctly. */
> @@ -794,6 +818,15 @@ static int vhost_scsi_set_endpoint(
>   }
>   }
>  
> + len = sizeof(vs_tpg[0]) * VHOST_SCSI_MAX_TARGET;
> + vs_tpg = kzalloc(len, GFP_KERNEL);
> + if (!vs_tpg) {
> + mutex_unlock(&vs->dev.mutex);
> + return -ENOMEM;
> + }
> + if (vs->vs_tpg)
> + memcpy(vs_tpg, vs->vs_tpg, len);
> +
>   mutex_lock(&tcm_vhost_mutex);
>   list_for_each_entry(tv_tpg, &tcm_vhost_list, tv_tpg_list) {
>   mutex_lock(&tv_tpg->tv_tpg_mutex);
> @@ -808,14 +841,15 @@ static int vhost_scsi_set_endpoint(
>   tv_tport = tv_tpg->tport;
>  
>   if (!strcmp(tv_tport->tport_name, t->vhost_wwpn)) {
> - if (vs->vs_tpg[tv_tpg->tport_tpgt]) {
> + if (vs->vs_tpg && vs->vs_tpg[tv_tpg->tport_tpgt]) {
>

Re: [PATCH V3 2/2] tcm_vhost: Initialize vq->last_used_idx when set endpoint

2013-04-08 Thread Michael S. Tsirkin

On Wed, Apr 03, 2013 at 02:17:38PM +0800, Asias He wrote:
> This patch fixes guest hang when booting seabios and guest.
> 
>   [0.576238] scsi0 : Virtio SCSI HBA
>   [0.616754] virtio_scsi virtio1: request:id 0 is not a head!
> 
> vq->last_used_idx is initialized only when /dev/vhost-scsi is
> opened or closed.
> 
>vhost_scsi_open -> vhost_dev_init() -> vhost_vq_reset()
>vhost_scsi_release() -> vhost_dev_cleanup -> vhost_vq_reset()
> 
> So, when guest talks to tcm_vhost after seabios does, vq->last_used_idx
> still contains the old valule for seabios. This confuses guest.
> 
> Fix this by calling vhost_init_used() to init vq->last_used_idx when
> we set endpoint.
> 
> Signed-off-by: Asias He 

Please apply for 3.9.

Acked-by: Michael S. Tsirkin 

> ---
>  drivers/vhost/tcm_vhost.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> index 11121ea..a10efd3 100644
> --- a/drivers/vhost/tcm_vhost.c
> +++ b/drivers/vhost/tcm_vhost.c
> @@ -865,6 +865,7 @@ static int vhost_scsi_set_endpoint(
>   /* Flushing the vhost_work acts as synchronize_rcu */
>   mutex_lock(&vq->mutex);
>   rcu_assign_pointer(vq->private_data, vs_tpg);
> + vhost_init_used(vq);
>   mutex_unlock(&vq->mutex);
>   }
>   ret = 0;
> -- 
> 1.8.1.4
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] KVM: VMX: Add missing braces to avoid redundant error check

2013-04-08 Thread Jan Kiszka

The code was already properly aligned, now also add the braces to avoid
that err is checked even if alloc_apic_access_page didn't run and change
it. Found via Coccinelle by Fengguang Wu.

Signed-off-by: Jan Kiszka 
---
 arch/x86/kvm/vmx.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index cf1aa8f..656b0fa 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6797,10 +6797,11 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm 
*kvm, unsigned int id)
put_cpu();
if (err)
goto free_vmcs;
-   if (vm_need_virtualize_apic_accesses(kvm))
+   if (vm_need_virtualize_apic_accesses(kvm)) {
err = alloc_apic_access_page(kvm);
if (err)
goto free_vmcs;
+   }
 
if (enable_ept) {
if (!kvm->arch.ept_identity_map_addr)
-- 
1.7.3.4
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: x86: Fix memory leak in vmx.c

2013-04-08 Thread Gleb Natapov

On Thu, Apr 04, 2013 at 12:39:47PM -0700, Andrew Honig wrote:
> If userspace creates and destroys multiple VMs within the same process
> we leak 20k of memory in the userspace process context per VM.  This
> patch frees the memory in kvm_arch_destroy_vm.  If the process exits
> without closing the VM file descriptor or the file descriptor has been
> shared with another process then we don't need to free the memory.
> 
> Messing with user space memory from an fd is not ideal, but other changes
> would require user space changes and this is consistent with how the
> memory is currently allocated.
> 
> Tested: Test ran several VMs and ran against test program meant to 
> demonstrate the leak (www.spinics.net/lists/kvm/msg83734.html).
> 
> Signed-off-by: Andrew Honig 
> 
> ---
>  arch/x86/include/asm/kvm_host.h |3 +++
>  arch/x86/kvm/vmx.c  |3 +++
>  arch/x86/kvm/x86.c  |   11 +++
>  3 files changed, 17 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 4979778..975a74d 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -553,6 +553,9 @@ struct kvm_arch {
>   struct page *ept_identity_pagetable;
>   bool ept_identity_pagetable_done;
>   gpa_t ept_identity_map_addr;
> + unsigned long ept_ptr;
> + unsigned long apic_ptr;
> + unsigned long tss_ptr;
>  
Better to use __kvm_set_memory_region() with memory_size = 0 to delete
the slot and fix kvm_arch_prepare_memory_region() to unmap if
change == KVM_MR_DELETE.

>   unsigned long irq_sources_bitmap;
>   s64 kvmclock_offset;
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 6667042..8aa5d81 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -3703,6 +3703,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
>   }
>  
>   kvm->arch.apic_access_page = page;
> + kvm->arch.apic_ptr = kvm_userspace_mem.userspace_addr;
>  out:
>   mutex_unlock(&kvm->slots_lock);
>   return r;
> @@ -3733,6 +3734,7 @@ static int alloc_identity_pagetable(struct kvm *kvm)
>   }
>  
>   kvm->arch.ept_identity_pagetable = page;
> + kvm->arch.ept_ptr = kvm_userspace_mem.userspace_addr;
>  out:
>   mutex_unlock(&kvm->slots_lock);
>   return r;
> @@ -4366,6 +4368,7 @@ static int vmx_set_tss_addr(struct kvm *kvm, unsigned 
> int addr)
>   if (ret)
>   return ret;
>   kvm->arch.tss_addr = addr;
> + kvm->arch.tss_ptr = tss_mem.userspace_addr;
>   if (!init_rmode_tss(kvm))
>   return  -ENOMEM;
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index f19ac0a..411ff2a 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6812,6 +6812,16 @@ void kvm_arch_sync_events(struct kvm *kvm)
>  
>  void kvm_arch_destroy_vm(struct kvm *kvm)
>  {
> + if (current->mm == kvm->mm) {
> + /*
> +  * Free pages allocated on behalf of userspace, unless the
> +  * the memory map has changed due to process exit or fd
> +  * copying.
> +  */
Why mm changes during process exit? And what do you mean by fd copying?
One process creates kvm fd and pass it to another? In this case I think
the leak will still be there since all of the addresses bellow are
mapped after kvm fd is created. apic_access_page and identity_pagetable
during first vcpu creation and tss when KVM_SET_TSS_ADDR ioctl is
called. Vcpu creation and ioctl call can be done by different process
from the one that created kvm fd.


> + vm_munmap(kvm->arch.apic_ptr, PAGE_SIZE);
> + vm_munmap(kvm->arch.ept_ptr, PAGE_SIZE);
> + vm_munmap(kvm->arch.tss_ptr, PAGE_SIZE * 3);
> + }
>   kvm_iommu_unmap_guest(kvm);
>   kfree(kvm->arch.vpic);
>   kfree(kvm->arch.vioapic);
> @@ -6929,6 +6939,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>   return PTR_ERR((void *)userspace_addr);
>  
>   memslot->userspace_addr = userspace_addr;
> + mem->userspace_addr = userspace_addr;
>   }
>  
>   return 0;
> -- 
> 1.7.10.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Working of VXLAN

2013-04-08 Thread normal user

Hi all,
I am not sure whether this is a correct list to ask the question, If
don't please redirect me to the correct one.

I want to know how can use VXLAN driver for KVM guests.
I am looking for the standard setup steps to just setup a working environment.

I am not interested in the OVS implementation of VXLAN. I want to use
a standard bridge.
Please help.

Regards,
timber
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: VMX: Add missing braces to avoid redundant error check

2013-04-08 Thread Gleb Natapov

On Mon, Apr 08, 2013 at 11:07:46AM +0200, Jan Kiszka wrote:
> The code was already properly aligned, now also add the braces to avoid
Are you saying kvm is not written in Python?

> that err is checked even if alloc_apic_access_page didn't run and change
> it. Found via Coccinelle by Fengguang Wu.
> 
> Signed-off-by: Jan Kiszka 
Applied, thanks.

> ---
>  arch/x86/kvm/vmx.c |3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index cf1aa8f..656b0fa 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -6797,10 +6797,11 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm 
> *kvm, unsigned int id)
>   put_cpu();
>   if (err)
>   goto free_vmcs;
> - if (vm_need_virtualize_apic_accesses(kvm))
> + if (vm_need_virtualize_apic_accesses(kvm)) {
>   err = alloc_apic_access_page(kvm);
>   if (err)
>   goto free_vmcs;
> + }
>  
>   if (enable_ept) {
>   if (!kvm->arch.ept_identity_map_addr)
> -- 
> 1.7.3.4

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 0/4] KVM minor fixups

2013-04-08 Thread Gleb Natapov

On Fri, Apr 05, 2013 at 07:20:30PM +, Geoff Levand wrote:
> Hi Paolo,
> 
> I fixed up the series as requested.
> 
> -Geoff
> 
> V2:
> o Removed arm patches.
> o Moved kvm_spurious_fault to arch/x86/kvm/x86.c.
> o Fixed commit comments.
> 
> 
> The following changes since commit 07961ac7c0ee8b546658717034fe692fd12eefa9:
> 
>   Linux 3.9-rc5 (2013-03-31 15:12:43 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/geoff/kvm.git for-kvm
> 
> for you to fetch changes up to 753c819ec3c3c55d6ca0eeddb4c6c2a32be7219b:
> 
>   KVM: Move vm_list kvm_lock declarations out of x86 (2013-04-05 11:43:22 
> -0700)
> 
> Geoff Levand (4):
>   KVM: Make local routines static
>   KVM: Move kvm_spurious_fault to x86.c
>   KVM: Move kvm_rebooting declaration out of x86
>   KVM: Move vm_list kvm_lock declarations out of x86
> 
>  arch/x86/include/asm/kvm_host.h |4 
>  arch/x86/kvm/x86.c  |7 +++
>  include/linux/kvm_host.h|5 +
>  virt/kvm/kvm_main.c |   16 
>  4 files changed, 16 insertions(+), 16 deletions(-)
> 
Applied all, thanks.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/11] KVM: nVMX: Stats counters for nVMX

2013-04-08 Thread Gleb Natapov

On Sun, Mar 10, 2013 at 06:03:55PM +0200, Abel Gordon wrote:
> Add new counters to measure how many vmread/vmwrite/vmlaunch/vmresume/vmclear
> instructions were trapped and emulated by L0
> 
stat counters are deprecated in favor of trace points. Adding kvmnested
trace system is very welcome though.

> Signed-off-by: Abel Gordon 
> ---
>  arch/x86/include/asm/kvm_host.h |6 ++
>  arch/x86/kvm/vmx.c  |7 +++
>  arch/x86/kvm/x86.c  |6 ++
>  3 files changed, 19 insertions(+)
> 
> --- .before/arch/x86/include/asm/kvm_host.h   2013-03-10 18:00:54.0 
> +0200
> +++ .after/arch/x86/include/asm/kvm_host.h2013-03-10 18:00:54.0 
> +0200
> @@ -619,6 +619,12 @@ struct kvm_vcpu_stat {
>   u32 hypercalls;
>   u32 irq_injections;
>   u32 nmi_injections;
> + u32 nvmx_vmreads;
> + u32 nvmx_vmwrites;
> + u32 nvmx_vmptrlds;
> + u32 nvmx_vmlaunchs;
> + u32 nvmx_vmresumes;
> + u32 nvmx_vmclears;
>  };
>  
>  struct x86_instruction_info;
> --- .before/arch/x86/kvm/vmx.c2013-03-10 18:00:54.0 +0200
> +++ .after/arch/x86/kvm/vmx.c 2013-03-10 18:00:54.0 +0200
> @@ -5545,6 +5545,7 @@ static int handle_vmclear(struct kvm_vcp
>   struct vmcs12 *vmcs12;
>   struct page *page;
>   struct x86_exception e;
> + ++vcpu->stat.nvmx_vmclears;
>  
>   if (!nested_vmx_check_permission(vcpu))
>   return 1;
> @@ -5601,12 +5602,14 @@ static int nested_vmx_run(struct kvm_vcp
>  /* Emulate the VMLAUNCH instruction */
>  static int handle_vmlaunch(struct kvm_vcpu *vcpu)
>  {
> + ++vcpu->stat.nvmx_vmlaunchs;
>   return nested_vmx_run(vcpu, true);
>  }
>  
>  /* Emulate the VMRESUME instruction */
>  static int handle_vmresume(struct kvm_vcpu *vcpu)
>  {
> + ++vcpu->stat.nvmx_vmresumes;
>  
>   return nested_vmx_run(vcpu, false);
>  }
> @@ -5689,6 +5692,7 @@ static int handle_vmread(struct kvm_vcpu
>   u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
>   gva_t gva = 0;
>  
> + ++vcpu->stat.nvmx_vmreads;
>   if (!nested_vmx_check_permission(vcpu) ||
>   !nested_vmx_check_vmcs12(vcpu))
>   return 1;
> @@ -5741,6 +5745,8 @@ static int handle_vmwrite(struct kvm_vcp
>   u64 field_value = 0;
>   struct x86_exception e;
>  
> + ++vcpu->stat.nvmx_vmwrites;
> +
>   if (!nested_vmx_check_permission(vcpu) ||
>   !nested_vmx_check_vmcs12(vcpu))
>   return 1;
> @@ -5807,6 +5813,7 @@ static int handle_vmptrld(struct kvm_vcp
>   gva_t gva;
>   gpa_t vmptr;
>   struct x86_exception e;
> + ++vcpu->stat.nvmx_vmptrlds;
>  
>   if (!nested_vmx_check_permission(vcpu))
>   return 1;
> --- .before/arch/x86/kvm/x86.c2013-03-10 18:00:54.0 +0200
> +++ .after/arch/x86/kvm/x86.c 2013-03-10 18:00:54.0 +0200
> @@ -145,6 +145,12 @@ struct kvm_stats_debugfs_item debugfs_en
>   { "insn_emulation_fail", VCPU_STAT(insn_emulation_fail) },
>   { "irq_injections", VCPU_STAT(irq_injections) },
>   { "nmi_injections", VCPU_STAT(nmi_injections) },
> + { "nvmx_vmreads", VCPU_STAT(nvmx_vmreads) },
> + { "nvmx_vmwrites", VCPU_STAT(nvmx_vmwrites) },
> + { "nvmx_vmptrlds", VCPU_STAT(nvmx_vmptrlds) },
> + { "nvmx_vmlaunchs", VCPU_STAT(nvmx_vmlaunchs) },
> + { "nvmx_vmresumes", VCPU_STAT(nvmx_vmresumes) },
> + { "nvmx_vmclears", VCPU_STAT(nvmx_vmclears) },
>   { "mmu_shadow_zapped", VM_STAT(mmu_shadow_zapped) },
>   { "mmu_pte_write", VM_STAT(mmu_pte_write) },
>   { "mmu_pte_updated", VM_STAT(mmu_pte_updated) },
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: VMX: Add missing braces to avoid redundant error check

2013-04-08 Thread Jan Kiszka

On 2013-04-08 11:47, Gleb Natapov wrote:
> On Mon, Apr 08, 2013 at 11:07:46AM +0200, Jan Kiszka wrote:
>> The code was already properly aligned, now also add the braces to avoid
> Are you saying kvm is not written in Python?

# python arch/x86/kvm/vmx.c
  File "arch/x86/kvm/vmx.c", line 1
/*
^
SyntaxError: invalid syntax


Hmm, indeed. That explains our problems...

Jan

> 
>> that err is checked even if alloc_apic_access_page didn't run and change
>> it. Found via Coccinelle by Fengguang Wu.
>>
>> Signed-off-by: Jan Kiszka 
> Applied, thanks.
> 
>> ---
>>  arch/x86/kvm/vmx.c |3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index cf1aa8f..656b0fa 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -6797,10 +6797,11 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm 
>> *kvm, unsigned int id)
>>  put_cpu();
>>  if (err)
>>  goto free_vmcs;
>> -if (vm_need_virtualize_apic_accesses(kvm))
>> +if (vm_need_virtualize_apic_accesses(kvm)) {
>>  err = alloc_apic_access_page(kvm);
>>  if (err)
>>  goto free_vmcs;
>> +}
>>  
>>  if (enable_ept) {
>>  if (!kvm->arch.ept_identity_map_addr)
>> -- 
>> 1.7.3.4
> 
> --
>   Gleb.
> 

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/7 v3] KVM :PPC: Userspace Debug support

2013-04-08 Thread Bharat Bhushan

From: Bharat Bhushan 

This patchset adds the userspace debug support for booke/bookehv.
this is tested on powerpc e500v2/e500mc devices.

We are now assuming that debug resource will not be used by
kernel for its own debugging. It will be used for only kernel
user process debugging. So the kernel debug load interface during
context_to is used to load debug conext for that selected process.

v2->v3
 - We are now assuming that debug resource will not be used by
   kernel for its own debugging.
   It will be used for only kernel user process debugging.
   So the kernel debug load interface during context_to is
   used to load debug conext for that selected process.

v1->v2
 - Debug registers are save/restore in vcpu_put/vcpu_get.
   Earlier the debug registers are saved/restored in guest entry/exit

Bharat Bhushan (7):
  KVM: PPC: debug stub interface parameter defined
  Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER
  KVM: extend EMULATE_EXIT_USER to support different exit reasons
  booke: exit to user space if emulator request
  KVM: PPC: exit to user space on "ehpriv" instruction
  powerpc: export debug register save function for KVM
  KVM: PPC: Add userspace debug stub support

 arch/powerpc/include/asm/kvm_host.h  |8 +
 arch/powerpc/include/asm/kvm_ppc.h   |2 +-
 arch/powerpc/include/asm/switch_to.h |4 +
 arch/powerpc/include/uapi/asm/kvm.h  |   37 +
 arch/powerpc/kernel/process.c|3 +-
 arch/powerpc/kvm/book3s.c|6 +
 arch/powerpc/kvm/book3s_emulate.c|4 +-
 arch/powerpc/kvm/book3s_pr.c |4 +-
 arch/powerpc/kvm/booke.c |  239 --
 arch/powerpc/kvm/booke.h |5 +
 arch/powerpc/kvm/e500_emulate.c  |   10 ++
 arch/powerpc/kvm/powerpc.c   |6 -
 12 files changed, 304 insertions(+), 24 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/7 v3] KVM: PPC: debug stub interface parameter defined

2013-04-08 Thread Bharat Bhushan

This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
ioctl support. Follow up patches will use this for setting up
hardware breakpoints, watchpoints and software breakpoints.

Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below.
This is because I am not sure what is required for book3s. So this ioctl
behaviour will not change for book3s.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/include/uapi/asm/kvm.h |   23 +++
 arch/powerpc/kvm/book3s.c   |6 ++
 arch/powerpc/kvm/booke.c|6 ++
 arch/powerpc/kvm/powerpc.c  |6 --
 4 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index c2ff99c..c0c38ed 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -272,8 +272,31 @@ struct kvm_debug_exit_arch {
 
 /* for KVM_SET_GUEST_DEBUG */
 struct kvm_guest_debug_arch {
+   struct {
+   /* H/W breakpoint/watchpoint address */
+   __u64 addr;
+   /*
+* Type denotes h/w breakpoint, read watchpoint, write
+* watchpoint or watchpoint (both read and write).
+*/
+#define KVMPPC_DEBUG_NONE  0x0
+#define KVMPPC_DEBUG_BREAKPOINT(1UL << 1)
+#define KVMPPC_DEBUG_WATCH_WRITE   (1UL << 2)
+#define KVMPPC_DEBUG_WATCH_READ(1UL << 3)
+   __u32 type;
+   __u32 reserved;
+   } bp[16];
 };
 
+/* Debug related defines */
+/*
+ * kvm_guest_debug->control is a 32 bit field. The lower 16 bits are generic
+ * and upper 16 bits are architecture specific. Architecture specific defines
+ * that ioctl is for setting hardware breakpoint or software breakpoint.
+ */
+#define KVM_GUESTDBG_USE_SW_BP 0x0001
+#define KVM_GUESTDBG_USE_HW_BP 0x0002
+
 /* definition of registers in kvm_run */
 struct kvm_sync_regs {
 };
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 2d32ae4..128ed3a 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -612,6 +612,12 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
return 0;
 }
 
+int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
+   struct kvm_guest_debug *dbg)
+{
+   return -EINVAL;
+}
+
 void kvmppc_decrementer_func(unsigned long data)
 {
struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index a49a68a..a3e2db0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1526,6 +1526,12 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
return r;
 }
 
+int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
+struct kvm_guest_debug *dbg)
+{
+   return -EINVAL;
+}
+
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 {
return -ENOTSUPP;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 16b4595..716c2d4 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -531,12 +531,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 #endif
 }
 
-int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
-struct kvm_guest_debug *dbg)
-{
-   return -EINVAL;
-}
-
 static void kvmppc_complete_dcr_load(struct kvm_vcpu *vcpu,
  struct kvm_run *run)
 {
-- 
1.7.0.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/7 v3] Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER

2013-04-08 Thread Bharat Bhushan

Instruction emulation return EMULATE_DO_PAPR when it requires
exit to userspace on book3s. Similar return is required
for booke. EMULATE_DO_PAPR reads out to be confusing so it is
renamed to EMULATE_EXIT_USER.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/include/asm/kvm_ppc.h |2 +-
 arch/powerpc/kvm/book3s_emulate.c  |2 +-
 arch/powerpc/kvm/book3s_pr.c   |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index f589307..8c2c8ef 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -44,7 +44,7 @@ enum emulation_result {
EMULATE_DO_DCR,   /* kvm_run filled with DCR request */
EMULATE_FAIL, /* can't emulate this instruction */
EMULATE_AGAIN,/* something went wrong. go again */
-   EMULATE_DO_PAPR,  /* kvm_run filled with PAPR request */
+   EMULATE_EXIT_USER,/* emulation requires exit to user-space */
 };
 
 extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 836c569..cdd19d6 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -194,7 +194,7 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
run->papr_hcall.args[i] = gpr;
}
 
-   emulated = EMULATE_DO_PAPR;
+   emulated = EMULATE_EXIT_USER;
break;
}
 #endif
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 286e23e..b960faf 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -762,7 +762,7 @@ program_interrupt:
run->exit_reason = KVM_EXIT_MMIO;
r = RESUME_HOST_NV;
break;
-   case EMULATE_DO_PAPR:
+   case EMULATE_EXIT_USER:
run->exit_reason = KVM_EXIT_PAPR_HCALL;
vcpu->arch.hcall_needed = 1;
r = RESUME_HOST_NV;
-- 
1.7.0.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/7 v3] KVM: extend EMULATE_EXIT_USER to support different exit reasons

2013-04-08 Thread Bharat Bhushan

From: Bharat Bhushan 

Currently the instruction emulator code returns EMULATE_EXIT_USER
and common code initializes the "run->exit_reason = .." and
"vcpu->arch.hcall_needed = .." with one fixed reason.
But there can be different reasons when emulator need to exit
to user space. To support that the "run->exit_reason = .."
and "vcpu->arch.hcall_needed = .." initialization is moved a
level up to emulator.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/kvm/book3s_emulate.c |2 ++
 arch/powerpc/kvm/book3s_pr.c  |2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index cdd19d6..1f6344c 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -194,6 +194,8 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
run->papr_hcall.args[i] = gpr;
}
 
+   run->exit_reason = KVM_EXIT_PAPR_HCALL;
+   vcpu->arch.hcall_needed = 1;
emulated = EMULATE_EXIT_USER;
break;
}
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index b960faf..c1cffa8 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -763,8 +763,6 @@ program_interrupt:
r = RESUME_HOST_NV;
break;
case EMULATE_EXIT_USER:
-   run->exit_reason = KVM_EXIT_PAPR_HCALL;
-   vcpu->arch.hcall_needed = 1;
r = RESUME_HOST_NV;
break;
default:
-- 
1.7.0.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/7 v3] booke: exit to user space if emulator request

2013-04-08 Thread Bharat Bhushan

From: Bharat Bhushan 

This allows the exit to user space if emulator request by returning
EMULATE_EXIT_USER. This will be used in subsequent patches in list

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/kvm/booke.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index a3e2db0..97ae158 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -745,6 +745,9 @@ static int emulation_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
kvmppc_core_queue_program(vcpu, ESR_PIL);
return RESUME_HOST;
 
+   case EMULATE_EXIT_USER:
+   return RESUME_HOST;
+
default:
BUG();
}
-- 
1.7.0.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/7 v3] KVM: PPC: exit to user space on "ehpriv" instruction

2013-04-08 Thread Bharat Bhushan

From: Bharat Bhushan 

"ehpriv" instruction is used for setting software breakpoints
by user space. This patch adds support to exit to user space
with "run->debug" have relevant information.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/kvm/e500_emulate.c |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c
index e78f353..cefdd38 100644
--- a/arch/powerpc/kvm/e500_emulate.c
+++ b/arch/powerpc/kvm/e500_emulate.c
@@ -26,6 +26,7 @@
 #define XOP_TLBRE   946
 #define XOP_TLBWE   978
 #define XOP_TLBILX  18
+#define XOP_EHPRIV  270
 
 #ifdef CONFIG_KVM_E500MC
 static int dbell2prio(ulong param)
@@ -130,6 +131,15 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
emulated = kvmppc_e500_emul_tlbivax(vcpu, ea);
break;
 
+   case XOP_EHPRIV:
+   run->exit_reason = KVM_EXIT_DEBUG;
+   run->debug.arch.address = vcpu->arch.pc;
+   run->debug.arch.status = 0;
+   kvmppc_account_exit(vcpu, DEBUG_EXITS);
+   emulated = EMULATE_EXIT_USER;
+   *advance = 0;
+   break;
+
default:
emulated = EMULATE_FAIL;
}
-- 
1.7.0.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/7 v3] powerpc: export debug register save function for KVM

2013-04-08 Thread Bharat Bhushan

From: Bharat Bhushan 

KVM need this function when switching from vcpu to user-space
thread. My subsequent patch will use this function.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/include/asm/switch_to.h |4 
 arch/powerpc/kernel/process.c|3 ++-
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/switch_to.h 
b/arch/powerpc/include/asm/switch_to.h
index 200d763..50b357f 100644
--- a/arch/powerpc/include/asm/switch_to.h
+++ b/arch/powerpc/include/asm/switch_to.h
@@ -30,6 +30,10 @@ extern void enable_kernel_spe(void);
 extern void giveup_spe(struct task_struct *);
 extern void load_up_spe(struct task_struct *);
 
+#ifdef CONFIG_PPC_ADV_DEBUG_REGS
+extern void switch_booke_debug_regs(struct thread_struct *new_thread);
+#endif
+
 #ifndef CONFIG_SMP
 extern void discard_lazy_cpu_state(void);
 #else
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 59dd545..7b2296b 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -362,12 +362,13 @@ static void prime_debug_regs(struct thread_struct *thread)
  * debug registers, set the debug registers from the values
  * stored in the new thread.
  */
-static void switch_booke_debug_regs(struct thread_struct *new_thread)
+void switch_booke_debug_regs(struct thread_struct *new_thread)
 {
if ((current->thread.dbcr0 & DBCR0_IDM)
|| (new_thread->dbcr0 & DBCR0_IDM))
prime_debug_regs(new_thread);
 }
+EXPORT_SYMBOL(switch_booke_debug_regs);
 #else  /* !CONFIG_PPC_ADV_DEBUG_REGS */
 #ifndef CONFIG_HAVE_HW_BREAKPOINT
 static void set_debug_reg_defaults(struct thread_struct *thread)
-- 
1.7.0.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 7/7 v3] KVM: PPC: Add userspace debug stub support

2013-04-08 Thread Bharat Bhushan

From: Bharat Bhushan 

This patch adds the debug stub support on booke/bookehv.
Now QEMU debug stub can use hw breakpoint, watchpoint and software
breakpoint to debug guest.

Debug registers are saved/restored on vcpu_put()/vcpu_get().
Also the debug registers are saved restored only if guest
is using debug resources.

Currently we do not support debug resource emulation to guest,
so always exit to user space irrespective of user space is expecting
the debug exception or not. This is unexpected event and let us
leave the action on user space. This is similar to what it was before,
only thing is that now we have proper exit state available to user space.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/include/asm/kvm_host.h |8 +
 arch/powerpc/include/uapi/asm/kvm.h |   22 +++-
 arch/powerpc/kvm/booke.c|  242 ---
 arch/powerpc/kvm/booke.h|5 +
 4 files changed, 255 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index e34f8fe..b9ad20f 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -505,7 +505,15 @@ struct kvm_vcpu_arch {
u32 mmucfg;
u32 epr;
u32 crit_save;
+
+   /* Flag indicating that debug registers are used by guest */
+   bool debug_active;
+   /* for save/restore thread->dbcr0 on vcpu run/heavyweight_exit */
+   u32 saved_dbcr0;
+   /* guest debug registers*/
struct kvmppc_booke_debug_reg dbg_reg;
+   /* shadow debug registers */
+   struct kvmppc_booke_debug_reg shadow_dbg_reg;
 #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index c0c38ed..d7ce449 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -25,6 +25,7 @@
 /* Select powerpc specific features in  */
 #define __KVM_HAVE_SPAPR_TCE
 #define __KVM_HAVE_PPC_SMT
+#define __KVM_HAVE_GUEST_DEBUG
 
 struct kvm_regs {
__u64 pc;
@@ -267,7 +268,24 @@ struct kvm_fpu {
__u64 fpr[32];
 };
 
+/*
+ * Defines for h/w breakpoint, watchpoint (read, write or both) and
+ * software breakpoint.
+ * These are used as "type" in KVM_SET_GUEST_DEBUG ioctl and "status"
+ * for KVM_DEBUG_EXIT.
+ */
+#define KVMPPC_DEBUG_NONE  0x0
+#define KVMPPC_DEBUG_BREAKPOINT(1UL << 1)
+#define KVMPPC_DEBUG_WATCH_WRITE   (1UL << 2)
+#define KVMPPC_DEBUG_WATCH_READ(1UL << 3)
 struct kvm_debug_exit_arch {
+   __u64 address;
+   /*
+* exiting to userspace because of h/w breakpoint, watchpoint
+* (read, write or both) and software breakpoint.
+*/
+   __u32 status;
+   __u32 reserved;
 };
 
 /* for KVM_SET_GUEST_DEBUG */
@@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
 * Type denotes h/w breakpoint, read watchpoint, write
 * watchpoint or watchpoint (both read and write).
 */
-#define KVMPPC_DEBUG_NONE  0x0
-#define KVMPPC_DEBUG_BREAKPOINT(1UL << 1)
-#define KVMPPC_DEBUG_WATCH_WRITE   (1UL << 2)
-#define KVMPPC_DEBUG_WATCH_READ(1UL << 3)
__u32 type;
__u32 reserved;
} bp[16];
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 97ae158..0e93416 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu)
 #endif
 }
 
+static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu)
+{
+   /* Synchronize guest's desire to get debug interrupts into shadow MSR */
+#ifndef CONFIG_KVM_BOOKE_HV
+   vcpu->arch.shadow_msr &= ~MSR_DE;
+   vcpu->arch.shadow_msr |= vcpu->arch.shared->msr & MSR_DE;
+#endif
+
+   /* Force enable debug interrupts when user space wants to debug */
+   if (vcpu->guest_debug) {
+#ifdef CONFIG_KVM_BOOKE_HV
+   /*
+* Since there is no shadow MSR, sync MSR_DE into the guest
+* visible MSR.
+*/
+   vcpu->arch.shared->msr |= MSR_DE;
+#else
+   vcpu->arch.shadow_msr |= MSR_DE;
+   vcpu->arch.shared->msr &= ~MSR_DE;
+#endif
+   }
+}
+
 /*
  * Helper function for "full" MSR writes.  No need to call this if only
  * EE/CE/ME/DE/RI are changing.
@@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
kvmppc_mmu_msr_notify(vcpu, old_msr);
kvmppc_vcpu_sync_spe(vcpu);
kvmppc_vcpu_sync_fpu(vcpu);
+   kvmppc_vcpu_sync_debug(vcpu);
 }
 
 static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu,
@@ -646,6 +670,46 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
return r;
 }
 
+static void kvmppc_load_usespace_gebug(void)
+{
+   switch_booke_debug_regs(¤t->thread);
+}
+
+static void kvmp

Re: [RFC PATCH v3 1/6] kvm: add device control API

2013-04-08 Thread Gleb Natapov

On Thu, Apr 04, 2013 at 06:47:45PM -0500, Scott Wood wrote:
> On 04/04/2013 05:41:35 AM, Gleb Natapov wrote:
> >On Tue, Apr 02, 2013 at 08:57:48PM -0500, Scott Wood wrote:
> >> +struct kvm_device_attr {
> >> +  __u32   flags;  /* no flags currently defined */
> >> +  __u32   group;  /* device-defined */
> >> +  __u64   attr;   /* group-defined */
> >> +  __u64   addr;   /* userspace address of attr data */
> >> +};
> >> +
> >Since now each device has its own fd is it an advantage to enforce
> >common interface between different devices?
> 
> I think so, even if only to avoid repeating the various pains
> surrounding adding ioctls.  Not necessarily "enforce", just enable.
> If a device has some sort of command that does not fit neatly into
> the "set or get" model, it could still add a new ioctl.
> 
Make sense.

> >If we do so though why not handle file creation, ioctl and file
> >descriptor lifetime in the
> >common code. Common code will have "struct kvm_device" with "struct
> >kvm_device_arch" and "struct kvm_device_ops" members. Instead of
> >kvm_mpic_ioctl there will be kvm_device_ioctl which will despatch
> >ioctls
> >to a device using kvm_device->ops->(set|get|has)_attr pointers.
> 
> So make it more like the pre-fd version, except for the actual fd
> usage?  It would make destruction a bit simpler (assuming there's no
> need for vcpu destruction code to access a device).  Hopefully
> nobody asks me to change it back again, though. :-)
> 

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH v3 1/6] kvm: add device control API

2013-04-08 Thread Gleb Natapov

On Fri, Apr 05, 2013 at 12:02:06PM +1100, Paul Mackerras wrote:
> On Thu, Apr 04, 2013 at 01:41:35PM +0300, Gleb Natapov wrote:
> 
> > Since now each device has its own fd is it an advantage to enforce
> > common interface between different devices? If we do so though why
> > not handle file creation, ioctl and file descriptor lifetime in the
> > common code. Common code will have "struct kvm_device" with "struct
> > kvm_device_arch" and "struct kvm_device_ops" members. Instead of
> > kvm_mpic_ioctl there will be kvm_device_ioctl which will despatch ioctls
> > to a device using kvm_device->ops->(set|get|has)_attr pointers.
> 
> I thought about making the same request, but when I looked at it, the
> amount of code that could be made common in this way is pretty tiny,
> and doing that involves a bit of extra complexity, so I thought that
> on the whole it wouldn't be worthwhile.
> 
The value of doing so is not only in making some code common, but also
moving fd lifetime management into the common code where it can be
debugged once and for all potential users. I also expect the amount of
shared code to grow when interface will be used by more architectures.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH v3 5/6] kvm/ppc/mpic: in-kernel MPIC emulation

2013-04-08 Thread Gleb Natapov

On Thu, Apr 04, 2013 at 06:33:38PM -0500, Scott Wood wrote:
> On 04/04/2013 12:59:02 AM, Gleb Natapov wrote:
> >On Wed, Apr 03, 2013 at 03:58:04PM -0500, Scott Wood wrote:
> >> KVM_DEV_MPIC_* could go elsewhere if you want to avoid cluttering
> >> the main kvm.h.  The arch header would be OK, since the non-arch
> >> header includes the arch header, and thus it wouldn't be visible to
> >> userspace where it is -- if there later is a need for MPIC (or
> >> whatever other device follows MPIC's example) on another
> >> architecture, it could be moved without breaking anything.  Or, we
> >> could just have a header for each device type.
> >>
> >If device will be used by more then one arch it will move into
> >virt/kvm
> >and will have its own header, like ioapic.
> 
> virt/kvm/ioapic.h is not uapi.  The ioapic uapi component (e.g.
> struct kvm_ioapic_state) is duplicated between x86 and ia64, which
> is the sort of thing I'd like to avoid.  I'm OK with putting it in
> the PPC header if, upon a later need for multi-architecture support,
> it could move into either the main uapi header or a separate uapi
> header that the main uapi header includes (i.e. no userspace-visible
> change in which header needs to be included).
> 
Agree, it make sense to have separate uapi header for a device that is
used by more then one arch.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Virtualbox svga card in KVM

2013-04-08 Thread Stefan Hajnoczi

On Fri, Apr 05, 2013 at 04:52:05PM -0700, Sriram Murthy wrote:
> For starters, virtual box has better SVGA WDDM drivers that allows for a much 
> richer display when the VM display is local.

What does "much richer display" mean?

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 答复: [Qemu-devel] qemu crashed when starting vm(kvm) with vnc connect

2013-04-08 Thread Stefan Hajnoczi

On Sun, Apr 07, 2013 at 04:58:07AM +, Zhanghaoyu (A) wrote:
> >> I start a kvm VM with vnc(using the zrle protocol) connect, sometimes qemu 
> >> program crashed during starting period, received signal SIGABRT.
> >> Trying about 20 times, this crash may be reproduced.
> >> I guess the cause memory corruption or double free.
> >
> > Which version of QEMU are you running?
> > 
> > Please try qemu.git/master.
> > 
> > Stefan
> 
> I used the QEMU download from qemu.git (http://git.qemu.org/git/qemu.git).

Great, thanks!  Can you please post a backtrace?

The easiest way is:

  $ ulimit -c unlimited
  $ qemu-system-x86_64 -enable-kvm -m 1024 ...
  ...crash...
  $ gdb -c qemu-system-x86_64.core
  (gdb) bt

Depending on how your system is configured the core file might have a
different filename but there should be a file name *core* the current
working directory after the crash.

The backtrace will make it possible to find out where the crash
occurred.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 03/11] KVM: nVMX: Detect shadow-vmcs capability

2013-04-08 Thread Gleb Natapov

On Sun, Mar 10, 2013 at 06:04:55PM +0200, Abel Gordon wrote:
> Add logic required to detect if shadow-vmcs is supported by the
> processor. Introduce a new kernel module parameter to specify if L0 should use
> shadow vmcs (or not) to run L1.
> 
> Signed-off-by: Abel Gordon 
> ---
>  arch/x86/kvm/vmx.c |   25 -
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> --- .before/arch/x86/kvm/vmx.c2013-03-10 18:00:54.0 +0200
> +++ .after/arch/x86/kvm/vmx.c 2013-03-10 18:00:54.0 +0200
> @@ -86,6 +86,8 @@ module_param(fasteoi, bool, S_IRUGO);
>  
>  static bool __read_mostly enable_apicv_reg_vid;
>  
> +static bool __read_mostly enable_shadow_vmcs = 1;
> +module_param_named(enable_shadow_vmcs, enable_shadow_vmcs, bool, S_IRUGO);
>  /*
>   * If nested=1, nested virtualization is supported, i.e., guests may use
>   * VMX and be a hypervisor for its own guests. If nested=0, guests may not
> @@ -895,6 +897,18 @@ static inline bool cpu_has_vmx_wbinvd_ex
>   SECONDARY_EXEC_WBINVD_EXITING;
>  }
>  
> +static inline bool cpu_has_vmx_shadow_vmcs(void)
> +{
> + u64 vmx_msr;
> + rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
> + /* check if the cpu supports writing r/o exit information fields */
> + if (!(vmx_msr & (1u << 29)))
Define please.

> + return false;
> +
> + return vmcs_config.cpu_based_2nd_exec_ctrl &
> + SECONDARY_EXEC_SHADOW_VMCS;
> +}
> +
>  static inline bool report_flexpriority(void)
>  {
>   return flexpriority_enabled;
> @@ -2582,7 +2596,8 @@ static __init int setup_vmcs_config(stru
>   SECONDARY_EXEC_RDTSCP |
>   SECONDARY_EXEC_ENABLE_INVPCID |
>   SECONDARY_EXEC_APIC_REGISTER_VIRT |
> - SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY;
> + SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
> + SECONDARY_EXEC_SHADOW_VMCS;
>   if (adjust_vmx_controls(min2, opt2,
>   MSR_IA32_VMX_PROCBASED_CTLS2,
>   &_cpu_based_2nd_exec_control) < 0)
> @@ -2771,6 +2786,8 @@ static __init int hardware_setup(void)
>  
>   if (!cpu_has_vmx_vpid())
>   enable_vpid = 0;
> + if (!cpu_has_vmx_shadow_vmcs())
> + enable_shadow_vmcs = 0;
>  
>   if (!cpu_has_vmx_ept() ||
>   !cpu_has_vmx_ept_4levels()) {
> @@ -3982,6 +3999,12 @@ static u32 vmx_secondary_exec_control(st
>   exec_control &= ~(SECONDARY_EXEC_APIC_REGISTER_VIRT |
> SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY);
>   exec_control &= ~SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
> + /* SECONDARY_EXEC_SHADOW_VMCS is enabled when L1 executes VMPTRLD
> +(handle_vmptrld).
> +We can NOT enable shadow_vmcs here because we don't have yet
> +a current VMCS12
> + */
> + exec_control &= ~SECONDARY_EXEC_SHADOW_VMCS;
>   return exec_control;
>  }
>  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH v7 4/7] KVM: Add reset/restore rtc_status support

2013-04-08 Thread Zhang, Yang Z

Gleb Natapov wrote on 2013-04-07:
> On Sun, Apr 07, 2013 at 01:16:51PM +, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-07:
>>> On Sun, Apr 07, 2013 at 01:05:02PM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2013-04-07:
> On Sun, Apr 07, 2013 at 12:39:32PM +, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-07:
>>> On Sun, Apr 07, 2013 at 02:30:15AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2013-04-04:
> On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |9 + arch/x86/kvm/lapic.h | 2
>>  ++ virt/kvm/ioapic.c|   43
>>  +++ virt/kvm/ioapic.h
>>  | 1 + 4 files changed, 55 insertions(+), 0 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 96ab160..9c041fa 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void
> *bitmap)
>>  return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>  }
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>> +{
>> +struct kvm_lapic *apic = vcpu->arch.apic;
>> +
>> +return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>> +apic_test_vector(vector, apic->regs + APIC_IRR);
>> +}
>> +
>>  static inline void apic_set_vector(int vec, void *bitmap)
>>  {
>>  set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>> @@ -1665,6 +1673,7 @@ void kvm_apic_post_state_restore(struct
>>> kvm_vcpu
> *vcpu,
>>  apic->highest_isr_cache = -1;
>>  kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
>>  apic_find_highest_isr(apic));   kvm_make_request(KVM_REQ_EVENT,
>>  vcpu); +kvm_rtc_irq_restore(vcpu); }
>>  
>>  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index 967519c..004d2ad 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -170,4 +170,6 @@ static inline bool
> kvm_apic_has_events(struct
> kvm_vcpu *vcpu)
>>  return vcpu->arch.apic->pending_events;
>>  }
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>> +
>>  #endif
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 8664812..0b12b17 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -90,6 +90,47 @@ static unsigned long
> ioapic_read_indirect(struct
> kvm_ioapic *ioapic,
>>  return result;
>>  }
>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>> +{
>> +ioapic->rtc_status.pending_eoi = 0;
>> +bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS);
>> +}
>> +
>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>> +{
>> +struct kvm_vcpu *vcpu;
>> +int vector, i, pending_eoi = 0;
>> +
>> +if (RTC_GSI >= IOAPIC_NUM_PINS)
>> +return;
>> +
>> +vector = ioapic->redirtbl[RTC_GSI].fields.vector;
>> +kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
>> +if (kvm_apic_pending_eoi(vcpu, vector)) {
>> +pending_eoi++;
>> +__set_bit(vcpu->vcpu_id,
>>> ioapic->rtc_status.dest_map);
> You should cleat dest_map at the beginning to get rid of stale bits.
 I thought kvm_set_ioapic is called only after save/restore or 
 migration.
>>> And
> the
>>> ioapic should be reset successfully before call it. So the
>>> dest_map is empty before call rtc_irq_restore().
 But it is possible kvm_set_ioapic is called beside save/restore or
 migration. Right?
 
>>> First of all userspace should not care when it calls kvm_set_ioapic()
>>> the kernel need to do the right thing. Second, believe it or not,
>>> kvm_ioapic_reset() is not called during system reset. Instead userspace
>>> reset it by calling kvm_set_ioapic() with ioapic state after reset.
>> Ok. I see. As the logic you suggested, it will clear dest_map if no
>> pending eoi in vcpu, so we don't need to do it again.
>> 
> You again rely on userspace doing thing in certain manner. What is
> set_lapic() is never called? Kernel internal state have to be correct
> after each ioctl call.
 Sorry. I cannot figure out what's the problem if don't clear dest_map?
 Can you elaborate it?
 
>>> What is not obvious about it? If there is a b

Re: [PATCH 04/11] KVM: nVMX: Introduce vmread and vmwrite bitmaps

2013-04-08 Thread Gleb Natapov

On Sun, Mar 10, 2013 at 06:05:25PM +0200, Abel Gordon wrote:
> Prepare vmread and vmwrite bitmaps according to a pre-specified list of 
> fields.
> These lists are intended to specifiy most frequent accessed fields so we can
> minimize the number of fields that are copied from/to the software controlled
> VMCS12 format to/from to processor-specific shadow vmcs. The lists were built
> measuring the VMCS fields access rate after L2 Ubuntu 12.04 booted when it was
> running on top of L1 KVM, also Ubuntu 12.04. Note that during boot there were
> additional fields which were frequently modified but they were not added to
> these lists because after boot these fields were not longer accessed by L1.
> 
> Signed-off-by: Abel Gordon 
> ---
>  arch/x86/kvm/vmx.c |   75 ++-
>  1 file changed, 74 insertions(+), 1 deletion(-)
> 
> --- .before/arch/x86/kvm/vmx.c2013-03-10 18:00:54.0 +0200
> +++ .after/arch/x86/kvm/vmx.c 2013-03-10 18:00:54.0 +0200
> @@ -453,6 +453,51 @@ static inline struct vcpu_vmx *to_vmx(st
>  #define FIELD64(number, name)[number] = VMCS12_OFFSET(name), \
>   [number##_HIGH] = VMCS12_OFFSET(name)+4
>  
> +
> +static const unsigned long shadow_read_only_fields[] = {
> + VM_EXIT_REASON,
> + VM_EXIT_INTR_INFO,
> + VM_EXIT_INSTRUCTION_LEN,
> + IDT_VECTORING_INFO_FIELD,
> + IDT_VECTORING_ERROR_CODE,
> + VM_EXIT_INTR_ERROR_CODE,
> + EXIT_QUALIFICATION,
> + GUEST_LINEAR_ADDRESS,
> + GUEST_PHYSICAL_ADDRESS
> +};
> +static const int max_shadow_read_only_fields = 
> ARRAY_SIZE(shadow_read_only_fields);
> +
> +static const unsigned long shadow_read_write_fields[] = {
> + GUEST_RIP,
> + GUEST_RSP,
> + GUEST_CR0,
> + GUEST_CR3,
> + GUEST_CR4,
> + GUEST_INTERRUPTIBILITY_INFO,
> + GUEST_RFLAGS,
> + GUEST_CS_SELECTOR,
> + GUEST_CS_AR_BYTES,
> + GUEST_CS_LIMIT,
> + GUEST_CS_BASE,
> + GUEST_ES_BASE,
> + CR0_GUEST_HOST_MASK,
> + CR0_READ_SHADOW,
> + CR4_READ_SHADOW,
> + TSC_OFFSET,
> + EXCEPTION_BITMAP,
> + CPU_BASED_VM_EXEC_CONTROL,
> + VM_ENTRY_EXCEPTION_ERROR_CODE,
> + VM_ENTRY_INTR_INFO_FIELD,
> + VM_ENTRY_INSTRUCTION_LEN,
> + VM_ENTRY_EXCEPTION_ERROR_CODE,
> + HOST_FS_BASE,
> + HOST_GS_BASE,
> + HOST_FS_SELECTOR,
> + HOST_GS_SELECTOR
> +};
> +static const int max_shadow_read_write_fields =
> + ARRAY_SIZE(shadow_read_write_fields);
> +
>  static const unsigned short vmcs_field_to_offset_table[] = {
>   FIELD(VIRTUAL_PROCESSOR_ID, virtual_processor_id),
>   FIELD(GUEST_ES_SELECTOR, guest_es_selector),
> @@ -642,6 +687,8 @@ static unsigned long *vmx_msr_bitmap_leg
>  static unsigned long *vmx_msr_bitmap_longmode;
>  static unsigned long *vmx_msr_bitmap_legacy_x2apic;
>  static unsigned long *vmx_msr_bitmap_longmode_x2apic;
> +static unsigned long *vmx_vmread_bitmap;
> +static unsigned long *vmx_vmwrite_bitmap;
>  
>  static bool cpu_has_load_ia32_efer;
>  static bool cpu_has_load_perf_global_ctrl;
> @@ -4033,6 +4080,8 @@ static int vmx_vcpu_setup(struct vcpu_vm
>   vmcs_write64(IO_BITMAP_A, __pa(vmx_io_bitmap_a));
>   vmcs_write64(IO_BITMAP_B, __pa(vmx_io_bitmap_b));
>  
> + vmcs_write64(VMREAD_BITMAP, __pa(vmx_vmread_bitmap));
> + vmcs_write64(VMWRITE_BITMAP, __pa(vmx_vmwrite_bitmap));
Why are you doing it without checking that shadow vmcs is supported and
enabled?

>   if (cpu_has_vmx_msr_bitmap())
>   vmcs_write64(MSR_BITMAP, __pa(vmx_msr_bitmap_legacy));
>  
> @@ -7764,6 +7813,24 @@ static int __init vmx_init(void)
>   (unsigned long *)__get_free_page(GFP_KERNEL);
>   if (!vmx_msr_bitmap_longmode_x2apic)
>   goto out4;
> + vmx_vmread_bitmap = (unsigned long *)__get_free_page(GFP_KERNEL);
> + if (!vmx_vmread_bitmap)
> + goto out4;
> +
> + vmx_vmwrite_bitmap = (unsigned long *)__get_free_page(GFP_KERNEL);
> + if (!vmx_vmwrite_bitmap)
> + goto out5;
> +
We need to cleanup this bitmaps allocations some day to allocate only when the
feature is supported and used.

> + memset(vmx_vmread_bitmap, 0xff, PAGE_SIZE);
> + memset(vmx_vmwrite_bitmap, 0xff, PAGE_SIZE);
> + /* shadowed read/write fields */
> + for (i = 0; i < max_shadow_read_write_fields; i++) {
> + clear_bit(shadow_read_write_fields[i], vmx_vmwrite_bitmap);
> + clear_bit(shadow_read_write_fields[i], vmx_vmread_bitmap);
> + }
> + /* shadowed read only fields */
> + for (i = 0; i < max_shadow_read_only_fields; i++)
> + clear_bit(shadow_read_only_fields[i], vmx_vmread_bitmap);
>  
>   /*
>* Allow direct access to the PC debug port (it is often used for I/O
> @@ -7782,7 +7849,7 @@ static int __init vmx_init(void)
>   r = kvm_init(&vmx_x86_ops, sizeof(struct vcpu_vmx),
>__alignof__(struct vcpu_vmx), THIS_MO

Re: [Qemu-devel] [PATCH uq/master v2 1/2] kvm: reset state from the CPU's reset method

2013-04-08 Thread Gleb Natapov

On Tue, Apr 02, 2013 at 04:29:32PM +0300, Gleb Natapov wrote:
> >  static void kvm_sw_tlb_put(PowerPCCPU *cpu)
> >  {
> >  CPUPPCState *env = &cpu->env;
> > diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
> > index 23fe51f..6321384 100644
> > --- a/target-s390x/cpu.c
> > +++ b/target-s390x/cpu.c
> > @@ -84,6 +84,10 @@ static void s390_cpu_reset(CPUState *s)
> >   * after incrementing the cpu counter */
> >  #if !defined(CONFIG_USER_ONLY)
> >  s->halted = 1;
> > +
> > +if (kvm_enabled()) {
> > +kvm_arch_reset_vcpu(s);
> Does this compile with kvm support disabled?
> 
Well, it does not:
  CCs390x-softmmu/target-s390x/cpu.o
/users/gleb/work/qemu/target-s390x/cpu.c: In function 's390_cpu_reset':
/users/gleb/work/qemu/target-s390x/cpu.c:89:9: error: implicit
declaration of function 'kvm_arch_reset_vcpu'
[-Werror=implicit-function-declaration]
/users/gleb/work/qemu/target-s390x/cpu.c:89:9: error: nested extern
declaration of 'kvm_arch_reset_vcpu' [-Werror=nested-externs]
cc1: all warnings being treated as errors

I wonder if it is portable between compilers to rely on code in if(0){} to
be dropped in all levels of optimizations.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

reply: reply: [Qemu-devel] qemu crashed when starting vm(kvm) with vnc connect

2013-04-08 Thread Zhanghaoyu (A)

On Sun, Apr 07, 2013 at 04:58:07AM +, Zhanghaoyu (A) wrote:
> >>> I start a kvm VM with vnc(using the zrle protocol) connect, sometimes 
> >>> qemu program crashed during starting period, received signal SIGABRT.
> >>> Trying about 20 times, this crash may be reproduced.
> >>> I guess the cause memory corruption or double free.
> >>
> >> Which version of QEMU are you running?
> >> 
> >> Please try qemu.git/master.
> >> 
> >> Stefan
> >
> >I used the QEMU download from qemu.git (http://git.qemu.org/git/qemu.git).
>
> Great, thanks!  Can you please post a backtrace?
> 
> The easiest way is:
> 
>  $ ulimit -c unlimited
>  $ qemu-system-x86_64 -enable-kvm -m 1024 ...
>  ...crash...
>  $ gdb -c qemu-system-x86_64.core
>  (gdb) bt
> 
> Depending on how your system is configured the core file might have a 
> different filename but there should be a file name *core* the current working 
> directory
after the crash.
> 
> The backtrace will make it possible to find out where the crash occurred.
> 
> Thanks,
> Stefan

backtrace from core file is shown as below:

Program received signal SIGABRT, Aborted.
0x7f32eda3dd95 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x7f32eda3dd95 in raise () from /lib64/libc.so.6
#1  0x7f32eda3f2ab in abort () from /lib64/libc.so.6
#2  0x7f32eda77ece in __libc_message () from /lib64/libc.so.6
#3  0x7f32eda7dc06 in malloc_printerr () from /lib64/libc.so.6
#4  0x7f32eda7ecda in _int_free () from /lib64/libc.so.6
#5  0x7f32efd3452c in free_and_trace (mem=0x7f329cd0) at vl.c:2880
#6  0x7f32efd251a1 in buffer_free (buffer=0x7f32f0c82890) at ui/vnc.c:505
#7  0x7f32efd20c56 in vnc_zrle_clear (vs=0x7f32f0c762d0)
at ui/vnc-enc-zrle.c:364
#8  0x7f32efd26d07 in vnc_disconnect_finish (vs=0x7f32f0c762d0)
at ui/vnc.c:1050
#9  0x7f32efd275c5 in vnc_client_read (opaque=0x7f32f0c762d0)
at ui/vnc.c:1349
#10 0x7f32efcb397c in qemu_iohandler_poll (readfds=0x7f32f074d020,
writefds=0x7f32f074d0a0, xfds=0x7f32f074d120, ret=1) at iohandler.c:124
#11 0x7f32efcb46e8 in main_loop_wait (nonblocking=0) at main-loop.c:417
#12 0x7f32efd31159 in main_loop () at vl.c:2133
#13 0x7f32efd38070 in main (argc=46, argv=0x7fff7f5df178,
envp=0x7fff7f5df2f0) at vl.c:4481

Zhang Haoyu
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Monitoring MMIO to PCI Passthrough devices?

2013-04-08 Thread Andre Richter

Hi all,

I'm quite new to KVM/QEMU internals.
On recent x86 setups (Sandy/Ivy Bridge with vt-x and vt-d), if I
attach a PCI device via PCI-Passthrough to a VM,
I can directly do MMIO with the device's registers or whatsoever hides
behind it's BAR addresses.

I wonder if there is a way for the Host/VMM to monitor/trap guest
access to such areas.
I think this is not what PCI passthrough is intended for, because it
wants to get the host out of the way for I/O with device.
But what if a guest goes nuts and starts to flood the interconnect
with usless transactions to the PCI device?

I would be very thankful for some hints / pointers :)

Greetings,
Andre
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v7 4/7] KVM: Add reset/restore rtc_status support

2013-04-08 Thread Gleb Natapov

On Mon, Apr 08, 2013 at 11:21:34AM +, Zhang, Yang Z wrote:
> Gleb Natapov wrote on 2013-04-07:
> > On Sun, Apr 07, 2013 at 01:16:51PM +, Zhang, Yang Z wrote:
> >> Gleb Natapov wrote on 2013-04-07:
> >>> On Sun, Apr 07, 2013 at 01:05:02PM +, Zhang, Yang Z wrote:
>  Gleb Natapov wrote on 2013-04-07:
> > On Sun, Apr 07, 2013 at 12:39:32PM +, Zhang, Yang Z wrote:
> >> Gleb Natapov wrote on 2013-04-07:
> >>> On Sun, Apr 07, 2013 at 02:30:15AM +, Zhang, Yang Z wrote:
>  Gleb Natapov wrote on 2013-04-04:
> > On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote:
> >> From: Yang Zhang 
> >> 
> >> Signed-off-by: Yang Zhang 
> >> ---
> >>  arch/x86/kvm/lapic.c |9 + arch/x86/kvm/lapic.h | 2
> >>  ++ virt/kvm/ioapic.c|   43
> >>  +++ virt/kvm/ioapic.h
> >>  | 1 + 4 files changed, 55 insertions(+), 0 deletions(-)
> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> >> index 96ab160..9c041fa 100644
> >> --- a/arch/x86/kvm/lapic.c
> >> +++ b/arch/x86/kvm/lapic.c
> >> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void
> > *bitmap)
> >>return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
> >>  }
> >> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
> >> +{
> >> +  struct kvm_lapic *apic = vcpu->arch.apic;
> >> +
> >> +  return apic_test_vector(vector, apic->regs + APIC_ISR) ||
> >> +  apic_test_vector(vector, apic->regs + APIC_IRR);
> >> +}
> >> +
> >>  static inline void apic_set_vector(int vec, void *bitmap)
> >>  {
> >>set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
> >> @@ -1665,6 +1673,7 @@ void kvm_apic_post_state_restore(struct
> >>> kvm_vcpu
> > *vcpu,
> >>apic->highest_isr_cache = -1;
> >>kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
> >>  apic_find_highest_isr(apic)); kvm_make_request(KVM_REQ_EVENT,
> >>  vcpu); +  kvm_rtc_irq_restore(vcpu); }
> >>  
> >>  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
> >> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> >> index 967519c..004d2ad 100644
> >> --- a/arch/x86/kvm/lapic.h
> >> +++ b/arch/x86/kvm/lapic.h
> >> @@ -170,4 +170,6 @@ static inline bool
> > kvm_apic_has_events(struct
> > kvm_vcpu *vcpu)
> >>return vcpu->arch.apic->pending_events;
> >>  }
> >> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
> >> +
> >>  #endif
> >> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> >> index 8664812..0b12b17 100644
> >> --- a/virt/kvm/ioapic.c
> >> +++ b/virt/kvm/ioapic.c
> >> @@ -90,6 +90,47 @@ static unsigned long
> > ioapic_read_indirect(struct
> > kvm_ioapic *ioapic,
> >>return result;
> >>  }
> >> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
> >> +{
> >> +  ioapic->rtc_status.pending_eoi = 0;
> >> +  bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS);
> >> +}
> >> +
> >> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
> >> +{
> >> +  struct kvm_vcpu *vcpu;
> >> +  int vector, i, pending_eoi = 0;
> >> +
> >> +  if (RTC_GSI >= IOAPIC_NUM_PINS)
> >> +  return;
> >> +
> >> +  vector = ioapic->redirtbl[RTC_GSI].fields.vector;
> >> +  kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
> >> +  if (kvm_apic_pending_eoi(vcpu, vector)) {
> >> +  pending_eoi++;
> >> +  __set_bit(vcpu->vcpu_id,
> >>> ioapic->rtc_status.dest_map);
> > You should cleat dest_map at the beginning to get rid of stale bits.
>  I thought kvm_set_ioapic is called only after save/restore or 
>  migration.
> >>> And
> > the
> >>> ioapic should be reset successfully before call it. So the
> >>> dest_map is empty before call rtc_irq_restore().
>  But it is possible kvm_set_ioapic is called beside save/restore or
>  migration. Right?
>  
> >>> First of all userspace should not care when it calls kvm_set_ioapic()
> >>> the kernel need to do the right thing. Second, believe it or not,
> >>> kvm_ioapic_reset() is not called during system reset. Instead 
> >>> userspace
> >>> reset it by calling kvm_set_ioapic() with ioapic state after reset.
> >> Ok. I see. As the logic you suggested, it will clear dest_map if no
> >> pending eoi in vcpu, so we don't need to do it again.
> >> 
> > You again rely on userspace doing thing in certain manner. What is
> > set_lapic() is

RE: [PATCH v7 4/7] KVM: Add reset/restore rtc_status support

2013-04-08 Thread Zhang, Yang Z

Gleb Natapov wrote on 2013-04-08:
> On Mon, Apr 08, 2013 at 11:21:34AM +, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-07:
>>> On Sun, Apr 07, 2013 at 01:16:51PM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2013-04-07:
> On Sun, Apr 07, 2013 at 01:05:02PM +, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-07:
>>> On Sun, Apr 07, 2013 at 12:39:32PM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2013-04-07:
> On Sun, Apr 07, 2013 at 02:30:15AM +, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-04:
>>> On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote:
 From: Yang Zhang 
 
 Signed-off-by: Yang Zhang 
 ---
  arch/x86/kvm/lapic.c |9 + arch/x86/kvm/lapic.h |
  2 ++ virt/kvm/ioapic.c|   43
  +++
  virt/kvm/ioapic.h | 1 + 4 files changed, 55 insertions(+), 0
  deletions(-)
 diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
 index 96ab160..9c041fa 100644
 --- a/arch/x86/kvm/lapic.c
 +++ b/arch/x86/kvm/lapic.c
 @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec,
> void
>>> *bitmap)
return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
  }
 +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
 +{
 +  struct kvm_lapic *apic = vcpu->arch.apic;
 +
 +  return apic_test_vector(vector, apic->regs + APIC_ISR) ||
 +  apic_test_vector(vector, apic->regs + APIC_IRR);
 +}
 +
  static inline void apic_set_vector(int vec, void *bitmap)
  {
set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
 @@ -1665,6 +1673,7 @@ void
> kvm_apic_post_state_restore(struct
> kvm_vcpu
>>> *vcpu,
apic->highest_isr_cache = -1;
kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
  apic_find_highest_isr(apic));
kvm_make_request(KVM_REQ_EVENT, vcpu);
  + kvm_rtc_irq_restore(vcpu); }
  
  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
 diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
 index 967519c..004d2ad 100644
 --- a/arch/x86/kvm/lapic.h
 +++ b/arch/x86/kvm/lapic.h
 @@ -170,4 +170,6 @@ static inline bool
>>> kvm_apic_has_events(struct
>>> kvm_vcpu *vcpu)
return vcpu->arch.apic->pending_events;
  }
 +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
 +
  #endif
 diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
 index 8664812..0b12b17 100644
 --- a/virt/kvm/ioapic.c
 +++ b/virt/kvm/ioapic.c
 @@ -90,6 +90,47 @@ static unsigned long
>>> ioapic_read_indirect(struct
>>> kvm_ioapic *ioapic,
return result;
  }
 +static void rtc_irq_reset(struct kvm_ioapic *ioapic) +{
 +  ioapic->rtc_status.pending_eoi = 0;
 +  bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS); +}
 + +static void rtc_irq_restore(struct kvm_ioapic *ioapic) +{
 +  struct kvm_vcpu *vcpu; +int vector, i, pending_eoi = 0; 
 +
 +  if (RTC_GSI >= IOAPIC_NUM_PINS) +   return; + + 
 vector =
 ioapic->redirtbl[RTC_GSI].fields.vector;
 +  kvm_for_each_vcpu(i, vcpu, ioapic->kvm) { + if
 (kvm_apic_pending_eoi(vcpu, vector)) { +   
 pending_eoi++;
 +  __set_bit(vcpu->vcpu_id,
> ioapic->rtc_status.dest_map);
>>> You should cleat dest_map at the beginning to get rid of stale bits.
>> I thought kvm_set_ioapic is called only after save/restore or
> migration.
> And
>>> the
> ioapic should be reset successfully before call it. So the
> dest_map is empty before call rtc_irq_restore().
>> But it is possible kvm_set_ioapic is called beside save/restore or
>> migration. Right?
>> 
> First of all userspace should not care when it calls
> kvm_set_ioapic() the kernel need to do the right thing. Second,
> believe it or not, kvm_ioapic_reset() is not called during
> system reset. Instead userspace reset it by calling
> kvm_set_ioapic() with ioapic state after reset.
 Ok. I see. As the logic you suggested, it will clear dest_map if no
 pending eoi in vcpu, so we don't need to do it again.
 
>>> You again rely on userspace doing thing in certain manner. What is
>>> set_lapic() is ne

Re: [Qemu-devel] [PATCH uq/master v2 1/2] kvm: reset state from the CPU's reset method

2013-04-08 Thread Paolo Bonzini

Il 08/04/2013 14:19, Gleb Natapov ha scritto:
>> > Does this compile with kvm support disabled?

Oops, sorry, I thought I had replied to this email (with "hmm, let me
check").

> Well, it does not:
>   CCs390x-softmmu/target-s390x/cpu.o
> /users/gleb/work/qemu/target-s390x/cpu.c: In function 's390_cpu_reset':
> /users/gleb/work/qemu/target-s390x/cpu.c:89:9: error: implicit
> declaration of function 'kvm_arch_reset_vcpu'
> [-Werror=implicit-function-declaration]
> /users/gleb/work/qemu/target-s390x/cpu.c:89:9: error: nested extern
> declaration of 'kvm_arch_reset_vcpu' [-Werror=nested-externs]
> cc1: all warnings being treated as errors
> 
> I wonder if it is portable between compilers to rely on code in if(0){} to
> be dropped in all levels of optimizations.

It generally is okay to assume it (I think early GCC 3.x releases had no
-O0 dead-code optimization, but it was a long time ago).  However:

* in QEMU only some files have kvm_enabled() as 0 when KVM is disabled.
 Files that are shared among multiple targets have it defined to
kvm_allowed.  This is not the problem here.

* you still need to define the prototypes for anything you call, of course.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

KVM call agenda for 2013-04-09

2013-04-08 Thread Juan Quintela





Hi

Please send in any agenda topics you are interested in.

Later, Juan.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Monitoring MMIO to PCI Passthrough devices?

2013-04-08 Thread Alex Williamson

On Mon, 2013-04-08 at 14:34 +0200, Andre Richter wrote:
> Hi all,
> 
> I'm quite new to KVM/QEMU internals.
> On recent x86 setups (Sandy/Ivy Bridge with vt-x and vt-d), if I
> attach a PCI device via PCI-Passthrough to a VM,
> I can directly do MMIO with the device's registers or whatsoever hides
> behind it's BAR addresses.
> 
> I wonder if there is a way for the Host/VMM to monitor/trap guest
> access to such areas.
> I think this is not what PCI passthrough is intended for, because it
> wants to get the host out of the way for I/O with device.
> But what if a guest goes nuts and starts to flood the interconnect
> with usless transactions to the PCI device?
> 
> I would be very thankful for some hints / pointers :)

This is exactly how we debug devices that don't work with PCI
passthrough.  If you use vfio-pci to do PCI assignment (recommended) I
just added code to make it easy to turn this on.  Use the latest
qemu.git and edit hw/vfio_pci.c.  Uncomment /* #define DEBUG_VFIO */ and
change "#define VFIO_ALLOW_MMAP 1" to 0 and rebuild.  All accesses to
the device will be printed to stderr.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8 0/7] Use eoi to track RTC interrupt delivery status

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Current interrupt coalescing logci which only used by RTC has conflict
with Posted Interrupt.

This patch introduces a new mechinism to use eoi to track interrupt:
When delivering an interrupt to vcpu, the pending_eoi set to number of
vcpu that received the interrupt. And decrease it when each vcpu writing
eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
write eoi.

Changes from v7 to v8
* Revamping restore code.
* Add BUG_ON to check pending_eoi.
* Rebase on top of KVM.

Changes from v6 to v7
* Only track the RTC interrupt when userspace uses *_LINE_* ioctl.
* Call rtc_irq_restore() after lapic is restored.
* Rebase on top of KVM.

Changes from v5 to v6
* Move set dest_map logic into __apic_accept_irq().
* Use RTC_GSI to distinguish different platform, and drop all CONFIG_X86.
* Rebase on top of KVM.

Changes from v4 to v5
* Calculate destination vcpu on interrupt injection not hook into ioapic
  modification.
* Rebase on top of KVM.

Yang Zhang (7):
  KVM: Add vcpu info to ioapic_update_eoi()
  KVM: Introduce struct rtc_status
  KVM: Return destination vcpu on interrupt injection
  KVM: Add reset/restore rtc_status support
  KVM: Force vmexit with virtual interrupt delivery
  KVM: Let ioapic know the irq line status
  KVM: Use eoi to track RTC interrupt delivery status

 arch/x86/kvm/i8254.c |4 +-
 arch/x86/kvm/lapic.c |   36 +
 arch/x86/kvm/lapic.h |7 ++-
 arch/x86/kvm/x86.c   |6 ++-
 include/linux/kvm_host.h |   11 +++--
 virt/kvm/assigned-dev.c  |   13 +++--
 virt/kvm/eventfd.c   |   15 +++--
 virt/kvm/ioapic.c|  133 --
 virt/kvm/ioapic.h|   20 ++-
 virt/kvm/irq_comm.c  |   31 ++-
 virt/kvm/kvm_main.c  |3 +-
 11 files changed, 214 insertions(+), 65 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8 1/7] KVM: Add vcpu info to ioapic_update_eoi()

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Add vcpu info to ioapic_update_eoi, so we can know which vcpu
issued this EOI.

Signed-off-by: Yang Zhang 
---
 arch/x86/kvm/lapic.c |2 +-
 virt/kvm/ioapic.c|   12 ++--
 virt/kvm/ioapic.h|3 ++-
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index e227474..3e22536 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -739,7 +739,7 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int 
vector)
trigger_mode = IOAPIC_LEVEL_TRIG;
else
trigger_mode = IOAPIC_EDGE_TRIG;
-   kvm_ioapic_update_eoi(apic->vcpu->kvm, vector, trigger_mode);
+   kvm_ioapic_update_eoi(apic->vcpu, vector, trigger_mode);
}
 }
 
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 914cbe0..1d8906d 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -264,8 +264,8 @@ void kvm_ioapic_clear_all(struct kvm_ioapic *ioapic, int 
irq_source_id)
spin_unlock(&ioapic->lock);
 }
 
-static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector,
-int trigger_mode)
+static void __kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu,
+   struct kvm_ioapic *ioapic, int vector, int trigger_mode)
 {
int i;
 
@@ -304,12 +304,12 @@ bool kvm_ioapic_handles_vector(struct kvm *kvm, int 
vector)
return test_bit(vector, ioapic->handled_vectors);
 }
 
-void kvm_ioapic_update_eoi(struct kvm *kvm, int vector, int trigger_mode)
+void kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu, int vector, int trigger_mode)
 {
-   struct kvm_ioapic *ioapic = kvm->arch.vioapic;
+   struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
 
spin_lock(&ioapic->lock);
-   __kvm_ioapic_update_eoi(ioapic, vector, trigger_mode);
+   __kvm_ioapic_update_eoi(vcpu, ioapic, vector, trigger_mode);
spin_unlock(&ioapic->lock);
 }
 
@@ -407,7 +407,7 @@ static int ioapic_mmio_write(struct kvm_io_device *this, 
gpa_t addr, int len,
break;
 #ifdef CONFIG_IA64
case IOAPIC_REG_EOI:
-   __kvm_ioapic_update_eoi(ioapic, data, IOAPIC_LEVEL_TRIG);
+   __kvm_ioapic_update_eoi(NULL, ioapic, data, IOAPIC_LEVEL_TRIG);
break;
 #endif
 
diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
index 0400a46..2fc61a5 100644
--- a/virt/kvm/ioapic.h
+++ b/virt/kvm/ioapic.h
@@ -70,7 +70,8 @@ static inline struct kvm_ioapic *ioapic_irqchip(struct kvm 
*kvm)
 int kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
int short_hand, int dest, int dest_mode);
 int kvm_apic_compare_prio(struct kvm_vcpu *vcpu1, struct kvm_vcpu *vcpu2);
-void kvm_ioapic_update_eoi(struct kvm *kvm, int vector, int trigger_mode);
+void kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu, int vector,
+   int trigger_mode);
 bool kvm_ioapic_handles_vector(struct kvm *kvm, int vector);
 int kvm_ioapic_init(struct kvm *kvm);
 void kvm_ioapic_destroy(struct kvm *kvm);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8 2/7] KVM: Introduce struct rtc_status

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

rtc_status is used to track RTC interrupt delivery status. The pending_eoi
will be increased by vcpu who received RTC interrupt and will be decreased
when EOI to this interrupt.
Also, we use dest_map to record the destination vcpu to avoid the case that
vcpu who didn't get the RTC interupt, but issued EOI with same vector of RTC
and descreased pending_eoi by mistake.

Signed-off-by: Yang Zhang 
---
 virt/kvm/ioapic.h |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
index 2fc61a5..87cd94b 100644
--- a/virt/kvm/ioapic.h
+++ b/virt/kvm/ioapic.h
@@ -34,6 +34,17 @@ struct kvm_vcpu;
 #defineIOAPIC_INIT 0x5
 #defineIOAPIC_EXTINT   0x7
 
+#ifdef CONFIG_X86
+#define RTC_GSI 8
+#else
+#define RTC_GSI -1U
+#endif
+
+struct rtc_status {
+   int pending_eoi;
+   DECLARE_BITMAP(dest_map, KVM_MAX_VCPUS);
+};
+
 struct kvm_ioapic {
u64 base_address;
u32 ioregsel;
@@ -47,6 +58,7 @@ struct kvm_ioapic {
void (*ack_notifier)(void *opaque, int irq);
spinlock_t lock;
DECLARE_BITMAP(handled_vectors, 256);
+   struct rtc_status rtc_status;
 };
 
 #ifdef DEBUG
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8 3/7] KVM: Return destination vcpu on interrupt injection

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Add a new parameter to know vcpus who received the interrupt.

Signed-off-by: Yang Zhang 
---
 arch/x86/kvm/lapic.c |   25 -
 arch/x86/kvm/lapic.h |5 +++--
 virt/kvm/ioapic.c|2 +-
 virt/kvm/ioapic.h|2 +-
 virt/kvm/irq_comm.c  |   12 ++--
 5 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3e22536..0b73402 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -384,14 +384,16 @@ int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu)
 }
 
 static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
-int vector, int level, int trig_mode);
+int vector, int level, int trig_mode,
+unsigned long *dest_map);
 
-int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq)
+int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
+   unsigned long *dest_map)
 {
struct kvm_lapic *apic = vcpu->arch.apic;
 
return __apic_accept_irq(apic, irq->delivery_mode, irq->vector,
-   irq->level, irq->trig_mode);
+   irq->level, irq->trig_mode, dest_map);
 }
 
 static int pv_eoi_put_user(struct kvm_vcpu *vcpu, u8 val)
@@ -564,7 +566,7 @@ int kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct 
kvm_lapic *source,
 }
 
 bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
-   struct kvm_lapic_irq *irq, int *r)
+   struct kvm_lapic_irq *irq, int *r, unsigned long *dest_map)
 {
struct kvm_apic_map *map;
unsigned long bitmap = 1;
@@ -575,7 +577,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct 
kvm_lapic *src,
*r = -1;
 
if (irq->shorthand == APIC_DEST_SELF) {
-   *r = kvm_apic_set_irq(src->vcpu, irq);
+   *r = kvm_apic_set_irq(src->vcpu, irq, dest_map);
return true;
}
 
@@ -620,7 +622,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct 
kvm_lapic *src,
continue;
if (*r < 0)
*r = 0;
-   *r += kvm_apic_set_irq(dst[i]->vcpu, irq);
+   *r += kvm_apic_set_irq(dst[i]->vcpu, irq, dest_map);
}
 
ret = true;
@@ -634,7 +636,8 @@ out:
  * Return 1 if successfully added and 0 if discarded.
  */
 static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
-int vector, int level, int trig_mode)
+int vector, int level, int trig_mode,
+unsigned long *dest_map)
 {
int result = 0;
struct kvm_vcpu *vcpu = apic->vcpu;
@@ -647,6 +650,9 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int 
delivery_mode,
if (unlikely(!apic_enabled(apic)))
break;
 
+   if (dest_map)
+   __set_bit(vcpu->vcpu_id, dest_map);
+
if (trig_mode) {
apic_debug("level trig mode for vector %d", vector);
apic_set_vector(vector, apic->regs + APIC_TMR);
@@ -805,7 +811,7 @@ static void apic_send_ipi(struct kvm_lapic *apic)
   irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode,
   irq.vector);
 
-   kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq);
+   kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
 }
 
 static u32 apic_get_tmcct(struct kvm_lapic *apic)
@@ -1441,7 +1447,8 @@ int kvm_apic_local_deliver(struct kvm_lapic *apic, int 
lvt_type)
vector = reg & APIC_VECTOR_MASK;
mode = reg & APIC_MODE_MASK;
trig_mode = reg & APIC_LVT_LEVEL_TRIGGER;
-   return __apic_accept_irq(apic, mode, vector, 1, trig_mode);
+   return __apic_accept_irq(apic, mode, vector, 1, trig_mode,
+   NULL);
}
return 0;
 }
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index baa20cf..3e5a431 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -55,11 +55,12 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu);
 
 int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest);
 int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
-int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq);
+int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
+   unsigned long *dest_map);
 int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
 
 bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
-   struct kvm_lapic_irq *irq, int *r);
+   struct kvm_lapic_irq *irq, int *r, unsigned long *dest_map);
 
 u64 kvm_get_apic_base(struct kvm_vcpu *vcpu);
 void kvm_set_apic_base(struct kvm_vcpu *vcpu

[PATCH v8 4/7] KVM: Add reset/restore rtc_status support

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Signed-off-by: Yang Zhang 
---
 arch/x86/kvm/lapic.c |9 +++
 arch/x86/kvm/lapic.h |2 +
 virt/kvm/ioapic.c|   60 ++
 virt/kvm/ioapic.h|1 +
 4 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 0b73402..6796218 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void *bitmap)
return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
 }
 
+bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
+{
+   struct kvm_lapic *apic = vcpu->arch.apic;
+
+   return apic_test_vector(vector, apic->regs + APIC_ISR) ||
+   apic_test_vector(vector, apic->regs + APIC_IRR);
+}
+
 static inline void apic_set_vector(int vec, void *bitmap)
 {
set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
@@ -1618,6 +1626,7 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
apic->highest_isr_cache = -1;
kvm_x86_ops->hwapic_isr_update(vcpu->kvm, apic_find_highest_isr(apic));
kvm_make_request(KVM_REQ_EVENT, vcpu);
+   kvm_rtc_eoi_tracking_restore_one(vcpu);
 }
 
 void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 3e5a431..16304b1 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -166,4 +166,6 @@ static inline bool kvm_apic_has_events(struct kvm_vcpu 
*vcpu)
return vcpu->arch.apic->pending_events;
 }
 
+bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
+
 #endif
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 27ae8dd..4699180 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -90,6 +90,64 @@ static unsigned long ioapic_read_indirect(struct kvm_ioapic 
*ioapic,
return result;
 }
 
+static void rtc_irq_reset(struct kvm_ioapic *ioapic)
+{
+   ioapic->rtc_status.pending_eoi = 0;
+   bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS);
+}
+
+static void __rtc_irq_eoi_tracking_restore_one(struct kvm_vcpu *vcpu,
+   int vector)
+{
+   bool new_val, old_val;
+   struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
+   union kvm_ioapic_redirect_entry *e;
+
+   e = &ioapic->redirtbl[RTC_GSI];
+   if (!kvm_apic_match_dest(vcpu, NULL, 0, e->fields.dest_id,
+   e->fields.dest_mode))
+   return;
+
+   new_val = kvm_apic_pending_eoi(vcpu, vector);
+   old_val = test_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
+
+   if (new_val == old_val)
+   return;
+
+   if (new_val) {
+   __set_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
+   ioapic->rtc_status.pending_eoi++;
+   } else {
+   __clear_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
+   ioapic->rtc_status.pending_eoi--;
+   }
+}
+
+void kvm_rtc_eoi_tracking_restore_one(struct kvm_vcpu *vcpu)
+{
+   struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
+   int vector;
+
+   vector = ioapic->redirtbl[RTC_GSI].fields.vector;
+   spin_lock(&ioapic->lock);
+   __rtc_irq_eoi_tracking_restore_one(vcpu, vector);
+   spin_unlock(&ioapic->lock);
+}
+
+static void kvm_rtc_eoi_tracking_restore_all(struct kvm_ioapic *ioapic)
+{
+   struct kvm_vcpu *vcpu;
+   int i, vector;
+
+   if (RTC_GSI >= IOAPIC_NUM_PINS)
+   return;
+
+   rtc_irq_reset(ioapic);
+   vector = ioapic->redirtbl[RTC_GSI].fields.vector;
+   kvm_for_each_vcpu(i, vcpu, ioapic->kvm)
+   __rtc_irq_eoi_tracking_restore_one(vcpu, vector);
+}
+
 static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
 {
union kvm_ioapic_redirect_entry *pent;
@@ -428,6 +486,7 @@ void kvm_ioapic_reset(struct kvm_ioapic *ioapic)
ioapic->ioregsel = 0;
ioapic->irr = 0;
ioapic->id = 0;
+   rtc_irq_reset(ioapic);
update_handled_vectors(ioapic);
 }
 
@@ -494,6 +553,7 @@ int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state 
*state)
memcpy(ioapic, state, sizeof(struct kvm_ioapic_state));
update_handled_vectors(ioapic);
kvm_ioapic_make_eoibitmap_request(kvm);
+   kvm_rtc_eoi_tracking_restore_all(ioapic);
spin_unlock(&ioapic->lock);
return 0;
 }
diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
index 761e5b5..313fc4e 100644
--- a/virt/kvm/ioapic.h
+++ b/virt/kvm/ioapic.h
@@ -79,6 +79,7 @@ static inline struct kvm_ioapic *ioapic_irqchip(struct kvm 
*kvm)
return kvm->arch.vioapic;
 }
 
+void kvm_rtc_eoi_tracking_restore_one(struct kvm_vcpu *vcpu);
 int kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
int short_hand, int dest, int dest_mode);
 int kvm_apic_compare_prio(struct kvm_vcpu *vcpu1, struct kvm_vcpu *vcpu2);
-- 
1.7.1

--
To unsubscribe from this li

[PATCH v8 5/7] KVM: Force vmexit with virtual interrupt delivery

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Need the EOI to track interrupt deliver status, so force vmexit
on EOI for rtc interrupt when enabling virtual interrupt delivery.

Signed-off-by: Yang Zhang 
---
 virt/kvm/ioapic.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 4699180..662d0a9 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -191,7 +191,7 @@ void kvm_ioapic_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
if (!e->fields.mask &&
(e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
 kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC,
-index))) {
+index) || index == RTC_GSI)) {
if (kvm_apic_match_dest(vcpu, NULL, 0,
e->fields.dest_id, e->fields.dest_mode))
__set_bit(e->fields.vector, (unsigned long 
*)eoi_exit_bitmap);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8 6/7] KVM: Let ioapic know the irq line status

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Userspace may deliver RTC interrupt without query the status. So we
want to track RTC EOI for this case.

Signed-off-by: Yang Zhang 
---
 arch/x86/kvm/i8254.c |4 ++--
 arch/x86/kvm/x86.c   |6 --
 include/linux/kvm_host.h |   11 +++
 virt/kvm/assigned-dev.c  |   13 +++--
 virt/kvm/eventfd.c   |   15 +--
 virt/kvm/ioapic.c|   18 ++
 virt/kvm/ioapic.h|2 +-
 virt/kvm/irq_comm.c  |   19 ---
 virt/kvm/kvm_main.c  |3 ++-
 9 files changed, 54 insertions(+), 37 deletions(-)

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index c1d30b2..412a5aa 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -290,8 +290,8 @@ static void pit_do_work(struct kthread_work *work)
}
spin_unlock(&ps->inject_lock);
if (inject) {
-   kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
-   kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0);
+   kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1, false);
+   kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0, false);
 
/*
 * Provides NMI watchdog support via Virtual Wire mode.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2aaba81..5e85d8d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3484,13 +3484,15 @@ out:
return r;
 }
 
-int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event)
+int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
+   bool line_status)
 {
if (!irqchip_in_kernel(kvm))
return -ENXIO;
 
irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
-   irq_event->irq, irq_event->level);
+   irq_event->irq, irq_event->level,
+   line_status);
return 0;
 }
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 1c0be23..7bcdb6b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -289,7 +289,8 @@ struct kvm_kernel_irq_routing_entry {
u32 gsi;
u32 type;
int (*set)(struct kvm_kernel_irq_routing_entry *e,
-  struct kvm *kvm, int irq_source_id, int level);
+  struct kvm *kvm, int irq_source_id, int level,
+  bool line_status);
union {
struct {
unsigned irqchip;
@@ -588,7 +589,8 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 
 int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
   struct kvm_userspace_memory_region *mem);
-int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level);
+int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
+   bool line_status);
 long kvm_arch_vm_ioctl(struct file *filp,
   unsigned int ioctl, unsigned long arg);
 
@@ -719,10 +721,11 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic 
*ioapic,
   union kvm_ioapic_redirect_entry *entry,
   unsigned long *deliver_bitmask);
 #endif
-int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level);
+int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
+   bool line_status);
 int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int 
level);
 int kvm_set_msi(struct kvm_kernel_irq_routing_entry *irq_entry, struct kvm 
*kvm,
-   int irq_source_id, int level);
+   int irq_source_id, int level, bool line_status);
 bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin);
 void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin);
 void kvm_register_irq_ack_notifier(struct kvm *kvm,
diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index 3642239..f4c7f59 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -80,11 +80,12 @@ kvm_assigned_dev_raise_guest_irq(struct 
kvm_assigned_dev_kernel *assigned_dev,
spin_lock(&assigned_dev->intx_mask_lock);
if (!(assigned_dev->flags & KVM_DEV_ASSIGN_MASK_INTX))
kvm_set_irq(assigned_dev->kvm,
-   assigned_dev->irq_source_id, vector, 1);
+   assigned_dev->irq_source_id, vector, 1,
+   false);
spin_unlock(&assigned_dev->intx_mask_lock);
} else
kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id,
-   vector, 1);
+   vector, 1, false);
 }
 
 static irqreturn_t kvm_assigned_dev_thread_intx(int irq, void *dev_id)
@@ -165,7 +166,7 @@ static void kvm_assigne

[PATCH v8 7/7] KVM: Use eoi to track RTC interrupt delivery status

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Current interrupt coalescing logci which only used by RTC has conflict
with Posted Interrupt.
This patch introduces a new mechinism to use eoi to track interrupt:
When delivering an interrupt to vcpu, the pending_eoi set to number of
vcpu that received the interrupt. And decrease it when each vcpu writing
eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
write eoi.

Signed-off-by: Yang Zhang 
---
 virt/kvm/ioapic.c |   41 -
 1 files changed, 40 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 8d1f662..197ef97 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -149,6 +149,29 @@ static void kvm_rtc_eoi_tracking_restore_all(struct 
kvm_ioapic *ioapic)
__rtc_irq_eoi_tracking_restore_one(vcpu, vector);
 }
 
+static void rtc_irq_ack_eoi(struct kvm_ioapic *ioapic, struct kvm_vcpu *vcpu,
+   int irq)
+{
+   if (irq != RTC_GSI)
+   return;
+
+   if (test_and_clear_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map))
+   --ioapic->rtc_status.pending_eoi;
+
+   WARN_ON(ioapic->rtc_status.pending_eoi < 0);
+}
+
+static bool rtc_irq_check(struct kvm_ioapic *ioapic, int irq, bool line_status)
+{
+   if (irq != RTC_GSI || !line_status)
+   return false;
+
+   if (ioapic->rtc_status.pending_eoi > 0)
+   return true; /* coalesced */
+
+   return false;
+}
+
 static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx,
bool line_status)
 {
@@ -262,6 +285,7 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int 
irq, bool line_status)
 {
union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
struct kvm_lapic_irq irqe;
+   int ret;
 
ioapic_debug("dest=%x dest_mode=%x delivery_mode=%x "
 "vector=%x trig_mode=%x\n",
@@ -277,7 +301,15 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int 
irq, bool line_status)
irqe.level = 1;
irqe.shorthand = 0;
 
-   return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
+   if (irq == RTC_GSI && line_status) {
+   BUG_ON(ioapic->rtc_status.pending_eoi != 0);
+   ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
+   ioapic->rtc_status.dest_map);
+   ioapic->rtc_status.pending_eoi = ret;
+   } else
+   ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
+
+   return ret;
 }
 
 int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id,
@@ -301,6 +333,11 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, 
int irq_source_id,
ret = 1;
} else {
int edge = (entry.fields.trig_mode == IOAPIC_EDGE_TRIG);
+
+   if (rtc_irq_check(ioapic, irq, line_status)) {
+   ret = 0; /* coalesced */
+   goto out;
+   }
ioapic->irr |= mask;
if ((edge && old_irr != ioapic->irr) ||
(!edge && !entry.fields.remote_irr))
@@ -308,6 +345,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, 
int irq_source_id,
else
ret = 0; /* report coalesced interrupt */
}
+out:
trace_kvm_ioapic_set_irq(entry.bits, irq, ret == 0);
spin_unlock(&ioapic->lock);
 
@@ -335,6 +373,7 @@ static void __kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu,
if (ent->fields.vector != vector)
continue;
 
+   rtc_irq_ack_eoi(ioapic, vcpu, i);
/*
 * We are dropping lock while calling ack notifiers because ack
 * notifier callbacks for assigned devices call into IOAPIC
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8 0/7] KVM: VMX: Add Posted Interrupt supporting

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

The follwoing patches are adding the Posted Interrupt supporting to KVM:
The first patch enables the feature 'acknowledge interrupt on vmexit'.Since
it is required by Posted interrupt, we need to enable it firstly.

And the subsequent patches are adding the posted interrupt supporting:
Posted Interrupt allows APIC interrupts to inject into guest directly
without any vmexit.

- When delivering a interrupt to guest, if target vcpu is running,
  update Posted-interrupt requests bitmap and send a notification event
  to the vcpu. Then the vcpu will handle this interrupt automatically,
  without any software involvemnt.

- If target vcpu is not running or there already a notification event
  pending in the vcpu, do nothing. The interrupt will be handled by
  next vm entry

Changes from v7 to v8:
* Remove unused memeber 'on' from struct pi_desc.
* Register a dummy function to sync_pir_to_irr is apicv is disabled.
* Minor fixup.
* Rebase on top of KVM upstream + RTC eoi tracking patch.

Changes from v6 to v7:
* Update TMR when ioapic/lapic's id/ldr/dfr is changed. According to SDM,
  Software should not touch virual apic page when target vcpu in non-root
  mode. Obviously, set TMR when delivering interrupt is break the rule. So
  only update TMR when in target vcpu's context.
* Clear outstanding notification bit before sync pir to irr.
* Sync pit to irr before touch irr.

Changes from v5 to v6:
* Split sync_pir_to_irr into two functions one to query whether PIR is empty
  and the other to perform the sync.
* Add comments to explain how vmx_sync_pir_to_irr() work.
* Rebase on top of KVM upstream.

Yang Zhang (7):
  KVM: VMX: Enable acknowledge interupt on vmexit
  KVM: VMX: Register a new IPI for posted interrupt
  KVM: VMX: Check the posted interrupt capability
  KVM: Call common update function when ioapic entry changed.
  KVM: Set TMR when programming ioapic entry
  KVM: VMX: Add the algorithm of deliver posted interrupt
  KVM: VMX: Use posted interrupt to deliver virtual interrupt

 arch/ia64/kvm/lapic.h  |6 -
 arch/x86/include/asm/entry_arch.h  |4 +
 arch/x86/include/asm/hardirq.h |3 +
 arch/x86/include/asm/hw_irq.h  |1 +
 arch/x86/include/asm/irq_vectors.h |5 +
 arch/x86/include/asm/kvm_host.h|3 +
 arch/x86/include/asm/vmx.h |4 +
 arch/x86/kernel/entry_64.S |5 +
 arch/x86/kernel/irq.c  |   22 
 arch/x86/kernel/irqinit.c  |4 +
 arch/x86/kvm/lapic.c   |   61 +++
 arch/x86/kvm/lapic.h   |3 +
 arch/x86/kvm/svm.c |   12 ++
 arch/x86/kvm/vmx.c |  209 +++-
 arch/x86/kvm/x86.c |   19 +++-
 include/linux/kvm_host.h   |4 +-
 virt/kvm/ioapic.c  |   32 --
 virt/kvm/ioapic.h  |7 +-
 virt/kvm/irq_comm.c|4 +-
 virt/kvm/kvm_main.c|5 +-
 20 files changed, 337 insertions(+), 76 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8 1/7] KVM: VMX: Enable acknowledge interupt on vmexit

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

The "acknowledge interrupt on exit" feature controls processor behavior
for external interrupt acknowledgement. When this control is set, the
processor acknowledges the interrupt controller to acquire the
interrupt vector on VM exit.

After enabling this feature, an interrupt which arrived when target cpu is
running in vmx non-root mode will be handled by vmx handler instead of handler
in idt. Currently, vmx handler only fakes an interrupt stack and jump to idt
table to let real handler to handle it. Further, we will recognize the interrupt
and only delivery the interrupt which not belong to current vcpu through idt 
table.
The interrupt which belonged to current vcpu will be handled inside vmx handler.
This will reduce the interrupt handle cost of KVM.

Also, interrupt enable logic is changed if this feature is turnning on:
Before this patch, hypervior call local_irq_enable() to enable it directly.
Now IF bit is set on interrupt stack frame, and will be enabled on a return from
interrupt handler if exterrupt interrupt exists. If no external interrupt, still
call local_irq_enable() to enable it.

Refer to Intel SDM volum 3, chapter 33.2.

Signed-off-by: Yang Zhang 
---
 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/svm.c  |6 
 arch/x86/kvm/vmx.c  |   58 ---
 arch/x86/kvm/x86.c  |4 ++-
 4 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b5a6462..8e95512 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -730,6 +730,7 @@ struct kvm_x86_ops {
int (*check_intercept)(struct kvm_vcpu *vcpu,
   struct x86_instruction_info *info,
   enum x86_intercept_stage stage);
+   void (*handle_external_intr)(struct kvm_vcpu *vcpu);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7a46c1f..2f8fe3f 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4233,6 +4233,11 @@ out:
return ret;
 }
 
+static void svm_handle_external_intr(struct kvm_vcpu *vcpu)
+{
+   local_irq_enable();
+}
+
 static struct kvm_x86_ops svm_x86_ops = {
.cpu_has_kvm_support = has_svm,
.disabled_by_bios = is_disabled,
@@ -4328,6 +4333,7 @@ static struct kvm_x86_ops svm_x86_ops = {
.set_tdp_cr3 = set_tdp_cr3,
 
.check_intercept = svm_check_intercept,
+   .handle_external_intr = svm_handle_external_intr,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 03f5746..7408d93 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -378,6 +378,7 @@ struct vcpu_vmx {
struct shared_msr_entry *guest_msrs;
int   nmsrs;
int   save_nmsrs;
+   unsigned long host_idt_base;
 #ifdef CONFIG_X86_64
u64   msr_host_kernel_gs_base;
u64   msr_guest_kernel_gs_base;
@@ -2627,7 +2628,8 @@ static __init int setup_vmcs_config(struct vmcs_config 
*vmcs_conf)
 #ifdef CONFIG_X86_64
min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
 #endif
-   opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT;
+   opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT |
+   VM_EXIT_ACK_INTR_ON_EXIT;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
&_vmexit_control) < 0)
return -EIO;
@@ -3879,7 +3881,7 @@ static void vmx_disable_intercept_msr_write_x2apic(u32 
msr)
  * Note that host-state that does change is set elsewhere. E.g., host-state
  * that is set differently for each CPU is set in vmx_vcpu_load(), not here.
  */
-static void vmx_set_constant_host_state(void)
+static void vmx_set_constant_host_state(struct vcpu_vmx *vmx)
 {
u32 low32, high32;
unsigned long tmpl;
@@ -3907,6 +3909,7 @@ static void vmx_set_constant_host_state(void)
 
native_store_idt(&dt);
vmcs_writel(HOST_IDTR_BASE, dt.address);   /* 22.2.4 */
+   vmx->host_idt_base = dt.address;
 
vmcs_writel(HOST_RIP, vmx_return); /* 22.2.5 */
 
@@ -4039,7 +4042,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 
vmcs_write16(HOST_FS_SELECTOR, 0);/* 22.2.4 */
vmcs_write16(HOST_GS_SELECTOR, 0);/* 22.2.4 */
-   vmx_set_constant_host_state();
+   vmx_set_constant_host_state(vmx);
 #ifdef CONFIG_X86_64
rdmsrl(MSR_FS_BASE, a);
vmcs_writel(HOST_FS_BASE, a); /* 22.2.4 */
@@ -6400,6 +6403,52 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx 
*vmx)
}
 }
 
+static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
+{
+   u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
+
+   /*
+* If external interrupt exists, IF bit is set in rflags/eflags on the
+* interrupt stack frame

[PATCH v8 2/7] KVM: VMX: Register a new IPI for posted interrupt

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Posted Interrupt feature requires a special IPI to deliver posted interrupt
to guest. And it should has a high priority so the interrupt will not be
blocked by others.
Normally, the posted interrupt will be consumed by vcpu if target vcpu is
running and transparent to OS. But in some cases, the interrupt will arrive
when target vcpu is scheduled out. And host will see it. So we need to
register a dump handler to handle it.

Signed-off-by: Yang Zhang 
---
 arch/x86/include/asm/entry_arch.h  |4 
 arch/x86/include/asm/hardirq.h |3 +++
 arch/x86/include/asm/hw_irq.h  |1 +
 arch/x86/include/asm/irq_vectors.h |5 +
 arch/x86/kernel/entry_64.S |5 +
 arch/x86/kernel/irq.c  |   22 ++
 arch/x86/kernel/irqinit.c  |4 
 7 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/entry_arch.h 
b/arch/x86/include/asm/entry_arch.h
index 40afa00..9bd4eca 100644
--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -19,6 +19,10 @@ BUILD_INTERRUPT(reboot_interrupt,REBOOT_VECTOR)
 
 BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR)
 
+#ifdef CONFIG_HAVE_KVM
+BUILD_INTERRUPT(kvm_posted_intr_ipi, POSTED_INTR_VECTOR)
+#endif
+
 /*
  * every pentium local APIC has two 'local interrupts', with a
  * soft-definable vector attached to both interrupts, one of
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 81f04ce..ab0ae1a 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -12,6 +12,9 @@ typedef struct {
unsigned int irq_spurious_count;
unsigned int icr_read_retry_count;
 #endif
+#ifdef CONFIG_HAVE_KVM
+   unsigned int kvm_posted_intr_ipis;
+#endif
unsigned int x86_platform_ipis; /* arch dependent */
unsigned int apic_perf_irqs;
unsigned int apic_irq_work_irqs;
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 10a78c3..1da97ef 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -28,6 +28,7 @@
 /* Interrupt handlers registered during init_IRQ */
 extern void apic_timer_interrupt(void);
 extern void x86_platform_ipi(void);
+extern void kvm_posted_intr_ipi(void);
 extern void error_interrupt(void);
 extern void irq_work_interrupt(void);
 
diff --git a/arch/x86/include/asm/irq_vectors.h 
b/arch/x86/include/asm/irq_vectors.h
index aac5fa6..5702d7e 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -102,6 +102,11 @@
  */
 #define X86_PLATFORM_IPI_VECTOR0xf7
 
+/* Vector for KVM to deliver posted interrupt IPI */
+#ifdef CONFIG_HAVE_KVM
+#define POSTED_INTR_VECTOR 0xf2
+#endif
+
 /*
  * IRQ work vector:
  */
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index c1d01e6..7272089 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1166,6 +1166,11 @@ apicinterrupt LOCAL_TIMER_VECTOR \
 apicinterrupt X86_PLATFORM_IPI_VECTOR \
x86_platform_ipi smp_x86_platform_ipi
 
+#ifdef CONFIG_HAVE_KVM
+apicinterrupt POSTED_INTR_VECTOR \
+   kvm_posted_intr_ipi smp_kvm_posted_intr_ipi
+#endif
+
 apicinterrupt THRESHOLD_APIC_VECTOR \
threshold_interrupt smp_threshold_interrupt
 apicinterrupt THERMAL_APIC_VECTOR \
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index e4595f1..6ae6ea1 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -228,6 +228,28 @@ void smp_x86_platform_ipi(struct pt_regs *regs)
set_irq_regs(old_regs);
 }
 
+#ifdef CONFIG_HAVE_KVM
+/*
+ * Handler for POSTED_INTERRUPT_VECTOR.
+ */
+void smp_kvm_posted_intr_ipi(struct pt_regs *regs)
+{
+   struct pt_regs *old_regs = set_irq_regs(regs);
+
+   ack_APIC_irq();
+
+   irq_enter();
+
+   exit_idle();
+
+   inc_irq_stat(kvm_posted_intr_ipis);
+
+   irq_exit();
+
+   set_irq_regs(old_regs);
+}
+#endif
+
 EXPORT_SYMBOL_GPL(vector_used_by_percpu_irq);
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index 7dc4e45..a2a1fbc 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -172,6 +172,10 @@ static void __init apic_intr_init(void)
 
/* IPI for X86 platform specific use */
alloc_intr_gate(X86_PLATFORM_IPI_VECTOR, x86_platform_ipi);
+#ifdef CONFIG_HAVE_KVM
+   /* IPI for KVM to deliver posted interrupt */
+   alloc_intr_gate(POSTED_INTR_VECTOR, kvm_posted_intr_ipi);
+#endif
 
/* IPI vectors for APIC spurious and error interrupts */
alloc_intr_gate(SPURIOUS_APIC_VECTOR, spurious_interrupt);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8 3/7] KVM: VMX: Check the posted interrupt capability

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Detect the posted interrupt feature. If it exists, then set it in vmcs_config.

Signed-off-by: Yang Zhang 
---
 arch/x86/include/asm/vmx.h |4 ++
 arch/x86/kvm/vmx.c |   82 +---
 2 files changed, 66 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index fc1c313..6f07f19 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -71,6 +71,7 @@
 #define PIN_BASED_NMI_EXITING   0x0008
 #define PIN_BASED_VIRTUAL_NMIS  0x0020
 #define PIN_BASED_VMX_PREEMPTION_TIMER  0x0040
+#define PIN_BASED_POSTED_INTR   0x0080
 
 #define PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR0x0016
 
@@ -102,6 +103,7 @@
 /* VMCS Encodings */
 enum vmcs_field {
VIRTUAL_PROCESSOR_ID= 0x,
+   POSTED_INTR_NV  = 0x0002,
GUEST_ES_SELECTOR   = 0x0800,
GUEST_CS_SELECTOR   = 0x0802,
GUEST_SS_SELECTOR   = 0x0804,
@@ -136,6 +138,8 @@ enum vmcs_field {
VIRTUAL_APIC_PAGE_ADDR_HIGH = 0x2013,
APIC_ACCESS_ADDR= 0x2014,
APIC_ACCESS_ADDR_HIGH   = 0x2015,
+   POSTED_INTR_DESC_ADDR   = 0x2016,
+   POSTED_INTR_DESC_ADDR_HIGH  = 0x2017,
EPT_POINTER = 0x201a,
EPT_POINTER_HIGH= 0x201b,
EOI_EXIT_BITMAP0= 0x201c,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7408d93..05da991 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -84,7 +84,8 @@ module_param(vmm_exclusive, bool, S_IRUGO);
 static bool __read_mostly fasteoi = 1;
 module_param(fasteoi, bool, S_IRUGO);
 
-static bool __read_mostly enable_apicv_reg_vid;
+static bool __read_mostly enable_apicv;
+module_param(enable_apicv, bool, S_IRUGO);
 
 /*
  * If nested=1, nested virtualization is supported, i.e., guests may use
@@ -366,6 +367,14 @@ struct nested_vmx {
struct page *apic_access_page;
 };
 
+#define POSTED_INTR_ON  0
+/* Posted-Interrupt Descriptor */
+struct pi_desc {
+   u32 pir[8]; /* Posted interrupt requested */
+   u32 control;/* bit 0 of control is outstanding notification bit */
+   u32 rsvd[7];
+} __aligned(64);
+
 struct vcpu_vmx {
struct kvm_vcpu   vcpu;
unsigned long host_rsp;
@@ -430,6 +439,9 @@ struct vcpu_vmx {
 
bool rdtscp_enabled;
 
+   /* Posted interrupt descriptor */
+   struct pi_desc pi_desc;
+
/* Support for a guest hypervisor (nested VMX) */
struct nested_vmx nested;
 };
@@ -785,6 +797,18 @@ static inline bool cpu_has_vmx_virtual_intr_delivery(void)
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY;
 }
 
+static inline bool cpu_has_vmx_posted_intr(void)
+{
+   return vmcs_config.pin_based_exec_ctrl & PIN_BASED_POSTED_INTR;
+}
+
+static inline bool cpu_has_vmx_apicv(void)
+{
+   return cpu_has_vmx_apic_register_virt() &&
+   cpu_has_vmx_virtual_intr_delivery() &&
+   cpu_has_vmx_posted_intr();
+}
+
 static inline bool cpu_has_vmx_flexpriority(void)
 {
return cpu_has_vmx_tpr_shadow() &&
@@ -2552,12 +2576,6 @@ static __init int setup_vmcs_config(struct vmcs_config 
*vmcs_conf)
u32 _vmexit_control = 0;
u32 _vmentry_control = 0;
 
-   min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
-   opt = PIN_BASED_VIRTUAL_NMIS;
-   if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
-   &_pin_based_exec_control) < 0)
-   return -EIO;
-
min = CPU_BASED_HLT_EXITING |
 #ifdef CONFIG_X86_64
  CPU_BASED_CR8_LOAD_EXITING |
@@ -2634,6 +2652,17 @@ static __init int setup_vmcs_config(struct vmcs_config 
*vmcs_conf)
&_vmexit_control) < 0)
return -EIO;
 
+   min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
+   opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR;
+   if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
+   &_pin_based_exec_control) < 0)
+   return -EIO;
+
+   if (!(_cpu_based_2nd_exec_control &
+   SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) ||
+   !(_vmexit_control & VM_EXIT_ACK_INTR_ON_EXIT))
+   _pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
+
min = 0;
opt = VM_ENTRY_LOAD_IA32_PAT;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
@@ -2812,11 +2841,10 @@ static __init int hardware_setup(void)
if (!cpu_has_vmx_ple())
ple_gap = 0;
 
-   if (!cpu_has_vmx_apic_register_virt() ||
-   !cpu_has_vmx_virtual_intr_delivery())
-   enable_apicv_reg_vid = 0;
+   if (!cpu_has_vmx_apicv())
+

[PATCH v8 4/7] KVM: Call common update function when ioapic entry changed.

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Both TMR and EOI exit bitmap need to be updated when ioapic changed
or vcpu's id/ldr/dfr changed. So use common function instead eoi exit
bitmap specific function.

Signed-off-by: Yang Zhang 
---
 arch/ia64/kvm/lapic.h|6 --
 arch/x86/kvm/lapic.c |4 ++--
 arch/x86/kvm/lapic.h |1 +
 arch/x86/kvm/vmx.c   |3 +++
 arch/x86/kvm/x86.c   |   11 +++
 include/linux/kvm_host.h |4 ++--
 virt/kvm/ioapic.c|   22 +-
 virt/kvm/ioapic.h|6 ++
 virt/kvm/irq_comm.c  |4 ++--
 virt/kvm/kvm_main.c  |4 ++--
 10 files changed, 34 insertions(+), 31 deletions(-)

diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
index c3e2935..c5f92a9 100644
--- a/arch/ia64/kvm/lapic.h
+++ b/arch/ia64/kvm/lapic.h
@@ -27,10 +27,4 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
kvm_lapic_irq *irq);
 #define kvm_apic_present(x) (true)
 #define kvm_lapic_enabled(x) (true)
 
-static inline bool kvm_apic_vid_enabled(void)
-{
-   /* IA64 has no apicv supporting, do nothing here */
-   return false;
-}
-
 #endif
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 6796218..6c83969 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -136,7 +136,7 @@ static inline void apic_set_spiv(struct kvm_lapic *apic, 
u32 val)
apic_set_reg(apic, APIC_SPIV, val);
 }
 
-static inline int apic_enabled(struct kvm_lapic *apic)
+int apic_enabled(struct kvm_lapic *apic)
 {
return kvm_apic_sw_enabled(apic) && kvm_apic_hw_enabled(apic);
 }
@@ -217,7 +217,7 @@ out:
if (old)
kfree_rcu(old, rcu);
 
-   kvm_ioapic_make_eoibitmap_request(kvm);
+   kvm_vcpu_request_scan_ioapic(kvm);
 }
 
 static inline void kvm_apic_set_id(struct kvm_lapic *apic, u8 id)
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 16304b1..a2e2c6a 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -167,5 +167,6 @@ static inline bool kvm_apic_has_events(struct kvm_vcpu 
*vcpu)
 }
 
 bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
+int apic_enabled(struct kvm_lapic *apic);
 
 #endif
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 05da991..5637a8a 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6415,6 +6415,9 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, 
int max_irr)
 
 static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 {
+   if (!vmx_vm_has_apicv(vcpu->kvm))
+   return;
+
vmcs_write64(EOI_EXIT_BITMAP0, eoi_exit_bitmap[0]);
vmcs_write64(EOI_EXIT_BITMAP1, eoi_exit_bitmap[1]);
vmcs_write64(EOI_EXIT_BITMAP2, eoi_exit_bitmap[2]);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5b146d2..53dc96f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5649,13 +5649,16 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
 #endif
 }
 
-static void update_eoi_exitmap(struct kvm_vcpu *vcpu)
+static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 {
u64 eoi_exit_bitmap[4];
 
+   if (!apic_enabled(vcpu->arch.apic))
+   return;
+
memset(eoi_exit_bitmap, 0, 32);
 
-   kvm_ioapic_calculate_eoi_exitmap(vcpu, eoi_exit_bitmap);
+   kvm_ioapic_scan_entry(vcpu, eoi_exit_bitmap);
kvm_x86_ops->load_eoi_exitmap(vcpu, eoi_exit_bitmap);
 }
 
@@ -5712,8 +5715,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
kvm_handle_pmu_event(vcpu);
if (kvm_check_request(KVM_REQ_PMI, vcpu))
kvm_deliver_pmi(vcpu);
-   if (kvm_check_request(KVM_REQ_EOIBITMAP, vcpu))
-   update_eoi_exitmap(vcpu);
+   if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
+   vcpu_scan_ioapic(vcpu);
}
 
if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7bcdb6b..6f49d9d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -126,7 +126,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_MASTERCLOCK_UPDATE 19
 #define KVM_REQ_MCLOCK_INPROGRESS 20
 #define KVM_REQ_EPR_EXIT  21
-#define KVM_REQ_EOIBITMAP 22
+#define KVM_REQ_SCAN_IOAPIC   22
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID   1
@@ -572,7 +572,7 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
 void kvm_flush_remote_tlbs(struct kvm *kvm);
 void kvm_reload_remote_mmus(struct kvm *kvm);
 void kvm_make_mclock_inprogress_request(struct kvm *kvm);
-void kvm_make_update_eoibitmap_request(struct kvm *kvm);
+void kvm_make_scan_ioapic_request(struct kvm *kvm);
 
 long kvm_arch_dev_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg);
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 47896ee..40ad96d 100644
--- a/virt/kvm

[PATCH v8 7/7] KVM: VMX: Use posted interrupt to deliver virtual interrupt

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

If posted interrupt is avaliable, then uses it to inject virtual
interrupt to guest.

Signed-off-by: Yang Zhang 
---
 arch/x86/kvm/lapic.c |   29 ++---
 arch/x86/kvm/vmx.c   |2 +-
 arch/x86/kvm/x86.c   |1 +
 3 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8948979..46a4cca 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -353,6 +353,7 @@ static inline int apic_find_highest_irr(struct kvm_lapic 
*apic)
if (!apic->irr_pending)
return -1;
 
+   kvm_x86_ops->sync_pir_to_irr(apic->vcpu);
result = apic_search_irr(apic);
ASSERT(result == -1 || result >= 16);
 
@@ -683,18 +684,24 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int 
delivery_mode,
if (dest_map)
__set_bit(vcpu->vcpu_id, dest_map);
 
-   result = !apic_test_and_set_irr(vector, apic);
-   trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
- trig_mode, vector, !result);
-   if (!result) {
-   if (trig_mode)
-   apic_debug("level trig mode repeatedly for "
-   "vector %d", vector);
-   break;
-   }
+   if (kvm_x86_ops->deliver_posted_interrupt) {
+   result = 1;
+   kvm_x86_ops->deliver_posted_interrupt(vcpu, vector);
+   } else {
+   result = !apic_test_and_set_irr(vector, apic);
+
+   trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
+   trig_mode, vector, !result);
+   if (!result) {
+   if (trig_mode)
+   apic_debug("level trig mode repeatedly "
+   "for vector %d", vector);
+   break;
+   }
 
-   kvm_make_request(KVM_REQ_EVENT, vcpu);
-   kvm_vcpu_kick(vcpu);
+   kvm_make_request(KVM_REQ_EVENT, vcpu);
+   kvm_vcpu_kick(vcpu);
+   }
break;
 
case APIC_DM_REMRD:
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 3de2d7f..cd1c6ff 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -84,7 +84,7 @@ module_param(vmm_exclusive, bool, S_IRUGO);
 static bool __read_mostly fasteoi = 1;
 module_param(fasteoi, bool, S_IRUGO);
 
-static bool __read_mostly enable_apicv;
+static bool __read_mostly enable_apicv = 1;
 module_param(enable_apicv, bool, S_IRUGO);
 
 /*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 72be079..486f627 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2685,6 +2685,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
struct kvm_lapic_state *s)
 {
+   kvm_x86_ops->sync_pir_to_irr(vcpu);
memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
 
return 0;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8 6/7] KVM: VMX: Add the algorithm of deliver posted interrupt

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

Only deliver the posted interrupt when target vcpu is running
and there is no previous interrupt pending in pir.

Signed-off-by: Yang Zhang 
---
 arch/x86/include/asm/kvm_host.h |2 +
 arch/x86/kvm/lapic.c|   13 
 arch/x86/kvm/lapic.h|1 +
 arch/x86/kvm/svm.c  |6 +++
 arch/x86/kvm/vmx.c  |   66 ++-
 virt/kvm/kvm_main.c |1 +
 6 files changed, 88 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8e95512..842ea5a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -704,6 +704,8 @@ struct kvm_x86_ops {
void (*hwapic_isr_update)(struct kvm *kvm, int isr);
void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
+   void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
+   void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
int (*get_tdp_level)(void);
u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index f7b5e51..8948979 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -318,6 +318,19 @@ static u8 count_vectors(void *bitmap)
return count;
 }
 
+void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir)
+{
+   u32 i, pir_val;
+   struct kvm_lapic *apic = vcpu->arch.apic;
+
+   for (i = 0; i <= 7; i++) {
+   pir_val = xchg(&pir[i], 0);
+   if (pir_val)
+   *((u32 *)(apic->regs + APIC_IRR + i * 0x10)) |= pir_val;
+   }
+}
+EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
+
 static inline int apic_test_and_set_irr(int vec, struct kvm_lapic *apic)
 {
apic->irr_pending = true;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index b067a08..f09f8d5 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -54,6 +54,7 @@ u64 kvm_lapic_get_base(struct kvm_vcpu *vcpu);
 void kvm_apic_set_version(struct kvm_vcpu *vcpu);
 
 void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr);
+void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir);
 int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest);
 int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
 int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 2f8fe3f..d6713e1 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3577,6 +3577,11 @@ static void svm_hwapic_isr_update(struct kvm *kvm, int 
isr)
return;
 }
 
+static void svm_sync_pir_to_irr(struct kvm_vcpu *vcpu)
+{
+   return;
+}
+
 static int svm_nmi_allowed(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -4305,6 +4310,7 @@ static struct kvm_x86_ops svm_x86_ops = {
.vm_has_apicv = svm_vm_has_apicv,
.load_eoi_exitmap = svm_load_eoi_exitmap,
.hwapic_isr_update = svm_hwapic_isr_update,
+   .sync_pir_to_irr = svm_sync_pir_to_irr,
 
.set_tss_addr = svm_set_tss_addr,
.get_tdp_level = get_npt_level,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5637a8a..3de2d7f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -375,6 +375,23 @@ struct pi_desc {
u32 rsvd[7];
 } __aligned(64);
 
+static bool pi_test_and_set_on(struct pi_desc *pi_desc)
+{
+   return test_and_set_bit(POSTED_INTR_ON,
+   (unsigned long *)&pi_desc->control);
+}
+
+static bool pi_test_and_clear_on(struct pi_desc *pi_desc)
+{
+   return test_and_clear_bit(POSTED_INTR_ON,
+   (unsigned long *)&pi_desc->control);
+}
+
+static int pi_test_and_set_pir(int vector, struct pi_desc *pi_desc)
+{
+   return test_and_set_bit(vector, (unsigned long *)pi_desc->pir);
+}
+
 struct vcpu_vmx {
struct kvm_vcpu   vcpu;
unsigned long host_rsp;
@@ -639,6 +656,7 @@ static void vmx_get_segment(struct kvm_vcpu *vcpu,
struct kvm_segment *var, int seg);
 static bool guest_state_valid(struct kvm_vcpu *vcpu);
 static u32 vmx_segment_access_rights(struct kvm_segment *var);
+static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
 static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -2846,8 +2864,11 @@ static __init int hardware_setup(void)
 
if (enable_apicv)
kvm_x86_ops->update_cr8_intercept = NULL;
-   else
+   else {
kvm_x86_ops->hwapic_irr_update = NULL;
+   kvm_x86_ops->deliver_posted_interrupt = NULL;
+   kvm_x86_ops->sync_pir_to_irr = vmx_sync_pir_to_irr_dummy;
+   }
 
if (nested)
nested_vmx_setup_ctls_msrs()

[PATCH v8 5/7] KVM: Set TMR when programming ioapic entry

2013-04-08 Thread Yang Zhang

From: Yang Zhang 

We already know the trigger mode of a given interrupt when programming
the ioapice entry. So it's not necessary to set it in each interrupt
delivery.

Signed-off-by: Yang Zhang 
---
 arch/x86/kvm/lapic.c |   15 +--
 arch/x86/kvm/lapic.h |1 +
 arch/x86/kvm/x86.c   |5 -
 virt/kvm/ioapic.c|   12 +---
 virt/kvm/ioapic.h|3 ++-
 5 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 6c83969..f7b5e51 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -468,6 +468,15 @@ static inline int apic_find_highest_isr(struct kvm_lapic 
*apic)
return result;
 }
 
+void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr)
+{
+   struct kvm_lapic *apic = vcpu->arch.apic;
+   int i;
+
+   for (i = 0; i < 8; i++)
+   apic_set_reg(apic, APIC_TMR + 0x10 * i, tmr[i]);
+}
+
 static void apic_update_ppr(struct kvm_lapic *apic)
 {
u32 tpr, isrv, ppr, old_ppr;
@@ -661,12 +670,6 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int 
delivery_mode,
if (dest_map)
__set_bit(vcpu->vcpu_id, dest_map);
 
-   if (trig_mode) {
-   apic_debug("level trig mode for vector %d", vector);
-   apic_set_vector(vector, apic->regs + APIC_TMR);
-   } else
-   apic_clear_vector(vector, apic->regs + APIC_TMR);
-
result = !apic_test_and_set_irr(vector, apic);
trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
  trig_mode, vector, !result);
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index a2e2c6a..b067a08 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -53,6 +53,7 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value);
 u64 kvm_lapic_get_base(struct kvm_vcpu *vcpu);
 void kvm_apic_set_version(struct kvm_vcpu *vcpu);
 
+void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr);
 int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest);
 int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
 int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 53dc96f..72be079 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5652,14 +5652,17 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
 static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 {
u64 eoi_exit_bitmap[4];
+   u32 tmr[8];
 
if (!apic_enabled(vcpu->arch.apic))
return;
 
memset(eoi_exit_bitmap, 0, 32);
+   memset(tmr, 0, 32);
 
-   kvm_ioapic_scan_entry(vcpu, eoi_exit_bitmap);
+   kvm_ioapic_scan_entry(vcpu, eoi_exit_bitmap, tmr);
kvm_x86_ops->load_eoi_exitmap(vcpu, eoi_exit_bitmap);
+   kvm_apic_update_tmr(vcpu, tmr);
 }
 
 static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 40ad96d..c6e9ff1 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -196,7 +196,8 @@ static void update_handled_vectors(struct kvm_ioapic 
*ioapic)
smp_wmb();
 }
 
-void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
+void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap,
+   u32 *tmr)
 {
struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
union kvm_ioapic_redirect_entry *e;
@@ -210,8 +211,13 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 
*eoi_exit_bitmap)
 kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC,
 index) || index == RTC_GSI)) {
if (kvm_apic_match_dest(vcpu, NULL, 0,
-   e->fields.dest_id, e->fields.dest_mode))
-   __set_bit(e->fields.vector, (unsigned long 
*)eoi_exit_bitmap);
+   e->fields.dest_id, e->fields.dest_mode)) {
+   __set_bit(e->fields.vector,
+   (unsigned long *)eoi_exit_bitmap);
+   if (e->fields.trig_mode == IOAPIC_LEVEL_TRIG)
+   __set_bit(e->fields.vector,
+   (unsigned long *)tmr);
+   }
}
}
spin_unlock(&ioapic->lock);
diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
index 674a388..615d8c9 100644
--- a/virt/kvm/ioapic.h
+++ b/virt/kvm/ioapic.h
@@ -97,6 +97,7 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct 
kvm_lapic *src,
 int kvm_get_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state);
 int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state);
 void kvm_vcpu_request_scan_ioapic(struct kvm *kvm);
-void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_b

Re: [PATCH V7 0/5] virtio-scsi multiqueue

2013-04-08 Thread Rusty Russell

Asias He  writes:
> On Sat, Apr 06, 2013 at 09:40:13AM +0100, James Bottomley wrote:
>> Well, I haven't had time to look at anything other than the patch I
>> commented on.  I'm happy with your fix, so you can add my acked by to
>> that one.  Since it's going through the virtio tree, don't wait for me,
>> put it in and I'll make you fix up anything I find later that I'm
>> unhappy with.
>
> So, Rusty, could you pick this up in your virtio-next tree?

Done!

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH uq/master v2 1/2] kvm: reset state from the CPU's reset method

2013-04-08 Thread Andreas Färber

Am 08.04.2013 14:19, schrieb Gleb Natapov:
> On Tue, Apr 02, 2013 at 04:29:32PM +0300, Gleb Natapov wrote:
>>>  static void kvm_sw_tlb_put(PowerPCCPU *cpu)
>>>  {
>>>  CPUPPCState *env = &cpu->env;
>>> diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
>>> index 23fe51f..6321384 100644
>>> --- a/target-s390x/cpu.c
>>> +++ b/target-s390x/cpu.c
>>> @@ -84,6 +84,10 @@ static void s390_cpu_reset(CPUState *s)
>>>   * after incrementing the cpu counter */
>>>  #if !defined(CONFIG_USER_ONLY)
>>>  s->halted = 1;
>>> +
>>> +if (kvm_enabled()) {
>>> +kvm_arch_reset_vcpu(s);
>> Does this compile with kvm support disabled?
>>
> Well, it does not:
>   CCs390x-softmmu/target-s390x/cpu.o
> /users/gleb/work/qemu/target-s390x/cpu.c: In function 's390_cpu_reset':
> /users/gleb/work/qemu/target-s390x/cpu.c:89:9: error: implicit
> declaration of function 'kvm_arch_reset_vcpu'
> [-Werror=implicit-function-declaration]
> /users/gleb/work/qemu/target-s390x/cpu.c:89:9: error: nested extern
> declaration of 'kvm_arch_reset_vcpu' [-Werror=nested-externs]
> cc1: all warnings being treated as errors
> 
> I wonder if it is portable between compilers to rely on code in if(0){} to
> be dropped in all levels of optimizations.

No, we had a previous case where --enable-debug broke if (kvm_enabled())
{...} but regular builds worked.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Warning with vhost_net kref.h:42 handle_tx

2013-04-08 Thread Juhani Rautiainen

Hi!

I've just built server to use it with KVM. Almost immediately I got this 
warning with one of my virtual servers (two currently).

Apr  7 04:44:02 base kernel: [ cut here ]
Apr  7 04:44:02 base kernel: WARNING: at include/linux/kref.h:42 
handle_tx+0x613/0x680 [vhost_net]()
Apr  7 04:44:02 base kernel: Hardware name: System Product Name
Apr  7 04:44:02 base kernel: Modules linked in: tcm_qla2xxx tcm_fc libfc 
tcm_loop target_core_mod configfs ebtable_nat ebtables ipt_MASQUERADE 
iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc 
autofs4 sunrpc cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 
nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 
nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 
vhost_net macvtap macvlan tun iTCO_wdt iTCO_vendor_support e1000 
acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel 
ghash_clmulni_intel microcode i2c_i801 raid456 async_raid6_recov async_pq 
lpc_ich mfd_core raid6_pq async_xor xor async_memcpy async_tx e1000e 
xhci_hcd sg wmi ext4(F) mbcache(F) jbd2(F) firewire_ohci(F) 
firewire_core(F) crc_itu_t(F) sd_mod(F) crc_t10dif(F) aesni_intel(F) 
ablk_helper(F) cryptd(F) lrw(F) aes_x86_64(F) xts(F) gf128mul(F) ahci(F) 
libahci(F) qla2xxx(F) scsi_transport_fc(F) scsi_tgt(F) mvsas(F) libsas(F) 
scsi_transport_sas(F) i915(F) drm_kms_helper(F) drm(F) i2c_
Apr  7 04:44:02 base kernel: algo_bit(F) i2c_core(F) video(F) dm_mirror(F) 
dm_region_hash(F) dm_log(F) dm_mod(F) [last unloaded: i2c_dev]
Apr  7 04:44:02 base kernel: Pid: 17583, comm: vhost-17562 Tainted: GF   
W3.8.5 #4
Apr  7 04:44:02 base kernel: Call Trace:
Apr  7 04:44:02 base kernel: [] 
warn_slowpath_common+0x7f/0xc0
Apr  7 04:44:02 base kernel: [] 
warn_slowpath_null+0x1a/0x20
Apr  7 04:44:02 base kernel: [] handle_tx+0x613/0x680 
[vhost_net]
Apr  7 04:44:02 base kernel: [] ? handle_tx+0x1/0x680 
[vhost_net]
Apr  7 04:44:02 base kernel: [] handle_tx_kick+0x15/0x20 
[vhost_net]
Apr  7 04:44:02 base kernel: [] vhost_worker+0x10c/0x1c0 
[vhost_net]
Apr  7 04:44:02 base kernel: [] ? 
vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
Apr  7 04:44:02 base kernel: [] ? 
vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
Apr  7 04:44:02 base kernel: [] kthread+0xce/0xe0
Apr  7 04:44:02 base kernel: [] ? 
kthread_freezable_should_stop+0x70/0x70
Apr  7 04:44:02 base kernel: [] ret_from_fork+0x7c/0xb0
Apr  7 04:44:02 base kernel: [] ? 
kthread_freezable_should_stop+0x70/0x70
Apr  7 04:44:02 base kernel: ---[ end trace b9b0bac0c9fdebba ]---

This kept filling my logs until I noticed these. This also prevented 
virtual machine & host shutdown. I had to reset the server by force. I 
think these are related to that:

Apr  7 23:46:55 base kernel: INFO: task qemu-kvm:17562 blocked for more 
than 120 seconds.
Apr  7 23:46:55 base kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr  7 23:46:55 base kernel: qemu-kvmD 816105e0 0 
17562  1 0x0084
Apr  7 23:46:55 base kernel: 88040d6d5d88 0086 
88040d6d5fd8 00013a80
Apr  7 23:46:55 base kernel: 88040d6d4010 00013a80 
00013a80 00013a80
Apr  7 23:46:55 base kernel: 88040d6d5fd8 00013a80 
81a14420 88040c878b40
Apr  7 23:46:55 base kernel: Call Trace:
Apr  7 23:46:55 base kernel: [] schedule+0x29/0x70
Apr  7 23:46:55 base kernel: [] 
vhost_ubuf_put_and_wait+0x65/0xa0 [vhost_net]
Apr  7 23:46:55 base kernel: [] ? wake_up_bit+0x40/0x40
Apr  7 23:46:55 base kernel: [] 
vhost_net_set_backend+0x1b5/0x280 [vhost_net]
Apr  7 23:46:55 base kernel: [] 
vhost_net_ioctl+0x149/0x1a0 [vhost_net]
Apr  7 23:46:55 base kernel: [] do_vfs_ioctl+0x8c/0x340
Apr  7 23:46:55 base kernel: [] ? vfs_write+0xc5/0x130
Apr  7 23:46:55 base kernel: [] ? sys_futex+0x7b/0x180
Apr  7 23:46:55 base kernel: [] sys_ioctl+0xa1/0xb0
Apr  7 23:46:55 base kernel: [] 
system_call_fastpath+0x16/0x1b


Is there anything that I can do to help to further debug these? That is if 
I can get them to occur again. Machine has been running for without 
incident since last reboot (18 hours).

I noticed that there is taint flag but I can't figure out how that 
happened. I tried to check licenses with modinfo put can't see any modules 
without GPL or related license. There shouldn't be any tainted modules 
since I compiled kernel from vanilla 3.8.5

Thanks,
Juhani
-- 
Juhani Rautiainen   jra...@iki.fi
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Virtualbox svga card in KVM

2013-04-08 Thread Sriram Murthy

By "richer display", I meant support for different resolution and color depth 
(including support for nonstandard resolutions as well).
-Sriram




- Original Message -
From: Stefan Hajnoczi 
To: Sriram Murthy 
Cc: "kvm@vger.kernel.org" ; qemu list 

Sent: Monday, April 8, 2013 3:46 AM
Subject: Re: Virtualbox svga card in KVM

On Fri, Apr 05, 2013 at 04:52:05PM -0700, Sriram Murthy wrote:
> For starters, virtual box has better SVGA WDDM drivers that allows for a much 
> richer display when the VM display is local.

What does "much richer display" mean?

Stefan

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] Virtualbox svga card in KVM

2013-04-08 Thread Peter Maydell

On 6 April 2013 00:52, Sriram Murthy  wrote:
> (actually, the virtualbox SVGA card is based off of the KVM VGA card)

Is it possible to implement it as an extension to the VGA
card device, or has it diverged incompatibly such that it
has to be its own separate device model?

thanks
-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH uq/master v2 0/2] Add some tracepoints for clarification of the cause of troubles

2013-04-08 Thread Stefan Hajnoczi

On Fri, Mar 29, 2013 at 01:24:25PM +0900, Kazuya Saito wrote:
> This series adds tracepoints for helping us clarify the cause of
> troubles. Virtualization on Linux is composed of some components such
> as qemu, kvm, libvirt, and so on. So it is very important to clarify
> firstly and swiftly the cause of troubles is on what component of
> them. Although qemu has useful information of this because it stands
> among kvm, libvirt and guest, it doesn't output the information by
> trace or log system.
> These patches add tracepoints which lead to reduce the time of the
> clarification. We'd like to add the tracepoints as the first set
> because, based on our experience, we've found out they must be useful
> for an investigation in the future. Without those tracepoints,
> we had a really hard time investigating a problem since the problem's
> reproducibility was quite low and there was no clue in the dump of
> qemu.
> 
> Changes from v1:
> Add arg to kvm_ioctl, kvm_vm_ioctl, kvm_vcpu_ioctl tracepoints.
> Add cpu_index to kvm_vcpu_ioctl, kvm_run_exit tracepoints.
> 
> Kazuya Saito (2):
>   kvm-all: add kvm_ioctl, kvm_vm_ioctl, kvm_vcpu_ioctl tracepoints
>   kvm-all: add kvm_run_exit tracepoint
> 
>  kvm-all.c|5 +
>  trace-events |7 +++
>  2 files changed, 12 insertions(+), 0 deletions(-)
> 
> 
> 

Thanks, applied to my tracing tree:
https://github.com/stefanha/qemu/commits/tracing

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] Virtualbox svga card in KVM

2013-04-08 Thread Sriram Murthy

The Virtualbox SVGA card was derived out of the KVM VGA card, so there are 
quite a few similarities (I am deliberately being vague here as I am still in 
the process of discovering the features of both these cards completely). Having 
said that, the APIs and the data structures themselves have been modified to 
add new features (like displaying a custom bmp as the VGA bootup logo) and it 
has a custom vga bios as well.
Also, it is better that it be its own separate device model, so that 
maintenance of the vbox code becomes easier later. Further, I am thinking on 
the lines of retaining the VIrtualbox SVGA card code as is, and write a small 
KVM abstraction layer, so that it will be easy to port the bug fixes into the 
vbox SVGA card later on. 
Any comments/suggestions welcome here.

-Sriram



- Original Message -
From: Peter Maydell 
To: Sriram Murthy 
Cc: Stefan Hajnoczi ; qemu list ; 
"kvm@vger.kernel.org" 
Sent: Monday, April 8, 2013 8:11 AM
Subject: Re: [Qemu-devel] Virtualbox svga card in KVM

On 6 April 2013 00:52, Sriram Murthy  wrote:
> (actually, the virtualbox SVGA card is based off of the KVM VGA card)

Is it possible to implement it as an extension to the VGA
card device, or has it diverged incompatibly such that it
has to be its own separate device model?

thanks
-- PMM

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 00/32] Port of KVM to arm64

2013-04-08 Thread Marc Zyngier

This series contains the third version of KVM for arm64.

It depends on the following branches/series:
- git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64.git 
soc-armv8-model
  Catalin's platform support branch for v8 models
- git://github.com/columbia/linux-kvm-arm.git kvm-arm-fixes
  mostly reworking the 32bit port to accomodate for arm64
- git://github.com/columbia/linux-kvm-arm.git kvm-arm-next
  adding perf support
- http://lists.infradead.org/pipermail/linux-arm-kernel/2013-April/161381.html
  reworking the whole init procedure for KVM/ARM
- http://lists.infradead.org/pipermail/linux-arm-kernel/2013-April/161395.html
  more 32bit rework

The code is unsurprisingly extremely similar to the KVM/arm code, and
a lot of it is actually shared with the 32bit version. Some of the
include files are duplicated though (I'm definitely willing to fix
that).

In terms of features:
- Support for 4k and 64k pages
- Support for 32bit and 64bit guests
- PSCI support for SMP booting

Testing has been done on both AEMv8 and Foundation models, with
various 32 and 64bit guests running a variety of distributions (OE,
Ubuntu and openSUSE for 64bit, Debian and Ubuntu on 32bit).

The patches are also available on the following branch:
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git 
kvm-arm64/kvm

As we do not have a 64bit QEMU port, it has been tested using
kvmtools. Note that some of the changes have broken the userspace ABI
in v2, and you must update and rebuild your kvmtools (the previous
version won't work anymore):

git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git 
kvm-arm64/kvmtool

>From v2:
- Dropped the idmap code and use the new KVM/ARM boot protocol
- New KVM_CAP_ARM_EL1_32BIT capability to let userspace detect if EL1
  is 32bit capable
- Fixed a bug in arch/arm/entry.S, where EL0/EL1 breakpoint handling was
  mixed up (spotted by Catalin Marinas)
- Some sparse fixes (courtesy of Geoff Levand)
- Dropped the "shared" attribute from device mappings (spotted by Catalin)
- Add API documentation
- Add MAINTAINERS entry

>From v1:
- Rework of the world-switch to use common structure between host and
  guests (suggested by Christopher Covington)
- Some additional constants to make the EL1 fault injection clearer
  (suggested by Christopher Covington)
- Use of __u64 instead of "unsigned long" in the userspace API
  (suggested by Michael S. Tsirkin)
- Move the FP/SIMD registers into the "core" registers, dropping the
  specific accessors.
- Generic MPIDR implementation (suggested by Christopher Covington)
- Cleaner handling of the various host implementations

Marc Zyngier (32):
  arm64: add explicit symbols to ESR_EL1 decoding
  arm64: KVM: define HYP and Stage-2 translation page flags
  arm64: KVM: HYP mode idmap support
  arm64: KVM: EL2 register definitions
  arm64: KVM: system register definitions for 64bit guests
  arm64: KVM: Basic ESR_EL2 helpers and vcpu register access
  arm64: KVM: fault injection into a guest
  arm64: KVM: architecture specific MMU backend
  arm64: KVM: user space interface
  arm64: KVM: system register handling
  arm64: KVM: CPU specific system registers handling
  arm64: KVM: virtual CPU reset
  arm64: KVM: kvm_arch and kvm_vcpu_arch definitions
  arm64: KVM: MMIO access backend
  arm64: KVM: guest one-reg interface
  arm64: KVM: hypervisor initialization code
  arm64: KVM: HYP mode world switch implementation
  arm64: KVM: Exit handling
  arm64: KVM: Plug the VGIC
  arm64: KVM: Plug the arch timer
  arm64: KVM: PSCI implementation
  arm64: KVM: Build system integration
  arm64: KVM: define 32bit specific registers
  arm64: KVM: 32bit GP register access
  arm64: KVM: 32bit conditional execution emulation
  arm64: KVM: 32bit handling of coprocessor traps
  arm64: KVM: CPU specific 32bit coprocessor access
  arm64: KVM: 32bit specific register world switch
  arm64: KVM: 32bit guest fault injection
  arm64: KVM: enable initialization of a 32bit vcpu
  arm64: KVM: userspace API documentation
  arm64: KVM: MAINTAINERS update

 Documentation/virtual/kvm/api.txt   |   55 +-
 MAINTAINERS |9 +
 arch/arm/kvm/arch_timer.c   |1 +
 arch/arm64/Kconfig  |2 +
 arch/arm64/Makefile |2 +-
 arch/arm64/include/asm/esr.h|   55 ++
 arch/arm64/include/asm/kvm_arch_timer.h |   58 ++
 arch/arm64/include/asm/kvm_arm.h|  243 
 arch/arm64/include/asm/kvm_asm.h|  104 
 arch/arm64/include/asm/kvm_coproc.h |   56 ++
 arch/arm64/include/asm/kvm_emulate.h|  185 ++
 arch/arm64/include/asm/kvm_host.h   |  202 ++
 arch/arm64/include/asm/kvm_mmio.h   |   59 ++
 arch/arm64/include/asm/kvm_mmu.h|  136 
 arch/arm64/include/asm/kvm_psci.h   |   23 +
 arch/arm64/include/asm/kvm_vgic.h   |  156 +
 arch/arm64/include/asm/pgtable-hwdef.h  |   13 +
 arch/arm64/include/asm/pgtable.h|   12

[PATCH v3 01/32] arm64: add explicit symbols to ESR_EL1 decoding

2013-04-08 Thread Marc Zyngier

The ESR_EL1 decoding process is a bit cryptic, and KVM has also
a need for the same constants.

Add a new esr.h file containing the appropriate exception classes
constants, and change entry.S to use it. Fix a small bug in the
EL1 breakpoint check while we're at it.

Signed-off-by: Marc Zyngier 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/esr.h | 55 
 arch/arm64/kernel/entry.S| 53 +-
 2 files changed, 82 insertions(+), 26 deletions(-)
 create mode 100644 arch/arm64/include/asm/esr.h

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
new file mode 100644
index 000..7883412
--- /dev/null
+++ b/arch/arm64/include/asm/esr.h
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ASM_ESR_H
+#define __ASM_ESR_H
+
+#define ESR_EL1_EC_SHIFT   (26)
+#define ESR_EL1_IL (1U << 25)
+
+#define ESR_EL1_EC_UNKNOWN (0x00)
+#define ESR_EL1_EC_WFI (0x01)
+#define ESR_EL1_EC_CP15_32 (0x03)
+#define ESR_EL1_EC_CP15_64 (0x04)
+#define ESR_EL1_EC_CP14_MR (0x05)
+#define ESR_EL1_EC_CP14_LS (0x06)
+#define ESR_EL1_EC_FP_ASIMD(0x07)
+#define ESR_EL1_EC_CP10_ID (0x08)
+#define ESR_EL1_EC_CP14_64 (0x0C)
+#define ESR_EL1_EC_ILL_ISS (0x0E)
+#define ESR_EL1_EC_SVC32   (0x11)
+#define ESR_EL1_EC_SVC64   (0x15)
+#define ESR_EL1_EC_SYS64   (0x18)
+#define ESR_EL1_EC_IABT_EL0(0x20)
+#define ESR_EL1_EC_IABT_EL1(0x21)
+#define ESR_EL1_EC_PC_ALIGN(0x22)
+#define ESR_EL1_EC_DABT_EL0(0x24)
+#define ESR_EL1_EC_DABT_EL1(0x25)
+#define ESR_EL1_EC_SP_ALIGN(0x26)
+#define ESR_EL1_EC_FP_EXC32(0x28)
+#define ESR_EL1_EC_FP_EXC64(0x2C)
+#define ESR_EL1_EC_SERRROR (0x2F)
+#define ESR_EL1_EC_BREAKPT_EL0 (0x30)
+#define ESR_EL1_EC_BREAKPT_EL1 (0x31)
+#define ESR_EL1_EC_SOFTSTP_EL0 (0x32)
+#define ESR_EL1_EC_SOFTSTP_EL1 (0x33)
+#define ESR_EL1_EC_WATCHPT_EL0 (0x34)
+#define ESR_EL1_EC_WATCHPT_EL1 (0x35)
+#define ESR_EL1_EC_BKPT32  (0x38)
+#define ESR_EL1_EC_BRK64   (0x3C)
+
+#endif /* __ASM_ESR_H */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 514d609..c7e0470 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -239,18 +240,18 @@ ENDPROC(el1_error_invalid)
 el1_sync:
kernel_entry 1
mrs x1, esr_el1 // read the syndrome register
-   lsr x24, x1, #26// exception class
-   cmp x24, #0x25  // data abort in EL1
+   lsr x24, x1, #ESR_EL1_EC_SHIFT  // exception class
+   cmp x24, #ESR_EL1_EC_DABT_EL1   // data abort in EL1
b.eqel1_da
-   cmp x24, #0x18  // configurable trap
+   cmp x24, #ESR_EL1_EC_SYS64  // configurable trap
b.eqel1_undef
-   cmp x24, #0x26  // stack alignment exception
+   cmp x24, #ESR_EL1_EC_SP_ALIGN   // stack alignment exception
b.eqel1_sp_pc
-   cmp x24, #0x22  // pc alignment exception
+   cmp x24, #ESR_EL1_EC_PC_ALIGN   // pc alignment exception
b.eqel1_sp_pc
-   cmp x24, #0x00  // unknown exception in EL1
+   cmp x24, #ESR_EL1_EC_UNKNOWN// unknown exception in EL1
b.eqel1_undef
-   cmp x24, #0x30  // debug exception in EL1
+   cmp x24, #ESR_EL1_EC_BREAKPT_EL1// debug exception in EL1
b.geel1_dbg
b   el1_inv
 el1_da:
@@ -346,27 +347,27 @@ el1_preempt:
 el0_sync:
kernel_entry 0
mrs x25, esr_el1// read the syndrome register
-   lsr x24, x25, #26   // exception class
-   cmp x24, #0x15  // SVC in 64-bit state
+   lsr x24, x25, #ESR_EL1_EC_SHIFT // exception class
+   cmp x24, #ESR_EL1_EC_SVC64  // SVC in 64-bit state
b.eqel0_svc
adr lr, ret_from_exception
-   cmp x24, #0x24  // data abort in EL0
+   cmp x24, #ESR_EL1_EC_DABT_EL0   // data abor

[PATCH v3 02/32] arm64: KVM: define HYP and Stage-2 translation page flags

2013-04-08 Thread Marc Zyngier

Add HYP and S2 page flags, for both normal and device memory.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/pgtable-hwdef.h | 13 +
 arch/arm64/include/asm/pgtable.h   | 12 
 2 files changed, 25 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 75fd13d..acb4ee5 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -35,6 +35,7 @@
 /*
  * Section
  */
+#define PMD_SECT_USER  (_AT(pteval_t, 1) << 6) /* AP[1] */
 #define PMD_SECT_S (_AT(pmdval_t, 3) << 8)
 #define PMD_SECT_AF(_AT(pmdval_t, 1) << 10)
 #define PMD_SECT_NG(_AT(pmdval_t, 1) << 11)
@@ -68,6 +69,18 @@
 #define PTE_ATTRINDX_MASK  (_AT(pteval_t, 7) << 2)
 
 /*
+ * 2nd stage PTE definitions
+ */
+#define PTE_S2_RDONLY   (_AT(pteval_t, 1) << 6)   /* HAP[1]   */
+#define PTE_S2_RDWR (_AT(pteval_t, 2) << 6)   /* HAP[2:1] */
+
+/*
+ * EL2/HYP PTE/PMD definitions
+ */
+#define PMD_HYPPMD_SECT_USER
+#define PTE_HYPPTE_USER
+
+/*
  * 40-bit physical address supported.
  */
 #define PHYS_MASK_SHIFT(40)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e333a24..7c84ab4 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -76,6 +76,12 @@ extern pgprot_t pgprot_default;
 #define PAGE_KERNEL_MOD_PROT(pgprot_default, PTE_PXN | PTE_UXN | 
PTE_DIRTY)
 #define PAGE_KERNEL_EXEC   _MOD_PROT(pgprot_default, PTE_UXN | PTE_DIRTY)
 
+#define PAGE_HYP   _MOD_PROT(pgprot_default, PTE_HYP)
+#define PAGE_HYP_DEVICE_MOD_PROT(__pgprot(PROT_DEVICE_nGnRE), 
PTE_HYP)
+
+#define PAGE_S2_MOD_PROT(pgprot_default, PTE_USER | 
PTE_S2_RDONLY)
+#define PAGE_S2_DEVICE _MOD_PROT(__pgprot(PROT_DEVICE_nGnRE), PTE_USER 
| PTE_S2_RDWR)
+
 #define __PAGE_NONE__pgprot(((_PAGE_DEFAULT) & ~PTE_TYPE_MASK) | 
PTE_PROT_NONE)
 #define __PAGE_SHARED  __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | 
PTE_PXN | PTE_UXN)
 #define __PAGE_SHARED_EXEC __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | 
PTE_PXN)
@@ -197,6 +203,12 @@ extern pgprot_t phys_mem_access_prot(struct file *file, 
unsigned long pfn,
 
 #define pmd_bad(pmd)   (!(pmd_val(pmd) & 2))
 
+#define pmd_table(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \
+PMD_TYPE_TABLE)
+#define pmd_sect(pmd)  ((pmd_val(pmd) & PMD_TYPE_MASK) == \
+PMD_TYPE_SECT)
+
+
 static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
 {
*pmdp = pmd;
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 06/32] arm64: KVM: Basic ESR_EL2 helpers and vcpu register access

2013-04-08 Thread Marc Zyngier

Implements helpers for dealing with the EL2 syndrome register as
well as accessing the vcpu registers.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_emulate.h | 163 +++
 1 file changed, 163 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_emulate.h

diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
new file mode 100644
index 000..2dcfa74
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -0,0 +1,163 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Derived from arch/arm/include/kvm_emulate.h
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Author: Christoffer Dall 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM64_KVM_EMULATE_H__
+#define __ARM64_KVM_EMULATE_H__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+void kvm_inject_undefined(struct kvm_vcpu *vcpu);
+void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
+void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
+
+static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
+{
+   return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
+}
+
+static inline unsigned long *vcpu_elr_el1(const struct kvm_vcpu *vcpu)
+{
+   return (unsigned long *)&vcpu_gp_regs(vcpu)->elr_el1;
+}
+
+static inline unsigned long *vcpu_cpsr(const struct kvm_vcpu *vcpu)
+{
+   return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pstate;
+}
+
+static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu)
+{
+   return false;   /* 32bit? Bahhh... */
+}
+
+static inline bool kvm_condition_valid(const struct kvm_vcpu *vcpu)
+{
+   return true;/* No conditionals on arm64 */
+}
+
+static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
+{
+   *vcpu_pc(vcpu) += 4;
+}
+
+static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
+{
+}
+
+static inline unsigned long *vcpu_reg(const struct kvm_vcpu *vcpu, u8 reg_num)
+{
+   return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.regs[reg_num];
+}
+
+/* Get vcpu SPSR for current mode */
+static inline unsigned long *vcpu_spsr(const struct kvm_vcpu *vcpu)
+{
+   return (unsigned long *)&vcpu_gp_regs(vcpu)->spsr[KVM_SPSR_EL1];
+}
+
+static inline bool kvm_vcpu_reg_is_pc(const struct kvm_vcpu *vcpu, int reg)
+{
+   return false;
+}
+
+static inline bool vcpu_mode_priv(const struct kvm_vcpu *vcpu)
+{
+   u32 mode = *vcpu_cpsr(vcpu) & PSR_MODE_MASK;
+
+   return mode != PSR_MODE_EL0t;
+}
+
+static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu)
+{
+   return vcpu->arch.fault.esr_el2;
+}
+
+static inline unsigned long kvm_vcpu_get_hfar(const struct kvm_vcpu *vcpu)
+{
+   return vcpu->arch.fault.far_el2;
+}
+
+static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
+{
+   return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
+}
+
+static inline bool kvm_vcpu_dabt_isvalid(const struct kvm_vcpu *vcpu)
+{
+   return !!(kvm_vcpu_get_hsr(vcpu) & ESR_EL2_ISV);
+}
+
+static inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
+{
+   return !!(kvm_vcpu_get_hsr(vcpu) & ESR_EL2_WNR);
+}
+
+static inline bool kvm_vcpu_dabt_issext(const struct kvm_vcpu *vcpu)
+{
+   return !!(kvm_vcpu_get_hsr(vcpu) & ESR_EL2_SSE);
+}
+
+static inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu)
+{
+   return (kvm_vcpu_get_hsr(vcpu) & ESR_EL2_SRT_MASK) >> ESR_EL2_SRT_SHIFT;
+}
+
+static inline bool kvm_vcpu_dabt_isextabt(const struct kvm_vcpu *vcpu)
+{
+   return !!(kvm_vcpu_get_hsr(vcpu) & ESR_EL2_EA);
+}
+
+static inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
+{
+   return !!(kvm_vcpu_get_hsr(vcpu) & ESR_EL2_S1PTW);
+}
+
+static inline int kvm_vcpu_dabt_get_as(const struct kvm_vcpu *vcpu)
+{
+   return 1 << ((kvm_vcpu_get_hsr(vcpu) & ESR_EL2_SAS) >> 
ESR_EL2_SAS_SHIFT);
+}
+
+/* This one is not specific to Data Abort */
+static inline bool kvm_vcpu_trap_il_is32bit(const struct kvm_vcpu *vcpu)
+{
+   return !!(kvm_vcpu_get_hsr(vcpu) & ESR_EL2_IL);
+}
+
+static inline u8 kvm_vcpu_trap_get_class(const struct kvm_vcpu *vcpu)
+{
+   return kvm_vcpu_get_hsr(vcpu) >> ESR_EL2_EC_SHIFT;
+}
+
+static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
+{
+   return kvm_vcpu_trap_get_cla

[PATCH v3 05/32] arm64: KVM: system register definitions for 64bit guests

2013-04-08 Thread Marc Zyngier

Define the saved/restored registers for 64bit guests.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_asm.h | 68 
 1 file changed, 68 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_asm.h

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
new file mode 100644
index 000..591ac21
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -0,0 +1,68 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM_KVM_ASM_H__
+#define __ARM_KVM_ASM_H__
+
+/*
+ * 0 is reserved as an invalid value.
+ * Order *must* be kept in sync with the hyp switch code.
+ */
+#defineMPIDR_EL1   1   /* MultiProcessor Affinity Register */
+#defineCSSELR_EL1  2   /* Cache Size Selection Register */
+#defineSCTLR_EL1   3   /* System Control Register */
+#defineACTLR_EL1   4   /* Auxilliary Control Register */
+#defineCPACR_EL1   5   /* Coprocessor Access Control */
+#defineTTBR0_EL1   6   /* Translation Table Base Register 0 */
+#defineTTBR1_EL1   7   /* Translation Table Base Register 1 */
+#defineTCR_EL1 8   /* Translation Control Register */
+#defineESR_EL1 9   /* Exception Syndrome Register */
+#defineAFSR0_EL1   10  /* Auxilary Fault Status Register 0 */
+#defineAFSR1_EL1   11  /* Auxilary Fault Status Register 1 */
+#defineFAR_EL1 12  /* Fault Address Register */
+#defineMAIR_EL113  /* Memory Attribute Indirection 
Register */
+#defineVBAR_EL114  /* Vector Base Address Register */
+#defineCONTEXTIDR_EL1  15  /* Context ID Register */
+#defineTPIDR_EL0   16  /* Thread ID, User R/W */
+#defineTPIDRRO_EL0 17  /* Thread ID, User R/O */
+#defineTPIDR_EL1   18  /* Thread ID, Privileged */
+#defineAMAIR_EL1   19  /* Aux Memory Attribute Indirection 
Register */
+#defineCNTKCTL_EL1 20  /* Timer Control Register (EL1) */
+#defineNR_SYS_REGS 21
+
+#define ARM_EXCEPTION_IRQ0
+#define ARM_EXCEPTION_TRAP   1
+
+#ifndef __ASSEMBLY__
+struct kvm;
+struct kvm_vcpu;
+
+extern char __kvm_hyp_init[];
+extern char __kvm_hyp_init_end[];
+
+extern char __kvm_hyp_vector[];
+
+extern char __kvm_hyp_code_start[];
+extern char __kvm_hyp_code_end[];
+
+extern void __kvm_flush_vm_context(void);
+extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
+
+extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
+#endif
+
+#endif /* __ARM_KVM_ASM_H__ */
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 03/32] arm64: KVM: HYP mode idmap support

2013-04-08 Thread Marc Zyngier

Add the necessary infrastructure for identity-mapped HYP page
tables. Idmap-ed code must be in the ".hyp.idmap.text" linker
section.

The rest of the HYP ends up in ".hyp.text".

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/vmlinux.lds.S | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 3fae2be..855d43d 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -17,6 +17,15 @@ ENTRY(stext)
 
 jiffies = jiffies_64;
 
+#define HYPERVISOR_TEXT\
+   . = ALIGN(2048);\
+   VMLINUX_SYMBOL(__hyp_idmap_text_start) = .; \
+   *(.hyp.idmap.text)  \
+   VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;   \
+   VMLINUX_SYMBOL(__hyp_text_start) = .;   \
+   *(.hyp.text)\
+   VMLINUX_SYMBOL(__hyp_text_end) = .;
+
 SECTIONS
 {
/*
@@ -49,6 +58,7 @@ SECTIONS
TEXT_TEXT
SCHED_TEXT
LOCK_TEXT
+   HYPERVISOR_TEXT
*(.fixup)
*(.gnu.warning)
. = ALIGN(16);
@@ -124,3 +134,9 @@ SECTIONS
STABS_DEBUG
.comment 0 : { *(.comment) }
 }
+
+/*
+ * The HYP init code can't be more than a page long.
+ */
+ASSERT(((__hyp_idmap_text_start + PAGE_SIZE) >= __hyp_idmap_text_end),
+   "HYP init code too big")
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 04/32] arm64: KVM: EL2 register definitions

2013-04-08 Thread Marc Zyngier

Define all the useful bitfields for EL2 registers.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_arm.h | 243 +++
 1 file changed, 243 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_arm.h

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
new file mode 100644
index 000..8ced0ca
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -0,0 +1,243 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM64_KVM_ARM_H__
+#define __ARM64_KVM_ARM_H__
+
+#include 
+
+/* Hyp Configuration Register (HCR) bits */
+#define HCR_ID (1 << 33)
+#define HCR_CD (1 << 32)
+#define HCR_RW_SHIFT   31
+#define HCR_RW (1 << HCR_RW_SHIFT)
+#define HCR_TRVM   (1 << 30)
+#define HCR_HCD(1 << 29)
+#define HCR_TDZ(1 << 28)
+#define HCR_TGE(1 << 27)
+#define HCR_TVM(1 << 26)
+#define HCR_TTLB   (1 << 25)
+#define HCR_TPU(1 << 24)
+#define HCR_TPC(1 << 23)
+#define HCR_TSW(1 << 22)
+#define HCR_TAC(1 << 21)
+#define HCR_TIDCP  (1 << 20)
+#define HCR_TSC(1 << 19)
+#define HCR_TID3   (1 << 18)
+#define HCR_TID2   (1 << 17)
+#define HCR_TID1   (1 << 16)
+#define HCR_TID0   (1 << 15)
+#define HCR_TWE(1 << 14)
+#define HCR_TWI(1 << 13)
+#define HCR_DC (1 << 12)
+#define HCR_BSU(3 << 10)
+#define HCR_BSU_IS (1 << 10)
+#define HCR_FB (1 << 9)
+#define HCR_VA (1 << 8)
+#define HCR_VI (1 << 7)
+#define HCR_VF (1 << 6)
+#define HCR_AMO(1 << 5)
+#define HCR_IMO(1 << 4)
+#define HCR_FMO(1 << 3)
+#define HCR_PTW(1 << 2)
+#define HCR_SWIO   (1 << 1)
+#define HCR_VM (1)
+
+/*
+ * The bits we set in HCR:
+ * RW: 64bit by default, can be overriden for 32bit VMs
+ * TAC:Trap ACTLR
+ * TSC:Trap SMC
+ * TSW:Trap cache operations by set/way
+ * TWI:Trap WFI
+ * TIDCP:  Trap L2CTLR/L2ECTLR
+ * BSU_IS: Upgrade barriers to the inner shareable domain
+ * FB: Force broadcast of all maintainance operations
+ * AMO:Override CPSR.A and enable signaling with VA
+ * IMO:Override CPSR.I and enable signaling with VI
+ * FMO:Override CPSR.F and enable signaling with VF
+ * SWIO:   Turn set/way invalidates into set/way clean+invalidate
+ */
+#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | \
+HCR_FB | HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | \
+HCR_SWIO | HCR_TIDCP | HCR_RW)
+#define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
+
+/* Hyp System Control Register (SCTLR_EL2) bits */
+#define SCTLR_EL2_EE   (1 << 25)
+#define SCTLR_EL2_WXN  (1 << 19)
+#define SCTLR_EL2_I(1 << 12)
+#define SCTLR_EL2_SA   (1 << 3)
+#define SCTLR_EL2_C(1 << 2)
+#define SCTLR_EL2_A(1 << 1)
+#define SCTLR_EL2_M1
+#define SCTLR_EL2_FLAGS(SCTLR_EL2_M | SCTLR_EL2_A | SCTLR_EL2_C |  
\
+SCTLR_EL2_SA | SCTLR_EL2_I)
+
+/* TCR_EL2 Registers bits */
+#define TCR_EL2_TBI(1 << 20)
+#define TCR_EL2_PS (7 << 16)
+#define TCR_EL2_PS_40B (2 << 16)
+#define TCR_EL2_TG0(1 << 14)
+#define TCR_EL2_SH0(3 << 12)
+#define TCR_EL2_ORGN0  (3 << 10)
+#define TCR_EL2_IRGN0  (3 << 8)
+#define TCR_EL2_T0SZ   0x3f
+#define TCR_EL2_MASK   (TCR_EL2_TG0 | TCR_EL2_SH0 | \
+TCR_EL2_ORGN0 | TCR_EL2_IRGN0 | TCR_EL2_T0SZ)
+
+#define TCR_EL2_FLAGS  (TCR_EL2_PS_40B)
+
+/* VTCR_EL2 Registers bits */
+#define VTCR_EL2_PS_MASK   (7 << 16)
+#define VTCR_EL2_PS_40B(2 << 16)
+#define VTCR_EL2_TG0_MASK  (1 << 14)
+#define VTCR_EL2_TG0_4K(0 << 14)
+#define VTCR_EL2_TG0_64K   (1 << 14)
+#define VTCR_EL2_SH0_MASK  (3 << 12)
+#define VTCR_EL2_SH0_INNER (3 << 12)
+#define VTCR_EL2_ORGN0_MASK(3 << 10)
+#define VTCR_EL2_ORGN0_WBWA(3 << 10)
+#define VTCR_EL2_IRGN0_MASK(3 << 8)
+#define VTCR_EL2_IRGN0_WBWA(3 << 8)
+#define VTCR_EL2_SL0_MASK  (3 << 6)
+#define VTCR_E

[PATCH v3 29/32] arm64: KVM: 32bit guest fault injection

2013-04-08 Thread Marc Zyngier

Add fault injection capability for 32bit guests.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/inject_fault.c | 79 ++-
 1 file changed, 78 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index 2ff3b78..083cfd5 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -1,5 +1,5 @@
 /*
- * Fault injection for 64bit guests.
+ * Fault injection for both 32 and 64bit guests.
  *
  * Copyright (C) 2012,2013 - ARM Ltd
  * Author: Marc Zyngier 
@@ -25,6 +25,74 @@
 #include 
 #include 
 
+static void prepare_fault32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset)
+{
+   unsigned long cpsr;
+   unsigned long new_spsr_value = *vcpu_cpsr(vcpu);
+   bool is_thumb = (new_spsr_value & COMPAT_PSR_T_BIT);
+   u32 return_offset = (is_thumb) ? 4 : 0;
+   u32 sctlr = vcpu_cp15(vcpu, c1_SCTLR);
+
+   cpsr = mode | COMPAT_PSR_I_BIT;
+
+   if (sctlr & (1 << 30))
+   cpsr |= COMPAT_PSR_T_BIT;
+   if (sctlr & (1 << 25))
+   cpsr |= COMPAT_PSR_E_BIT;
+
+   *vcpu_cpsr(vcpu) = cpsr;
+
+   /* Note: These now point to the banked copies */
+   *vcpu_spsr(vcpu) = new_spsr_value;
+   *vcpu_reg(vcpu, 14) = *vcpu_pc(vcpu) + return_offset;
+
+   /* Branch to exception vector */
+   if (sctlr & (1 << 13))
+   vect_offset += 0x;
+   else /* always have security exceptions */
+   vect_offset += vcpu_cp15(vcpu, c12_VBAR);
+
+   *vcpu_pc(vcpu) = vect_offset;
+}
+
+static void inject_undef32(struct kvm_vcpu *vcpu)
+{
+   prepare_fault32(vcpu, COMPAT_PSR_MODE_UND, 4);
+}
+
+/*
+ * Modelled after TakeDataAbortException() and TakePrefetchAbortException
+ * pseudocode.
+ */
+static void inject_abt32(struct kvm_vcpu *vcpu, bool is_pabt,
+unsigned long addr)
+{
+   u32 vect_offset;
+   u32 *far, *fsr;
+   bool is_lpae;
+
+   if (is_pabt) {
+   vect_offset = 12;
+   far = &vcpu_cp15(vcpu, c6_IFAR);
+   fsr = &vcpu_cp15(vcpu, c5_IFSR);
+   } else { /* !iabt */
+   vect_offset = 16;
+   far = &vcpu_cp15(vcpu, c6_DFAR);
+   fsr = &vcpu_cp15(vcpu, c5_DFSR);
+   }
+
+   prepare_fault32(vcpu, COMPAT_PSR_MODE_ABT | COMPAT_PSR_A_BIT, 
vect_offset);
+
+   *far = addr;
+
+   /* Give the guest an IMPLEMENTATION DEFINED exception */
+   is_lpae = (vcpu_cp15(vcpu, c2_TTBCR) >> 31);
+   if (is_lpae)
+   *fsr = 1 << 9 | 0x34;
+   else
+   *fsr = 0x14;
+}
+
 static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long 
addr)
 {
unsigned long cpsr = *vcpu_cpsr(vcpu);
@@ -90,6 +158,9 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
  */
 void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr)
 {
+   if (!(vcpu->arch.hcr_el2 & HCR_RW))
+   inject_abt32(vcpu, false, addr);
+
inject_abt64(vcpu, false, addr);
 }
 
@@ -103,6 +174,9 @@ void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long 
addr)
  */
 void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr)
 {
+   if (!(vcpu->arch.hcr_el2 & HCR_RW))
+   inject_abt32(vcpu, true, addr);
+
inject_abt64(vcpu, true, addr);
 }
 
@@ -114,5 +188,8 @@ void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long 
addr)
  */
 void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 {
+   if (!(vcpu->arch.hcr_el2 & HCR_RW))
+   inject_undef32(vcpu);
+
inject_undef64(vcpu);
 }
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 28/32] arm64: KVM: 32bit specific register world switch

2013-04-08 Thread Marc Zyngier

Allow registers specific to 32bit guests to be saved/restored
during the world switch.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/hyp.S | 70 
 1 file changed, 70 insertions(+)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 25da0b5..bf305c1 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -266,6 +266,74 @@ __kvm_hyp_code_start:
msr cntkctl_el1,x23
 .endm
 
+.macro skip_32bit_state tmp, target
+   // Skip 32bit state if not needed
+   mrs \tmp, hcr_el2
+   tbnz\tmp, #HCR_RW_SHIFT, \target
+.endm
+
+.macro skip_tee_state tmp, target
+   // Skip ThumbEE state if not needed
+   mrs \tmp, id_pfr0_el1
+   tbz \tmp, #12, \target
+.endm
+
+.macro save_guest_32bit_state
+   skip_32bit_state x3, 1f
+
+   add x3, x2, #CPU_SPSR_OFFSET(KVM_SPSR_ABT)
+   mrs x4, spsr_abt
+   mrs x5, spsr_und
+   mrs x6, spsr_irq
+   mrs x7, spsr_fiq
+   stp x4, x5, [x3]
+   stp x6, x7, [x3, #16]
+
+   add x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
+   mrs x4, dacr32_el2
+   mrs x5, ifsr32_el2
+   mrs x6, fpexc32_el2
+   mrs x7, dbgvcr32_el2
+   stp x4, x5, [x3]
+   stp x6, x7, [x3, #16]
+
+   skip_tee_state x8, 1f
+
+   add x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
+   mrs x4, teecr32_el1
+   mrs x5, teehbr32_el1
+   stp x4, x5, [x3]
+1:
+.endm
+
+.macro restore_guest_32bit_state
+   skip_32bit_state x3, 1f
+
+   add x3, x2, #CPU_SPSR_OFFSET(KVM_SPSR_ABT)
+   ldp x4, x5, [x3]
+   ldp x6, x7, [x3, #16]
+   msr spsr_abt, x4
+   msr spsr_und, x5
+   msr spsr_irq, x6
+   msr spsr_fiq, x7
+
+   add x3, x2, #CPU_SYSREG_OFFSET(DACR32_EL2)
+   ldp x4, x5, [x3]
+   ldp x6, x7, [x3, #16]
+   msr dacr32_el2, x4
+   msr ifsr32_el2, x5
+   msr fpexc32_el2, x6
+   msr dbgvcr32_el2, x7
+
+   skip_tee_state x8, 1f
+
+   add x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
+   ldp x4, x5, [x3]
+   msr teecr32_el1, x4
+   msr teehbr32_el1, x5
+1:
+.endm
+
 .macro activate_traps
ldr x2, [x0, #VCPU_IRQ_LINES]
ldr x1, [x0, #VCPU_HCR_EL2]
@@ -494,6 +562,7 @@ ENTRY(__kvm_vcpu_run)
 
bl __restore_sysregs
bl __restore_fpsimd
+   restore_guest_32bit_state
restore_guest_regs
 
// That's it, no more messing around.
@@ -510,6 +579,7 @@ __kvm_vcpu_return:
save_guest_regs
bl __save_fpsimd
bl __save_sysregs
+   save_guest_32bit_state
 
save_timer_state
save_vgic_state
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 09/32] arm64: KVM: user space interface

2013-04-08 Thread Marc Zyngier

Provide the kvm.h file that defines the user space visible
interface.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/uapi/asm/kvm.h | 117 ++
 1 file changed, 117 insertions(+)
 create mode 100644 arch/arm64/include/uapi/asm/kvm.h

diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
new file mode 100644
index 000..4e64570
--- /dev/null
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -0,0 +1,117 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Derived from arch/arm/include/uapi/asm/kvm.h:
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Author: Christoffer Dall 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM_KVM_H__
+#define __ARM_KVM_H__
+
+#define KVM_SPSR_EL1   0
+#define KVM_NR_SPSR1
+
+#ifndef __ASSEMBLY__
+#include 
+#include 
+
+#define __KVM_HAVE_GUEST_DEBUG
+#define __KVM_HAVE_IRQ_LINE
+
+#define KVM_REG_SIZE(id)   \
+   (1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT))
+
+struct kvm_regs {
+   struct user_pt_regs regs;   /* sp = sp_el0 */
+
+   __u64   sp_el1;
+   __u64   elr_el1;
+
+   __u64   spsr[KVM_NR_SPSR];
+
+   struct user_fpsimd_state fp_regs;
+};
+
+/* Supported Processor Types */
+#define KVM_ARM_TARGET_AEM_V8  0
+#define KVM_ARM_TARGET_FOUNDATION_V8   1
+#define KVM_ARM_TARGET_CORTEX_A57  2
+
+#define KVM_ARM_NUM_TARGETS3
+
+/* KVM_ARM_SET_DEVICE_ADDR ioctl id encoding */
+#define KVM_ARM_DEVICE_TYPE_SHIFT  0
+#define KVM_ARM_DEVICE_TYPE_MASK   (0x << KVM_ARM_DEVICE_TYPE_SHIFT)
+#define KVM_ARM_DEVICE_ID_SHIFT16
+#define KVM_ARM_DEVICE_ID_MASK (0x << KVM_ARM_DEVICE_ID_SHIFT)
+
+/* Supported device IDs */
+#define KVM_ARM_DEVICE_VGIC_V2 0
+
+/* Supported VGIC address types  */
+#define KVM_VGIC_V2_ADDR_TYPE_DIST 0
+#define KVM_VGIC_V2_ADDR_TYPE_CPU  1
+
+#define KVM_VGIC_V2_DIST_SIZE  0x1000
+#define KVM_VGIC_V2_CPU_SIZE   0x2000
+
+struct kvm_vcpu_init {
+   __u32 target;
+   __u32 features[7];
+};
+
+struct kvm_sregs {
+};
+
+struct kvm_fpu {
+};
+
+struct kvm_guest_debug_arch {
+};
+
+struct kvm_debug_exit_arch {
+};
+
+struct kvm_sync_regs {
+};
+
+struct kvm_arch_memory_slot {
+};
+
+/* KVM_IRQ_LINE irq field index values */
+#define KVM_ARM_IRQ_TYPE_SHIFT 24
+#define KVM_ARM_IRQ_TYPE_MASK  0xff
+#define KVM_ARM_IRQ_VCPU_SHIFT 16
+#define KVM_ARM_IRQ_VCPU_MASK  0xff
+#define KVM_ARM_IRQ_NUM_SHIFT  0
+#define KVM_ARM_IRQ_NUM_MASK   0x
+
+/* irq_type field */
+#define KVM_ARM_IRQ_TYPE_CPU   0
+#define KVM_ARM_IRQ_TYPE_SPI   1
+#define KVM_ARM_IRQ_TYPE_PPI   2
+
+/* out-of-kernel GIC cpu interrupt injection irq_number field */
+#define KVM_ARM_IRQ_CPU_IRQ0
+#define KVM_ARM_IRQ_CPU_FIQ1
+
+/* Highest supported SPI, from VGIC_NR_IRQS */
+#define KVM_ARM_IRQ_GIC_MAX127
+
+#endif
+
+#endif /* __ARM_KVM_H__ */
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 32/32] arm64: KVM: MAINTAINERS update

2013-04-08 Thread Marc Zyngier

Elect myself as the KVM/arm64 maintainer.

Signed-off-by: Marc Zyngier 
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 836a618..c6e0170 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4628,6 +4628,15 @@ F:   arch/arm/include/uapi/asm/kvm*
 F: arch/arm/include/asm/kvm*
 F: arch/arm/kvm/
 
+KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)
+M: Marc Zyngier 
+L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
+L: kvm...@lists.cs.columbia.edu
+S: Maintained
+F: arch/arm64/include/uapi/asm/kvm*
+F: arch/arm64/include/asm/kvm*
+F: arch/arm64/kvm/
+
 KEXEC
 M: Eric Biederman 
 W: http://kernel.org/pub/linux/utils/kernel/kexec/
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 31/32] arm64: KVM: userspace API documentation

2013-04-08 Thread Marc Zyngier

Unsurprisingly, the arm64 userspace API is extremely similar to
the 32bit one, the only significant difference being the ONE_REG
register mapping.

Signed-off-by: Marc Zyngier 
---
 Documentation/virtual/kvm/api.txt | 55 +--
 1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 119358d..7c3385e 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -280,7 +280,7 @@ kvm_run' (see below).
 4.11 KVM_GET_REGS
 
 Capability: basic
-Architectures: all except ARM
+Architectures: all except ARM, arm64
 Type: vcpu ioctl
 Parameters: struct kvm_regs (out)
 Returns: 0 on success, -1 on error
@@ -301,7 +301,7 @@ struct kvm_regs {
 4.12 KVM_SET_REGS
 
 Capability: basic
-Architectures: all except ARM
+Architectures: all except ARM, arm64
 Type: vcpu ioctl
 Parameters: struct kvm_regs (in)
 Returns: 0 on success, -1 on error
@@ -587,7 +587,7 @@ struct kvm_fpu {
 4.24 KVM_CREATE_IRQCHIP
 
 Capability: KVM_CAP_IRQCHIP
-Architectures: x86, ia64, ARM
+Architectures: x86, ia64, ARM, arm64
 Type: vm ioctl
 Parameters: none
 Returns: 0 on success, -1 on error
@@ -595,14 +595,14 @@ Returns: 0 on success, -1 on error
 Creates an interrupt controller model in the kernel.  On x86, creates a virtual
 ioapic, a virtual PIC (two PICs, nested), and sets up future vcpus to have a
 local APIC.  IRQ routing for GSIs 0-15 is set to both PIC and IOAPIC; GSI 16-23
-only go to the IOAPIC.  On ia64, a IOSAPIC is created. On ARM, a GIC is
+only go to the IOAPIC.  On ia64, a IOSAPIC is created. On ARM/arm64, a GIC is
 created.
 
 
 4.25 KVM_IRQ_LINE
 
 Capability: KVM_CAP_IRQCHIP
-Architectures: x86, ia64, arm
+Architectures: x86, ia64, arm, arm64
 Type: vm ioctl
 Parameters: struct kvm_irq_level
 Returns: 0 on success, -1 on error
@@ -612,9 +612,10 @@ On some architectures it is required that an interrupt 
controller model has
 been previously created with KVM_CREATE_IRQCHIP.  Note that edge-triggered
 interrupts require the level to be set to 1 and then back to 0.
 
-ARM can signal an interrupt either at the CPU level, or at the in-kernel 
irqchip
-(GIC), and for in-kernel irqchip can tell the GIC to use PPIs designated for
-specific cpus.  The irq field is interpreted like this:
+ARM/arm64 can signal an interrupt either at the CPU level, or at the
+in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to
+use PPIs designated for specific cpus.  The irq field is interpreted
+like this:
 
   bits:  | 31 ... 24 | 23  ... 16 | 15...0 |
   field: | irq_type  | vcpu_index | irq_id |
@@ -1802,6 +1803,19 @@ ARM 32-bit VFP control registers have the following id 
bit patterns:
 ARM 64-bit FP registers have the following id bit patterns:
   0x4002  0012 0 
 
+
+arm64 registers are mapped using the lower 32 bits. The upper 16 of
+that is the register group type, or coprocessor number:
+
+arm64 core/FP-SIMD registers have the following id bit patterns:
+  0x6002  0010 
+
+arm64 CCSIDR registers are demultiplexed by CSSELR value:
+  0x6002  0011 00 
+
+arm64 system registers have the following id bit patterns:
+  0x6002  0013 
+
 4.69 KVM_GET_ONE_REG
 
 Capability: KVM_CAP_ONE_REG
@@ -2165,7 +2179,7 @@ valid entries found.
 4.77 KVM_ARM_VCPU_INIT
 
 Capability: basic
-Architectures: arm
+Architectures: arm, arm64
 Type: vcpu ioctl
 Parameters: struct struct kvm_vcpu_init (in)
 Returns: 0 on success; -1 on error
@@ -2184,12 +2198,14 @@ should be created before this ioctl is invoked.
 Possible features:
- KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state.
  Depends on KVM_CAP_ARM_PSCI.
+   - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
+ Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
 
 
 4.78 KVM_GET_REG_LIST
 
 Capability: basic
-Architectures: arm
+Architectures: arm, arm64
 Type: vcpu ioctl
 Parameters: struct kvm_reg_list (in/out)
 Returns: 0 on success; -1 on error
@@ -2209,7 +2225,7 @@ KVM_GET_ONE_REG/KVM_SET_ONE_REG calls.
 4.80 KVM_ARM_SET_DEVICE_ADDR
 
 Capability: KVM_CAP_ARM_SET_DEVICE_ADDR
-Architectures: arm
+Architectures: arm, arm64
 Type: vm ioctl
 Parameters: struct kvm_arm_device_address (in)
 Returns: 0 on success, -1 on error
@@ -2230,18 +2246,19 @@ can access emulated or directly exposed devices, which 
the host kernel needs
 to know about. The id field is an architecture specific identifier for a
 specific device.
 
-ARM divides the id field into two parts, a device id and an address type id
-specific to the individual device.
+ARM/arm64 divides the id field into two parts, a device id and an
+address type id specific to the individual device.
 
   bits:  | 63...   32 | 31...16 | 15...0 |
   field: |0x  | device id   |  addr type id  |
 
-ARM currently only require this when using the in-kernel GIC support for the
-hardware VGIC

[PATCH v3 08/32] arm64: KVM: architecture specific MMU backend

2013-04-08 Thread Marc Zyngier

Define the arm64 specific MMU backend:
- HYP/kernel VA offset
- S2 4/64kB definitions
- S2 page table populating and flushing
- icache cleaning

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_mmu.h | 136 +++
 1 file changed, 136 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_mmu.h

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
new file mode 100644
index 000..2eb2230
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -0,0 +1,136 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM64_KVM_MMU_H__
+#define __ARM64_KVM_MMU_H__
+
+#include 
+#include 
+
+/*
+ * As we only have the TTBR0_EL2 register, we cannot express
+ * "negative" addresses. This makes it impossible to directly share
+ * mappings with the kernel.
+ *
+ * Instead, give the HYP mode its own VA region at a fixed offset from
+ * the kernel by just masking the top bits (which are all ones for a
+ * kernel address).
+ */
+#define HYP_PAGE_OFFSET_SHIFT  VA_BITS
+#define HYP_PAGE_OFFSET_MASK   ((UL(1) << HYP_PAGE_OFFSET_SHIFT) - 1)
+#define HYP_PAGE_OFFSET(PAGE_OFFSET & HYP_PAGE_OFFSET_MASK)
+
+/*
+ * Our virtual mapping for the idmap-ed MMU-enable code. Must be
+ * shared across all the page-tables. Conveniently, we use the last
+ * possible page, where no kernel mapping will ever exist.
+ */
+#define TRAMPOLINE_VA  (HYP_PAGE_OFFSET_MASK & PAGE_MASK)
+
+#ifdef __ASSEMBLY__
+
+/*
+ * Convert a kernel VA into a HYP VA.
+ * reg: VA to be converted.
+ */
+.macro kern_hyp_va reg
+   and \reg, \reg, #HYP_PAGE_OFFSET_MASK
+.endm
+
+#else
+
+#include 
+
+#define KERN_TO_HYP(kva)   ((unsigned long)kva - PAGE_OFFSET + 
HYP_PAGE_OFFSET)
+
+/*
+ * Align KVM with the kernel's view of physical memory. Should be
+ * 40bit IPA, with PGD being 8kB aligned.
+ */
+#define KVM_PHYS_SHIFT PHYS_MASK_SHIFT
+#define KVM_PHYS_SIZE  (1UL << KVM_PHYS_SHIFT)
+#define KVM_PHYS_MASK  (KVM_PHYS_SIZE - 1UL)
+
+#ifdef CONFIG_ARM64_64K_PAGES
+#define PAGE_LEVELS2
+#define BITS_PER_LEVEL 13
+#else  /* 4kB pages */
+#define PAGE_LEVELS3
+#define BITS_PER_LEVEL 9
+#endif
+
+/* Make sure we get the right size, and thus the right alignment */
+#define BITS_PER_S2_PGD (KVM_PHYS_SHIFT - (PAGE_LEVELS - 1) * BITS_PER_LEVEL - 
PAGE_SHIFT)
+#define PTRS_PER_S2_PGD (1 << max(BITS_PER_LEVEL, BITS_PER_S2_PGD))
+#define S2_PGD_ORDER   get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
+#define S2_PGD_SIZE(1 << S2_PGD_ORDER)
+
+int create_hyp_mappings(void *from, void *to);
+int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
+void free_hyp_pgds(void);
+
+int kvm_alloc_stage2_pgd(struct kvm *kvm);
+void kvm_free_stage2_pgd(struct kvm *kvm);
+int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
+ phys_addr_t pa, unsigned long size);
+
+int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run);
+
+void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
+
+phys_addr_t kvm_mmu_get_httbr(void);
+phys_addr_t kvm_mmu_get_boot_httbr(void);
+phys_addr_t kvm_get_idmap_vector(void);
+int kvm_mmu_init(void);
+void kvm_clear_hyp_idmap(void);
+
+#definekvm_set_pte(ptep, pte)  set_pte(ptep, pte)
+
+static inline bool kvm_is_write_fault(unsigned long esr)
+{
+   unsigned long esr_ec = esr >> ESR_EL2_EC_SHIFT;
+
+   if (esr_ec == ESR_EL2_EC_IABT)
+   return false;
+
+   if ((esr & ESR_EL2_ISV) && !(esr & ESR_EL2_WNR))
+   return false;
+
+   return true;
+}
+
+static inline void kvm_clean_pgd(pgd_t *pgd) {}
+static inline void kvm_clean_pmd_entry(pmd_t *pmd) {}
+static inline void kvm_clean_pte(pte_t *pte) {}
+
+static inline void kvm_set_s2pte_writable(pte_t *pte)
+{
+   pte_val(*pte) |= PTE_S2_RDWR;
+}
+
+struct kvm;
+
+static inline void coherent_icache_guest_page(struct kvm *kvm, gfn_t gfn)
+{
+   unsigned long hva = gfn_to_hva(kvm, gfn);
+   flush_icache_range(hva, hva + PAGE_SIZE);
+}
+
+#define kvm_flush_dcache_to_poc(a,l)   __flush_dcache_area((a), (l))
+
+#endif /* __ASSEMBLY__ */
+#endif /* __ARM64_KVM_MMU_H__ */
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  htt

[PATCH v3 14/32] arm64: KVM: MMIO access backend

2013-04-08 Thread Marc Zyngier

Define the necessary structures to perform an MMIO access.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_mmio.h | 59 +++
 1 file changed, 59 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_mmio.h

diff --git a/arch/arm64/include/asm/kvm_mmio.h 
b/arch/arm64/include/asm/kvm_mmio.h
new file mode 100644
index 000..fc2f689
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_mmio.h
@@ -0,0 +1,59 @@
+/*
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Author: Christoffer Dall 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM64_KVM_MMIO_H__
+#define __ARM64_KVM_MMIO_H__
+
+#include 
+#include 
+#include 
+
+/*
+ * This is annoying. The mmio code requires this, even if we don't
+ * need any decoding. To be fixed.
+ */
+struct kvm_decode {
+   unsigned long rt;
+   bool sign_extend;
+};
+
+/*
+ * The in-kernel MMIO emulation code wants to use a copy of run->mmio,
+ * which is an anonymous type. Use our own type instead.
+ */
+struct kvm_exit_mmio {
+   phys_addr_t phys_addr;
+   u8  data[8];
+   u32 len;
+   boolis_write;
+};
+
+static inline void kvm_prepare_mmio(struct kvm_run *run,
+   struct kvm_exit_mmio *mmio)
+{
+   run->mmio.phys_addr = mmio->phys_addr;
+   run->mmio.len   = mmio->len;
+   run->mmio.is_write  = mmio->is_write;
+   memcpy(run->mmio.data, mmio->data, mmio->len);
+   run->exit_reason= KVM_EXIT_MMIO;
+}
+
+int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,
+phys_addr_t fault_ipa);
+
+#endif /* __ARM64_KVM_MMIO_H__ */
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 12/32] arm64: KVM: virtual CPU reset

2013-04-08 Thread Marc Zyngier

Provide the reset code for a virtual CPU booted in 64bit mode.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/reset.c | 76 ++
 1 file changed, 76 insertions(+)
 create mode 100644 arch/arm64/kvm/reset.c

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
new file mode 100644
index 000..bc33e76
--- /dev/null
+++ b/arch/arm64/kvm/reset.c
@@ -0,0 +1,76 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Derived from arch/arm/kvm/reset.c
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Author: Christoffer Dall 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * ARMv8 Reset Values
+ */
+static struct kvm_regs default_regs_reset = {
+   .regs.pstate = PSR_MODE_EL1h | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT,
+};
+
+int kvm_arch_dev_ioctl_check_extention(long ext)
+{
+   int r;
+
+   switch (ext) {
+   default:
+   r = 0;
+   }
+
+   return r;
+}
+
+
+/**
+ * kvm_reset_vcpu - sets core registers and sys_regs to reset value
+ * @vcpu: The VCPU pointer
+ *
+ * This function finds the right table above and sets the registers on
+ * the virtual CPU struct to their architectually defined reset
+ * values.
+ */
+int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
+{
+   struct kvm_regs *cpu_reset;
+
+   switch (vcpu->arch.target) {
+   default:
+   cpu_reset = &default_regs_reset;
+   break;
+   }
+
+   /* Reset core registers */
+   memcpy(vcpu_gp_regs(vcpu), cpu_reset, sizeof(*cpu_reset));
+
+   /* Reset system registers */
+   kvm_reset_sys_regs(vcpu);
+
+   return 0;
+}
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 18/32] arm64: KVM: Exit handling

2013-04-08 Thread Marc Zyngier

Handle the exit of a VM, decoding the exit reason from HYP mode
and calling the corresponding handler.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/handle_exit.c | 119 +++
 1 file changed, 119 insertions(+)
 create mode 100644 arch/arm64/kvm/handle_exit.c

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
new file mode 100644
index 000..c65d115
--- /dev/null
+++ b/arch/arm64/kvm/handle_exit.c
@@ -0,0 +1,119 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Derived from arch/arm/kvm/handle_exit.c:
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Author: Christoffer Dall 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
+
+static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+   /*
+* Guest called HVC instruction:
+* Let it know we don't want that by injecting an undefined exception.
+*/
+   kvm_debug("hvc: %x (at %08lx)", kvm_vcpu_get_hsr(vcpu) & ((1 << 16) - 
1),
+ *vcpu_pc(vcpu));
+   kvm_debug(" HSR: %8x", kvm_vcpu_get_hsr(vcpu));
+   kvm_inject_undefined(vcpu);
+   return 1;
+}
+
+static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+   /* We don't support SMC; don't do that. */
+   kvm_debug("smc: at %08lx", *vcpu_pc(vcpu));
+   kvm_inject_undefined(vcpu);
+   return 1;
+}
+
+/**
+ * kvm_handle_wfi - handle a wait-for-interrupts instruction executed by a 
guest
+ * @vcpu:  the vcpu pointer
+ *
+ * Simply call kvm_vcpu_block(), which will halt execution of
+ * world-switches and schedule other host processes until there is an
+ * incoming IRQ or FIQ to the VM.
+ */
+static int kvm_handle_wfi(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+   kvm_vcpu_block(vcpu);
+   return 1;
+}
+
+static exit_handle_fn arm_exit_handlers[] = {
+   [ESR_EL2_EC_WFI]= kvm_handle_wfi,
+   [ESR_EL2_EC_HVC64]  = handle_hvc,
+   [ESR_EL2_EC_SMC64]  = handle_smc,
+   [ESR_EL2_EC_SYS64]  = kvm_handle_sys_reg,
+   [ESR_EL2_EC_IABT]   = kvm_handle_guest_abort,
+   [ESR_EL2_EC_DABT]   = kvm_handle_guest_abort,
+};
+
+static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu)
+{
+   u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
+
+   if (hsr_ec >= ARRAY_SIZE(arm_exit_handlers) ||
+   !arm_exit_handlers[hsr_ec]) {
+   kvm_err("Unkown exception class: hsr: %#08x\n",
+   (unsigned int)kvm_vcpu_get_hsr(vcpu));
+   BUG();
+   }
+
+   return arm_exit_handlers[hsr_ec];
+}
+
+/*
+ * Return > 0 to return to guest, < 0 on error, 0 (and set exit_reason) on
+ * proper exit to userspace.
+ */
+int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
+  int exception_index)
+{
+   exit_handle_fn exit_handler;
+
+   switch (exception_index) {
+   case ARM_EXCEPTION_IRQ:
+   return 1;
+   case ARM_EXCEPTION_TRAP:
+   /*
+* See ARM ARM B1.14.1: "Hyp traps on instructions
+* that fail their condition code check"
+*/
+   if (!kvm_condition_valid(vcpu)) {
+   kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+   return 1;
+   }
+
+   exit_handler = kvm_get_exit_handler(vcpu);
+
+   return exit_handler(vcpu, run);
+   default:
+   kvm_pr_unimpl("Unsupported exception type: %d",
+ exception_index);
+   run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+   return 0;
+   }
+}
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 16/32] arm64: KVM: hypervisor initialization code

2013-04-08 Thread Marc Zyngier

Provide EL2 with page tables and stack, and set the vectors
to point to the full blown world-switch code.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_host.h |  13 +
 arch/arm64/kvm/hyp-init.S | 112 ++
 2 files changed, 125 insertions(+)
 create mode 100644 arch/arm64/kvm/hyp-init.S

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index a0279ff..8655de4 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -183,4 +183,17 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+static inline void __cpu_init_hyp_mode(unsigned long long boot_pgd_ptr,
+  unsigned long long pgd_ptr,
+  unsigned long hyp_stack_ptr,
+  unsigned long vector_ptr)
+{
+   /*
+* Call initialization code, and switch to the full blown
+* HYP code.
+*/
+   kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
+hyp_stack_ptr, vector_ptr);
+}
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
new file mode 100644
index 000..4a9cfd2
--- /dev/null
+++ b/arch/arm64/kvm/hyp-init.S
@@ -0,0 +1,112 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+
+   .text
+   .pushsection.hyp.idmap.text, "ax"
+
+   .align  11
+
+__kvm_hyp_init:
+   .global __kvm_hyp_init
+
+ENTRY(__kvm_hyp_init_vector)
+   ventry  __invalid   // Synchronous EL2t
+   ventry  __invalid   // IRQ EL2t
+   ventry  __invalid   // FIQ EL2t
+   ventry  __invalid   // Error EL2t
+
+   ventry  __invalid   // Synchronous EL2h
+   ventry  __invalid   // IRQ EL2h
+   ventry  __invalid   // FIQ EL2h
+   ventry  __invalid   // Error EL2h
+
+   ventry  __do_hyp_init   // Synchronous 64-bit EL1
+   ventry  __invalid   // IRQ 64-bit EL1
+   ventry  __invalid   // FIQ 64-bit EL1
+   ventry  __invalid   // Error 64-bit EL1
+
+   ventry  __invalid   // Synchronous 32-bit EL1
+   ventry  __invalid   // IRQ 32-bit EL1
+   ventry  __invalid   // FIQ 32-bit EL1
+   ventry  __invalid   // Error 32-bit EL1
+ENDPROC(__kvm_hyp_init_vector)
+
+__invalid:
+   b   .
+
+   /*
+* x0: HYP boot pgd
+* x1: HYP pgd
+* x2: HYP stack
+* x3: HYP vectors
+*/
+__do_hyp_init:
+
+   msr ttbr0_el2, x0
+
+   mrs x4, tcr_el1
+   ldr x5, =TCR_EL2_MASK
+   and x4, x4, x5
+   ldr x5, =TCR_EL2_FLAGS
+   orr x4, x4, x5
+   msr tcr_el2, x4
+
+   ldr x4, =VTCR_EL2_FLAGS
+   msr vtcr_el2, x4
+
+   mrs x4, mair_el1
+   msr mair_el2, x4
+   isb
+
+   mov x4, #SCTLR_EL2_FLAGS
+   msr sctlr_el2, x4
+   isb
+
+   /* MMU is now enabled. Get ready for the trampoline dance */
+   ldr x4, =TRAMPOLINE_VA
+   adr x5, target
+   bfi x4, x5, #0, #PAGE_SHIFT
+   br  x4
+
+   nop
+
+target: /* We're now in the trampoline code, switch page tables */
+   msr ttbr0_el2, x1
+   isb
+
+   /* Invalidate the old TLBs */
+   tlbialle2
+   dsb sy
+
+   /* Set the stack and new vectors */
+   kern_hyp_va x2
+   mov sp, x2
+   kern_hyp_va x3
+   msr vbar_el2, x3
+
+   /* Hello, World! */
+   eret
+
+   .ltorg
+
+   .popsection
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 17/32] arm64: KVM: HYP mode world switch implementation

2013-04-08 Thread Marc Zyngier

The HYP mode world switch in all its glory.

Implements save/restore of host/guest registers, EL2 trapping,
IPA resolution, and additional services (tlb invalidation).

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kernel/asm-offsets.c |  34 +++
 arch/arm64/kvm/hyp.S| 602 
 2 files changed, 636 insertions(+)
 create mode 100644 arch/arm64/kvm/hyp.S

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index a2a4d81..666e231 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -104,5 +105,38 @@ int main(void)
   BLANK();
   DEFINE(TZ_MINWEST,   offsetof(struct timezone, tz_minuteswest));
   DEFINE(TZ_DSTTIME,   offsetof(struct timezone, tz_dsttime));
+  BLANK();
+#ifdef CONFIG_KVM_ARM_HOST
+  DEFINE(VCPU_CONTEXT, offsetof(struct kvm_vcpu, arch.ctxt));
+  DEFINE(CPU_GP_REGS,  offsetof(struct kvm_cpu_context, gp_regs));
+  DEFINE(CPU_USER_PT_REGS, offsetof(struct kvm_regs, regs));
+  DEFINE(CPU_FP_REGS,  offsetof(struct kvm_regs, fp_regs));
+  DEFINE(CPU_SP_EL1,   offsetof(struct kvm_regs, sp_el1));
+  DEFINE(CPU_ELR_EL1,  offsetof(struct kvm_regs, elr_el1));
+  DEFINE(CPU_SPSR, offsetof(struct kvm_regs, spsr));
+  DEFINE(CPU_SYSREGS,  offsetof(struct kvm_cpu_context, sys_regs));
+  DEFINE(VCPU_ESR_EL2, offsetof(struct kvm_vcpu, arch.fault.esr_el2));
+  DEFINE(VCPU_FAR_EL2, offsetof(struct kvm_vcpu, arch.fault.far_el2));
+  DEFINE(VCPU_HPFAR_EL2,   offsetof(struct kvm_vcpu, 
arch.fault.hpfar_el2));
+  DEFINE(VCPU_HCR_EL2, offsetof(struct kvm_vcpu, arch.hcr_el2));
+  DEFINE(VCPU_IRQ_LINES,   offsetof(struct kvm_vcpu, arch.irq_lines));
+  DEFINE(VCPU_HOST_CONTEXT,offsetof(struct kvm_vcpu, 
arch.host_cpu_context));
+  DEFINE(VCPU_TIMER_CNTV_CTL,  offsetof(struct kvm_vcpu, 
arch.timer_cpu.cntv_ctl));
+  DEFINE(VCPU_TIMER_CNTV_CVAL, offsetof(struct kvm_vcpu, 
arch.timer_cpu.cntv_cval));
+  DEFINE(KVM_TIMER_CNTVOFF,offsetof(struct kvm, arch.timer.cntvoff));
+  DEFINE(KVM_TIMER_ENABLED,offsetof(struct kvm, arch.timer.enabled));
+  DEFINE(VCPU_KVM, offsetof(struct kvm_vcpu, kvm));
+  DEFINE(VCPU_VGIC_CPU,offsetof(struct kvm_vcpu, 
arch.vgic_cpu));
+  DEFINE(VGIC_CPU_HCR, offsetof(struct vgic_cpu, vgic_hcr));
+  DEFINE(VGIC_CPU_VMCR,offsetof(struct vgic_cpu, vgic_vmcr));
+  DEFINE(VGIC_CPU_MISR,offsetof(struct vgic_cpu, vgic_misr));
+  DEFINE(VGIC_CPU_EISR,offsetof(struct vgic_cpu, vgic_eisr));
+  DEFINE(VGIC_CPU_ELRSR,   offsetof(struct vgic_cpu, vgic_elrsr));
+  DEFINE(VGIC_CPU_APR, offsetof(struct vgic_cpu, vgic_apr));
+  DEFINE(VGIC_CPU_LR,  offsetof(struct vgic_cpu, vgic_lr));
+  DEFINE(VGIC_CPU_NR_LR,   offsetof(struct vgic_cpu, nr_lr));
+  DEFINE(KVM_VTTBR,offsetof(struct kvm, arch.vttbr));
+  DEFINE(KVM_VGIC_VCTRL,   offsetof(struct kvm, arch.vgic.vctrl_base));
+#endif
   return 0;
 }
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
new file mode 100644
index 000..c745d20
--- /dev/null
+++ b/arch/arm64/kvm/hyp.S
@@ -0,0 +1,602 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define CPU_GP_REG_OFFSET(x)   (CPU_GP_REGS + x)
+#define CPU_XREG_OFFSET(x) CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x)
+#define CPU_SPSR_OFFSET(x) CPU_GP_REG_OFFSET(CPU_SPSR + 8*x)
+#define CPU_SYSREG_OFFSET(x)   (CPU_SYSREGS + 8*x)
+
+   .text
+   .pushsection.hyp.text, "ax"
+   .align  PAGE_SHIFT
+
+__kvm_hyp_code_start:
+   .globl __kvm_hyp_code_start
+
+.macro save_common_regs
+   // x2: base address for cpu context
+   // x3: tmp register
+
+   add x3, x2, #CPU_XREG_OFFSET(19)
+   stp x19, x20, [x3]
+   stp x21, x22, [x3, #16]
+   stp x23, x24, [x3, #32]
+   stp x25, x26, [x3, #48]
+   stp x27, x28, [x3, #64]
+   stp x29, lr, [x3, #80]
+
+   mrs x19, sp_el0
+   mrs x20, elr_el2// EL1 PC
+   mrs

[PATCH v3 10/32] arm64: KVM: system register handling

2013-04-08 Thread Marc Zyngier

Provide 64bit system register handling, modeled after the cp15
handling for ARM.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_coproc.h |  51 +++
 arch/arm64/include/uapi/asm/kvm.h   |  29 ++
 arch/arm64/kvm/sys_regs.c   | 871 
 arch/arm64/kvm/sys_regs.h   | 138 ++
 include/uapi/linux/kvm.h|   1 +
 5 files changed, 1090 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_coproc.h
 create mode 100644 arch/arm64/kvm/sys_regs.c
 create mode 100644 arch/arm64/kvm/sys_regs.h

diff --git a/arch/arm64/include/asm/kvm_coproc.h 
b/arch/arm64/include/asm/kvm_coproc.h
new file mode 100644
index 000..9b4477a
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_coproc.h
@@ -0,0 +1,51 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Derived from arch/arm/include/asm/kvm_coproc.h
+ * Copyright (C) 2012 Rusty Russell IBM Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM64_KVM_COPROC_H__
+#define __ARM64_KVM_COPROC_H__
+
+#include 
+
+void kvm_reset_sys_regs(struct kvm_vcpu *vcpu);
+
+struct kvm_sys_reg_table {
+   const struct sys_reg_desc *table;
+   size_t num;
+};
+
+struct kvm_sys_reg_target_table {
+   struct kvm_sys_reg_table table64;
+};
+
+void kvm_register_target_sys_reg_table(unsigned int target,
+  struct kvm_sys_reg_target_table *table);
+
+int kvm_handle_sys_reg(struct kvm_vcpu *vcpu, struct kvm_run *run);
+
+#define kvm_coproc_table_init kvm_sys_reg_table_init
+void kvm_sys_reg_table_init(void);
+
+struct kvm_one_reg;
+int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
+int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
+int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
+unsigned long kvm_arm_num_sys_reg_descs(struct kvm_vcpu *vcpu);
+
+#endif /* __ARM64_KVM_COPROC_H__ */
diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 4e64570..ebac919 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -92,6 +92,35 @@ struct kvm_sync_regs {
 struct kvm_arch_memory_slot {
 };
 
+/* If you need to interpret the index values, here is the key: */
+#define KVM_REG_ARM_COPROC_MASK0x0FFF
+#define KVM_REG_ARM_COPROC_SHIFT   16
+
+/* Normal registers are mapped as coprocessor 16. */
+#define KVM_REG_ARM_CORE   (0x0010 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_REG_ARM_CORE_REG(name) (offsetof(struct kvm_regs, name) / 
sizeof(__u32))
+
+/* Some registers need more space to represent values. */
+#define KVM_REG_ARM_DEMUX  (0x0011 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_REG_ARM_DEMUX_ID_MASK  0xFF00
+#define KVM_REG_ARM_DEMUX_ID_SHIFT 8
+#define KVM_REG_ARM_DEMUX_ID_CCSIDR(0x00 << KVM_REG_ARM_DEMUX_ID_SHIFT)
+#define KVM_REG_ARM_DEMUX_VAL_MASK 0x00FF
+#define KVM_REG_ARM_DEMUX_VAL_SHIFT0
+
+/* AArch64 system registers */
+#define KVM_REG_ARM64_SYSREG   (0x0013 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_REG_ARM64_SYSREG_OP0_MASK  0xc000
+#define KVM_REG_ARM64_SYSREG_OP0_SHIFT 14
+#define KVM_REG_ARM64_SYSREG_OP1_MASK  0x3800
+#define KVM_REG_ARM64_SYSREG_OP1_SHIFT 11
+#define KVM_REG_ARM64_SYSREG_CRN_MASK  0x0780
+#define KVM_REG_ARM64_SYSREG_CRN_SHIFT 7
+#define KVM_REG_ARM64_SYSREG_CRM_MASK  0x0078
+#define KVM_REG_ARM64_SYSREG_CRM_SHIFT 3
+#define KVM_REG_ARM64_SYSREG_OP2_MASK  0x0007
+#define KVM_REG_ARM64_SYSREG_OP2_SHIFT 0
+
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT 24
 #define KVM_ARM_IRQ_TYPE_MASK  0xff
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
new file mode 100644
index 000..9df3b32
--- /dev/null
+++ b/arch/arm64/kvm/sys_regs.c
@@ -0,0 +1,871 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Derived from arch/arm/kvm/coproc.c:
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Authors: Rusty Russell 
+ *  Christoffer Dall 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the

[PATCH v3 21/32] arm64: KVM: PSCI implementation

2013-04-08 Thread Marc Zyngier

Wire the PSCI backend into the exit handling code.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_host.h |  2 +-
 arch/arm64/include/asm/kvm_psci.h | 23 +++
 arch/arm64/include/uapi/asm/kvm.h | 16 
 arch/arm64/kvm/handle_exit.c  | 16 +++-
 4 files changed, 47 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_psci.h

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 8655de4..fdcb324 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -34,7 +34,7 @@
 #include 
 #include 
 
-#define KVM_VCPU_MAX_FEATURES 0
+#define KVM_VCPU_MAX_FEATURES 1
 
 /* We don't currently support large pages. */
 #define KVM_HPAGE_GFN_SHIFT(x) 0
diff --git a/arch/arm64/include/asm/kvm_psci.h 
b/arch/arm64/include/asm/kvm_psci.h
new file mode 100644
index 000..e301a48
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_psci.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM64_KVM_PSCI_H__
+#define __ARM64_KVM_PSCI_H__
+
+bool kvm_psci_call(struct kvm_vcpu *vcpu);
+
+#endif /* __ARM64_KVM_PSCI_H__ */
diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index ebac919..fb60f90 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -69,6 +69,8 @@ struct kvm_regs {
 #define KVM_VGIC_V2_DIST_SIZE  0x1000
 #define KVM_VGIC_V2_CPU_SIZE   0x2000
 
+#define KVM_ARM_VCPU_POWER_OFF 0 /* CPU is started in OFF state */
+
 struct kvm_vcpu_init {
__u32 target;
__u32 features[7];
@@ -141,6 +143,20 @@ struct kvm_arch_memory_slot {
 /* Highest supported SPI, from VGIC_NR_IRQS */
 #define KVM_ARM_IRQ_GIC_MAX127
 
+/* PSCI interface */
+#define KVM_PSCI_FN_BASE   0x95c1ba5e
+#define KVM_PSCI_FN(n) (KVM_PSCI_FN_BASE + (n))
+
+#define KVM_PSCI_FN_CPU_SUSPENDKVM_PSCI_FN(0)
+#define KVM_PSCI_FN_CPU_OFFKVM_PSCI_FN(1)
+#define KVM_PSCI_FN_CPU_ON KVM_PSCI_FN(2)
+#define KVM_PSCI_FN_MIGRATEKVM_PSCI_FN(3)
+
+#define KVM_PSCI_RET_SUCCESS   0
+#define KVM_PSCI_RET_NI((unsigned long)-1)
+#define KVM_PSCI_RET_INVAL ((unsigned long)-2)
+#define KVM_PSCI_RET_DENIED((unsigned long)-3)
+
 #endif
 
 #endif /* __ARM_KVM_H__ */
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index c65d115..4766b7f 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -24,26 +24,24 @@
 #include 
 #include 
 #include 
+#include 
 
 typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
 
 static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
-   /*
-* Guest called HVC instruction:
-* Let it know we don't want that by injecting an undefined exception.
-*/
-   kvm_debug("hvc: %x (at %08lx)", kvm_vcpu_get_hsr(vcpu) & ((1 << 16) - 
1),
- *vcpu_pc(vcpu));
-   kvm_debug(" HSR: %8x", kvm_vcpu_get_hsr(vcpu));
+   if (kvm_psci_call(vcpu))
+   return 1;
+
kvm_inject_undefined(vcpu);
return 1;
 }
 
 static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
-   /* We don't support SMC; don't do that. */
-   kvm_debug("smc: at %08lx", *vcpu_pc(vcpu));
+   if (kvm_psci_call(vcpu))
+   return 1;
+
kvm_inject_undefined(vcpu);
return 1;
 }
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 15/32] arm64: KVM: guest one-reg interface

2013-04-08 Thread Marc Zyngier

Let userspace play with the guest registers.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/guest.c | 254 +
 1 file changed, 254 insertions(+)
 create mode 100644 arch/arm64/kvm/guest.c

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
new file mode 100644
index 000..47d3729
--- /dev/null
+++ b/arch/arm64/kvm/guest.c
@@ -0,0 +1,254 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Derived from arch/arm/kvm/guest.c:
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Author: Christoffer Dall 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct kvm_stats_debugfs_item debugfs_entries[] = {
+   { NULL }
+};
+
+int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
+{
+   vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
+   return 0;
+}
+
+static u64 core_reg_offset_from_id(u64 id)
+{
+   return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
+}
+
+static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+   __u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
+   struct kvm_regs *regs = vcpu_gp_regs(vcpu);
+   int nr_regs = sizeof(*regs) / sizeof(__u32);
+   u32 off;
+
+   /* Our ID is an index into the kvm_regs struct. */
+   off = core_reg_offset_from_id(reg->id);
+   if (off >= nr_regs ||
+   (off + (KVM_REG_SIZE(reg->id) / sizeof(__u32))) >= nr_regs)
+   return -ENOENT;
+
+   if (copy_to_user(uaddr, ((u32 *)regs) + off, KVM_REG_SIZE(reg->id)))
+   return -EFAULT;
+
+   return 0;
+}
+
+static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+   __u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
+   struct kvm_regs *regs = vcpu_gp_regs(vcpu);
+   int nr_regs = sizeof(*regs) / sizeof(__u32);
+   void *valp;
+   u64 off;
+   int err = 0;
+
+   /* Our ID is an index into the kvm_regs struct. */
+   off = core_reg_offset_from_id(reg->id);
+   if (off >= nr_regs ||
+   (off + (KVM_REG_SIZE(reg->id) / sizeof(__u32))) >= nr_regs)
+   return -ENOENT;
+
+   valp = kmalloc(KVM_REG_SIZE(reg->id), GFP_KERNEL);
+   if (!valp)
+   return -ENOMEM;
+
+   if (copy_from_user(valp, uaddr, KVM_REG_SIZE(reg->id))) {
+   err = -EFAULT;
+   goto out;
+   }
+
+   if (off == KVM_REG_ARM_CORE_REG(regs.pstate)) {
+   unsigned long mode = (*(unsigned long *)valp) & 
COMPAT_PSR_MODE_MASK;
+   switch (mode) {
+   case PSR_MODE_EL0t:
+   case PSR_MODE_EL1t:
+   case PSR_MODE_EL1h:
+   break;
+   default:
+   err = -EINVAL;
+   goto out;
+   }
+   }
+
+   memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
+out:
+   kfree(valp);
+   return err;
+}
+
+int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
+{
+   return -EINVAL;
+}
+
+int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
+{
+   return -EINVAL;
+}
+
+static unsigned long num_core_regs(void)
+{
+   return sizeof(struct kvm_regs) / sizeof(unsigned long);
+}
+
+/**
+ * kvm_arm_num_regs - how many registers do we present via KVM_GET_ONE_REG
+ *
+ * This is for all registers.
+ */
+unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
+{
+   return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu);
+}
+
+/**
+ * kvm_arm_copy_reg_indices - get indices of all registers.
+ *
+ * We do core registers right here, then we apppend system regs.
+ */
+int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
+{
+   unsigned int i;
+   const u64 core_reg = KVM_REG_ARM64 | KVM_REG_SIZE_U64 | 
KVM_REG_ARM_CORE;
+
+   for (i = 0; i < sizeof(struct kvm_regs)/sizeof(unsigned long); i++) {
+   if (put_user(core_reg | i, uindices))
+   return -EFAULT;
+   uindices++;
+   }
+
+   return kvm_arm_copy_sys_reg_indices(vcpu, uindices);
+}
+
+int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_r

[PATCH v3 23/32] arm64: KVM: define 32bit specific registers

2013-04-08 Thread Marc Zyngier

Define the 32bit specific registers (SPSRs, cp15...).

Most CPU registers are directly mapped to a 64bit register
(r0->x0...). Only the SPSRs have separate registers.

cp15 registers are also mapped into their 64bit counterpart in most
cases.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_asm.h  | 38 +-
 arch/arm64/include/asm/kvm_host.h |  5 -
 arch/arm64/include/uapi/asm/kvm.h |  7 ++-
 3 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 591ac21..c92de41 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -42,7 +42,43 @@
 #defineTPIDR_EL1   18  /* Thread ID, Privileged */
 #defineAMAIR_EL1   19  /* Aux Memory Attribute Indirection 
Register */
 #defineCNTKCTL_EL1 20  /* Timer Control Register (EL1) */
-#defineNR_SYS_REGS 21
+/* 32bit specific registers. Keep them at the end of the range */
+#defineDACR32_EL2  21  /* Domain Access Control Register */
+#defineIFSR32_EL2  22  /* Instruction Fault Status Register */
+#defineFPEXC32_EL2 23  /* Floating-Point Exception Control 
Register */
+#defineDBGVCR32_EL224  /* Debug Vector Catch Register */
+#defineTEECR32_EL1 25  /* ThumbEE Configuration Register */
+#defineTEEHBR32_EL126  /* ThumbEE Handler Base Register */
+#defineNR_SYS_REGS 27
+
+/* 32bit mapping */
+#define c0_MPIDR   (MPIDR_EL1 * 2) /* MultiProcessor ID Register */
+#define c0_CSSELR  (CSSELR_EL1 * 2)/* Cache Size Selection Register */
+#define c1_SCTLR   (SCTLR_EL1 * 2) /* System Control Register */
+#define c1_ACTLR   (ACTLR_EL1 * 2) /* Auxiliary Control Register */
+#define c1_CPACR   (CPACR_EL1 * 2) /* Coprocessor Access Control */
+#define c2_TTBR0   (TTBR0_EL1 * 2) /* Translation Table Base Register 0 */
+#define c2_TTBR0_high  (c2_TTBR0 + 1)  /* TTBR0 top 32 bits */
+#define c2_TTBR1   (TTBR1_EL1 * 2) /* Translation Table Base Register 1 */
+#define c2_TTBR1_high  (c2_TTBR1 + 1)  /* TTBR1 top 32 bits */
+#define c2_TTBCR   (TCR_EL1 * 2)   /* Translation Table Base Control R. */
+#define c3_DACR(DACR32_EL2 * 2)/* Domain Access Control 
Register */
+#define c5_DFSR(ESR_EL1 * 2)   /* Data Fault Status Register */
+#define c5_IFSR(IFSR32_EL2 * 2)/* Instruction Fault Status 
Register */
+#define c5_ADFSR   (AFSR0_EL1 * 2) /* Auxiliary Data Fault Status R */
+#define c5_AIFSR   (AFSR1_EL1 * 2) /* Auxiliary Instr Fault Status R */
+#define c6_DFAR(FAR_EL1 * 2)   /* Data Fault Address Register 
*/
+#define c6_IFAR(c6_DFAR + 1)   /* Instruction Fault Address 
Register */
+#define c10_PRRR   (MAIR_EL1 * 2)  /* Primary Region Remap Register */
+#define c10_NMRR   (c10_PRRR + 1)  /* Normal Memory Remap Register */
+#define c12_VBAR   (VBAR_EL1 * 2)  /* Vector Base Address Register */
+#define c13_CID(CONTEXTIDR_EL1 * 2)/* Context ID Register 
*/
+#define c13_TID_URW(TPIDR_EL0 * 2) /* Thread ID, User R/W */
+#define c13_TID_URO(TPIDRRO_EL0 * 2)/* Thread ID, User R/O */
+#define c13_TID_PRIV   (TPIDR_EL1 * 2) /* Thread ID, Privileged */
+#define c10_AMAIR  (AMAIR_EL1 * 2) /* Aux Memory Attr Indirection Reg */
+#define c14_CNTKCTL(CNTKCTL_EL1 * 2) /* Timer Control Register (PL1) */
+#define NR_CP15_REGS   (NR_SYS_REGS * 2)
 
 #define ARM_EXCEPTION_IRQ0
 #define ARM_EXCEPTION_TRAP   1
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index fdcb324..d44064d 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -84,7 +84,10 @@ struct kvm_vcpu_fault_info {
 
 struct kvm_cpu_context {
struct kvm_regs gp_regs;
-   u64 sys_regs[NR_SYS_REGS];
+   union {
+   u64 sys_regs[NR_SYS_REGS];
+   u32 cp15[NR_CP15_REGS];
+   };
 };
 
 typedef struct kvm_cpu_context kvm_cpu_context_t;
diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index fb60f90..5b1110c 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -23,7 +23,12 @@
 #define __ARM_KVM_H__
 
 #define KVM_SPSR_EL1   0
-#define KVM_NR_SPSR1
+#define KVM_SPSR_SVC   KVM_SPSR_EL1
+#define KVM_SPSR_ABT   1
+#define KVM_SPSR_UND   2
+#define KVM_SPSR_IRQ   3
+#define KVM_SPSR_FIQ   4
+#define KVM_NR_SPSR5
 
 #ifndef __ASSEMBLY__
 #include 
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 22/32] arm64: KVM: Build system integration

2013-04-08 Thread Marc Zyngier

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/Kconfig  |  2 ++
 arch/arm64/Makefile |  2 +-
 arch/arm64/kvm/Kconfig  | 59 +
 arch/arm64/kvm/Makefile | 19 
 4 files changed, 81 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/Kconfig
 create mode 100644 arch/arm64/kvm/Makefile

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 43b0e9f..d984a46 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -240,6 +240,8 @@ source "drivers/Kconfig"
 
 source "fs/Kconfig"
 
+source "arch/arm64/kvm/Kconfig"
+
 source "arch/arm64/Kconfig.debug"
 
 source "security/Kconfig"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index c95c5cb..ae89e63 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -36,7 +36,7 @@ TEXT_OFFSET := 0x0008
 
 export TEXT_OFFSET GZFLAGS
 
-core-y += arch/arm64/kernel/ arch/arm64/mm/
+core-y += arch/arm64/kernel/ arch/arm64/mm/ arch/arm64/kvm/
 libs-y := arch/arm64/lib/ $(libs-y)
 libs-y += $(LIBGCC)
 
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
new file mode 100644
index 000..a76be8b
--- /dev/null
+++ b/arch/arm64/kvm/Kconfig
@@ -0,0 +1,59 @@
+#
+# KVM configuration
+#
+
+source "virt/kvm/Kconfig"
+
+menuconfig VIRTUALIZATION
+   bool "Virtualization"
+   ---help---
+ Say Y here to get to see options for using your Linux host to run
+ other operating systems inside virtual machines (guests).
+ This option alone does not add any kernel code.
+
+ If you say N, all options in this submenu will be skipped and
+ disabled.
+
+if VIRTUALIZATION
+
+config KVM
+   bool "Kernel-based Virtual Machine (KVM) support"
+   select PREEMPT_NOTIFIERS
+   select ANON_INODES
+   select KVM_MMIO
+   select KVM_ARM_HOST
+   select KVM_ARM_VGIC
+   select KVM_ARM_TIMER
+   ---help---
+ Support hosting virtualized guest machines.
+
+ This module provides access to the hardware capabilities through
+ a character device node named /dev/kvm.
+
+ If unsure, say N.
+
+config KVM_ARM_HOST
+   bool
+   depends on KVM
+   depends on MMU
+   select  MMU_NOTIFIER
+   ---help---
+ Provides host support for ARM processors.
+
+config KVM_ARM_VGIC
+bool
+   depends on KVM_ARM_HOST && OF
+   select HAVE_KVM_IRQCHIP
+   ---help---
+ Adds support for a hardware assisted, in-kernel GIC emulation.
+
+config KVM_ARM_TIMER
+bool
+   depends on KVM_ARM_VGIC
+   select HAVE_KVM_IRQCHIP
+   ---help---
+ Adds support for the Architected Timers in virtual machines
+
+source drivers/virtio/Kconfig
+
+endif # VIRTUALIZATION
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
new file mode 100644
index 000..be9eb3833
--- /dev/null
+++ b/arch/arm64/kvm/Makefile
@@ -0,0 +1,19 @@
+#
+# Makefile for Kernel-based Virtual Machine module
+#
+
+ccflags-y += -Ivirt/kvm -Iarch/arm64/kvm
+CFLAGS_arm.o := -I.
+CFLAGS_mmu.o := -I.
+
+obj-$(CONFIG_KVM_ARM_HOST) += kvm.o
+
+kvm-$(CONFIG_KVM_ARM_HOST) += $(addprefix ../../../virt/kvm/, kvm_main.o 
coalesced_mmio.o)
+kvm-$(CONFIG_KVM_ARM_HOST) += $(addprefix ../../../arch/arm/kvm/, arm.o mmu.o 
mmio.o psci.o perf.o)
+
+kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o
+kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
+kvm-$(CONFIG_KVM_ARM_HOST) += guest.o reset.o sys_regs.o sys_regs_generic_v8.o
+
+kvm-$(CONFIG_KVM_ARM_VGIC) += $(addprefix ../../../arch/arm/kvm/, vgic.o)
+kvm-$(CONFIG_KVM_ARM_TIMER) += $(addprefix ../../../arch/arm/kvm/, 
arch_timer.o)
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 11/32] arm64: KVM: CPU specific system registers handling

2013-04-08 Thread Marc Zyngier

Add the support code for CPU specific system registers. Not much
here yet.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/sys_regs_generic_v8.c | 85 
 1 file changed, 85 insertions(+)
 create mode 100644 arch/arm64/kvm/sys_regs_generic_v8.c

diff --git a/arch/arm64/kvm/sys_regs_generic_v8.c 
b/arch/arm64/kvm/sys_regs_generic_v8.c
new file mode 100644
index 000..d4e8039
--- /dev/null
+++ b/arch/arm64/kvm/sys_regs_generic_v8.c
@@ -0,0 +1,85 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Based on arch/arm/kvm/coproc_a15.c:
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Authors: Rusty Russell 
+ *  Christoffer Dall 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "sys_regs.h"
+
+static bool access_actlr(struct kvm_vcpu *vcpu,
+const struct sys_reg_params *p,
+const struct sys_reg_desc *r)
+{
+   if (p->is_write)
+   return ignore_write(vcpu, p);
+
+   *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, ACTLR_EL1);
+   return true;
+}
+
+static void reset_actlr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
+{
+   u64 actlr;
+
+   asm volatile("mrs %0, actlr_el1\n" : "=r" (actlr));
+   vcpu_sys_reg(vcpu, ACTLR_EL1) = actlr;
+}
+
+/*
+ * Implementation specific sys-reg registers.
+ * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
+ */
+static const struct sys_reg_desc genericv8_sys_regs[] = {
+   /* ACTLR_EL1 */
+   { Op0(0b11), Op1(0b000), CRn(0b0001), CRm(0b), Op2(0b001),
+ access_actlr, reset_actlr, ACTLR_EL1 },
+};
+
+static struct kvm_sys_reg_target_table genericv8_target_table = {
+   .table64 = {
+   .table = genericv8_sys_regs,
+   .num = ARRAY_SIZE(genericv8_sys_regs),
+   },
+};
+
+static int __init sys_reg_genericv8_init(void)
+{
+   unsigned int i;
+
+   for (i = 1; i < ARRAY_SIZE(genericv8_sys_regs); i++)
+   BUG_ON(cmp_sys_reg(&genericv8_sys_regs[i-1],
+  &genericv8_sys_regs[i]) >= 0);
+
+   kvm_register_target_sys_reg_table(KVM_ARM_TARGET_AEM_V8,
+ &genericv8_target_table);
+   kvm_register_target_sys_reg_table(KVM_ARM_TARGET_FOUNDATION_V8,
+ &genericv8_target_table);
+   kvm_register_target_sys_reg_table(KVM_ARM_TARGET_CORTEX_A57,
+ &genericv8_target_table);
+   return 0;
+}
+late_initcall(sys_reg_genericv8_init);
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 20/32] arm64: KVM: Plug the arch timer

2013-04-08 Thread Marc Zyngier

Add support for the in-kernel timer emulation. The include file
is a complete duplicate of the 32bit one - something to fix
at one point.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm/kvm/arch_timer.c   |  1 +
 arch/arm64/include/asm/kvm_arch_timer.h | 58 +
 arch/arm64/kvm/hyp.S| 56 +++
 3 files changed, 115 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_arch_timer.h

diff --git a/arch/arm/kvm/arch_timer.c b/arch/arm/kvm/arch_timer.c
index c55b608..49a7516 100644
--- a/arch/arm/kvm/arch_timer.c
+++ b/arch/arm/kvm/arch_timer.c
@@ -195,6 +195,7 @@ static struct notifier_block kvm_timer_cpu_nb = {
 
 static const struct of_device_id arch_timer_of_match[] = {
{ .compatible   = "arm,armv7-timer",},
+   { .compatible   = "arm,armv8-timer",},
{},
 };
 
diff --git a/arch/arm64/include/asm/kvm_arch_timer.h 
b/arch/arm64/include/asm/kvm_arch_timer.h
new file mode 100644
index 000..eb02273
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_arch_timer.h
@@ -0,0 +1,58 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Marc Zyngier 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM64_KVM_ARCH_TIMER_H
+#define __ARM64_KVM_ARCH_TIMER_H
+
+#include 
+#include 
+#include 
+
+struct arch_timer_kvm {
+   /* Is the timer enabled */
+   boolenabled;
+
+   /* Virtual offset, restored only */
+   cycle_t cntvoff;
+};
+
+struct arch_timer_cpu {
+   /* Background timer used when the guest is not running */
+   struct hrtimer  timer;
+
+   /* Work queued with the above timer expires */
+   struct work_struct  expired;
+
+   /* Background timer active */
+   boolarmed;
+
+   /* Timer IRQ */
+   const struct kvm_irq_level  *irq;
+
+   /* Registers: control register, timer value */
+   u32 cntv_ctl;   /* Saved/restored */
+   cycle_t cntv_cval;  /* Saved/restored */
+};
+
+int kvm_timer_hyp_init(void);
+int kvm_timer_init(struct kvm *kvm);
+void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
+void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
+void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
+void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu);
+
+#endif
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index cc3192e..25da0b5 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -390,6 +390,60 @@ __kvm_hyp_code_start:
 2:
 .endm
 
+.macro save_timer_state
+   // x0: vcpu pointer
+   ldr x2, [x0, #VCPU_KVM]
+   kern_hyp_va x2
+   ldr w3, [x2, #KVM_TIMER_ENABLED]
+   cbz w3, 1f
+
+   mrs x3, cntv_ctl_el0
+   and x3, x3, #3
+   str w3, [x0, #VCPU_TIMER_CNTV_CTL]
+   bic x3, x3, #1  // Clear Enable
+   msr cntv_ctl_el0, x3
+
+   isb
+
+   mrs x3, cntv_cval_el0
+   str x3, [x0, #VCPU_TIMER_CNTV_CVAL]
+
+1:
+   // Allow physical timer/counter access for the host
+   mrs x2, cnthctl_el2
+   orr x2, x2, #3
+   msr cnthctl_el2, x2
+
+   // Clear cntvoff for the host
+   msr cntvoff_el2, xzr
+.endm
+
+.macro restore_timer_state
+   // x0: vcpu pointer
+   // Disallow physical timer access for the guest
+   // Physical counter access is allowed
+   mrs x2, cnthctl_el2
+   orr x2, x2, #1
+   bic x2, x2, #2
+   msr cnthctl_el2, x2
+
+   ldr x2, [x0, #VCPU_KVM]
+   kern_hyp_va x2
+   ldr w3, [x2, #KVM_TIMER_ENABLED]
+   cbz w3, 1f
+
+   ldr x3, [x2, #KVM_TIMER_CNTVOFF]
+   msr cntvoff_el2, x3
+   ldr x2, [x0, #VCPU_TIMER_CNTV_CVAL]
+   msr cntv_cval_el0, x2
+   isb
+
+   ldr w2, [x0, #VCPU_TIMER_CNTV_CTL]
+   and x2, x2, #3
+   msr cntv_ctl_el0, x2
+1:
+.endm
+
 __save_sysregs:
save_sysregs
ret
@@ -433,6 +487,7 @@ ENTRY(__kvm_vcpu_run)
activate_vm
 
restore_vgic_state
+   restore_timer_state
 
// Guest context
add x2, x0, #VCPU_CONTEXT
@@ -456,6 +511,7 @@ __kvm_vcpu_return:
bl __save_fpsimd
bl __save_sysregs
 
+   save_timer_state
save_vgic_st

[PATCH v3 07/32] arm64: KVM: fault injection into a guest

2013-04-08 Thread Marc Zyngier

Implement the injection of a fault (undefined, data abort or
prefetch abort) into a 64bit guest.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/inject_fault.c | 118 ++
 1 file changed, 118 insertions(+)
 create mode 100644 arch/arm64/kvm/inject_fault.c

diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
new file mode 100644
index 000..2ff3b78
--- /dev/null
+++ b/arch/arm64/kvm/inject_fault.c
@@ -0,0 +1,118 @@
+/*
+ * Fault injection for 64bit guests.
+ *
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Based on arch/arm/kvm/emulate.c
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Author: Christoffer Dall 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+
+static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long 
addr)
+{
+   unsigned long cpsr = *vcpu_cpsr(vcpu);
+   int is_aarch32;
+   u32 esr = 0;
+
+   is_aarch32 = vcpu_mode_is_32bit(vcpu);
+
+   *vcpu_spsr(vcpu) = cpsr;
+   *vcpu_elr_el1(vcpu) = *vcpu_pc(vcpu);
+
+   *vcpu_cpsr(vcpu) = PSR_MODE_EL1h | PSR_A_BIT | PSR_F_BIT | PSR_I_BIT;
+   *vcpu_pc(vcpu) = vcpu_sys_reg(vcpu, VBAR_EL1) + 0x200;
+
+   vcpu_sys_reg(vcpu, FAR_EL1) = addr;
+
+   /*
+* Build an {i,d}abort, depending on the level and the
+* instruction set. Report an external synchronous abort.
+*/
+   if (kvm_vcpu_trap_il_is32bit(vcpu))
+   esr |= ESR_EL1_IL;
+
+   if (is_aarch32 || (cpsr & PSR_MODE_MASK) == PSR_MODE_EL0t)
+   esr |= (ESR_EL1_EC_IABT_EL0 << ESR_EL1_EC_SHIFT);
+   else
+   esr |= (ESR_EL1_EC_IABT_EL1 << ESR_EL1_EC_SHIFT);
+
+   if (!is_iabt)
+   esr |= ESR_EL1_EC_DABT_EL0;
+
+   vcpu_sys_reg(vcpu, ESR_EL1) = esr | 0x10; /* External abort */
+}
+
+static void inject_undef64(struct kvm_vcpu *vcpu)
+{
+   unsigned long cpsr = *vcpu_cpsr(vcpu);
+   u32 esr = (ESR_EL1_EC_UNKNOWN << ESR_EL1_EC_SHIFT);
+
+   *vcpu_spsr(vcpu) = cpsr;
+   *vcpu_elr_el1(vcpu) = *vcpu_pc(vcpu);
+
+   *vcpu_cpsr(vcpu) = PSR_MODE_EL1h | PSR_F_BIT | PSR_I_BIT;
+   *vcpu_pc(vcpu) = vcpu_sys_reg(vcpu, VBAR_EL1) + 0x200;
+
+   /*
+* Build an unknown exception, depending on the instruction
+* set.
+*/
+   if (kvm_vcpu_trap_il_is32bit(vcpu))
+   esr |= ESR_EL1_IL;
+
+   vcpu_sys_reg(vcpu, ESR_EL1) = esr;
+}
+
+/**
+ * kvm_inject_dabt - inject a data abort into the guest
+ * @vcpu: The VCPU to receive the undefined exception
+ * @addr: The address to report in the DFAR
+ *
+ * It is assumed that this code is called from the VCPU thread and that the
+ * VCPU therefore is not currently executing guest code.
+ */
+void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr)
+{
+   inject_abt64(vcpu, false, addr);
+}
+
+/**
+ * kvm_inject_pabt - inject a prefetch abort into the guest
+ * @vcpu: The VCPU to receive the undefined exception
+ * @addr: The address to report in the DFAR
+ *
+ * It is assumed that this code is called from the VCPU thread and that the
+ * VCPU therefore is not currently executing guest code.
+ */
+void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr)
+{
+   inject_abt64(vcpu, true, addr);
+}
+
+/**
+ * kvm_inject_undefined - inject an undefined instruction into the guest
+ *
+ * It is assumed that this code is called from the VCPU thread and that the
+ * VCPU therefore is not currently executing guest code.
+ */
+void kvm_inject_undefined(struct kvm_vcpu *vcpu)
+{
+   inject_undef64(vcpu);
+}
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 19/32] arm64: KVM: Plug the VGIC

2013-04-08 Thread Marc Zyngier

Add support for the in-kernel GIC emulation. The include file
is a complete duplicate of the 32bit one - something to fix
at one point.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_vgic.h | 156 ++
 arch/arm64/kvm/hyp.S  |  88 +
 2 files changed, 244 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_vgic.h

diff --git a/arch/arm64/include/asm/kvm_vgic.h 
b/arch/arm64/include/asm/kvm_vgic.h
new file mode 100644
index 000..f353f22
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_vgic.h
@@ -0,0 +1,156 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Marc Zyngier 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM64_KVM_VGIC_H
+#define __ARM64_KVM_VGIC_H
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define VGIC_NR_IRQS   128
+#define VGIC_NR_SGIS   16
+#define VGIC_NR_PPIS   16
+#define VGIC_NR_PRIVATE_IRQS   (VGIC_NR_SGIS + VGIC_NR_PPIS)
+#define VGIC_NR_SHARED_IRQS(VGIC_NR_IRQS - VGIC_NR_PRIVATE_IRQS)
+#define VGIC_MAX_CPUS  KVM_MAX_VCPUS
+
+/* Sanity checks... */
+#if (VGIC_MAX_CPUS > 8)
+#error Invalid number of CPU interfaces
+#endif
+
+#if (VGIC_NR_IRQS & 31)
+#error "VGIC_NR_IRQS must be a multiple of 32"
+#endif
+
+#if (VGIC_NR_IRQS > 1024)
+#error "VGIC_NR_IRQS must be <= 1024"
+#endif
+
+/*
+ * The GIC distributor registers describing interrupts have two parts:
+ * - 32 per-CPU interrupts (SGI + PPI)
+ * - a bunch of shared interrupts (SPI)
+ */
+struct vgic_bitmap {
+   union {
+   u32 reg[VGIC_NR_PRIVATE_IRQS / 32];
+   DECLARE_BITMAP(reg_ul, VGIC_NR_PRIVATE_IRQS);
+   } percpu[VGIC_MAX_CPUS];
+   union {
+   u32 reg[VGIC_NR_SHARED_IRQS / 32];
+   DECLARE_BITMAP(reg_ul, VGIC_NR_SHARED_IRQS);
+   } shared;
+};
+
+struct vgic_bytemap {
+   u32 percpu[VGIC_MAX_CPUS][VGIC_NR_PRIVATE_IRQS / 4];
+   u32 shared[VGIC_NR_SHARED_IRQS  / 4];
+};
+
+struct vgic_dist {
+   spinlock_t  lock;
+   boolready;
+
+   /* Virtual control interface mapping */
+   void __iomem*vctrl_base;
+
+   /* Distributor and vcpu interface mapping in the guest */
+   phys_addr_t vgic_dist_base;
+   phys_addr_t vgic_cpu_base;
+
+   /* Distributor enabled */
+   u32 enabled;
+
+   /* Interrupt enabled (one bit per IRQ) */
+   struct vgic_bitmap  irq_enabled;
+
+   /* Interrupt 'pin' level */
+   struct vgic_bitmap  irq_state;
+
+   /* Level-triggered interrupt in progress */
+   struct vgic_bitmap  irq_active;
+
+   /* Interrupt priority. Not used yet. */
+   struct vgic_bytemap irq_priority;
+
+   /* Level/edge triggered */
+   struct vgic_bitmap  irq_cfg;
+
+   /* Source CPU per SGI and target CPU */
+   u8  irq_sgi_sources[VGIC_MAX_CPUS][16];
+
+   /* Target CPU for each IRQ */
+   u8  irq_spi_cpu[VGIC_NR_SHARED_IRQS];
+   struct vgic_bitmap  irq_spi_target[VGIC_MAX_CPUS];
+
+   /* Bitmap indicating which CPU has something pending */
+   unsigned long   irq_pending_on_cpu;
+};
+
+struct vgic_cpu {
+   /* per IRQ to LR mapping */
+   u8  vgic_irq_lr_map[VGIC_NR_IRQS];
+
+   /* Pending interrupts on this VCPU */
+   DECLARE_BITMAP( pending_percpu, VGIC_NR_PRIVATE_IRQS);
+   DECLARE_BITMAP( pending_shared, VGIC_NR_SHARED_IRQS);
+
+   /* Bitmap of used/free list registers */
+   DECLARE_BITMAP( lr_used, 64);
+
+   /* Number of list registers on this CPU */
+   int nr_lr;
+
+   /* CPU vif control registers for world switch */
+   u32 vgic_hcr;
+   u32 vgic_vmcr;
+   u32 vgic_misr;  /* Saved only */
+   u32 vgic_eisr[2];   /* Saved only */
+   u32 vgic_elrsr[2];  /* Saved only */
+   u32 vgic_apr;
+   u32 vgic_lr[64];/* A15 has only 4... */
+};
+
+#define LR_EMPTY   0xff
+
+struct kvm;
+struct kvm_vcpu;
+struct kvm_run;
+struct kvm_exit_mmio;
+
+int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
+int kvm_vgic_hyp_init(void);
+int kvm_vgic_init(struct kvm *

[PATCH v3 13/32] arm64: KVM: kvm_arch and kvm_vcpu_arch definitions

2013-04-08 Thread Marc Zyngier

Provide the architecture dependent structures for VM and
vcpu abstractions.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_host.h | 186 ++
 1 file changed, 186 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_host.h

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
new file mode 100644
index 000..a0279ff
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -0,0 +1,186 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Derived from arch/arm/include/asm/kvm_host.h:
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Author: Christoffer Dall 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __ARM64_KVM_HOST_H__
+#define __ARM64_KVM_HOST_H__
+
+#include 
+#include 
+#include 
+
+#define KVM_MAX_VCPUS 4
+#define KVM_USER_MEM_SLOTS 32
+#define KVM_PRIVATE_MEM_SLOTS 4
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
+
+#include 
+#include 
+
+#define KVM_VCPU_MAX_FEATURES 0
+
+/* We don't currently support large pages. */
+#define KVM_HPAGE_GFN_SHIFT(x) 0
+#define KVM_NR_PAGE_SIZES  1
+#define KVM_PAGES_PER_HPAGE(x) (1UL<<31)
+
+struct kvm_vcpu;
+int kvm_target_cpu(void);
+int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
+int kvm_arch_dev_ioctl_check_extention(long ext);
+
+struct kvm_arch {
+   /* The VMID generation used for the virt. memory system */
+   u64vmid_gen;
+   u32vmid;
+
+   /* 1-level 2nd stage table and lock */
+   spinlock_t pgd_lock;
+   pgd_t *pgd;
+
+   /* VTTBR value associated with above pgd and vmid */
+   u64vttbr;
+
+   /* Interrupt controller */
+   struct vgic_distvgic;
+
+   /* Timer */
+   struct arch_timer_kvm   timer;
+};
+
+#define KVM_NR_MEM_OBJS 40
+
+/*
+ * We don't want allocation failures within the mmu code, so we preallocate
+ * enough memory for a single page fault in a cache.
+ */
+struct kvm_mmu_memory_cache {
+   int nobjs;
+   void *objects[KVM_NR_MEM_OBJS];
+};
+
+struct kvm_vcpu_fault_info {
+   u32 esr_el2;/* Hyp Syndrom Register */
+   u64 far_el2;/* Hyp Fault Address Register */
+   u64 hpfar_el2;  /* Hyp IPA Fault Address Register */
+};
+
+struct kvm_cpu_context {
+   struct kvm_regs gp_regs;
+   u64 sys_regs[NR_SYS_REGS];
+};
+
+typedef struct kvm_cpu_context kvm_cpu_context_t;
+
+struct kvm_vcpu_arch {
+   struct kvm_cpu_context ctxt;
+
+   /* HYP configuration */
+   u64 hcr_el2;
+
+   /* Exception Information */
+   struct kvm_vcpu_fault_info fault;
+
+   /* Pointer to host CPU context */
+   kvm_cpu_context_t *host_cpu_context;
+
+   /* VGIC state */
+   struct vgic_cpu vgic_cpu;
+   struct arch_timer_cpu timer_cpu;
+
+   /*
+* Anything that is not used directly from assembly code goes
+* here.
+*/
+   /* dcache set/way operation pending */
+   int last_pcpu;
+   cpumask_t require_dcache_flush;
+
+   /* Don't run the guest */
+   bool pause;
+
+   /* IO related fields */
+   struct kvm_decode mmio_decode;
+
+   /* Interrupt related fields */
+   u64 irq_lines;  /* IRQ and FIQ levels */
+
+   /* Cache some mmu pages needed inside spinlock regions */
+   struct kvm_mmu_memory_cache mmu_page_cache;
+
+   /* Target CPU and feature flags */
+   u32 target;
+   DECLARE_BITMAP(features, KVM_VCPU_MAX_FEATURES);
+
+   /* Detect first run of a vcpu */
+   bool has_run_once;
+};
+
+#define vcpu_gp_regs(v)(&(v)->arch.ctxt.gp_regs)
+#define vcpu_sys_reg(v,r)  ((v)->arch.ctxt.sys_regs[(r)])
+#define vcpu_cp15(v,r) ((v)->arch.ctxt.cp15[(r)])
+
+struct kvm_vm_stat {
+   u32 remote_tlb_flush;
+};
+
+struct kvm_vcpu_stat {
+   u32 halt_wakeup;
+};
+
+struct kvm_vcpu_init;
+int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
+   const struct kvm_vcpu_init *init);
+unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu);
+int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
+struct kvm_one_reg;
+int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+
+#define KVM_ARCH_WANT_MMU_NOTIFIER
+struct kvm;
+int kvm_unmap_hv

[PATCH v3 26/32] arm64: KVM: 32bit handling of coprocessor traps

2013-04-08 Thread Marc Zyngier

Provide the necessary infrastructure to trap coprocessor accesses that
occur when running 32bit guests.

Also wire SMC and HVC trapped in 32bit mode while were at it.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_coproc.h |   5 +
 arch/arm64/kvm/handle_exit.c|   7 ++
 arch/arm64/kvm/sys_regs.c   | 178 ++--
 3 files changed, 183 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_coproc.h 
b/arch/arm64/include/asm/kvm_coproc.h
index 9b4477a..9a59301 100644
--- a/arch/arm64/include/asm/kvm_coproc.h
+++ b/arch/arm64/include/asm/kvm_coproc.h
@@ -32,11 +32,16 @@ struct kvm_sys_reg_table {
 
 struct kvm_sys_reg_target_table {
struct kvm_sys_reg_table table64;
+   struct kvm_sys_reg_table table32;
 };
 
 void kvm_register_target_sys_reg_table(unsigned int target,
   struct kvm_sys_reg_target_table *table);
 
+int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_handle_sys_reg(struct kvm_vcpu *vcpu, struct kvm_run *run);
 
 #define kvm_coproc_table_init kvm_sys_reg_table_init
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 4766b7f..9beaca03 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -62,6 +62,13 @@ static int kvm_handle_wfi(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 
 static exit_handle_fn arm_exit_handlers[] = {
[ESR_EL2_EC_WFI]= kvm_handle_wfi,
+   [ESR_EL2_EC_CP15_32]= kvm_handle_cp15_32,
+   [ESR_EL2_EC_CP15_64]= kvm_handle_cp15_64,
+   [ESR_EL2_EC_CP14_MR]= kvm_handle_cp14_access,
+   [ESR_EL2_EC_CP14_LS]= kvm_handle_cp14_load_store,
+   [ESR_EL2_EC_CP14_64]= kvm_handle_cp14_access,
+   [ESR_EL2_EC_HVC32]  = handle_hvc,
+   [ESR_EL2_EC_SMC32]  = handle_smc,
[ESR_EL2_EC_HVC64]  = handle_hvc,
[ESR_EL2_EC_SMC64]  = handle_smc,
[ESR_EL2_EC_SYS64]  = kvm_handle_sys_reg,
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 9df3b32..0303218 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -38,6 +38,10 @@
  * types are different. My gut feeling is that it should be pretty
  * easy to merge, but that would be an ABI breakage -- again. VFP
  * would also need to be abstracted.
+ *
+ * For AArch32, we only take care of what is being trapped. Anything
+ * that has to do with init and userspace access has to go via the
+ * 64bit interface.
  */
 
 /* 3 bits per cache level, as per CLIDR, but non-existent caches always 0 */
@@ -163,6 +167,16 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010),
  access_dcsw },
 
+   /* TEECR32_EL1 */
+   { Op0(0b10), Op1(0b010), CRn(0b), CRm(0b), Op2(0b000),
+ NULL, reset_val, TEECR32_EL1, 0 },
+   /* TEEHBR32_EL1 */
+   { Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b), Op2(0b000),
+ NULL, reset_val, TEEHBR32_EL1, 0 },
+   /* DBGVCR32_EL2 */
+   { Op0(0b10), Op1(0b100), CRn(0b), CRm(0b0111), Op2(0b000),
+ NULL, reset_val, DBGVCR32_EL2, 0 },
+
/* MPIDR_EL1 */
{ Op0(0b11), Op1(0b000), CRn(0b), CRm(0b), Op2(0b101),
  NULL, reset_mpidr, MPIDR_EL1 },
@@ -273,6 +287,39 @@ static const struct sys_reg_desc sys_reg_descs[] = {
/* TPIDRRO_EL0 */
{ Op0(0b11), Op1(0b011), CRn(0b1101), CRm(0b), Op2(0b011),
  NULL, reset_unknown, TPIDRRO_EL0 },
+
+   /* DACR32_EL2 */
+   { Op0(0b11), Op1(0b100), CRn(0b0011), CRm(0b), Op2(0b000),
+ NULL, reset_unknown, DACR32_EL2 },
+   /* IFSR32_EL2 */
+   { Op0(0b11), Op1(0b100), CRn(0b0101), CRm(0b), Op2(0b001),
+ NULL, reset_unknown, IFSR32_EL2 },
+   /* FPEXC32_EL2 */
+   { Op0(0b11), Op1(0b100), CRn(0b0101), CRm(0b0011), Op2(0b000),
+ NULL, reset_val, FPEXC32_EL2, 0x70 },
+};
+
+/* Trapped cp15 registers */
+static const struct sys_reg_desc cp15_regs[] = {
+   /*
+* DC{C,I,CI}SW operations:
+*/
+   { Op1( 0), CRn( 7), CRm( 6), Op2( 2), access_dcsw },
+   { Op1( 0), CRn( 7), CRm(10), Op2( 2), access_dcsw },
+   { Op1( 0), CRn( 7), CRm(14), Op2( 2), access_dcsw },
+   { Op1( 0), CRn( 9), CRm(12), Op2( 0), pm_fake },
+   { Op1( 0), CRn( 9), CRm(12), Op2( 1), pm_fake },
+   { Op1( 0), CRn( 9), CRm(12), Op2( 2), pm_fake },
+   { Op1( 0), CRn( 9), CRm(12), Op2( 3), pm_fake },
+   { Op1( 0), CRn( 9), CRm(12), Op2( 5), pm_fake },
+   { Op1( 0), CRn( 9), CRm(12), Op2( 6), pm_fake },
+   { Op1( 0), CRn( 9), CRm(12), Op2( 7), pm_fake },
+

[PATCH v3 25/32] arm64: KVM: 32bit conditional execution emulation

2013-04-08 Thread Marc Zyngier

As conditional instructions can trap on AArch32, add the thinest
possible emulation layer to keep 32bit guests happy.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_emulate.h |  13 ++-
 arch/arm64/kvm/Makefile  |   2 +-
 arch/arm64/kvm/emulate.c | 154 +++
 3 files changed, 166 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm64/kvm/emulate.c

diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 37a6567..8d4ab33 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -31,6 +31,9 @@
 unsigned long *vcpu_reg32(const struct kvm_vcpu *vcpu, u8 reg_num);
 unsigned long *vcpu_spsr32(const struct kvm_vcpu *vcpu);
 
+bool kvm_condition_valid32(const struct kvm_vcpu *vcpu);
+void kvm_skip_instr32(struct kvm_vcpu *vcpu, bool is_wide_instr);
+
 void kvm_inject_undefined(struct kvm_vcpu *vcpu);
 void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
 void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
@@ -57,12 +60,18 @@ static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu 
*vcpu)
 
 static inline bool kvm_condition_valid(const struct kvm_vcpu *vcpu)
 {
-   return true;/* No conditionals on arm64 */
+   if (vcpu_mode_is_32bit(vcpu))
+   return kvm_condition_valid32(vcpu);
+
+   return true;
 }
 
 static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
 {
-   *vcpu_pc(vcpu) += 4;
+   if (vcpu_mode_is_32bit(vcpu))
+   kvm_skip_instr32(vcpu, is_wide_instr);
+   else
+   *vcpu_pc(vcpu) += 4;
 }
 
 static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 1668448..88c6639 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -11,7 +11,7 @@ obj-$(CONFIG_KVM_ARM_HOST) += kvm.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(addprefix ../../../virt/kvm/, kvm_main.o 
coalesced_mmio.o)
 kvm-$(CONFIG_KVM_ARM_HOST) += $(addprefix ../../../arch/arm/kvm/, arm.o mmu.o 
mmio.o psci.o perf.o)
 
-kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
+kvm-$(CONFIG_KVM_ARM_HOST) += emulate.o inject_fault.o regmap.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
 kvm-$(CONFIG_KVM_ARM_HOST) += guest.o reset.o sys_regs.o sys_regs_generic_v8.o
 
diff --git a/arch/arm64/kvm/emulate.c b/arch/arm64/kvm/emulate.c
new file mode 100644
index 000..01d4713
--- /dev/null
+++ b/arch/arm64/kvm/emulate.c
@@ -0,0 +1,154 @@
+/*
+ * (not much of an) Emulation layer for 32bit guests.
+ *
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Author: Christoffer Dall 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+
+/*
+ * stolen from arch/arm/kernel/opcodes.c
+ *
+ * condition code lookup table
+ * index into the table is test code: EQ, NE, ... LT, GT, AL, NV
+ *
+ * bit position in short is condition code: NZCV
+ */
+static const unsigned short cc_map[16] = {
+   0xF0F0, /* EQ == Z set*/
+   0x0F0F, /* NE */
+   0x, /* CS == C set*/
+   0x, /* CC */
+   0xFF00, /* MI == N set*/
+   0x00FF, /* PL */
+   0x, /* VS == V set*/
+   0x, /* VC */
+   0x0C0C, /* HI == C set && Z clear */
+   0xF3F3, /* LS == C clear || Z set */
+   0xAA55, /* GE == (N==V)   */
+   0x55AA, /* LT == (N!=V)   */
+   0x0A05, /* GT == (!Z && (N==V))   */
+   0xF5FA, /* LE == (Z || (N!=V))*/
+   0x, /* AL always  */
+   0   /* NV */
+};
+
+static int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
+{
+   u32 esr = kvm_vcpu_get_hsr(vcpu);
+
+   if (esr & ESR_EL2_CV)
+   return (esr & ESR_EL2_COND) >> ESR_EL2_COND_SHIFT;
+
+   return -1;
+}
+
+/*
+ * Check if a trapped instruction should have been executed or not.
+ */
+bool kvm_condition_valid32(const struct kvm_vcpu

[PATCH v3 24/32] arm64: KVM: 32bit GP register access

2013-04-08 Thread Marc Zyngier

Allow access to the 32bit register file through the usual API.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_emulate.h |  17 +++-
 arch/arm64/kvm/Makefile  |   2 +-
 arch/arm64/kvm/regmap.c  | 168 +++
 3 files changed, 184 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm64/kvm/regmap.c

diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 2dcfa74..37a6567 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -28,6 +28,9 @@
 #include 
 #include 
 
+unsigned long *vcpu_reg32(const struct kvm_vcpu *vcpu, u8 reg_num);
+unsigned long *vcpu_spsr32(const struct kvm_vcpu *vcpu);
+
 void kvm_inject_undefined(struct kvm_vcpu *vcpu);
 void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
 void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
@@ -49,7 +52,7 @@ static inline unsigned long *vcpu_cpsr(const struct kvm_vcpu 
*vcpu)
 
 static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu)
 {
-   return false;   /* 32bit? Bahhh... */
+   return !!(*vcpu_cpsr(vcpu) & PSR_MODE32_BIT);
 }
 
 static inline bool kvm_condition_valid(const struct kvm_vcpu *vcpu)
@@ -64,28 +67,38 @@ static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, 
bool is_wide_instr)
 
 static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
 {
+   *vcpu_cpsr(vcpu) |= COMPAT_PSR_T_BIT;
 }
 
 static inline unsigned long *vcpu_reg(const struct kvm_vcpu *vcpu, u8 reg_num)
 {
+   if (vcpu_mode_is_32bit(vcpu))
+   return vcpu_reg32(vcpu, reg_num);
+
return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.regs[reg_num];
 }
 
 /* Get vcpu SPSR for current mode */
 static inline unsigned long *vcpu_spsr(const struct kvm_vcpu *vcpu)
 {
+   if (vcpu_mode_is_32bit(vcpu))
+   return vcpu_spsr32(vcpu);
+
return (unsigned long *)&vcpu_gp_regs(vcpu)->spsr[KVM_SPSR_EL1];
 }
 
 static inline bool kvm_vcpu_reg_is_pc(const struct kvm_vcpu *vcpu, int reg)
 {
-   return false;
+   return (vcpu_mode_is_32bit(vcpu)) && reg == 15;
 }
 
 static inline bool vcpu_mode_priv(const struct kvm_vcpu *vcpu)
 {
u32 mode = *vcpu_cpsr(vcpu) & PSR_MODE_MASK;
 
+   if (vcpu_mode_is_32bit(vcpu))
+   return mode > COMPAT_PSR_MODE_USR;
+
return mode != PSR_MODE_EL0t;
 }
 
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index be9eb3833..1668448 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -11,7 +11,7 @@ obj-$(CONFIG_KVM_ARM_HOST) += kvm.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(addprefix ../../../virt/kvm/, kvm_main.o 
coalesced_mmio.o)
 kvm-$(CONFIG_KVM_ARM_HOST) += $(addprefix ../../../arch/arm/kvm/, arm.o mmu.o 
mmio.o psci.o perf.o)
 
-kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o
+kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
 kvm-$(CONFIG_KVM_ARM_HOST) += guest.o reset.o sys_regs.o sys_regs_generic_v8.o
 
diff --git a/arch/arm64/kvm/regmap.c b/arch/arm64/kvm/regmap.c
new file mode 100644
index 000..bbc6ae3
--- /dev/null
+++ b/arch/arm64/kvm/regmap.c
@@ -0,0 +1,168 @@
+/*
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier 
+ *
+ * Derived from arch/arm/kvm/emulate.c:
+ * Copyright (C) 2012 - Virtual Open Systems and Columbia University
+ * Author: Christoffer Dall 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#define VCPU_NR_MODES 6
+#define REG_OFFSET(_reg) \
+   (offsetof(struct user_pt_regs, _reg) / sizeof(unsigned long))
+
+#define USR_REG_OFFSET(R) REG_OFFSET(compat_usr(R))
+
+static const unsigned long vcpu_reg_offsets[VCPU_NR_MODES][16] = {
+   /* USR Registers */
+   {
+   USR_REG_OFFSET(0), USR_REG_OFFSET(1), USR_REG_OFFSET(2),
+   USR_REG_OFFSET(3), USR_REG_OFFSET(4), USR_REG_OFFSET(5),
+   USR_REG_OFFSET(6), USR_REG_OFFSET(7), USR_REG_OFFSET(8),
+   USR_REG_OFFSET(9), USR_REG_OFFSET(10), USR_REG_OFFSET(11),
+   USR_REG_OFFSET(12), USR_REG_OFFSET(13), USR_REG_OFFSET(14),
+   REG_OFFSET(pc)
+   },
+
+   /* FIQ Registers */
+   {
+   USR_REG_OFFSET(0), USR_REG_OFFSET(1), USR_REG_OFFSET(2),
+   USR_REG_OFFSET(3), USR_REG_OFFSET(4),

[PATCH v3 27/32] arm64: KVM: CPU specific 32bit coprocessor access

2013-04-08 Thread Marc Zyngier

Enable handling of CPU specific 32bit coprocessor access. Not much
here either.

Reviewed-by: Christopher Covington 
Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/sys_regs_generic_v8.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/kvm/sys_regs_generic_v8.c 
b/arch/arm64/kvm/sys_regs_generic_v8.c
index d4e8039..4268ab9 100644
--- a/arch/arm64/kvm/sys_regs_generic_v8.c
+++ b/arch/arm64/kvm/sys_regs_generic_v8.c
@@ -59,11 +59,21 @@ static const struct sys_reg_desc genericv8_sys_regs[] = {
  access_actlr, reset_actlr, ACTLR_EL1 },
 };
 
+static const struct sys_reg_desc genericv8_cp15_regs[] = {
+   /* ACTLR */
+   { Op1(0b000), CRn(0b0001), CRm(0b), Op2(0b001),
+ access_actlr },
+};
+
 static struct kvm_sys_reg_target_table genericv8_target_table = {
.table64 = {
.table = genericv8_sys_regs,
.num = ARRAY_SIZE(genericv8_sys_regs),
},
+   .table32 = {
+   .table = genericv8_cp15_regs,
+   .num = ARRAY_SIZE(genericv8_cp15_regs),
+   },
 };
 
 static int __init sys_reg_genericv8_init(void)
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 30/32] arm64: KVM: enable initialization of a 32bit vcpu

2013-04-08 Thread Marc Zyngier

Wire the init of a 32bit vcpu by allowing 32bit modes in pstate,
and providing sensible defaults out of reset state.

This feature is of course conditioned by the presence of 32bit
capability on the physical CPU, and is checked by the KVM_CAP_ARM_EL1_32BIT
capability.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/include/asm/kvm_host.h |  2 +-
 arch/arm64/include/uapi/asm/kvm.h |  1 +
 arch/arm64/kvm/guest.c|  6 ++
 arch/arm64/kvm/reset.c| 25 -
 include/uapi/linux/kvm.h  |  1 +
 5 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index d44064d..c3ec107 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -34,7 +34,7 @@
 #include 
 #include 
 
-#define KVM_VCPU_MAX_FEATURES 1
+#define KVM_VCPU_MAX_FEATURES 2
 
 /* We don't currently support large pages. */
 #define KVM_HPAGE_GFN_SHIFT(x) 0
diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 5b1110c..5031f42 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -75,6 +75,7 @@ struct kvm_regs {
 #define KVM_VGIC_V2_CPU_SIZE   0x2000
 
 #define KVM_ARM_VCPU_POWER_OFF 0 /* CPU is started in OFF state */
+#define KVM_ARM_VCPU_EL1_32BIT 1 /* CPU running a 32bit VM */
 
 struct kvm_vcpu_init {
__u32 target;
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 47d3729..74ef7d5 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -93,6 +93,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct 
kvm_one_reg *reg)
if (off == KVM_REG_ARM_CORE_REG(regs.pstate)) {
unsigned long mode = (*(unsigned long *)valp) & 
COMPAT_PSR_MODE_MASK;
switch (mode) {
+   case COMPAT_PSR_MODE_USR:
+   case COMPAT_PSR_MODE_FIQ:
+   case COMPAT_PSR_MODE_IRQ:
+   case COMPAT_PSR_MODE_SVC:
+   case COMPAT_PSR_MODE_ABT:
+   case COMPAT_PSR_MODE_UND:
case PSR_MODE_EL0t:
case PSR_MODE_EL1t:
case PSR_MODE_EL1h:
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index bc33e76..a282d35 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -35,11 +35,27 @@ static struct kvm_regs default_regs_reset = {
.regs.pstate = PSR_MODE_EL1h | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT,
 };
 
+static struct kvm_regs default_regs_reset32 = {
+   .regs.pstate = (COMPAT_PSR_MODE_SVC | COMPAT_PSR_A_BIT |
+   COMPAT_PSR_I_BIT | COMPAT_PSR_F_BIT),
+};
+
+static bool cpu_has_32bit_el1(void)
+{
+   u64 pfr0;
+
+   pfr0 = read_cpuid(ID_AA64PFR0_EL1);
+   return !!(pfr0 & 0x20);
+}
+
 int kvm_arch_dev_ioctl_check_extention(long ext)
 {
int r;
 
switch (ext) {
+   case KVM_CAP_ARM_EL1_32BIT:
+   r = cpu_has_32bit_el1();
+   break;
default:
r = 0;
}
@@ -62,7 +78,14 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 
switch (vcpu->arch.target) {
default:
-   cpu_reset = &default_regs_reset;
+   if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
+   if (!cpu_has_32bit_el1())
+   return -EINVAL;
+   cpu_reset = &default_regs_reset32;
+   vcpu->arch.hcr_el2 &= ~HCR_RW;
+   } else {
+   cpu_reset = &default_regs_reset;
+   }
break;
}
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2bf42b0..ce05763 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -665,6 +665,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_EPR 86
 #define KVM_CAP_ARM_PSCI 87
 #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
+#define KVM_CAP_ARM_EL1_32BIT 89
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
1.8.1.4


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH uq/master v2 1/2] kvm: reset state from the CPU's reset method

2013-04-08 Thread Gleb Natapov

On Mon, Apr 08, 2013 at 04:36:47PM +0200, Andreas Färber wrote:
> Am 08.04.2013 14:19, schrieb Gleb Natapov:
> > On Tue, Apr 02, 2013 at 04:29:32PM +0300, Gleb Natapov wrote:
> >>>  static void kvm_sw_tlb_put(PowerPCCPU *cpu)
> >>>  {
> >>>  CPUPPCState *env = &cpu->env;
> >>> diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
> >>> index 23fe51f..6321384 100644
> >>> --- a/target-s390x/cpu.c
> >>> +++ b/target-s390x/cpu.c
> >>> @@ -84,6 +84,10 @@ static void s390_cpu_reset(CPUState *s)
> >>>   * after incrementing the cpu counter */
> >>>  #if !defined(CONFIG_USER_ONLY)
> >>>  s->halted = 1;
> >>> +
> >>> +if (kvm_enabled()) {
> >>> +kvm_arch_reset_vcpu(s);
> >> Does this compile with kvm support disabled?
> >>
> > Well, it does not:
> >   CCs390x-softmmu/target-s390x/cpu.o
> > /users/gleb/work/qemu/target-s390x/cpu.c: In function 's390_cpu_reset':
> > /users/gleb/work/qemu/target-s390x/cpu.c:89:9: error: implicit
> > declaration of function 'kvm_arch_reset_vcpu'
> > [-Werror=implicit-function-declaration]
> > /users/gleb/work/qemu/target-s390x/cpu.c:89:9: error: nested extern
> > declaration of 'kvm_arch_reset_vcpu' [-Werror=nested-externs]
> > cc1: all warnings being treated as errors
> > 
> > I wonder if it is portable between compilers to rely on code in if(0){} to
> > be dropped in all levels of optimizations.
> 
> No, we had a previous case where --enable-debug broke if (kvm_enabled())
> {...} but regular builds worked.
> 
Can you recall what compiler was it with? 4.7.2 works with -O0.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: x86: Fix memory leak in vmx.c

2013-04-08 Thread Andrew Honig

On Mon, Apr 8, 2013 at 2:24 AM, Gleb Natapov  wrote:
> On Thu, Apr 04, 2013 at 12:39:47PM -0700, Andrew Honig wrote:
>> If userspace creates and destroys multiple VMs within the same process
>> we leak 20k of memory in the userspace process context per VM.  This
>> patch frees the memory in kvm_arch_destroy_vm.  If the process exits
>> without closing the VM file descriptor or the file descriptor has been
>> shared with another process then we don't need to free the memory.
>>
>> Messing with user space memory from an fd is not ideal, but other changes
>> would require user space changes and this is consistent with how the
>> memory is currently allocated.
>>
>> Tested: Test ran several VMs and ran against test program meant to
>> demonstrate the leak (www.spinics.net/lists/kvm/msg83734.html).
>>
>> Signed-off-by: Andrew Honig 
>>
>> ---
>>  arch/x86/include/asm/kvm_host.h |3 +++
>>  arch/x86/kvm/vmx.c  |3 +++
>>  arch/x86/kvm/x86.c  |   11 +++
>>  3 files changed, 17 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h 
>> b/arch/x86/include/asm/kvm_host.h
>> index 4979778..975a74d 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -553,6 +553,9 @@ struct kvm_arch {
>>   struct page *ept_identity_pagetable;
>>   bool ept_identity_pagetable_done;
>>   gpa_t ept_identity_map_addr;
>> + unsigned long ept_ptr;
>> + unsigned long apic_ptr;
>> + unsigned long tss_ptr;
>>
> Better to use __kvm_set_memory_region() with memory_size = 0 to delete
> the slot and fix kvm_arch_prepare_memory_region() to unmap if
> change == KVM_MR_DELETE.
>
Will do in the next version.

>>   unsigned long irq_sources_bitmap;
>>   s64 kvmclock_offset;
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 6667042..8aa5d81 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3703,6 +3703,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
>>   }
>>
>>   kvm->arch.apic_access_page = page;
>> + kvm->arch.apic_ptr = kvm_userspace_mem.userspace_addr;
>>  out:
>>   mutex_unlock(&kvm->slots_lock);
>>   return r;
>> @@ -3733,6 +3734,7 @@ static int alloc_identity_pagetable(struct kvm *kvm)
>>   }
>>
>>   kvm->arch.ept_identity_pagetable = page;
>> + kvm->arch.ept_ptr = kvm_userspace_mem.userspace_addr;
>>  out:
>>   mutex_unlock(&kvm->slots_lock);
>>   return r;
>> @@ -4366,6 +4368,7 @@ static int vmx_set_tss_addr(struct kvm *kvm, unsigned 
>> int addr)
>>   if (ret)
>>   return ret;
>>   kvm->arch.tss_addr = addr;
>> + kvm->arch.tss_ptr = tss_mem.userspace_addr;
>>   if (!init_rmode_tss(kvm))
>>   return  -ENOMEM;
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index f19ac0a..411ff2a 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -6812,6 +6812,16 @@ void kvm_arch_sync_events(struct kvm *kvm)
>>
>>  void kvm_arch_destroy_vm(struct kvm *kvm)
>>  {
>> + if (current->mm == kvm->mm) {
>> + /*
>> +  * Free pages allocated on behalf of userspace, unless the
>> +  * the memory map has changed due to process exit or fd
>> +  * copying.
>> +  */
> Why mm changes during process exit? And what do you mean by fd copying?
> One process creates kvm fd and pass it to another? In this case I think
> the leak will still be there since all of the addresses bellow are
> mapped after kvm fd is created. apic_access_page and identity_pagetable
> during first vcpu creation and tss when KVM_SET_TSS_ADDR ioctl is
> called. Vcpu creation and ioctl call can be done by different process
> from the one that created kvm fd.
>
>

The mm changes during process exit because exit_mm is called to clean
up process memory before exit_files is called.  If the process exits
without closing the fds then the vm will be closed after the mm is
destroyed and set to null.  Without checking for that, we'd access the
invalid mm and panic.

By fd copying I mean passing the fd to another process over unix
domain sockets or by using fork().  My understanding was that vcpu
creation and ioctl call could not be done by a different process than
the one that created by kvm fd.  Both kvm_vm_ioctl and kvm_vcpu_ioctl
start with a check to prevent this:
if (kvm->mm != current->mm)
  return -EIO;
As there's already lots of requirements in the code that the mm being
used by a VCPU is the same as the mm used by the VM.

I agree that a process could still fork or otherwise pass the fd
between processes to induce the 5 pages to leak, but this couldn't
happen unless the processes were badly misusing the API in a way that
wouldn't work anyway.  This is also of no use to a malicious user
space application either, because the 5 pages are in the user space
context.

These issues not withstanding this is a huge improvement over the
current situation where there's no w

Re: [PATCH] KVM: x86: Fix memory leak in vmx.c

2013-04-08 Thread Gleb Natapov

On Mon, Apr 08, 2013 at 10:11:52AM -0700, Andrew Honig wrote:
> On Mon, Apr 8, 2013 at 2:24 AM, Gleb Natapov  wrote:
> > On Thu, Apr 04, 2013 at 12:39:47PM -0700, Andrew Honig wrote:
> >> If userspace creates and destroys multiple VMs within the same process
> >> we leak 20k of memory in the userspace process context per VM.  This
> >> patch frees the memory in kvm_arch_destroy_vm.  If the process exits
> >> without closing the VM file descriptor or the file descriptor has been
> >> shared with another process then we don't need to free the memory.
> >>
> >> Messing with user space memory from an fd is not ideal, but other changes
> >> would require user space changes and this is consistent with how the
> >> memory is currently allocated.
> >>
> >> Tested: Test ran several VMs and ran against test program meant to
> >> demonstrate the leak (www.spinics.net/lists/kvm/msg83734.html).
> >>
> >> Signed-off-by: Andrew Honig 
> >>
> >> ---
> >>  arch/x86/include/asm/kvm_host.h |3 +++
> >>  arch/x86/kvm/vmx.c  |3 +++
> >>  arch/x86/kvm/x86.c  |   11 +++
> >>  3 files changed, 17 insertions(+)
> >>
> >> diff --git a/arch/x86/include/asm/kvm_host.h 
> >> b/arch/x86/include/asm/kvm_host.h
> >> index 4979778..975a74d 100644
> >> --- a/arch/x86/include/asm/kvm_host.h
> >> +++ b/arch/x86/include/asm/kvm_host.h
> >> @@ -553,6 +553,9 @@ struct kvm_arch {
> >>   struct page *ept_identity_pagetable;
> >>   bool ept_identity_pagetable_done;
> >>   gpa_t ept_identity_map_addr;
> >> + unsigned long ept_ptr;
> >> + unsigned long apic_ptr;
> >> + unsigned long tss_ptr;
> >>
> > Better to use __kvm_set_memory_region() with memory_size = 0 to delete
> > the slot and fix kvm_arch_prepare_memory_region() to unmap if
> > change == KVM_MR_DELETE.
> >
> Will do in the next version.
> 
> >>   unsigned long irq_sources_bitmap;
> >>   s64 kvmclock_offset;
> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> >> index 6667042..8aa5d81 100644
> >> --- a/arch/x86/kvm/vmx.c
> >> +++ b/arch/x86/kvm/vmx.c
> >> @@ -3703,6 +3703,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
> >>   }
> >>
> >>   kvm->arch.apic_access_page = page;
> >> + kvm->arch.apic_ptr = kvm_userspace_mem.userspace_addr;
> >>  out:
> >>   mutex_unlock(&kvm->slots_lock);
> >>   return r;
> >> @@ -3733,6 +3734,7 @@ static int alloc_identity_pagetable(struct kvm *kvm)
> >>   }
> >>
> >>   kvm->arch.ept_identity_pagetable = page;
> >> + kvm->arch.ept_ptr = kvm_userspace_mem.userspace_addr;
> >>  out:
> >>   mutex_unlock(&kvm->slots_lock);
> >>   return r;
> >> @@ -4366,6 +4368,7 @@ static int vmx_set_tss_addr(struct kvm *kvm, 
> >> unsigned int addr)
> >>   if (ret)
> >>   return ret;
> >>   kvm->arch.tss_addr = addr;
> >> + kvm->arch.tss_ptr = tss_mem.userspace_addr;
> >>   if (!init_rmode_tss(kvm))
> >>   return  -ENOMEM;
> >>
> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> index f19ac0a..411ff2a 100644
> >> --- a/arch/x86/kvm/x86.c
> >> +++ b/arch/x86/kvm/x86.c
> >> @@ -6812,6 +6812,16 @@ void kvm_arch_sync_events(struct kvm *kvm)
> >>
> >>  void kvm_arch_destroy_vm(struct kvm *kvm)
> >>  {
> >> + if (current->mm == kvm->mm) {
> >> + /*
> >> +  * Free pages allocated on behalf of userspace, unless the
> >> +  * the memory map has changed due to process exit or fd
> >> +  * copying.
> >> +  */
> > Why mm changes during process exit? And what do you mean by fd copying?
> > One process creates kvm fd and pass it to another? In this case I think
> > the leak will still be there since all of the addresses bellow are
> > mapped after kvm fd is created. apic_access_page and identity_pagetable
> > during first vcpu creation and tss when KVM_SET_TSS_ADDR ioctl is
> > called. Vcpu creation and ioctl call can be done by different process
> > from the one that created kvm fd.
> >
> >
> 
> The mm changes during process exit because exit_mm is called to clean
> up process memory before exit_files is called.  If the process exits
> without closing the fds then the vm will be closed after the mm is
> destroyed and set to null.  Without checking for that, we'd access the
> invalid mm and panic.
> 
OK, thanks for explanation.

> By fd copying I mean passing the fd to another process over unix
> domain sockets or by using fork().  My understanding was that vcpu
> creation and ioctl call could not be done by a different process than
> the one that created by kvm fd.  Both kvm_vm_ioctl and kvm_vcpu_ioctl
> start with a check to prevent this:
> if (kvm->mm != current->mm)
>   return -EIO;
> As there's already lots of requirements in the code that the mm being
> used by a VCPU is the same as the mm used by the VM.
> 
Right you are.

> I agree that a process could still fork or otherwise pass the fd
> between processes to induce the 5 pages to lea

Re: [PULL 0/7] ppc patch queue 2013-03-22

2013-04-08 Thread Scott Wood

On 03/31/2013 06:05:40 AM, Alexander Graf wrote:

On 31.03.2013, at 12:49, Gleb Natapov wrote:

> On Tue, Mar 26, 2013 at 11:37:42AM -0500, Scott Wood wrote:
>> On 03/25/2013 08:33:12 PM, Gleb Natapov wrote:
>>> On Tue, Mar 26, 2013 at 12:35:09AM +0100, Alexander Graf wrote:

 On 26.03.2013, at 00:16, Scott Wood wrote:

> On 03/25/2013 05:59:39 PM, Alexander Graf wrote:
>> On 25.03.2013, at 23:54, Scott Wood wrote:
>>> On 03/25/2013 05:32:11 PM, Alexander Graf wrote:
 On 25.03.2013, at 23:21, Scott Wood wrote:
> -next?  These are bugfixes, at least partially for
>>> regressions from 3.8 (that I pointed out before the bugs were
>>> merged!), that should go into master.
>
> Also, what about:
> http://patchwork.ozlabs.org/patch/226227/
>
> You've got all four patches in kvm-ppc-3.9 as of a few
>>> weeks ago -- will you be requesting a pull for that soon?
 Sigh. I guess I've screwed up the whole "let's make -next
>>> an unusable tree and fix regressions in a separate one" workflow
>>> again. Sorry for that.
 Since the patches already trickled into kvm's next branch,
>>> all we can do now is to wait for them to come back through stable,
>>> right? Marcelo, Gleb?
>>>
>>> Well, you can still submit that kvm-ppc-3.9 pull request. :-)
>> I can, but nobody would pull it, as it'd create ugly merge
>>> commits when 3.10 opens
>
> That's a lousy excuse for leaving bugs unfixed.

 I agree. So if it doesn't hurt to have the same commits in
>>> kvm/next and kvm/master, I'd be more than happy to send another
>>> pull request with the important fixes against kvm/master as well.

>>> If it will result in the same commit showing twice in the Linus
>>> tree in 3.10 we cannot do that.
>>
>> Why?
>>
> Because Linus distastes it and mat refuse to pull. There is a way  
to avoid
> such double commits: push fix to Linus tree and merge it back to  
next.

Yes, that's the normal workflow. But what if we screw up (like I  
did)? Does having a working 3.9 kernel win over double commits in the  
tree? I'd say yes, but it might be best to just ask Linus directly.

Linus, I accidentally sent a pull request including fixes that were  
meant for master for kvm/next which got accepted. Now we have those  
commits in there. However, I would prefer if we could have them in  
master, so that we have a known good 3.9 kernel for kvm on powerpc.

I could send another pull request against master, but that would mean  
that after merging things back on the next merge window, there would  
be a few duplicate commits in the history.

Do you think that's a big no-go, or would you be ok with duplicate  
commits in case of an occasional screwup?

It doesn't look like there's much time left before 3.9 is released (rc6  
was released yesterday, and Linus said he expects rc7 to be the last),  
so could we come to a conclusion on this soon?  While I think it's  
ridiculous that "the same commit showing twice" would be a reason to  
let regressions go unfixed[1], at the very least please request a pull  
for the fourth bugfix patch, which should also go into 3.8 stable, and  
which did not go into the "next" branch (so no "duplicate commit" issue  
there).  If that doesn't make it into 3.9, it will likely never make it  
into 3.8 stable because there will be no more 3.8 stable releases at  
that point.

-Scott

[1] It doesn't help that the bugfix patches were posted almost two  
months ago, before the patches that introduced the bug were merged...

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

We've been accepted to Google Summer of Code 2013

2013-04-08 Thread Stefan Hajnoczi

Good news!  QEMU.org has been accepted to Google Summer of Code 2013.

This means students can begin considering our list of QEMU, kvm kernel
module, and libvirt project ideas:

http://qemu-project.org/Google_Summer_of_Code_2013

Student applications open April 22 at 19:00 UTC.  You can already view
the application template here:

http://www.google-melange.com/gsoc/org/google/gsoc2013/qemu

If you are an interested student, please take a look at the project
ideas and get in touch with the mentor for that project.  They can
help clarify the scope of the project and what skills are necessary.

You are invited to join the #qemu-gsoc IRC channel on irc.oftc.net
where questions about Google Summer of Code with QEMU.org are welcome.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: KVM Guest Lock up (100%) again!

2013-04-08 Thread Phil Daws

Hello all,

Another lock up again this evening :( am wondering whether should consider 
upgrading the kernel to 3.7.10 and the latest version of KVM. Thoughts ?

Thanks.

- Original Message -
To: kvm@vger.kernel.org
Sent: Thursday, 4 April, 2013 3:36:11 PM
Subject: KVM Guest Lock up (100%) again!

One of my KVM guests locked up again at 100% CPU!  Any thoughts on how I can 
diagnose it ? We would love to put into production but am very concerned about 
the current stability. I have tried to re-direct the console, through screen, 
to see whether there is a spin lock that is causing the problem; though all I 
got in the log file was a login prompt. What is the correct way of redirecting 
the console in KVM on a CentOS 6.4 system please ? Thanks.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH V3 1/2] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup

2013-04-08 Thread Nicholas A. Bellinger

On Mon, 2013-04-08 at 10:10 +0300, Michael S. Tsirkin wrote:
> On Wed, Apr 03, 2013 at 02:17:37PM +0800, Asias He wrote:
> > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > we check it in vhost_scsi_handle_vq(), we ignored the lock.
> > 
> > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > indicate the status of the endpoint, we use per virtqueue
> > vq->private_data to indicate it. In this way, we can only take the
> > vq->mutex lock which is per queue and make the concurrent multiqueue
> > process having less lock contention. Further, in the read side of
> > vq->private_data, we can even do not take the lock if it is accessed in
> > the vhost worker thread, because it is protected by "vhost rcu".
> > 
> > Signed-off-by: Asias He 
> 
> Not strictly 3.9 material itself but needed for the next one.
> 
> Acked-by: Michael S. Tsirkin 
> 

Applied to target-pending/master with a small change to
s/VHOST_FEATURES/VHOST_SCSI_FEATURES

Thanks Asias and MST!

--nab

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 >

1 - 100 of 115 matches

Mail list logo