date:20130604

Re: [PATCH] Test case of multibyte NOP in emulation mode

2013-06-04 Thread Gleb Natapov

On Wed, Jun 05, 2013 at 10:16:46AM +0800, 李春奇  wrote:
> Add multibyte NOP test case to kvm-unit-tests. This case can test one
> of bugs when booting RHEL5.9 64-bit.
> 
Adding the test to x86/realmode.c will be much easier.

> Signed-off-by: Arthur Chunqi Li 
> ---
>  x86/emulator.c |   33 +
>  1 file changed, 33 insertions(+)
> 
> diff --git a/x86/emulator.c b/x86/emulator.c
> index 96576e5..f26c70f 100644
> --- a/x86/emulator.c
> +++ b/x86/emulator.c
> @@ -901,6 +901,37 @@ static void test_simplealu(u32 *mem)
>  report("test", *mem == 0x8400);
>  }
> 
> +static void test_nopl(uint64_t *mem, uint8_t *insn_page,
> +   uint8_t *alt_insn_page, void *insn_ram)
> +{
> +ulong *cr3 = (ulong *)read_cr3();
> +
> +// Pad with RET instructions
> +memset(insn_page, 0xc3, 4096);
> +memset(alt_insn_page, 0xc3, 4096);
> +// Place a trapping instruction in the page to trigger a VMEXIT
> +insn_page[0] = 0x89; // mov %eax, (%rax)
> +insn_page[1] = 0x00;
> +insn_page[2] = 0x90; // nop
> +// Place nopl 0x0(%eax) in alt_insn_page for emulator to execuate
> +alt_insn_page[0] = 0x0f; // nop DWORD ptr[EAX]
> +alt_insn_page[1] = 0x1f;
> +alt_insn_page[2] = 0x00;
> +
> +// Load the code TLB with insn_page, but point the page tables at
> +// alt_insn_page (and keep the data TLB clear, for AMD decode assist).
> +// This will make the CPU trap on the insn_page instruction but the
> +// hypervisor will see alt_insn_page.
> +install_page(cr3, virt_to_phys(insn_page), insn_ram);
> +// Load code TLB
> +invlpg(insn_ram);
> +asm volatile("call *%0" : : "r"(insn_ram + 3));
> +// Trap, let hypervisor emulate at alt_insn_page
> +install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
> +asm volatile("call *%0" : : "r"(insn_ram), "a"(mem));
> +report("nopl", 1);
> +}
> +
>  int main()
>  {
>   void *mem;
> @@ -964,6 +995,8 @@ int main()
> 
>   test_string_io_mmio(mem);
> 
> + test_nopl(mem, insn_page, alt_insn_page, insn_ram);
> +
>   printf("\nSUMMARY: %d tests, %d failures\n", tests, fails);
>   return fails ? 1 : 0;
>  }
> --
> 1.7.9.5

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Intercepting task switches in svm/vmx with tdp enabled

2013-06-04 Thread Gleb Natapov

On Wed, Jun 05, 2013 at 12:51:29AM -0500, Leo Prasath wrote:
> Hi,
> 
> I am interested in intercepting task switches in vmx/svm in 64 bit
> mode with ept/npt enabled.
> However, I am not seeing the exit code due to task switch ( 9 for vmx
> and 125 for svm ) in the list of vm exits that I see in a typical
> guest run.
I do not think "task switch" exit means what you think it means. This is
not OS context switches, but some x86 cpu concept of task that can be
switched by using HW mechanism. No modern OS uses it. Actually in 64 bit
mode it does not exists at all.

> I log the vm exit codes in the x86/svm.c:handle_exit method for svm
> and x86/vmx.c:vmx_handle_exit for vmx.
> 
> Any pointers regarding this is very much appreciated.
> 
> On a related note, does cr3 write interception approximate task switch
> interception ?
Depending on how OS works. For Linux it is probably true (if cr3 value
changes).

> ( I was able to intercept cr3 writes with svm while npt was enabled.
> but with vmx, I could intercept cr3 writes only with ept disabled )
> 
> Thanks,
> Leo
> 
> Looking through the manuals, svm has a control bit in VMCS for
> enabling / disabling task switch interception while vmx does not seem
> to have such a control bit.
Again, this is not "task switch" you are looking for.

> -
> Excerpts from the manuals :
> 
> Intel
> --
> 
> Exit reason #9 indicates a vm exit due to task switch.
> 
> Vol. 3C 24-9 : Some instructions cause VM exits regardless of the
> settings of the processor-based VM-execution controls (see Section
> 25.1.2), as
> do task switches (see Section 25.2).
> 
> Vol. 3C 25-6 : Task switches. Task switches are not allowed in VMX
> non-root operation. Any attempt to effect a task switch in VMX
> non-root operation causes a VM exit. See Section 25.4.2
> 
> AMD
> ---
> 
> Intercept code to look for is: 7Dh VMEXIT_TASK_SWITCH task switch
> 
> 15.14 AMD64 Technology Miscellaneous Intercepts : The SVM architecture
> includes intercepts to handle task switches, processor freezes due to
> FERR, and shutdown operations.
> Task switches can modify several resources that a VMM may want to
> protect (CR3, EFLAGS, LDT).  However, instead of checking various
> intercepts (e.g., CR3 Write, LDTR Write) individually, task switches
> check only a single intercept bit.
> 
> Page 581 : Layout of VMCB says Byte offset 00Ch : bit 29 Intercept
> task switches.
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] KVM: PPC: Add hugepage support for IOMMU in-kernel handling

2013-06-04 Thread Alexey Kardashevskiy

This adds special support for huge pages (16MB).  The reference
counting cannot be easily done for such pages in real mode (when
MMU is off) so we added a list of huge pages.  It is populated in
virtual mode and get_page is called just once per a huge page.
Real mode handlers check if the requested page is huge and in the list,
then no reference counting is done, otherwise an exit to virtual mode
happens.  The list is released at KVM exit.  At the moment the fastest
card available for tests uses up to 9 huge pages so walking through this
list is not very expensive.  However this can change and we may want
to optimize this.

Cc: David Gibson 
Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: Paul Mackerras 

---

Changes:
2013/06/05:
* fixed compile error when CONFIG_IOMMU_API=n

2013/05/20:
* the real mode handler now searches for a huge page by gpa (used to be pte)
* the virtual mode handler prints warning if it is called twice for the same
huge page as the real mode handler is expected to fail just once - when a huge
page is not in the list yet.
* the huge page is refcounted twice - when added to the hugepage list and
when used in the virtual mode hcall handler (can be optimized but it will
make the patch less nice).
---
 arch/powerpc/include/asm/kvm_host.h |2 +
 arch/powerpc/include/asm/kvm_ppc.h  |   22 +
 arch/powerpc/kvm/book3s_64_vio.c|   88 +--
 arch/powerpc/kvm/book3s_64_vio_hv.c |   40 ++--
 4 files changed, 146 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index ac0e2fe..4fc0865 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -181,6 +181,8 @@ struct kvmppc_spapr_tce_table {
u64 liobn;
u32 window_size;
struct iommu_group *grp;/* used for IOMMU groups */
+   struct list_head hugepages; /* used for IOMMU groups */
+   spinlock_t hugepages_lock;  /* used for IOMMU groups */
struct page *pages[0];
 };
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 934e01d..9054df0 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -149,6 +149,28 @@ extern long kvmppc_virtmode_h_put_tce_indirect(struct 
kvm_vcpu *vcpu,
 extern long kvmppc_virtmode_h_stuff_tce(struct kvm_vcpu *vcpu,
unsigned long liobn, unsigned long ioba,
unsigned long tce_value, unsigned long npages);
+
+/*
+ * The KVM guest can be backed with 16MB pages (qemu switch
+ * -mem-path /var/lib/hugetlbfs/global/pagesize-16MB/).
+ * In this case, we cannot do page counting from the real mode
+ * as the compound pages are used - they are linked in a list
+ * with pointers as virtual addresses which are inaccessible
+ * in real mode.
+ *
+ * The code below keeps a 16MB pages list and uses page struct
+ * in real mode if it is already locked in RAM and inserted into
+ * the list or switches to the virtual mode where it can be
+ * handled in a usual manner.
+ */
+struct kvmppc_iommu_hugepage {
+   struct list_head list;
+   pte_t pte;  /* Huge page PTE */
+   unsigned long gpa;  /* Guest physical address */
+   struct page *page;  /* page struct of the very first subpage */
+   unsigned long size; /* Huge page size (always 16MB at the moment) */
+};
+
 extern long kvm_vm_ioctl_allocate_rma(struct kvm *kvm,
struct kvm_allocate_rma *rma);
 extern struct kvmppc_linear_info *kvm_alloc_rma(void);
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index ffb4698..9e2ba4d 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -45,6 +45,71 @@
 #define TCES_PER_PAGE  (PAGE_SIZE / sizeof(u64))
 #define ERROR_ADDR  ((void *)~(unsigned long)0x0)
 
+#ifdef CONFIG_IOMMU_API
+/* Adds a new huge page descriptor to the list  */
+static long kvmppc_iommu_hugepage_try_add(
+   struct kvmppc_spapr_tce_table *tt,
+   pte_t pte, unsigned long hva, unsigned long gpa,
+   unsigned long pg_size)
+{
+   long ret = 0;
+   struct kvmppc_iommu_hugepage *hp;
+   struct page *p;
+
+   spin_lock(&tt->hugepages_lock);
+   list_for_each_entry(hp, &tt->hugepages, list) {
+   if (hp->pte == pte)
+   goto unlock_exit;
+   }
+
+   hva = hva & ~(pg_size - 1);
+   ret = get_user_pages_fast(hva, 1, true/*write*/, &p);
+   if ((ret != 1) || !p) {
+   ret = -EFAULT;
+   goto unlock_exit;
+   }
+   ret = 0;
+
+   hp = kzalloc(sizeof(*hp), GFP_KERNEL);
+   if (!hp) {
+   ret = -ENOMEM;
+   goto unlock_exit;
+   }
+
+   hp->page = p;
+   hp->pte = pte;
+   hp->gpa = gpa & ~(pg_size - 1);
+   hp->size = pg_size;
+
+   list_add(&hp->list, &t

[PATCH 2/4] powerpc: Prepare to support kernel handling of IOMMU map/unmap

2013-06-04 Thread Alexey Kardashevskiy

The current VFIO-on-POWER implementation supports only user mode
driven mapping, i.e. QEMU is sending requests to map/unmap pages.
However this approach is really slow, so we want to move that to KVM.
Since H_PUT_TCE can be extremely performance sensitive (especially with
network adapters where each packet needs to be mapped/unmapped) we chose
to implement that as a "fast" hypercall directly in "real
mode" (processor still in the guest context but MMU off).

To be able to do that, we need to provide some facilities to
access the struct page count within that real mode environment as things
like the sparsemem vmemmap mappings aren't accessible.

This adds an API to increment/decrement page counter as
get_user_pages API used for user mode mapping does not work
in the real mode.

CONFIG_SPARSEMEM_VMEMMAP and CONFIG_FLATMEM are supported.

Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: Paul Mackerras 
Cc: David Gibson 
Signed-off-by: Paul Mackerras 

---

Changes:
2013-05-20:
* PageTail() is replaced by PageCompound() in order to have the same checks
for whether the page is huge in realmode_get_page() and realmode_put_page()
---
 arch/powerpc/include/asm/pgtable-ppc64.h |4 ++
 arch/powerpc/mm/init_64.c|   77 +-
 2 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h 
b/arch/powerpc/include/asm/pgtable-ppc64.h
index e3d55f6f..7b46e5f 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -376,6 +376,10 @@ static inline pte_t *find_linux_pte_or_hugepte(pgd_t 
*pgdir, unsigned long ea,
 }
 #endif /* !CONFIG_HUGETLB_PAGE */
 
+struct page *realmode_pfn_to_page(unsigned long pfn);
+int realmode_get_page(struct page *page);
+int realmode_put_page(struct page *page);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index a90b9c4..ce3d8d4 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -297,5 +297,80 @@ void vmemmap_free(unsigned long start, unsigned long end)
 {
 }
 
-#endif /* CONFIG_SPARSEMEM_VMEMMAP */
+/*
+ * We do not have access to the sparsemem vmemmap, so we fallback to
+ * walking the list of sparsemem blocks which we already maintain for
+ * the sake of crashdump. In the long run, we might want to maintain
+ * a tree if performance of that linear walk becomes a problem.
+ *
+ * Any of realmode_ functions can fail due to:
+ * 1) As real sparsemem blocks do not lay in RAM continously (they
+ * are in virtual address space which is not available in the real mode),
+ * the requested page struct can be split between blocks so get_page/put_page
+ * may fail.
+ * 2) When huge pages are used, the get_page/put_page API will fail
+ * in real mode as the linked addresses in the page struct are virtual
+ * too.
+ * When 1) or 2) takes place, the API returns an error code to cause
+ * an exit to kernel virtual mode where the operation will be completed.
+ */
+struct page *realmode_pfn_to_page(unsigned long pfn)
+{
+   struct vmemmap_backing *vmem_back;
+   struct page *page;
+   unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
+   unsigned long pg_va = (unsigned long) pfn_to_page(pfn);
+
+   for (vmem_back = vmemmap_list; vmem_back; vmem_back = vmem_back->list) {
+   if (pg_va < vmem_back->virt_addr)
+   continue;
 
+   /* Check that page struct is not split between real pages */
+   if ((pg_va + sizeof(struct page)) >
+   (vmem_back->virt_addr + page_size))
+   return NULL;
+
+   page = (struct page *) (vmem_back->phys + pg_va -
+   vmem_back->virt_addr);
+   return page;
+   }
+
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(realmode_pfn_to_page);
+
+#elif defined(CONFIG_FLATMEM)
+
+struct page *realmode_pfn_to_page(unsigned long pfn)
+{
+   struct page *page = pfn_to_page(pfn);
+   return page;
+}
+EXPORT_SYMBOL_GPL(realmode_pfn_to_page);
+
+#endif /* CONFIG_SPARSEMEM_VMEMMAP/CONFIG_FLATMEM */
+
+#if defined(CONFIG_SPARSEMEM_VMEMMAP) || defined(CONFIG_FLATMEM)
+int realmode_get_page(struct page *page)
+{
+   if (PageCompound(page))
+   return -EAGAIN;
+
+   get_page(page);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(realmode_get_page);
+
+int realmode_put_page(struct page *page)
+{
+   if (PageCompound(page))
+   return -EAGAIN;
+
+   if (!atomic_add_unless(&page->_count, -1, 1))
+   return -EAGAIN;
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(realmode_put_page);
+#endif
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] KVM: PPC: Add support for IOMMU in-kernel handling

2013-06-04 Thread Alexey Kardashevskiy

This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT
and H_STUFF_TCE requests without passing them to QEMU, which should
save time on switching to QEMU and back.

Both real and virtual modes are supported - whenever the kernel
fails to handle TCE request, it passes it to the virtual mode.
If it the virtual mode handlers fail, then the request is passed
to the user mode, for example, to QEMU.

This adds a new KVM_CAP_SPAPR_TCE_IOMMU ioctl to asssociate
a virtual PCI bus ID (LIOBN) with an IOMMU group, which enables
in-kernel handling of IOMMU map/unmap.

Tests show that this patch increases transmission speed from 220MB/s
to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card).

Cc: David Gibson 
Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: Paul Mackerras 

---

Changes:
2013/06/05:
* changed capability number
* changed ioctl number
* update the doc article number

2013/05/20:
* removed get_user() from real mode handlers
* kvm_vcpu_arch::tce_tmp usage extended. Now real mode handler puts there
translated TCEs, tries realmode_get_page() on those and if it fails, it
passes control over the virtual mode handler which tries to finish
the request handling
* kvmppc_lookup_pte() now does realmode_get_page() protected by BUSY bit
on a page
* The only reason to pass the request to user mode now is when the user mode
did not register TCE table in the kernel, in all other cases the virtual mode
handler is expected to do the job
---
 Documentation/virtual/kvm/api.txt   |   28 +
 arch/powerpc/include/asm/kvm_host.h |3 +
 arch/powerpc/include/asm/kvm_ppc.h  |2 +
 arch/powerpc/include/uapi/asm/kvm.h |7 ++
 arch/powerpc/kvm/book3s_64_vio.c|  198 ++-
 arch/powerpc/kvm/book3s_64_vio_hv.c |  193 +-
 arch/powerpc/kvm/powerpc.c  |   12 +++
 include/uapi/linux/kvm.h|2 +
 8 files changed, 439 insertions(+), 6 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 6c082ff..e962e3b 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2379,6 +2379,34 @@ the guest. Othwerwise it might be better for the guest 
to continue using H_PUT_T
 hypercall (if KVM_CAP_SPAPR_TCE or KVM_CAP_SPAPR_TCE_IOMMU are present).
 
 
+4.84 KVM_CREATE_SPAPR_TCE_IOMMU
+
+Capability: KVM_CAP_SPAPR_TCE_IOMMU
+Architectures: powerpc
+Type: vm ioctl
+Parameters: struct kvm_create_spapr_tce_iommu (in)
+Returns: 0 on success, -1 on error
+
+This creates a link between IOMMU group and a hardware TCE (translation
+control entry) table. This link lets the host kernel know what IOMMU
+group (i.e. TCE table) to use for the LIOBN number passed with
+H_PUT_TCE, H_PUT_TCE_INDIRECT, H_STUFF_TCE hypercalls.
+
+/* for KVM_CAP_SPAPR_TCE_IOMMU */
+struct kvm_create_spapr_tce_iommu {
+   __u64 liobn;
+   __u32 iommu_id;
+   __u32 flags;
+};
+
+No flag is supported at the moment.
+
+When the guest issues TCE call on a liobn for which a TCE table has been
+registered, the kernel will handle it in real mode, updating the hardware
+TCE table. TCE table calls for other liobns will cause a vm exit and must
+be handled by userspace.
+
+
 5. The kvm_run structure
 
 
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 85d8f26..ac0e2fe 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -180,6 +180,7 @@ struct kvmppc_spapr_tce_table {
struct kvm *kvm;
u64 liobn;
u32 window_size;
+   struct iommu_group *grp;/* used for IOMMU groups */
struct page *pages[0];
 };
 
@@ -611,6 +612,8 @@ struct kvm_vcpu_arch {
u64 busy_preempt;
 
unsigned long *tce_tmp;/* TCE cache for TCE_PUT_INDIRECT hall */
+   unsigned long tce_tmp_num; /* Number of handled TCEs in the cache */
+   unsigned long tce_reason;  /* The reason of switching to the virtmode */
 #endif
 };
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index e852921b..934e01d 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -133,6 +133,8 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu);
 
 extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
struct kvm_create_spapr_tce *args);
+extern long kvm_vm_ioctl_create_spapr_tce_iommu(struct kvm *kvm,
+   struct kvm_create_spapr_tce_iommu *args);
 extern struct kvmppc_spapr_tce_table *kvmppc_find_tce_table(
struct kvm_vcpu *vcpu, unsigned long liobn);
 extern long kvmppc_emulated_validate_tce(unsigned long tce);
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index 0fb1a6e..cf82af4 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -319,6 +3

[PATCH 0/4 v3] KVM: PPC: IOMMU in-kernel handling

2013-06-04 Thread Alexey Kardashevskiy

Ben, ping! :)

This series has tiny fixes (capability and ioctl numbers,
changed documentation, compile errors in some configuration).
More details are in the commit messages.
Rebased on v3.10-rc4.


Alexey Kardashevskiy (4):
  KVM: PPC: Add support for multiple-TCE hcalls
  powerpc: Prepare to support kernel handling of IOMMU map/unmap
  KVM: PPC: Add support for IOMMU in-kernel handling
  KVM: PPC: Add hugepage support for IOMMU in-kernel handling

 Documentation/virtual/kvm/api.txt|   45 +++
 arch/powerpc/include/asm/kvm_host.h  |7 +
 arch/powerpc/include/asm/kvm_ppc.h   |   40 ++-
 arch/powerpc/include/asm/pgtable-ppc64.h |4 +
 arch/powerpc/include/uapi/asm/kvm.h  |7 +
 arch/powerpc/kvm/book3s_64_vio.c |  398 -
 arch/powerpc/kvm/book3s_64_vio_hv.c  |  471 --
 arch/powerpc/kvm/book3s_hv.c |   39 +++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |6 +
 arch/powerpc/kvm/book3s_pr_papr.c|   37 ++-
 arch/powerpc/kvm/powerpc.c   |   15 +
 arch/powerpc/mm/init_64.c|   77 -
 include/uapi/linux/kvm.h |3 +
 13 files changed, 1121 insertions(+), 28 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/4] KVM: PPC: Add support for multiple-TCE hcalls

2013-06-04 Thread Alexey Kardashevskiy

This adds real mode handlers for the H_PUT_TCE_INDIRECT and
H_STUFF_TCE hypercalls for QEMU emulated devices such as IBMVIO
devices or emulated PCI.  These calls allow adding multiple entries
(up to 512) into the TCE table in one call which saves time on
transition to/from real mode.

This adds a tce_tmp cache to kvm_vcpu_arch to save valid TCEs
(copied from user and verified) before writing the whole list into
the TCE table. This cache will be utilized more in the upcoming
VFIO/IOMMU support to continue TCE list processing in the virtual
mode in the case if the real mode handler failed for some reason.

This adds a guest physical to host real address converter
and calls the existing H_PUT_TCE handler. The converting function
is going to be fully utilized by upcoming VFIO supporting patches.

This also implements the KVM_CAP_PPC_MULTITCE capability,
so in order to support the functionality of this patch, QEMU
needs to query for this capability and set the "hcall-multi-tce"
hypertas property only if the capability is present, otherwise
there will be serious performance degradation.

Cc: David Gibson 
Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: Paul Mackerras 

---
Changelog:
2013/06/05:
* fixed mistype about IBMVIO in the commit message
* updated doc and moved it to another section
* changed capability number

2013/05/21:
* added kvm_vcpu_arch::tce_tmp
* removed cleanup if put_indirect failed, instead we do not even start
writing to TCE table if we cannot get TCEs from the user and they are
invalid
* kvmppc_emulated_h_put_tce is split to kvmppc_emulated_put_tce
and kvmppc_emulated_validate_tce (for the previous item)
* fixed bug with failthrough for H_IPI
* removed all get_user() from real mode handlers
* kvmppc_lookup_pte() added (instead of making lookup_linux_pte public)
---
 Documentation/virtual/kvm/api.txt   |   17 ++
 arch/powerpc/include/asm/kvm_host.h |2 +
 arch/powerpc/include/asm/kvm_ppc.h  |   16 +-
 arch/powerpc/kvm/book3s_64_vio.c|  118 ++
 arch/powerpc/kvm/book3s_64_vio_hv.c |  266 +++
 arch/powerpc/kvm/book3s_hv.c|   39 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |6 +
 arch/powerpc/kvm/book3s_pr_papr.c   |   37 -
 arch/powerpc/kvm/powerpc.c  |3 +
 include/uapi/linux/kvm.h|1 +
 10 files changed, 473 insertions(+), 32 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 5f91eda..6c082ff 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2362,6 +2362,23 @@ calls by the guest for that service will be passed to 
userspace to be
 handled.
 
 
+4.83 KVM_CAP_PPC_MULTITCE
+
+Capability: KVM_CAP_PPC_MULTITCE
+Architectures: ppc
+Type: vm
+
+This capability tells the guest that multiple TCE entry add/remove hypercalls
+handling is supported by the kernel. This significanly accelerates DMA
+operations for PPC KVM guests.
+
+Unlike other capabilities in this section, this one does not have an ioctl.
+Instead, when the capability is present, the H_PUT_TCE_INDIRECT and
+H_STUFF_TCE hypercalls are to be handled in the host kernel and not passed to
+the guest. Othwerwise it might be better for the guest to continue using 
H_PUT_TCE
+hypercall (if KVM_CAP_SPAPR_TCE or KVM_CAP_SPAPR_TCE_IOMMU are present).
+
+
 5. The kvm_run structure
 
 
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index af326cd..85d8f26 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -609,6 +609,8 @@ struct kvm_vcpu_arch {
spinlock_t tbacct_lock;
u64 busy_stolen;
u64 busy_preempt;
+
+   unsigned long *tce_tmp;/* TCE cache for TCE_PUT_INDIRECT hall */
 #endif
 };
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index a5287fe..e852921b 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -133,8 +133,20 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu);
 
 extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
struct kvm_create_spapr_tce *args);
-extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
-unsigned long ioba, unsigned long tce);
+extern struct kvmppc_spapr_tce_table *kvmppc_find_tce_table(
+   struct kvm_vcpu *vcpu, unsigned long liobn);
+extern long kvmppc_emulated_validate_tce(unsigned long tce);
+extern void kvmppc_emulated_put_tce(struct kvmppc_spapr_tce_table *tt,
+   unsigned long ioba, unsigned long tce);
+extern long kvmppc_virtmode_h_put_tce(struct kvm_vcpu *vcpu,
+   unsigned long liobn, unsigned long ioba,
+   unsigned long tce);
+extern long kvmppc_virtmode_h_put_tce_indirect(struct kvm_vcpu *vcpu,
+   unsigned long liobn, un

[PATCH] vfio: fix crash on rmmod

2013-06-04 Thread Alexey Kardashevskiy

devtmpfs_delete_node() calls devnode() callback with mode==NULL but
vfio still tries to write there.

The patch fixes this.

Signed-off-by: Alexey Kardashevskiy 

---

Steps to reproduce on freshly booted system with no devices given to VFIO:
modprobe vfio
rmmod vfio_iommu_spapr_tce
rmmod vfio
---
 drivers/vfio/vfio.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 523c121..259ad28 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1360,7 +1360,7 @@ static const struct file_operations vfio_device_fops = {
  */
 static char *vfio_devnode(struct device *dev, umode_t *mode)
 {
-   if (MINOR(dev->devt) == 0)
+   if (mode && (MINOR(dev->devt) == 0))
*mode = S_IRUGO | S_IWUGO;
 
return kasprintf(GFP_KERNEL, "vfio/%s", dev_name(dev));
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Gleb Natapov

On Tue, Jun 04, 2013 at 10:57:32PM -0700, Christoffer Dall wrote:
> On 4 June 2013 09:37, Gleb Natapov  wrote:
> > On Tue, Jun 04, 2013 at 05:51:41PM +0200, Paolo Bonzini wrote:
> >> Il 04/06/2013 17:43, Christoffer Dall ha scritto:
> >> > Hi Paolo,
> >> >
> >> > I don't think this is an issue. Gleb and Marcelo for example pulled
> >> > RMK's stable tree for my KVM/ARM updates for the 3.10 merge window and
> >> > that wasn't an issue.  If Linus pulls the kvm/next tree first the
> >> > diffstat should be similar and everything clean enough, no?
> >> >
> >> > Catalin has previously expressed his wish to upstream the kvm/arm64
> >> > patches directly through him given the churn in a completely new
> >> > architecture and he wants to make sure that everything looks right.
> >> >
> >> > It's a pretty clean implementation with quite few dependencies and
> >> > merging as a working series should be a priority instead of the
> >> > Kconfig hack, imho.
> >>
> >> Ok, let's see what Gleb says.
> >>
> > I have no objection to merge arm64 kvm trough Catalin if it mean less
> > churn for everyone. That's what we did with arm and mips. Arm64 kvm
> > has a dependency on kvm.git next though, so how Catalin make sure that
> > everything looks right? Will he merge kvm.git/next to arm64 tree?
> >
> Yes, that was the idea. Everything in kvm/next is considered stable, right?
> 
Right. Catalin should wait for kvm.git to be pulled by Linus next merge
windows before sending his pull request then.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Christoffer Dall

On 4 June 2013 09:37, Gleb Natapov  wrote:
> On Tue, Jun 04, 2013 at 05:51:41PM +0200, Paolo Bonzini wrote:
>> Il 04/06/2013 17:43, Christoffer Dall ha scritto:
>> > Hi Paolo,
>> >
>> > I don't think this is an issue. Gleb and Marcelo for example pulled
>> > RMK's stable tree for my KVM/ARM updates for the 3.10 merge window and
>> > that wasn't an issue.  If Linus pulls the kvm/next tree first the
>> > diffstat should be similar and everything clean enough, no?
>> >
>> > Catalin has previously expressed his wish to upstream the kvm/arm64
>> > patches directly through him given the churn in a completely new
>> > architecture and he wants to make sure that everything looks right.
>> >
>> > It's a pretty clean implementation with quite few dependencies and
>> > merging as a working series should be a priority instead of the
>> > Kconfig hack, imho.
>>
>> Ok, let's see what Gleb says.
>>
> I have no objection to merge arm64 kvm trough Catalin if it mean less
> churn for everyone. That's what we did with arm and mips. Arm64 kvm
> has a dependency on kvm.git next though, so how Catalin make sure that
> everything looks right? Will he merge kvm.git/next to arm64 tree?
>
Yes, that was the idea. Everything in kvm/next is considered stable, right?

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Intercepting task switches in svm/vmx with tdp enabled

2013-06-04 Thread Leo Prasath

Hi,

I am interested in intercepting task switches in vmx/svm in 64 bit
mode with ept/npt enabled.
However, I am not seeing the exit code due to task switch ( 9 for vmx
and 125 for svm ) in the list of vm exits that I see in a typical
guest run.
I log the vm exit codes in the x86/svm.c:handle_exit method for svm
and x86/vmx.c:vmx_handle_exit for vmx.

Any pointers regarding this is very much appreciated.

On a related note, does cr3 write interception approximate task switch
interception ?
( I was able to intercept cr3 writes with svm while npt was enabled.
but with vmx, I could intercept cr3 writes only with ept disabled )

Thanks,
Leo

Looking through the manuals, svm has a control bit in VMCS for
enabling / disabling task switch interception while vmx does not seem
to have such a control bit.
-
Excerpts from the manuals :

Intel
--

Exit reason #9 indicates a vm exit due to task switch.

Vol. 3C 24-9 : Some instructions cause VM exits regardless of the
settings of the processor-based VM-execution controls (see Section
25.1.2), as
do task switches (see Section 25.2).

Vol. 3C 25-6 : Task switches. Task switches are not allowed in VMX
non-root operation. Any attempt to effect a task switch in VMX
non-root operation causes a VM exit. See Section 25.4.2

AMD
---

Intercept code to look for is: 7Dh VMEXIT_TASK_SWITCH task switch

15.14 AMD64 Technology Miscellaneous Intercepts : The SVM architecture
includes intercepts to handle task switches, processor freezes due to
FERR, and shutdown operations.
Task switches can modify several resources that a VMM may want to
protect (CR3, EFLAGS, LDT).  However, instead of checking various
intercepts (e.g., CR3 Write, LDTR Write) individually, task switches
check only a single intercept bit.

Page 581 : Layout of VMCB says Byte offset 00Ch : bit 29 Intercept
task switches.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Test case of multibyte NOP in emulation mode

2013-06-04 Thread 李春奇

Add multibyte NOP test case to kvm-unit-tests. This case can test one
of bugs when booting RHEL5.9 64-bit.

Signed-off-by: Arthur Chunqi Li 
---
 x86/emulator.c |   33 +
 1 file changed, 33 insertions(+)

diff --git a/x86/emulator.c b/x86/emulator.c
index 96576e5..f26c70f 100644
--- a/x86/emulator.c
+++ b/x86/emulator.c
@@ -901,6 +901,37 @@ static void test_simplealu(u32 *mem)
 report("test", *mem == 0x8400);
 }

+static void test_nopl(uint64_t *mem, uint8_t *insn_page,
+   uint8_t *alt_insn_page, void *insn_ram)
+{
+ulong *cr3 = (ulong *)read_cr3();
+
+// Pad with RET instructions
+memset(insn_page, 0xc3, 4096);
+memset(alt_insn_page, 0xc3, 4096);
+// Place a trapping instruction in the page to trigger a VMEXIT
+insn_page[0] = 0x89; // mov %eax, (%rax)
+insn_page[1] = 0x00;
+insn_page[2] = 0x90; // nop
+// Place nopl 0x0(%eax) in alt_insn_page for emulator to execuate
+alt_insn_page[0] = 0x0f; // nop DWORD ptr[EAX]
+alt_insn_page[1] = 0x1f;
+alt_insn_page[2] = 0x00;
+
+// Load the code TLB with insn_page, but point the page tables at
+// alt_insn_page (and keep the data TLB clear, for AMD decode assist).
+// This will make the CPU trap on the insn_page instruction but the
+// hypervisor will see alt_insn_page.
+install_page(cr3, virt_to_phys(insn_page), insn_ram);
+// Load code TLB
+invlpg(insn_ram);
+asm volatile("call *%0" : : "r"(insn_ram + 3));
+// Trap, let hypervisor emulate at alt_insn_page
+install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
+asm volatile("call *%0" : : "r"(insn_ram), "a"(mem));
+report("nopl", 1);
+}
+
 int main()
 {
  void *mem;
@@ -964,6 +995,8 @@ int main()

  test_string_io_mmio(mem);

+ test_nopl(mem, insn_page, alt_insn_page, insn_ram);
+
  printf("\nSUMMARY: %d tests, %d failures\n", tests, fails);
  return fails ? 1 : 0;
 }
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v8 00/11] KVM: MMU: fast zap all shadow pages

2013-06-04 Thread Marcelo Tosatti

On Fri, May 31, 2013 at 08:36:19AM +0800, Xiao Guangrong wrote:
> Hi Gleb, Paolo, Marcelo,
> 
> I have putted the potential controversial patches to the latter that are
> patch 8 ~ 10, patch 11 depends on patch 9. Other patches are fully reviewed,
> I think its are ready for being merged. If not luck enough, further discussion
> is needed, could you please apply that patches first? :)
> 
> Thank you in advance!



Looks good to me.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/2] kvm: make vendor_intel a generic function

2013-06-04 Thread Paolo Bonzini

Il 04/06/2013 18:02, Bandan Das ha scritto:
> Make vendor_intel generic so that functions in x86.c
> can use it.
> 
> v2:
> Change vendor_intel function signature because the emulator
> shouldn't be dealing with struct vcpu
> 
> Signed-off-by: Bandan Das 
> ---
>  arch/x86/include/asm/kvm_emulate.h | 13 -
>  arch/x86/include/asm/kvm_host.h| 20 
>  arch/x86/kvm/emulate.c | 16 
>  3 files changed, 24 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_emulate.h 
> b/arch/x86/include/asm/kvm_emulate.h
> index 15f960c..611a55f 100644
> --- a/arch/x86/include/asm/kvm_emulate.h
> +++ b/arch/x86/include/asm/kvm_emulate.h
> @@ -319,19 +319,6 @@ struct x86_emulate_ctxt {
>  #define REPE_PREFIX  0xf3
>  #define REPNE_PREFIX 0xf2
>  
> -/* CPUID vendors */
> -#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541
> -#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163
> -#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65
> -
> -#define X86EMUL_CPUID_VENDOR_AMDisbetterI_ebx 0x69444d41
> -#define X86EMUL_CPUID_VENDOR_AMDisbetterI_ecx 0x21726574
> -#define X86EMUL_CPUID_VENDOR_AMDisbetterI_edx 0x74656273
> -
> -#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547
> -#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e
> -#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69
> -
>  enum x86_intercept_stage {
>   X86_ICTP_NONE = 0,   /* Allow zero-init to not match anything */
>   X86_ICPT_PRE_EXCEPT,
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 3741c65..ce9a44f 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -144,6 +144,19 @@ enum {
>  
>  #include 
>  
> +/* CPUID vendors */
> +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541
> +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163
> +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65
> +
> +#define X86EMUL_CPUID_VENDOR_AMDisbetterI_ebx 0x69444d41
> +#define X86EMUL_CPUID_VENDOR_AMDisbetterI_ecx 0x21726574
> +#define X86EMUL_CPUID_VENDOR_AMDisbetterI_edx 0x74656273
> +
> +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547
> +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e
> +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69
> +
>  #define KVM_NR_MEM_OBJS 40
>  
>  #define KVM_NR_DB_REGS   4
> @@ -942,6 +955,13 @@ static inline unsigned long read_msr(unsigned long msr)
>  }
>  #endif
>  
> +static inline bool vendor_intel(u32 ebx, u32 ecx, u32 edx)
> +{
> + return ebx == X86EMUL_CPUID_VENDOR_GenuineIntel_ebx
> + && ecx == X86EMUL_CPUID_VENDOR_GenuineIntel_ecx
> + && edx == X86EMUL_CPUID_VENDOR_GenuineIntel_edx;
> +}
> +
>  static inline u32 get_rdx_init_val(void)
>  {
>   return 0x600; /* P6 family */
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 8db0010..87f12fc 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -2280,17 +2280,6 @@ setup_syscalls_segments(struct x86_emulate_ctxt *ctxt,
>   ss->avl = 0;
>  }
>  
> -static bool vendor_intel(struct x86_emulate_ctxt *ctxt)
> -{
> - u32 eax, ebx, ecx, edx;
> -
> - eax = ecx = 0;
> - ctxt->ops->get_cpuid(ctxt, &eax, &ebx, &ecx, &edx);
> - return ebx == X86EMUL_CPUID_VENDOR_GenuineIntel_ebx
> - && ecx == X86EMUL_CPUID_VENDOR_GenuineIntel_ecx
> - && edx == X86EMUL_CPUID_VENDOR_GenuineIntel_edx;
> -}
> -
>  static bool em_syscall_is_enabled(struct x86_emulate_ctxt *ctxt)
>  {
>   const struct x86_emulate_ops *ops = ctxt->ops;
> @@ -2400,6 +2389,7 @@ static int em_sysenter(struct x86_emulate_ctxt *ctxt)
>   u64 msr_data;
>   u16 cs_sel, ss_sel;
>   u64 efer = 0;
> + u32 eax, ebx, ecx, edx;
>  
>   ops->get_msr(ctxt, MSR_EFER, &efer);
>   /* inject #GP if in real mode */
> @@ -2410,8 +2400,10 @@ static int em_sysenter(struct x86_emulate_ctxt *ctxt)
>* Not recognized on AMD in compat mode (but is recognized in legacy
>* mode).
>*/
> + eax = ecx = 0;
> + ctxt->ops->get_cpuid(ctxt, &eax, &ebx, &ecx, &edx);
>   if ((ctxt->mode == X86EMUL_MODE_PROT32) && (efer & EFER_LMA)
> - && !vendor_intel(ctxt))
> + && !vendor_intel(ebx, ecx, edx))
>   return emulate_ud(ctxt);
>  
>   /* XXX sysenter/sysexit have not been tested in 64bit mode.
> 

Reviewed-by: Paolo Bonzini 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 6/6] KVM: PPC: Book3E: Enhance FPU laziness

2013-06-04 Thread Scott Wood


On 06/03/2013 03:54:28 PM, Mihai Caraman wrote:
Adopt AltiVec approach to increase laziness by calling  
kvmppc_load_guest_fp()

just before returning to guest instaed of each sched in.

Signed-off-by: Mihai Caraman 


If you did this *before* adding Altivec it would have saved a question  
in an earlier patch. :-)



---
 arch/powerpc/kvm/booke.c  |1 +
 arch/powerpc/kvm/e500mc.c |2 --
 2 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 019496d..5382238 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1258,6 +1258,7 @@ int kvmppc_handle_exit(struct kvm_run *run,  
struct kvm_vcpu *vcpu,

} else {
kvmppc_lazy_ee_enable();
kvmppc_load_guest_altivec(vcpu);
+   kvmppc_load_guest_fp(vcpu);
}
}



You should probably do these before kvmppc_lazy_ee_enable().

Actually, I don't think this is a good idea at all.  As I understand  
it, you're not supposed to take kernel ownersship of floating point in  
non-atomic context, because an interrupt could itself call  
enable_kernel_fp().


Do you have benchmarks showing it's even worthwhile?

-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 5/6] KVM: PPC: Book3E: Add ONE_REG AltiVec support

2013-06-04 Thread Scott Wood


On 06/03/2013 03:54:27 PM, Mihai Caraman wrote:

Add ONE_REG support for AltiVec on Book3E.

Signed-off-by: Mihai Caraman 
---
 arch/powerpc/kvm/booke.c |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 01eb635..019496d 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1570,6 +1570,22 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu  
*vcpu, struct kvm_one_reg *reg)

case KVM_REG_PPC_DEBUG_INST:
val = get_reg_val(reg->id, KVMPPC_INST_EHPRIV);
break;
+#ifdef CONFIG_ALTIVEC
+   case KVM_REG_PPC_VR0 ... KVM_REG_PPC_VR31:
+   if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
+   r = -ENXIO;
+   break;
+   }
+   val.vval = vcpu->arch.vr[reg->id - KVM_REG_PPC_VR0];
+   break;
+   case KVM_REG_PPC_VSCR:
+   if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
+   r = -ENXIO;
+   break;
+   }
+   val = get_reg_val(reg->id, vcpu->arch.vscr.u[3]);
+   break;


Why u[3]?

-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 4/6] KVM: PPC: Book3E: Add AltiVec support

2013-06-04 Thread Scott Wood


On 06/03/2013 03:54:26 PM, Mihai Caraman wrote:
KVM Book3E FPU support gracefully reuse host infrastructure so we do  
the
same for AltiVec. To keep AltiVec lazy call  
kvmppc_load_guest_altivec()

just when returning to guest instead of each sched in.

Signed-off-by: Mihai Caraman 
---
 arch/powerpc/kvm/booke.c  |   74  
+++-

 arch/powerpc/kvm/e500mc.c |8 +
 2 files changed, 80 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index c08b04b..01eb635 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -134,6 +134,23 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu  
*vcpu)

 }

 /*
+ * Simulate AltiVec unavailable fault to load guest state
+ * from thread to AltiVec unit.
+ * It requires to be called with preemption disabled.
+ */
+static inline void kvmppc_load_guest_altivec(struct kvm_vcpu *vcpu)
+{
+#ifdef CONFIG_ALTIVEC
+   if (cpu_has_feature(CPU_FTR_ALTIVEC)) {
+   if (!(current->thread.regs->msr & MSR_VEC)) {
+   load_up_altivec(NULL);
+   current->thread.regs->msr |= MSR_VEC;
+   }
+   }
+#endif


Why not use kvmppc_supports_altivec()?  In fact, there's nothing  
KVM-specific about these functions...



+/*
+ * Always returns true is AltiVec unit is present, see
+ * kvmppc_core_check_processor_compat().
+ */
+static inline bool kvmppc_supports_altivec(void)
+{
+#ifdef CONFIG_ALTIVEC
+   if (cpu_has_feature(CPU_FTR_ALTIVEC))
+   return true;
+#endif
+   return false;
+}


Whitespace


 static inline bool kvmppc_supports_spe(void)
 {
 #ifdef CONFIG_SPE
@@ -947,7 +1016,7 @@ int kvmppc_handle_exit(struct kvm_run *run,  
struct kvm_vcpu *vcpu,

 */
bool handled = false;

-   if (kvmppc_supports_spe()) {
+		if (kvmppc_supports_altivec() || kvmppc_supports_spe())  
{

 #ifdef CONFIG_SPE
if (cpu_has_feature(CPU_FTR_SPE))
if (vcpu->arch.shared->msr & MSR_SPE) {
@@ -976,7 +1045,7 @@ int kvmppc_handle_exit(struct kvm_run *run,  
struct kvm_vcpu *vcpu,
 		 * The interrupt is shared, KVM support for the  
featured unit

 * is detected at run-time.
 */
-   if (kvmppc_supports_spe()) {
+		if (kvmppc_supports_altivec() || kvmppc_supports_spe())  
{

kvmppc_booke_queue_irqprio(vcpu,
  
BOOKE_IRQPRIO_SPE_FP_DATA_ALTIVEC_ASSIST);

r = RESUME_GUEST;


The distinction between how you're handling SPE and Altivec here  
doesn't really have anything to do with SPE versus Altivec -- it's  
PR-mode versus HV-mode.


@@ -1188,6 +1257,7 @@ int kvmppc_handle_exit(struct kvm_run *run,  
struct kvm_vcpu *vcpu,
 			r = (s << 2) | RESUME_HOST | (r &  
RESUME_FLAG_NV);

} else {
kvmppc_lazy_ee_enable();
+   kvmppc_load_guest_altivec(vcpu);
}
}



Why do you need to call an Altivec function here if we don't need to  
call an ordinary FPU function here?


-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 3/6] KVM: PPC: Book3E: Rename IRQPRIO names to accommodate ALTIVEC

2013-06-04 Thread Scott Wood


On 06/03/2013 03:54:25 PM, Mihai Caraman wrote:

Rename BOOKE_IRQPRIO_SPE_UNAVAIL and BOOKE_IRQPRIO_SPE_FP_DATA names
to accommodate ALTIVEC. Replace BOOKE_INTERRUPT_SPE_UNAVAIL and
BOOKE_INTERRUPT_SPE_FP_DATA with the common version.

Signed-off-by: Mihai Caraman 
---
 arch/powerpc/kvm/booke.c  |   12 ++--
 arch/powerpc/kvm/booke.h  |4 ++--
 arch/powerpc/kvm/bookehv_interrupts.S |8 
 arch/powerpc/kvm/e500.c   |   10 ++
 arch/powerpc/kvm/e500_emulate.c   |8 
 5 files changed, 22 insertions(+), 20 deletions(-)


Can you remove the TODO separate definitions from 1/6 now?  And/or  
combine 1/6 with this patch?


-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 2/6] KVM: PPC: Book3E: Refactor SPE_FP exit handling

2013-06-04 Thread Scott Wood


On 06/03/2013 03:54:24 PM, Mihai Caraman wrote:
SPE_FP interrupts are shared with ALTIVEC. Refactor SPE_FP exit  
handling

to detect KVM support for the featured unit at run-time, in order to
accommodate ALTIVEC later.

Signed-off-by: Mihai Caraman 
---
 arch/powerpc/kvm/booke.c |   80  
++

 1 files changed, 59 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1020119..d082bbc 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -822,6 +822,15 @@ static void kvmppc_restart_interrupt(struct  
kvm_vcpu *vcpu,

}
 }

+static inline bool kvmppc_supports_spe(void)
+{
+#ifdef CONFIG_SPE
+   if (cpu_has_feature(CPU_FTR_SPE))
+   return true;
+#endif
+   return false;
+}


Whitespace


 /**
  * kvmppc_handle_exit
  *
@@ -931,42 +940,71 @@ int kvmppc_handle_exit(struct kvm_run *run,  
struct kvm_vcpu *vcpu,

r = RESUME_GUEST;
break;

-#ifdef CONFIG_SPE
case BOOKE_INTERRUPT_SPE_UNAVAIL: {
-   if (vcpu->arch.shared->msr & MSR_SPE)
-   kvmppc_vcpu_enable_spe(vcpu);
-   else
-   kvmppc_booke_queue_irqprio(vcpu,
-		
BOOKE_IRQPRIO_SPE_UNAVAIL);

+   /*
+		 * The interrupt is shared, KVM support for the  
featured unit

+* is detected at run-time.
+*/


This is a decent comment for the changelog, but for the code itself it  
seems fairly obvious if you look at the definition of  
kvmppc_supports_spe().



+   bool handled = false;
+
+   if (kvmppc_supports_spe()) {
+#ifdef CONFIG_SPE
+   if (cpu_has_feature(CPU_FTR_SPE))


Didn't you already check this using kvmppc_supports_spe()?


case BOOKE_INTERRUPT_SPE_FP_ROUND:
+#ifdef CONFIG_SPE
 		kvmppc_booke_queue_irqprio(vcpu,  
BOOKE_IRQPRIO_SPE_FP_ROUND);

r = RESUME_GUEST;
break;


Why not use kvmppc_supports_spe() here, for consistency?

-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 0/6] KVM: PPC: Book3E: AltiVec support

2013-06-04 Thread Scott Wood


On 06/03/2013 03:54:22 PM, Mihai Caraman wrote:

Mihai Caraman (6):
  KVM: PPC: Book3E: Fix AltiVec interrupt numbers and build breakage
  KVM: PPC: Book3E: Refactor SPE_FP exit handling
  KVM: PPC: Book3E: Rename IRQPRIO names to accommodate ALTIVEC
  KVM: PPC: Book3E: Add AltiVec support
  KVM: PPC: Book3E: Add ONE_REG AltiVec support
  KVM: PPC: Book3E: Enhance FPU laziness

 arch/powerpc/include/asm/kvm_asm.h|   16 ++-
 arch/powerpc/kvm/booke.c  |  189  


 arch/powerpc/kvm/booke.h  |4 +-
 arch/powerpc/kvm/bookehv_interrupts.S |8 +-
 arch/powerpc/kvm/e500.c   |   10 +-
 arch/powerpc/kvm/e500_emulate.c   |8 +-
 arch/powerpc/kvm/e500mc.c |   10 ++-
 7 files changed, 199 insertions(+), 46 deletions(-)


This looks like a bit much for 3.10 (certainly, subject lines like  
"refactor" and "enhance" and "add support" aren't going to make Linus  
happy given that we're past rc4) so I think we should apply  
http://patchwork.ozlabs.org/patch/242896/ for 3.10.  Then for 3.11,  
revert it after applying this patchset.


-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

vhost && kernel BUG at /build/linux/mm/slub.c:3352!

2013-06-04 Thread Tommi Rantala

Hello,

Hit this right after killing trinity with Ctrl-C. Was fuzzing
v3.10-rc4-0-gd683b96 in a qemu virtual machine as the root user.

Tommi

[29175] Random reseed: 3970521611
[29175] Random reseed: 202886419
[29175] Random reseed: 2930978521
[179904.099501] binder: 29175:2539 ioctl 4010630e fff returned -22
[29175] Random reseed: 2776471322
[29175] Random reseed: 3086119361
child 2606 exiting
[29175] Bailing main loop. Exit reason: ctrl-c
[179906.393060] [ cut here ]
[179906.396341] kernel BUG at /build/linux/mm/slub.c:3352!
[179906.399693] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC
[179906.403272] CPU: 0 PID: 29175 Comm: trinity-main Not tainted 3.10.0-rc4 #1
[179906.407692] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[179906.411475] task: 8800b69e47c0 ti: 880092f2e000 task.ti:
880092f2e000
[179906.416305] RIP: 0010:[]  []
kfree+0x155/0x2c0
[179906.421462] RSP: :880092f2fdb0  EFLAGS: 00010246
[179906.424983] RAX: 0100 RBX: 88009e588000 RCX:

[179906.429746] RDX: 8800b69e47c0 RSI: 000a0004 RDI:
88009e588000
[179906.434499] RBP: 880092f2fdd8 R08: 0001 R09:

[179906.439226] R10:  R11: 0001 R12:

[179906.443835] R13: ea0002796200 R14: 8800b9a960f8 R15:
8800ba06f6a0
[179906.448470] FS:  7f04cd25c700() GS:8800bf60()
knlGS:
[179906.453857] CS:  0010 DS:  ES:  CR0: 80050033
[179906.456956] CR2: 7f98e29d8f50 CR3: 9294a000 CR4:
06f0
[179906.460558] DR0:  DR1:  DR2:

[179906.464059] DR3:  DR6: 0ff0 DR7:
0400
[179906.467617] Stack:
[179906.468704]  88001a7c  
8800b9a960f8
[179906.472638]  8800ba06f6a0 880092f2fdf0 81c1c6df
88001a7c
[179906.476583]  880092f2fe18 81c1c771 8800b69718c0
0008
[179906.480377] Call Trace:
[179906.481636]  [] vhost_net_vq_reset+0x7f/0xb0
[179906.484611]  [] vhost_net_release+0x61/0xb0
[179906.487481]  [] __fput+0x12a/0x230
[179906.489968]  [] fput+0x9/0x10
[179906.492422]  [] task_work_run+0xae/0xf0
[179906.495169]  [] do_exit+0x44c/0xb40
[179906.497789]  [] ? retint_swapgs+0x13/0x1b
[179906.500652]  [] do_group_exit+0x84/0xd0
[179906.503348]  [] SyS_exit_group+0x12/0x20
[179906.506146]  [] system_call_fastpath+0x16/0x1b
[179906.509147] Code: 49 c1 ed 0c 49 c1 e5 06 49 01 c5 49 8b 45 00 f6
c4 80 74 0a 4d 8b 6d 30 66 0f 1f 44 00 00 49 8b 45 00 a8 80 75 28 f6
c4 c0 75 02 <0f> 0b 49 8b 45 00 31 f6 f6 c4 40 74 04 41 8b 75 68 4c 89
ef e8
[179906.522213] RIP  [] kfree+0x155/0x2c0
[179906.524937]  RSP 
[179906.575627] ---[ end trace 3d4ce10faaa29990 ]---
[179906.577103] Fixing recursive fault but reboot is needed!
[29174] Watchdog exiting
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Fwd: VirtIO and BSOD On Windows Server 2003

2013-06-04 Thread Aaron Clausen

On Tue, Jun 4, 2013 at 7:29 AM, Vadim Rozenfeld  wrote:
>
> If IDE works fine, try adding another disk as virtio and see it the secondary
> disk works smoothly as well.

That's what I did, and the instant the virtio block driver initialized
in BSODed on me with 0x007f (reason code 0x805000f). According to
Microsoft "this problem occurs because the NTFS driver incorrectly
locks the resource when the NTFS driver tries to access the resource."

I've attempted to start the guest with the virtio driver cache
disabled, but with no success.

--
Aaron Clausen
mightymartia...@gmail.com

--
Aaron Clausen
mightymartia...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Fwd: VirtIO and BSOD On Windows Server 2003

2013-06-04 Thread Aaron Clausen

On Tue, Jun 4, 2013 at 7:55 AM, Aaron Clausen  wrote:
> On Tue, Jun 4, 2013 at 7:29 AM, Vadim Rozenfeld  wrote:
>>
>> If IDE works fine, try adding another disk as virtio and see it the secondary
>> disk works smoothly as well.
>
> That's what I did, and the instant the virtio block driver initialized
> in BSODed on me with 0x007f (reason code 0x805000f). According to
> Microsoft "this problem occurs because the NTFS driver incorrectly
> locks the resource when the NTFS driver tries to access the resource."
>
> I've attempted to start the guest with the virtio driver cache
> disabled, but with no success.

As a further bit of disclosure, I'm using raw images right now. I'm
going to convert one of my Server 2003 guests to qcow2 and see if that
makes a difference.


--
Aaron Clausen
mightymartia...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Gleb Natapov

On Tue, Jun 04, 2013 at 05:51:41PM +0200, Paolo Bonzini wrote:
> Il 04/06/2013 17:43, Christoffer Dall ha scritto:
> > Hi Paolo,
> > 
> > I don't think this is an issue. Gleb and Marcelo for example pulled
> > RMK's stable tree for my KVM/ARM updates for the 3.10 merge window and
> > that wasn't an issue.  If Linus pulls the kvm/next tree first the
> > diffstat should be similar and everything clean enough, no?
> > 
> > Catalin has previously expressed his wish to upstream the kvm/arm64
> > patches directly through him given the churn in a completely new
> > architecture and he wants to make sure that everything looks right.
> > 
> > It's a pretty clean implementation with quite few dependencies and
> > merging as a working series should be a priority instead of the
> > Kconfig hack, imho.
> 
> Ok, let's see what Gleb says.
> 
I have no objection to merge arm64 kvm trough Catalin if it mean less
churn for everyone. That's what we did with arm and mips. Arm64 kvm
has a dependency on kvm.git next though, so how Catalin make sure that
everything looks right? Will he merge kvm.git/next to arm64 tree?

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] kvm: make vendor_intel a generic function

2013-06-04 Thread Bandan Das

Paolo Bonzini  writes:

> Il 30/05/2013 08:07, Gleb Natapov ha scritto:
>>> > Unfortunately, this is not acceptable.  The emulator is not supposed to
>>> > know about the vcpu.  Everything has to go through the context; in
>>> > principle, the emulator should be usable outside of KVM.
>>> > 
>>> > I would just duplicate the code in kvm_guest_vcpu_model (perhaps you can
>>> > rename it to kvm_cpuid_get_intel_model or something like that; having
>>> > both "guest" and "vcpu" in the name is a pleonasm :)).
>> 
>> I thought having inline function that gets &eax, &ebx, &ecx, &edx as a
>> parameter and returns a vendor.
>
> That could work too.  Bandan, whatever looks nicer to you. :)

OK, I posted a v2. I changed vendor_intel() to just do the checking and the call
to cpuid happens outside the function now through whatever method is 
appropriate.
As a cleanup work, I think there should be a corresponding vendor_amd() too for 
the AMD related checks that happen in emulate.c. I will send a separate 
change for it.

Bandan
> Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 0/2] kvm: x86: Emulate MSR_PLATFORM_INFO

2013-06-04 Thread Bandan Das

These patches add an emulated MSR_PLATFORM_INFO that kvm guests
can read as described in section 14.3.2.4 of the Intel SDM. 
The relevant changes and details are in [2/2]; [1/2] makes vendor_intel
generic. There are atleat two known applications that fail to run because
of this MSR missing - Sandra and vTune.

v2: Addressed suggested changes

Bandan Das (2):
  kvm: make vendor_intel a generic function
  kvm: x86: emulate MSR_PLATFORM_INFO

 arch/x86/include/asm/kvm_emulate.h| 13 --
 arch/x86/include/asm/kvm_host.h   | 20 +++
 arch/x86/include/uapi/asm/msr-index.h |  2 ++
 arch/x86/kvm/cpuid.c  | 19 ++
 arch/x86/kvm/cpuid.h  | 16 
 arch/x86/kvm/emulate.c| 16 +++-
 arch/x86/kvm/x86.c| 48 +++
 7 files changed, 109 insertions(+), 25 deletions(-)

-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 1/2] kvm: make vendor_intel a generic function

2013-06-04 Thread Bandan Das

Make vendor_intel generic so that functions in x86.c
can use it.

v2:
Change vendor_intel function signature because the emulator
shouldn't be dealing with struct vcpu

Signed-off-by: Bandan Das 
---
 arch/x86/include/asm/kvm_emulate.h | 13 -
 arch/x86/include/asm/kvm_host.h| 20 
 arch/x86/kvm/emulate.c | 16 
 3 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index 15f960c..611a55f 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -319,19 +319,6 @@ struct x86_emulate_ctxt {
 #define REPE_PREFIX0xf3
 #define REPNE_PREFIX   0xf2
 
-/* CPUID vendors */
-#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541
-#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163
-#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65
-
-#define X86EMUL_CPUID_VENDOR_AMDisbetterI_ebx 0x69444d41
-#define X86EMUL_CPUID_VENDOR_AMDisbetterI_ecx 0x21726574
-#define X86EMUL_CPUID_VENDOR_AMDisbetterI_edx 0x74656273
-
-#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547
-#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e
-#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69
-
 enum x86_intercept_stage {
X86_ICTP_NONE = 0,   /* Allow zero-init to not match anything */
X86_ICPT_PRE_EXCEPT,
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3741c65..ce9a44f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -144,6 +144,19 @@ enum {
 
 #include 
 
+/* CPUID vendors */
+#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541
+#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163
+#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65
+
+#define X86EMUL_CPUID_VENDOR_AMDisbetterI_ebx 0x69444d41
+#define X86EMUL_CPUID_VENDOR_AMDisbetterI_ecx 0x21726574
+#define X86EMUL_CPUID_VENDOR_AMDisbetterI_edx 0x74656273
+
+#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547
+#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e
+#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69
+
 #define KVM_NR_MEM_OBJS 40
 
 #define KVM_NR_DB_REGS 4
@@ -942,6 +955,13 @@ static inline unsigned long read_msr(unsigned long msr)
 }
 #endif
 
+static inline bool vendor_intel(u32 ebx, u32 ecx, u32 edx)
+{
+   return ebx == X86EMUL_CPUID_VENDOR_GenuineIntel_ebx
+   && ecx == X86EMUL_CPUID_VENDOR_GenuineIntel_ecx
+   && edx == X86EMUL_CPUID_VENDOR_GenuineIntel_edx;
+}
+
 static inline u32 get_rdx_init_val(void)
 {
return 0x600; /* P6 family */
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 8db0010..87f12fc 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2280,17 +2280,6 @@ setup_syscalls_segments(struct x86_emulate_ctxt *ctxt,
ss->avl = 0;
 }
 
-static bool vendor_intel(struct x86_emulate_ctxt *ctxt)
-{
-   u32 eax, ebx, ecx, edx;
-
-   eax = ecx = 0;
-   ctxt->ops->get_cpuid(ctxt, &eax, &ebx, &ecx, &edx);
-   return ebx == X86EMUL_CPUID_VENDOR_GenuineIntel_ebx
-   && ecx == X86EMUL_CPUID_VENDOR_GenuineIntel_ecx
-   && edx == X86EMUL_CPUID_VENDOR_GenuineIntel_edx;
-}
-
 static bool em_syscall_is_enabled(struct x86_emulate_ctxt *ctxt)
 {
const struct x86_emulate_ops *ops = ctxt->ops;
@@ -2400,6 +2389,7 @@ static int em_sysenter(struct x86_emulate_ctxt *ctxt)
u64 msr_data;
u16 cs_sel, ss_sel;
u64 efer = 0;
+   u32 eax, ebx, ecx, edx;
 
ops->get_msr(ctxt, MSR_EFER, &efer);
/* inject #GP if in real mode */
@@ -2410,8 +2400,10 @@ static int em_sysenter(struct x86_emulate_ctxt *ctxt)
 * Not recognized on AMD in compat mode (but is recognized in legacy
 * mode).
 */
+   eax = ecx = 0;
+   ctxt->ops->get_cpuid(ctxt, &eax, &ebx, &ecx, &edx);
if ((ctxt->mode == X86EMUL_MODE_PROT32) && (efer & EFER_LMA)
-   && !vendor_intel(ctxt))
+   && !vendor_intel(ebx, ecx, edx))
return emulate_ud(ctxt);
 
/* XXX sysenter/sysexit have not been tested in 64bit mode.
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 2/2] kvm: x86: emulate MSR_PLATFORM_INFO

2013-06-04 Thread Bandan Das

To emulate MSR_PLATFORM_INFO, we divide guest virtual_tsc_khz by
the base clock - which is guest CPU model dependent. 
The relevant bits in this emulated MSR are :
Max Non-Turbo Ratio (15:8) - virtual_tsc_khz/bclk
Max Effi Ratio (47:40) - virtual_tsc_khz/bclk
Prog Ratio Limit for Trubo (28) - 0
Prog TDC-TDP Limit for Turbo (29) - 0

v2:
Change function name to kvm_cpuid_get_intel_model and 
call the new vendor_intel()

Signed-off-by: Bandan Das 
---
 arch/x86/include/uapi/asm/msr-index.h |  2 ++
 arch/x86/kvm/cpuid.c  | 19 ++
 arch/x86/kvm/cpuid.h  | 16 
 arch/x86/kvm/x86.c| 48 +++
 4 files changed, 85 insertions(+)

diff --git a/arch/x86/include/uapi/asm/msr-index.h 
b/arch/x86/include/uapi/asm/msr-index.h
index 2af848d..b0268d5 100644
--- a/arch/x86/include/uapi/asm/msr-index.h
+++ b/arch/x86/include/uapi/asm/msr-index.h
@@ -331,6 +331,8 @@
 
 #define MSR_IA32_MISC_ENABLE   0x01a0
 
+#define MSR_IA32_PLATFORM_INFO 0x00ce
+
 #define MSR_IA32_TEMPERATURE_TARGET0x01a2
 
 #define MSR_IA32_ENERGY_PERF_BIAS  0x01b0
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index a20ecb5..ae970d3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -608,6 +608,25 @@ struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct 
kvm_vcpu *vcpu,
 }
 EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry);
 
+u8 kvm_cpuid_get_intel_model(struct kvm_vcpu *vcpu)
+{
+   struct kvm_cpuid_entry2 *best;
+   u8 cpuid_model, cpuid_extmodel;
+
+   best = kvm_find_cpuid_entry(vcpu, 0, 0);
+   if (!vendor_intel(best->ebx, best->ecx, best->edx))
+   return CPUID_MODEL_UNKNOWN;
+
+   best = kvm_find_cpuid_entry(vcpu, 1, 0);
+
+   cpuid_model = (best->eax >> 4) & 0xf;
+   cpuid_extmodel = (best->eax >> 16) & 0xf;
+
+   cpuid_model += (cpuid_extmodel << 4);
+
+   return cpuid_model;
+}
+
 int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *best;
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index b7fd079..081461d 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -3,6 +3,21 @@
 
 #include "x86.h"
 
+#define MODEL_NEHALEM_CLARKSFIELD0x1e /* Core i7 and Ex BCLK: 133Mhz */
+#define MODEL_NEHALEM_BLOOMFIELD 0x1a /* Core i7 Xeon 3000 BCLK: 133 
Mhz */
+#define MODEL_NEHALEM_EX 0x2e /* Core i7 and i5 Nehalem BCLK: 
133 Mhz */
+#define MODEL_WESTMERE_ARRANDALE 0x25 /* Celeron/Pentium/Core i3/i5/i7 
BCLK: 133 Mhz */
+#define MODEL_WESTMERE_EX0x2f /* Xeon E7 BCLK: 133 Mhz */
+#define MODEL_WESTMERE_GULFTOWN  0x2c /* Core i7 Xeon 3000 BCLK: 133 
Mhz */
+#define MODEL_SANDYBRIDGE_SANDY  0x2a /* Core/Celeron/Pentium/Xeon 
BCLK: 133 Mhz */
+#define MODEL_SANDYBRIDGE_E  0x2d /* Core i7 and Ex BCLK: 100 Mhz 
*/
+#define MODEL_IVYBRIDGE_IVY  0x3a /* Core i3/i5/i7 (Ex) and Xeon 
E3 BCLK: 100 Mhz */
+#define MODEL_HASWELL_HASWELL0x3c /* BCLK: 100 Mhz */
+#define CPUID_MODEL_UNKNOWN  0x0  /* Everything else */
+
+#define BCLK_133_DEFAULT(133 * 1000)
+#define BCLK_100_DEFAULT(100 * 1000)
+
 void kvm_update_cpuid(struct kvm_vcpu *vcpu);
 struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
  u32 function, u32 index);
@@ -18,6 +33,7 @@ int kvm_vcpu_ioctl_get_cpuid2(struct kvm_vcpu *vcpu,
  struct kvm_cpuid2 *cpuid,
  struct kvm_cpuid_entry2 __user *entries);
 void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx);
+u8 kvm_cpuid_get_intel_model(struct kvm_vcpu *vcpu);
 
 
 static inline bool guest_cpuid_has_xsave(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 094b5d9..f9b2830 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -239,6 +239,49 @@ void kvm_set_shared_msr(unsigned slot, u64 value, u64 mask)
 }
 EXPORT_SYMBOL_GPL(kvm_set_shared_msr);
 
+static u64 kvm_get_platform_info(struct kvm_vcpu *vcpu)
+{
+   u8 cpumodel;
+   u32 bclk;
+
+   /*
+* Programmable Ratio Limit for Turbo Mode (bit 28): 0
+* Programmable TDC-TDP Limit for Turbo Mode (bit 29): 0
+*/
+   u64 platform_info = 0, max_nonturbo_ratio = 0, max_effi_ratio = 0;
+
+   cpumodel = kvm_cpuid_get_intel_model(vcpu);
+
+   switch (cpumodel) {
+   case MODEL_NEHALEM_CLARKSFIELD:
+   case MODEL_NEHALEM_BLOOMFIELD:
+   case MODEL_NEHALEM_EX:
+   case MODEL_WESTMERE_ARRANDALE:
+   case MODEL_WESTMERE_GULFTOWN:
+   case MODEL_WESTMERE_EX:
+   bclk = BCLK_133_DEFAULT;
+   break;
+   case MODEL_SANDYBRIDGE_SANDY:
+   case MODEL_SANDYBRIDGE_E:
+   case MODEL_IVYBRIDGE_IVY:
+   case MODEL_HASWELL_HASWELL:
+   bclk = BCLK_1

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Steve Capper

On Tue, Jun 04, 2013 at 04:40:23PM +0100, Will Deacon wrote:
> On Tue, Jun 04, 2013 at 04:30:52PM +0100, Paolo Bonzini wrote:
> > Il 04/06/2013 16:59, Marc Zyngier ha scritto:
> > >>> >> - Either I can rely on a stable branch from both KVM and KVM/ARM 
> > >>> >> trees
> > >>> >> on which I can base my tree for Catalin/Will to pull,
> > >>> >> - Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, 
> > >>> >> and
> > >>> >> only merge this last bit when the dependencies are satisfied in 
> > >>> >> Linus' tree.
> > >>> >>
> > >>> >> What do you guys think?
> > >>> >>
> > >> > I would think you would prefer option (1) to get the code in cleaner.
> > >> > Both the KVM/next tree is stable and I can provide you with a stable
> > >> > KVM/ARM tree. But I really don't feel strongly about this.
> > > That'd be my preferred choice too. Let's see what the KVM maintainers'
> > > position on that.
> > 
> > I wonder if Linus would complain about irrelevant KVM changes in
> > Will/Catalin's pull request.  The KVM/next tree has other patches below
> > the ones you need.
> > 
> > What we usually do for x86 is get an Acked-by from the other part.  If
> > there are no dependencies on other aarch64 core changes, it'd be better
> > to go through the KVM tree.  Otherwise separating the Kconfig change
> > should be okay (perhaps add it with depends on BROKEN, and remove the
> > dependency later?).
> 
> Well you can certainly have my ack for the series but, as you say, it
> depends whether there are further dependencies on patches queued for aarch64
> core. For 3.11, conflicts with Steve's (CC'd) hugetlb stuff are likely.
> 
>   Acked-by: Will Deacon 
> 
> Will
> 

I'd be happy to rebase/test the aarch64 huge page code against a branch if
that's helpful?

Cheers,
-- 
Steve
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Paolo Bonzini

Il 04/06/2013 17:43, Christoffer Dall ha scritto:
> Hi Paolo,
> 
> I don't think this is an issue. Gleb and Marcelo for example pulled
> RMK's stable tree for my KVM/ARM updates for the 3.10 merge window and
> that wasn't an issue.  If Linus pulls the kvm/next tree first the
> diffstat should be similar and everything clean enough, no?
> 
> Catalin has previously expressed his wish to upstream the kvm/arm64
> patches directly through him given the churn in a completely new
> architecture and he wants to make sure that everything looks right.
> 
> It's a pretty clean implementation with quite few dependencies and
> merging as a working series should be a priority instead of the
> Kconfig hack, imho.

Ok, let's see what Gleb says.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Christoffer Dall

On 4 June 2013 08:30, Paolo Bonzini  wrote:
> Il 04/06/2013 16:59, Marc Zyngier ha scritto:
 >> - Either I can rely on a stable branch from both KVM and KVM/ARM trees
 >> on which I can base my tree for Catalin/Will to pull,
 >> - Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, and
 >> only merge this last bit when the dependencies are satisfied in Linus' 
 >> tree.
 >>
 >> What do you guys think?
 >>
>>> > I would think you would prefer option (1) to get the code in cleaner.
>>> > Both the KVM/next tree is stable and I can provide you with a stable
>>> > KVM/ARM tree. But I really don't feel strongly about this.
>> That'd be my preferred choice too. Let's see what the KVM maintainers'
>> position on that.
>
> I wonder if Linus would complain about irrelevant KVM changes in
> Will/Catalin's pull request.  The KVM/next tree has other patches below
> the ones you need.
>
> What we usually do for x86 is get an Acked-by from the other part.  If
> there are no dependencies on other aarch64 core changes, it'd be better
> to go through the KVM tree.  Otherwise separating the Kconfig change
> should be okay (perhaps add it with depends on BROKEN, and remove the
> dependency later?).
>
Hi Paolo,

I don't think this is an issue. Gleb and Marcelo for example pulled
RMK's stable tree for my KVM/ARM updates for the 3.10 merge window and
that wasn't an issue.  If Linus pulls the kvm/next tree first the
diffstat should be similar and everything clean enough, no?

Catalin has previously expressed his wish to upstream the kvm/arm64
patches directly through him given the churn in a completely new
architecture and he wants to make sure that everything looks right.

It's a pretty clean implementation with quite few dependencies and
merging as a working series should be a priority instead of the
Kconfig hack, imho.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Marc Zyngier

On 04/06/13 16:30, Paolo Bonzini wrote:

Hi Paolo,

> Il 04/06/2013 16:59, Marc Zyngier ha scritto:
>> - Either I can rely on a stable branch from both KVM and
>> KVM/ARM trees on which I can base my tree for Catalin/Will
>> to pull, - Or I ask Catalin to only pull the arm64 part
>> *minus the Kconfig*, and only merge this last bit when the
>> dependencies are satisfied in Linus' tree.
>> 
>> What do you guys think?
>> 
 I would think you would prefer option (1) to get the code in
 cleaner. Both the KVM/next tree is stable and I can provide you
 with a stable KVM/ARM tree. But I really don't feel strongly
 about this.
>> That'd be my preferred choice too. Let's see what the KVM
>> maintainers' position on that.
> 
> I wonder if Linus would complain about irrelevant KVM changes in 
> Will/Catalin's pull request.  The KVM/next tree has other patches
> below the ones you need.

That's how the ARM tree is dealt with most of the time. We create stable
branches (that we know for sure are going in at the next merge window)
that are used as a base for others to base their own developments.
KVM/ARM has been merged like this, using something crazy like half a
dozen stable branches from different contributors... So far, Linus
hasn't complained.

KVM/arm64 is not that bad in that respect, but I'm inclined to follow
the same process.

> What we usually do for x86 is get an Acked-by from the other part.
> If there are no dependencies on other aarch64 core changes, it'd be
> better to go through the KVM tree.

There is a number of potential additions to the arm64 tree that may
conflict with KVM/arm64 (THP comes to my mind...).

> Otherwise separating the Kconfig change should be okay (perhaps add
> it with depends on BROKEN, and remove the dependency later?).

Could do, yes.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Will Deacon

On Tue, Jun 04, 2013 at 04:30:52PM +0100, Paolo Bonzini wrote:
> Il 04/06/2013 16:59, Marc Zyngier ha scritto:
> >>> >> - Either I can rely on a stable branch from both KVM and KVM/ARM trees
> >>> >> on which I can base my tree for Catalin/Will to pull,
> >>> >> - Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, and
> >>> >> only merge this last bit when the dependencies are satisfied in Linus' 
> >>> >> tree.
> >>> >>
> >>> >> What do you guys think?
> >>> >>
> >> > I would think you would prefer option (1) to get the code in cleaner.
> >> > Both the KVM/next tree is stable and I can provide you with a stable
> >> > KVM/ARM tree. But I really don't feel strongly about this.
> > That'd be my preferred choice too. Let's see what the KVM maintainers'
> > position on that.
> 
> I wonder if Linus would complain about irrelevant KVM changes in
> Will/Catalin's pull request.  The KVM/next tree has other patches below
> the ones you need.
> 
> What we usually do for x86 is get an Acked-by from the other part.  If
> there are no dependencies on other aarch64 core changes, it'd be better
> to go through the KVM tree.  Otherwise separating the Kconfig change
> should be okay (perhaps add it with depends on BROKEN, and remove the
> dependency later?).

Well you can certainly have my ack for the series but, as you say, it
depends whether there are further dependencies on patches queued for aarch64
core. For 3.11, conflicts with Steve's (CC'd) hugetlb stuff are likely.

  Acked-by: Will Deacon 

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Paolo Bonzini

Il 04/06/2013 16:59, Marc Zyngier ha scritto:
>>> >> - Either I can rely on a stable branch from both KVM and KVM/ARM trees
>>> >> on which I can base my tree for Catalin/Will to pull,
>>> >> - Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, and
>>> >> only merge this last bit when the dependencies are satisfied in Linus' 
>>> >> tree.
>>> >>
>>> >> What do you guys think?
>>> >>
>> > I would think you would prefer option (1) to get the code in cleaner.
>> > Both the KVM/next tree is stable and I can provide you with a stable
>> > KVM/ARM tree. But I really don't feel strongly about this.
> That'd be my preferred choice too. Let's see what the KVM maintainers'
> position on that.

I wonder if Linus would complain about irrelevant KVM changes in
Will/Catalin's pull request.  The KVM/next tree has other patches below
the ones you need.

What we usually do for x86 is get an Acked-by from the other part.  If
there are no dependencies on other aarch64 core changes, it'd be better
to go through the KVM tree.  Otherwise separating the Kconfig change
should be okay (perhaps add it with depends on BROKEN, and remove the
dependency later?).

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Marc Zyngier

On 04/06/13 15:50, Christoffer Dall wrote:
> On 4 June 2013 05:29, Marc Zyngier  wrote:
>> Guys,
>>
>> The KVM/arm64 code is now, as it seems, in good enough shape to be
>> merged. I've so far addressed all the comments, and it doesn't seem any
>> worse then what is queued for its 32bit counterpart.
>>
> 
> huh?

That was supposed to be a joke. Obviously, my sense of humour has failed
to impress you here. I'll improve on that in another version of the same
email... ;-)

>> For reference, it is sitting there:
>> git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
>> kvm-arm64/kvm
>>
>> What is not defined yet is the merge path:
>> - It is touching some of the arm64 core code, so it would be better if
>> it was merged through the arm64 tree
>> - It is depending on some of the patches in the core KVM queue (the
>> vgic/timer move to virt/kvm/arm/)
>> - It is also depending on some of the patches that are in the KVM/ARM
>> queue (parametrized timer interrupt, some MMU/MMIO fixes)
>>
>> So I can see two possibilities:
>> - Either I can rely on a stable branch from both KVM and KVM/ARM trees
>> on which I can base my tree for Catalin/Will to pull,
>> - Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, and
>> only merge this last bit when the dependencies are satisfied in Linus' tree.
>>
>> What do you guys think?
>>
> I would think you would prefer option (1) to get the code in cleaner.
> Both the KVM/next tree is stable and I can provide you with a stable
> KVM/ARM tree. But I really don't feel strongly about this.

That'd be my preferred choice too. Let's see what the KVM maintainers'
position on that.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC V9 12/19] xen: Enable PV ticketlocks on HVM Xen

2013-06-04 Thread Raghavendra K T


On 06/04/2013 08:14 PM, Konrad Rzeszutek Wilk wrote:

On Tue, Jun 04, 2013 at 12:46:53PM +0530, Raghavendra K T wrote:

On 06/03/2013 09:27 PM, Konrad Rzeszutek Wilk wrote:

On Sun, Jun 02, 2013 at 12:55:03AM +0530, Raghavendra K T wrote:

xen: Enable PV ticketlocks on HVM Xen


There is more to it. You should also revert 
70dd4998cb85f0ecd6ac892cc7232abefa432efb



Yes, true. Do you expect the revert to be folded into this patch itself?



I can do them. I would drop this patch and just mention in
the cover letter that Konrad would have to revert two git commits
to re-enable it on PVHVM.



Thanks. will do that.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Christoffer Dall

On 4 June 2013 05:29, Marc Zyngier  wrote:
> Guys,
>
> The KVM/arm64 code is now, as it seems, in good enough shape to be
> merged. I've so far addressed all the comments, and it doesn't seem any
> worse then what is queued for its 32bit counterpart.
>

huh?

> For reference, it is sitting there:
> git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
> kvm-arm64/kvm
>
> What is not defined yet is the merge path:
> - It is touching some of the arm64 core code, so it would be better if
> it was merged through the arm64 tree
> - It is depending on some of the patches in the core KVM queue (the
> vgic/timer move to virt/kvm/arm/)
> - It is also depending on some of the patches that are in the KVM/ARM
> queue (parametrized timer interrupt, some MMU/MMIO fixes)
>
> So I can see two possibilities:
> - Either I can rely on a stable branch from both KVM and KVM/ARM trees
> on which I can base my tree for Catalin/Will to pull,
> - Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, and
> only merge this last bit when the dependencies are satisfied in Linus' tree.
>
> What do you guys think?
>
I would think you would prefer option (1) to get the code in cleaner.
Both the KVM/next tree is stable and I can provide you with a stable
KVM/ARM tree. But I really don't feel strongly about this.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC V9 12/19] xen: Enable PV ticketlocks on HVM Xen

2013-06-04 Thread Konrad Rzeszutek Wilk

On Tue, Jun 04, 2013 at 12:46:53PM +0530, Raghavendra K T wrote:
> On 06/03/2013 09:27 PM, Konrad Rzeszutek Wilk wrote:
> >On Sun, Jun 02, 2013 at 12:55:03AM +0530, Raghavendra K T wrote:
> >>xen: Enable PV ticketlocks on HVM Xen
> >
> >There is more to it. You should also revert 
> >70dd4998cb85f0ecd6ac892cc7232abefa432efb
> >
> 
> Yes, true. Do you expect the revert to be folded into this patch itself?
> 

I can do them. I would drop this patch and just mention in
the cover letter that Konrad would have to revert two git commits
to re-enable it on PVHVM.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

reminder: no kvm developer call today

2013-06-04 Thread Michael S. Tsirkin

Reminder: we witched to a bi-weekly schedule.
There's no kvm developer call today.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: x86: handle hardware breakpoints during emulation

2013-06-04 Thread Paolo Bonzini

Il 04/06/2013 14:53, Gleb Natapov ha scritto:
>> > 
> Yeah. What about:
>  if ((dr6 = guest_debug()))
>return handle_gues_debug();
>  else if ((dr6 = check_bp()))
>return handle_bp(dr6);

I'll try either this...

 > >> If you do not want EMULATE_PROCEED, I can just use -1 instead in
 > >> kvm_vcpu_check_breakpoint, and return if r < 0.
 > >>
>>> > > But you need to know what to return EMULATE_DONE or EMULATE_USER_EXIT.
>> > 
>> > Sorry, _not_ return if r < 0.
>> > 
> Function that returns enum or -1? This is worse IMO. Return
> EMULATE_DONE/EMULATE_USER_EXIT via a pointer will be better.

... or this, and see what looks nicer.  But I like

   if (check_bp(&r))
   return r;

Thanks for the review.

Paolo

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Catalin Marinas

On Tue, Jun 04, 2013 at 02:13:52PM +0100, Anup Patel wrote:
> Hi Marc,
> 
> On Tue, Jun 4, 2013 at 5:59 PM, Marc Zyngier  wrote:
> > Guys,
> >
> > The KVM/arm64 code is now, as it seems, in good enough shape to be
> > merged. I've so far addressed all the comments, and it doesn't seem any
> > worse then what is queued for its 32bit counterpart.
> >
> > For reference, it is sitting there:
> > git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
> > kvm-arm64/kvm
> >
> > What is not defined yet is the merge path:
> > - It is touching some of the arm64 core code, so it would be better if
> > it was merged through the arm64 tree
> > - It is depending on some of the patches in the core KVM queue (the
> > vgic/timer move to virt/kvm/arm/)
> > - It is also depending on some of the patches that are in the KVM/ARM
> > queue (parametrized timer interrupt, some MMU/MMIO fixes)
> >
> > So I can see two possibilities:
> > - Either I can rely on a stable branch from both KVM and KVM/ARM trees
> > on which I can base my tree for Catalin/Will to pull,
> > - Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, and
> > only merge this last bit when the dependencies are satisfied in Linus' tree.
> >
> > What do you guys think?
> 
> I had quick look at your kvm-arm64/kvm branch. I agree with the approach
> of going through arm64 tree.
> 
> FYI, latest tested branch on APM ARMv8 board is kvm-arm64/kvm-3.10-rc3
> branch.
> 
> From my side, +1 for the second option that is "pull the arm64 part *minus
> the Kconfig*, and ..."

+1 as well for the second option.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] MAINTAINERS: s/Marcelo/Paolo/

2013-06-04 Thread Paolo Bonzini

Il 04/06/2013 15:16, Michael S. Tsirkin ha scritto:
> Marcelo doesn't maintain kvm anymore,
> Paolo is taking over the job.
> Update MAINTAINERS to stop flooding Marcelo with mail.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index be02724..66e94da 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -155,7 +155,7 @@ Guest CPU Cores (KVM):
>  
>  Overall
>  M: Gleb Natapov 
> -M: Marcelo Tosatti 
> +M: Paolo Bonzini 
>  L: kvm@vger.kernel.org
>  S: Supported
>  F: kvm-*
> 

Acked-by: Paolo Bonzini 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

KVM call agenda for 2013-06-11

2013-06-04 Thread Michael S. Tsirkin

Juan is not available now, and Anthony asked for
agenda to be sent early.
So here comes:

Agenda for the meeting Tue, June 11:
 
- Generating acpi tables, redux

Please, send any topic that you are interested in covering.

Thanks, MST

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Marc Zyngier

On 04/06/13 14:13, Anup Patel wrote:

Hi Anup,

> On Tue, Jun 4, 2013 at 5:59 PM, Marc Zyngier  wrote:
>> Guys,
>>
>> The KVM/arm64 code is now, as it seems, in good enough shape to be
>> merged. I've so far addressed all the comments, and it doesn't seem any
>> worse then what is queued for its 32bit counterpart.
>>
>> For reference, it is sitting there:
>> git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
>> kvm-arm64/kvm
>>
>> What is not defined yet is the merge path:
>> - It is touching some of the arm64 core code, so it would be better if
>> it was merged through the arm64 tree
>> - It is depending on some of the patches in the core KVM queue (the
>> vgic/timer move to virt/kvm/arm/)
>> - It is also depending on some of the patches that are in the KVM/ARM
>> queue (parametrized timer interrupt, some MMU/MMIO fixes)
>>
>> So I can see two possibilities:
>> - Either I can rely on a stable branch from both KVM and KVM/ARM trees
>> on which I can base my tree for Catalin/Will to pull,
>> - Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, and
>> only merge this last bit when the dependencies are satisfied in Linus' tree.
>>
>> What do you guys think?
> 
> I had quick look at your kvm-arm64/kvm branch. I agree with the approach
> of going through arm64 tree.
> 
> FYI, latest tested branch on APM ARMv8 board is kvm-arm64/kvm-3.10-rc3
> branch.

This is the exact same code, just a slightly different patch split to
implement the separate Kconfig option.

Thanks for testing,

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] MAINTAINERS: s/Marcelo/Paolo/

2013-06-04 Thread Michael S. Tsirkin

Marcelo doesn't maintain kvm anymore,
Paolo is taking over the job.
Update MAINTAINERS to stop flooding Marcelo with mail.

Signed-off-by: Michael S. Tsirkin 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index be02724..66e94da 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -155,7 +155,7 @@ Guest CPU Cores (KVM):
 
 Overall
 M: Gleb Natapov 
-M: Marcelo Tosatti 
+M: Paolo Bonzini 
 L: kvm@vger.kernel.org
 S: Supported
 F: kvm-*
-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Planning the merge of KVM/arm64

2013-06-04 Thread Anup Patel

Hi Marc,

On Tue, Jun 4, 2013 at 5:59 PM, Marc Zyngier  wrote:
> Guys,
>
> The KVM/arm64 code is now, as it seems, in good enough shape to be
> merged. I've so far addressed all the comments, and it doesn't seem any
> worse then what is queued for its 32bit counterpart.
>
> For reference, it is sitting there:
> git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
> kvm-arm64/kvm
>
> What is not defined yet is the merge path:
> - It is touching some of the arm64 core code, so it would be better if
> it was merged through the arm64 tree
> - It is depending on some of the patches in the core KVM queue (the
> vgic/timer move to virt/kvm/arm/)
> - It is also depending on some of the patches that are in the KVM/ARM
> queue (parametrized timer interrupt, some MMU/MMIO fixes)
>
> So I can see two possibilities:
> - Either I can rely on a stable branch from both KVM and KVM/ARM trees
> on which I can base my tree for Catalin/Will to pull,
> - Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, and
> only merge this last bit when the dependencies are satisfied in Linus' tree.
>
> What do you guys think?

I had quick look at your kvm-arm64/kvm branch. I agree with the approach
of going through arm64 tree.

FYI, latest tested branch on APM ARMv8 board is kvm-arm64/kvm-3.10-rc3
branch.

>From my side, +1 for the second option that is "pull the arm64 part *minus
the Kconfig*, and ..."

>
> Thanks,
>
> M.
> --
> Jazz is not dead. It just smells funny...
>
>
> ___
> kvmarm mailing list
> kvm...@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm

Regards,
Anup
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: x86: handle hardware breakpoints during emulation

2013-06-04 Thread Gleb Natapov

On Tue, Jun 04, 2013 at 02:19:13PM +0200, Paolo Bonzini wrote:
> Il 04/06/2013 13:47, Gleb Natapov ha scritto:
> > On Tue, Jun 04, 2013 at 01:33:20PM +0200, Paolo Bonzini wrote:
> >> Il 04/06/2013 13:28, Gleb Natapov ha scritto:
> >>> On Thu, May 30, 2013 at 06:00:30PM +0200, Paolo Bonzini wrote:
>  This lets debugging work better during emulation of invalid
>  guest state.
> 
>  The check is done before emulating the instruction, and (in the case
>  of guest debugging) reuses EMULATE_DO_MMIO to exit with KVM_EXIT_DEBUG.
> 
>  Signed-off-by: Paolo Bonzini 
>  ---
>   arch/x86/include/asm/kvm_host.h |  3 +-
>   arch/x86/kvm/x86.c  | 65 
>  +
>   2 files changed, 67 insertions(+), 1 deletion(-)
> 
>  diff --git a/arch/x86/include/asm/kvm_host.h 
>  b/arch/x86/include/asm/kvm_host.h
>  index e2e09f3..aefd8c2 100644
>  --- a/arch/x86/include/asm/kvm_host.h
>  +++ b/arch/x86/include/asm/kvm_host.h
>  @@ -788,9 +788,10 @@ extern u32  kvm_min_guest_tsc_khz;
>   extern u32  kvm_max_guest_tsc_khz;
>   
>   enum emulation_result {
>  -EMULATE_DONE,   /* no further processing */
>  -EMULATE_DO_MMIO,  /* kvm_run filled with mmio request */
>  +EMULATE_DONE, /* no further processing */
>  +EMULATE_DO_MMIO,  /* kvm_run ready for userspace exit */
> >>> If it no longer means MMIO (or PIO) lest rename it to something more
> >>> meaningful. EMULATE_EXIT? EMULATE_USER_EXIT?
> >>
> >> I'll go with EMULATE_USER_EXIT.
> >>
>   EMULATE_FAIL, /* can't emulate this instruction */
>  +EMULATE_PROCEED,  /* proceed with rest of emulation */
> >>> I think we can do without this. Have to function: check_bp(),
> >>> handle_bp(). Do:
> >>>
> >>>  if (check_bp())
> >>>return handle_bp();
> >>
> >> I tried this, but it doesn't work because you need to pass the computed
> >> dr6 from check_bp to handle_bp.  It becomes really ugly.
> >>
> > Can't check_bp() return dr6?
> > 
> >  if ((dr6 = check_bp())
> > return handle_bp(dr6);
> 
> It also needs to know if debugging the guest vs. in the guest.  Thus
> there is duplicate code between check and handle.
> 
Yeah. What about:
 if ((dr6 = guest_debug()))
   return handle_gues_debug();
 else if ((dr6 = check_bp()))
   return handle_bp(dr6);

> >> If you do not want EMULATE_PROCEED, I can just use -1 instead in
> >> kvm_vcpu_check_breakpoint, and return if r < 0.
> >>
> > But you need to know what to return EMULATE_DONE or EMULATE_USER_EXIT.
> 
> Sorry, _not_ return if r < 0.
> 
Function that returns enum or -1? This is worse IMO. Return
EMULATE_DONE/EMULATE_USER_EXIT via a pointer will be better.

> Paolo
> 
> >> Paolo
> >>
>   };
>   
>   #define EMULTYPE_NO_DECODE  (1 << 0)
>  diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>  index 1d928af..33b51bc 100644
>  --- a/arch/x86/kvm/x86.c
>  +++ b/arch/x86/kvm/x86.c
>  @@ -4872,6 +4872,60 @@ static bool retry_instruction(struct 
>  x86_emulate_ctxt *ctxt,
>   static int complete_emulated_mmio(struct kvm_vcpu *vcpu);
>   static int complete_emulated_pio(struct kvm_vcpu *vcpu);
>   
>  +static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
>  +unsigned long *db)
>  +{
>  +u32 dr6 = 0;
>  +int i;
>  +u32 enable, rwlen;
>  +
>  +enable = dr7;
>  +rwlen = dr7 >> 16;
>  +for (i = 0; i < 4; i++, enable >>= 2, rwlen >>= 4)
>  +if ((enable & 3) && (rwlen & 15) == type && db[i] == 
>  addr)
>  +dr6 |= (1 << i);
>  +return dr6;
>  +}
>  +
>  +static int kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu)
>  +{
>  +struct kvm_run *kvm_run = vcpu->run;
>  +unsigned long eip = vcpu->arch.emulate_ctxt.eip;
>  +u32 dr6 = 0;
>  +
>  +if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) &&
>  +(vcpu->arch.guest_debug_dr7 & DR7_BP_EN_MASK)) {
>  +dr6 = kvm_vcpu_check_hw_bp(eip, 0,
>  +   vcpu->arch.guest_debug_dr7,
>  +   vcpu->arch.eff_db);
>  +
>  +if (dr6 != 0) {
>  +kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1;
>  +kvm_run->debug.arch.pc = kvm_rip_read(vcpu) +
>  +get_segment_base(vcpu, VCPU_SREG_CS);
>  +
>  +kvm_run->debug.arch.exception = DB_VECTOR;
>  +kvm_run->exit_reason = KVM_EXIT_DEBUG;
>  +return EMULATE_DO_MMIO;
>  +}
>  +}
>

[nVMX w/ Haswell] KVM unit-tests in L1 - eventinj test fails trying to send NMI

2013-06-04 Thread Kashyap Chamarthy

Heya,

So, I invoked this in L1 with:
===
[test@foo kvm-unit-tests]$ time qemu-system-x86_64 -enable-kvm -device
pc-testdev -serial stdio -nographic -no-user-config -nodefaults
-device
isa-debug-exit,iobase=0xf4,iosize=0x4 -kernel ./x86/eventinj.flat |
tee /var/tmp/eventinj-test.txt
enabling apic
paging enabled
cr0 = 80010011
cr3 = 7fff000
cr4 = 20
Try to divide by 0
DE isr running divider is 0
Result is 150
DE exception: PASS
Try int 3
BP isr running
After int 3
BP exception: PASS
Try send vec 33 to itself
irq1 running
After vec 33 to itself
vec 33: PASS
Try int $33
irq1 running
After int $33
int $33: PASS
Try send vec 32 and 33 to itself
irq1 running
irq0 running
After vec 32 and 33 to itself
vec 32/33: PASS
Try send vec 32 and int $33
irq1 running
irq0 running
After vec 32 and int $33
vec 32/int $33: PASS
Try send vec 33 and 62 and mask one with TPR
irq1 running
After 33/62 TPR test
TPR: PASS
irq0 running
Try send NMI to itself
After NMI to itself
NMI: FAIL
Try int 33 with shadowed stack
irq1 running
After int 33 with shadowed stack
int 33 with shadowed stack: PASS

summary: 9 tests, 1 failures

real0m0.647s
user0m0.164s
sys 0m0.146s
[test@foo kvm-unit-tests]$
===

Any hints on further debugging this ?


Other info:
--

- L1's qemu-kvm CLI
===
# ps -ef | grep -i qemu
qemu  5455 1 94 Jun02 ?1-07:14:29
/usr/bin/qemu-system-x86_64 -machine accel=kvm -name regular-guest -S
-machine pc-i440fx-1.4,accel=kvm,usb=off -cpu Haswell,+vmx -m 10240
-smp 4,sockets=4,cores=1,threads=1 -uuid
4ed9ac0b-7f72-dfcf-68b3-e6fe2ac588b2 -nographic -no-user-config
-nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/regular-guest.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
-drive 
file=/home/test/vmimages/regular-guest.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
-device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:80:c1:34,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
root 12255  5419  0 08:41 pts/200:00:00 grep --color=auto -i qemu
===

- Setup details --
https://github.com/kashyapc/nvmx-haswell/blob/master/SETUP-nVMX.rst

/kashyap
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Planning the merge of KVM/arm64

2013-06-04 Thread Marc Zyngier

Guys,

The KVM/arm64 code is now, as it seems, in good enough shape to be
merged. I've so far addressed all the comments, and it doesn't seem any
worse then what is queued for its 32bit counterpart.

For reference, it is sitting there:
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
kvm-arm64/kvm

What is not defined yet is the merge path:
- It is touching some of the arm64 core code, so it would be better if
it was merged through the arm64 tree
- It is depending on some of the patches in the core KVM queue (the
vgic/timer move to virt/kvm/arm/)
- It is also depending on some of the patches that are in the KVM/ARM
queue (parametrized timer interrupt, some MMU/MMIO fixes)

So I can see two possibilities:
- Either I can rely on a stable branch from both KVM and KVM/ARM trees
on which I can base my tree for Catalin/Will to pull,
- Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, and
only merge this last bit when the dependencies are satisfied in Linus' tree.

What do you guys think?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: VirtIO and BSOD On Windows Server 2003

2013-06-04 Thread Vadim Rozenfeld



- Original Message -
From: "Stefan Hajnoczi" 
To: "Aaron Clausen" 
Cc: kvm@vger.kernel.org, vroze...@redhat.com
Sent: Tuesday, June 4, 2013 10:10:50 PM
Subject: Re: VirtIO and BSOD On Windows Server 2003

On Mon, Jun 03, 2013 at 09:56:41AM -0700, Aaron Clausen wrote:
> I recently built a new kvm server with Debian Wheezy which comes with
> KVM 1.1.2 and when I moved this guest over, I immediately started
> getting BSODs (0x007). I disabled virtio block driver and then
> attempted to upgrade to the latest with no luck.

Stop code 0x7b "Inaccessible boot device"?

How did you create the guest on the new server?  Perhaps the hardware
configuration changed - I suggest trying to make it as close to the
original guest as possible (including the same PCI slots).

Stefan

It usually happens when system unable to find a bootable device.
Check qemu options, you probably missed something.
Vadim.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] KVM fixes for 3.10-rc4

2013-06-04 Thread Gleb Natapov


Linus,

Please pull from

git://git.kernel.org/pub/scm/virt/kvm/kvm.git fixes

To receive KVM bug fixes. The bulk of the fixes is in MIPS KVM
kernel<->userspace ABI. MIPS KVM is new for 3.10 and some problems were
found with current ABI. It is better to fix them now and do not have
a kernel with broken one.


Andre Przywara (1):
  ARM: KVM: prevent NULL pointer dereferences with KVM VCPU ioctl

David Daney (6):
  mips/kvm: Fix ABI for use of FPU.
  mips/kvm: Fix ABI for use of 64-bit registers.
  mips/kvm: Fix name of gpr field in struct kvm_regs.
  mips/kvm: Use ARRAY_SIZE() instead of hardcoded constants in 
kvm_arch_vcpu_ioctl_{s,g}et_regs
  mips/kvm: Fix ABI by moving manipulation of CP0 registers to 
KVM_{G,S}ET_ONE_REG
  mips/kvm: Use ENOIOCTLCMD to indicate unimplemented ioctls.

Gleb Natapov (1):
  KVM: Fix race in apic->pending_events processing

Marc Zyngier (1):
  ARM: KVM: be more thorough when invalidating TLBs

Paolo Bonzini (2):
  KVM: Emulate multibyte NOP
  KVM: fix sil/dil/bpl/spl in the mod/rm fields

 arch/arm/kvm/arm.c   |   15 +-
 arch/arm/kvm/mmu.c   |   41 --
 arch/mips/include/asm/kvm_host.h |4 -
 arch/mips/include/uapi/asm/kvm.h |  137 +++
 arch/mips/kvm/kvm_mips.c |  280 +++---
 arch/mips/kvm/kvm_trap_emul.c|   50 ---
 arch/x86/kvm/emulate.c   |9 +-
 arch/x86/kvm/lapic.c |9 +-
 8 files changed, 421 insertions(+), 124 deletions(-)

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: x86: handle hardware breakpoints during emulation

2013-06-04 Thread Paolo Bonzini

Il 04/06/2013 13:47, Gleb Natapov ha scritto:
> On Tue, Jun 04, 2013 at 01:33:20PM +0200, Paolo Bonzini wrote:
>> Il 04/06/2013 13:28, Gleb Natapov ha scritto:
>>> On Thu, May 30, 2013 at 06:00:30PM +0200, Paolo Bonzini wrote:
 This lets debugging work better during emulation of invalid
 guest state.

 The check is done before emulating the instruction, and (in the case
 of guest debugging) reuses EMULATE_DO_MMIO to exit with KVM_EXIT_DEBUG.

 Signed-off-by: Paolo Bonzini 
 ---
  arch/x86/include/asm/kvm_host.h |  3 +-
  arch/x86/kvm/x86.c  | 65 
 +
  2 files changed, 67 insertions(+), 1 deletion(-)

 diff --git a/arch/x86/include/asm/kvm_host.h 
 b/arch/x86/include/asm/kvm_host.h
 index e2e09f3..aefd8c2 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -788,9 +788,10 @@ extern u32  kvm_min_guest_tsc_khz;
  extern u32  kvm_max_guest_tsc_khz;
  
  enum emulation_result {
 -  EMULATE_DONE,   /* no further processing */
 -  EMULATE_DO_MMIO,  /* kvm_run filled with mmio request */
 +  EMULATE_DONE, /* no further processing */
 +  EMULATE_DO_MMIO,  /* kvm_run ready for userspace exit */
>>> If it no longer means MMIO (or PIO) lest rename it to something more
>>> meaningful. EMULATE_EXIT? EMULATE_USER_EXIT?
>>
>> I'll go with EMULATE_USER_EXIT.
>>
EMULATE_FAIL, /* can't emulate this instruction */
 +  EMULATE_PROCEED,  /* proceed with rest of emulation */
>>> I think we can do without this. Have to function: check_bp(),
>>> handle_bp(). Do:
>>>
>>>  if (check_bp())
>>>return handle_bp();
>>
>> I tried this, but it doesn't work because you need to pass the computed
>> dr6 from check_bp to handle_bp.  It becomes really ugly.
>>
> Can't check_bp() return dr6?
> 
>  if ((dr6 = check_bp())
> return handle_bp(dr6);

It also needs to know if debugging the guest vs. in the guest.  Thus
there is duplicate code between check and handle.

>> If you do not want EMULATE_PROCEED, I can just use -1 instead in
>> kvm_vcpu_check_breakpoint, and return if r < 0.
>>
> But you need to know what to return EMULATE_DONE or EMULATE_USER_EXIT.

Sorry, _not_ return if r < 0.

Paolo

>> Paolo
>>
  };
  
  #define EMULTYPE_NO_DECODE(1 << 0)
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 1d928af..33b51bc 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -4872,6 +4872,60 @@ static bool retry_instruction(struct 
 x86_emulate_ctxt *ctxt,
  static int complete_emulated_mmio(struct kvm_vcpu *vcpu);
  static int complete_emulated_pio(struct kvm_vcpu *vcpu);
  
 +static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
 +  unsigned long *db)
 +{
 +  u32 dr6 = 0;
 +  int i;
 +  u32 enable, rwlen;
 +
 +  enable = dr7;
 +  rwlen = dr7 >> 16;
 +  for (i = 0; i < 4; i++, enable >>= 2, rwlen >>= 4)
 +  if ((enable & 3) && (rwlen & 15) == type && db[i] == addr)
 +  dr6 |= (1 << i);
 +  return dr6;
 +}
 +
 +static int kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu)
 +{
 +  struct kvm_run *kvm_run = vcpu->run;
 +  unsigned long eip = vcpu->arch.emulate_ctxt.eip;
 +  u32 dr6 = 0;
 +
 +  if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) &&
 +  (vcpu->arch.guest_debug_dr7 & DR7_BP_EN_MASK)) {
 +  dr6 = kvm_vcpu_check_hw_bp(eip, 0,
 + vcpu->arch.guest_debug_dr7,
 + vcpu->arch.eff_db);
 +
 +  if (dr6 != 0) {
 +  kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1;
 +  kvm_run->debug.arch.pc = kvm_rip_read(vcpu) +
 +  get_segment_base(vcpu, VCPU_SREG_CS);
 +
 +  kvm_run->debug.arch.exception = DB_VECTOR;
 +  kvm_run->exit_reason = KVM_EXIT_DEBUG;
 +  return EMULATE_DO_MMIO;
 +  }
 +  }
 +
 +  if (unlikely(vcpu->arch.dr7 & DR7_BP_EN_MASK)) {
 +  dr6 = kvm_vcpu_check_hw_bp(eip, 0,
 + vcpu->arch.dr7,
 + vcpu->arch.db);
 +
 +  if (dr6 != 0) {
 +  vcpu->arch.dr6 &= ~15;
 +  vcpu->arch.dr6 |= dr6;
 +  kvm_queue_exception(vcpu, DB_VECTOR);
 +  return EMULATE_DONE;
 +  }
 +  }
 +
 +  return EMULATE_PROCEED;
 +}
 +
  int x86_emulate_instruction(struct kvm_vcpu *vcpu,
unsigned long cr2,
int emulation_type,
 @@ -4892,6 +4946,17 @@ int x86_emulate_instru

Re: VirtIO and BSOD On Windows Server 2003

2013-06-04 Thread Stefan Hajnoczi

On Mon, Jun 03, 2013 at 09:56:41AM -0700, Aaron Clausen wrote:
> I recently built a new kvm server with Debian Wheezy which comes with
> KVM 1.1.2 and when I moved this guest over, I immediately started
> getting BSODs (0x007). I disabled virtio block driver and then
> attempted to upgrade to the latest with no luck.

Stop code 0x7b "Inaccessible boot device"?

How did you create the guest on the new server?  Perhaps the hardware
configuration changed - I suggest trying to make it as close to the
original guest as possible (including the same PCI slots).

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] kvm: zero-initialize KVM_SET_GSI_ROUTING input

2013-06-04 Thread Michael S. Tsirkin

kvm_add_routing_entry makes an attempt to
zero-initialize any new routing entry.
However, it fails to initialize padding
within the u field of the structure
kvm_irq_routing_entry.

Other functions like kvm_irqchip_update_msi_route
also fail to initialize the padding field in
kvm_irq_routing_entry.

While mostly harmless, this would prevent us from
reusing these fields for something useful in
the future.

It's better to just make sure all input is initialized.

Once it is, we can also drop complex field by field assignment and just
do the simple *a = *b to update a route entry.

Signed-off-by: Michael S. Tsirkin 
---
 kvm-all.c | 19 +++
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 405480e..f119ce1 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1006,11 +1006,8 @@ static void kvm_add_routing_entry(KVMState *s,
 }
 n = s->irq_routes->nr++;
 new = &s->irq_routes->entries[n];
-memset(new, 0, sizeof(*new));
-new->gsi = entry->gsi;
-new->type = entry->type;
-new->flags = entry->flags;
-new->u = entry->u;
+
+*new = *entry;
 
 set_gsi(s, entry->gsi);
 
@@ -1029,9 +1026,7 @@ static int kvm_update_routing_entry(KVMState *s,
 continue;
 }
 
-entry->type = new_entry->type;
-entry->flags = new_entry->flags;
-entry->u = new_entry->u;
+*entry = *new_entry;
 
 kvm_irqchip_commit_routes(s);
 
@@ -1043,7 +1038,7 @@ static int kvm_update_routing_entry(KVMState *s,
 
 void kvm_irqchip_add_irq_route(KVMState *s, int irq, int irqchip, int pin)
 {
-struct kvm_irq_routing_entry e;
+struct kvm_irq_routing_entry e = {};
 
 assert(pin < s->gsi_count);
 
@@ -1156,7 +1151,7 @@ int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg)
 return virq;
 }
 
-route = g_malloc(sizeof(KVMMSIRoute));
+route = g_malloc0(sizeof(KVMMSIRoute));
 route->kroute.gsi = virq;
 route->kroute.type = KVM_IRQ_ROUTING_MSI;
 route->kroute.flags = 0;
@@ -1177,7 +1172,7 @@ int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg)
 
 int kvm_irqchip_add_msi_route(KVMState *s, MSIMessage msg)
 {
-struct kvm_irq_routing_entry kroute;
+struct kvm_irq_routing_entry kroute = {};
 int virq;
 
 if (!kvm_gsi_routing_enabled()) {
@@ -1203,7 +1198,7 @@ int kvm_irqchip_add_msi_route(KVMState *s, MSIMessage msg)
 
 int kvm_irqchip_update_msi_route(KVMState *s, int virq, MSIMessage msg)
 {
-struct kvm_irq_routing_entry kroute;
+struct kvm_irq_routing_entry kroute = {};
 
 if (!kvm_irqchip_in_kernel()) {
 return -ENOSYS;
-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] kvm: skip system call when msi route is unchanged

2013-06-04 Thread Michael S. Tsirkin

Some guests do a large number of mask/unmask
calls which currently trigger expensive route update
system calls.
Detect that route in unchanged and skip the system call.

Reported-by: "Zhanghaoyu (A)" 
Signed-off-by: Michael S. Tsirkin 
---
 kvm-all.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kvm-all.c b/kvm-all.c
index f119ce1..891722b 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1026,6 +1026,10 @@ static int kvm_update_routing_entry(KVMState *s,
 continue;
 }
 
+if(!memcmp(entry, new_entry, sizeof *entry)) {
+return 0;
+}
+
 *entry = *new_entry;
 
 kvm_irqchip_commit_routes(s);
-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: x86: handle hardware breakpoints during emulation

2013-06-04 Thread Gleb Natapov

On Tue, Jun 04, 2013 at 01:33:20PM +0200, Paolo Bonzini wrote:
> Il 04/06/2013 13:28, Gleb Natapov ha scritto:
> > On Thu, May 30, 2013 at 06:00:30PM +0200, Paolo Bonzini wrote:
> >> This lets debugging work better during emulation of invalid
> >> guest state.
> >>
> >> The check is done before emulating the instruction, and (in the case
> >> of guest debugging) reuses EMULATE_DO_MMIO to exit with KVM_EXIT_DEBUG.
> >>
> >> Signed-off-by: Paolo Bonzini 
> >> ---
> >>  arch/x86/include/asm/kvm_host.h |  3 +-
> >>  arch/x86/kvm/x86.c  | 65 
> >> +
> >>  2 files changed, 67 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/x86/include/asm/kvm_host.h 
> >> b/arch/x86/include/asm/kvm_host.h
> >> index e2e09f3..aefd8c2 100644
> >> --- a/arch/x86/include/asm/kvm_host.h
> >> +++ b/arch/x86/include/asm/kvm_host.h
> >> @@ -788,9 +788,10 @@ extern u32  kvm_min_guest_tsc_khz;
> >>  extern u32  kvm_max_guest_tsc_khz;
> >>  
> >>  enum emulation_result {
> >> -  EMULATE_DONE,   /* no further processing */
> >> -  EMULATE_DO_MMIO,  /* kvm_run filled with mmio request */
> >> +  EMULATE_DONE, /* no further processing */
> >> +  EMULATE_DO_MMIO,  /* kvm_run ready for userspace exit */
> > If it no longer means MMIO (or PIO) lest rename it to something more
> > meaningful. EMULATE_EXIT? EMULATE_USER_EXIT?
> 
> I'll go with EMULATE_USER_EXIT.
> 
> >>EMULATE_FAIL, /* can't emulate this instruction */
> >> +  EMULATE_PROCEED,  /* proceed with rest of emulation */
> > I think we can do without this. Have to function: check_bp(),
> > handle_bp(). Do:
> > 
> >  if (check_bp())
> >return handle_bp();
> 
> I tried this, but it doesn't work because you need to pass the computed
> dr6 from check_bp to handle_bp.  It becomes really ugly.
> 
Can't check_bp() return dr6?

 if ((dr6 = check_bp())
return handle_bp(dr6);

> If you do not want EMULATE_PROCEED, I can just use -1 instead in
> kvm_vcpu_check_breakpoint, and return if r < 0.
> 
But you need to know what to return EMULATE_DONE or EMULATE_USER_EXIT.

> Paolo
> 
> >>  };
> >>  
> >>  #define EMULTYPE_NO_DECODE(1 << 0)
> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> index 1d928af..33b51bc 100644
> >> --- a/arch/x86/kvm/x86.c
> >> +++ b/arch/x86/kvm/x86.c
> >> @@ -4872,6 +4872,60 @@ static bool retry_instruction(struct 
> >> x86_emulate_ctxt *ctxt,
> >>  static int complete_emulated_mmio(struct kvm_vcpu *vcpu);
> >>  static int complete_emulated_pio(struct kvm_vcpu *vcpu);
> >>  
> >> +static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
> >> +  unsigned long *db)
> >> +{
> >> +  u32 dr6 = 0;
> >> +  int i;
> >> +  u32 enable, rwlen;
> >> +
> >> +  enable = dr7;
> >> +  rwlen = dr7 >> 16;
> >> +  for (i = 0; i < 4; i++, enable >>= 2, rwlen >>= 4)
> >> +  if ((enable & 3) && (rwlen & 15) == type && db[i] == addr)
> >> +  dr6 |= (1 << i);
> >> +  return dr6;
> >> +}
> >> +
> >> +static int kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu)
> >> +{
> >> +  struct kvm_run *kvm_run = vcpu->run;
> >> +  unsigned long eip = vcpu->arch.emulate_ctxt.eip;
> >> +  u32 dr6 = 0;
> >> +
> >> +  if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) &&
> >> +  (vcpu->arch.guest_debug_dr7 & DR7_BP_EN_MASK)) {
> >> +  dr6 = kvm_vcpu_check_hw_bp(eip, 0,
> >> + vcpu->arch.guest_debug_dr7,
> >> + vcpu->arch.eff_db);
> >> +
> >> +  if (dr6 != 0) {
> >> +  kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1;
> >> +  kvm_run->debug.arch.pc = kvm_rip_read(vcpu) +
> >> +  get_segment_base(vcpu, VCPU_SREG_CS);
> >> +
> >> +  kvm_run->debug.arch.exception = DB_VECTOR;
> >> +  kvm_run->exit_reason = KVM_EXIT_DEBUG;
> >> +  return EMULATE_DO_MMIO;
> >> +  }
> >> +  }
> >> +
> >> +  if (unlikely(vcpu->arch.dr7 & DR7_BP_EN_MASK)) {
> >> +  dr6 = kvm_vcpu_check_hw_bp(eip, 0,
> >> + vcpu->arch.dr7,
> >> + vcpu->arch.db);
> >> +
> >> +  if (dr6 != 0) {
> >> +  vcpu->arch.dr6 &= ~15;
> >> +  vcpu->arch.dr6 |= dr6;
> >> +  kvm_queue_exception(vcpu, DB_VECTOR);
> >> +  return EMULATE_DONE;
> >> +  }
> >> +  }
> >> +
> >> +  return EMULATE_PROCEED;
> >> +}
> >> +
> >>  int x86_emulate_instruction(struct kvm_vcpu *vcpu,
> >>unsigned long cr2,
> >>int emulation_type,
> >> @@ -4892,6 +4946,17 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
> >>  
> >>if (!(emulation_type & EMULTYPE_NO_DECODE)) {
> >>init_emulate_ctxt(vcpu);
> >> +
> >> +  /*
> >> +   * We will reenter on the same instruction since

Re: [PATCH 1/2] KVM: x86: handle hardware breakpoints during emulation

2013-06-04 Thread Paolo Bonzini

Il 04/06/2013 13:28, Gleb Natapov ha scritto:
> On Thu, May 30, 2013 at 06:00:30PM +0200, Paolo Bonzini wrote:
>> This lets debugging work better during emulation of invalid
>> guest state.
>>
>> The check is done before emulating the instruction, and (in the case
>> of guest debugging) reuses EMULATE_DO_MMIO to exit with KVM_EXIT_DEBUG.
>>
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  arch/x86/include/asm/kvm_host.h |  3 +-
>>  arch/x86/kvm/x86.c  | 65 
>> +
>>  2 files changed, 67 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h 
>> b/arch/x86/include/asm/kvm_host.h
>> index e2e09f3..aefd8c2 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -788,9 +788,10 @@ extern u32  kvm_min_guest_tsc_khz;
>>  extern u32  kvm_max_guest_tsc_khz;
>>  
>>  enum emulation_result {
>> -EMULATE_DONE,   /* no further processing */
>> -EMULATE_DO_MMIO,  /* kvm_run filled with mmio request */
>> +EMULATE_DONE, /* no further processing */
>> +EMULATE_DO_MMIO,  /* kvm_run ready for userspace exit */
> If it no longer means MMIO (or PIO) lest rename it to something more
> meaningful. EMULATE_EXIT? EMULATE_USER_EXIT?

I'll go with EMULATE_USER_EXIT.

>>  EMULATE_FAIL, /* can't emulate this instruction */
>> +EMULATE_PROCEED,  /* proceed with rest of emulation */
> I think we can do without this. Have to function: check_bp(),
> handle_bp(). Do:
> 
>  if (check_bp())
>return handle_bp();

I tried this, but it doesn't work because you need to pass the computed
dr6 from check_bp to handle_bp.  It becomes really ugly.

If you do not want EMULATE_PROCEED, I can just use -1 instead in
kvm_vcpu_check_breakpoint, and return if r < 0.

Paolo

>>  };
>>  
>>  #define EMULTYPE_NO_DECODE  (1 << 0)
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 1d928af..33b51bc 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -4872,6 +4872,60 @@ static bool retry_instruction(struct x86_emulate_ctxt 
>> *ctxt,
>>  static int complete_emulated_mmio(struct kvm_vcpu *vcpu);
>>  static int complete_emulated_pio(struct kvm_vcpu *vcpu);
>>  
>> +static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
>> +unsigned long *db)
>> +{
>> +u32 dr6 = 0;
>> +int i;
>> +u32 enable, rwlen;
>> +
>> +enable = dr7;
>> +rwlen = dr7 >> 16;
>> +for (i = 0; i < 4; i++, enable >>= 2, rwlen >>= 4)
>> +if ((enable & 3) && (rwlen & 15) == type && db[i] == addr)
>> +dr6 |= (1 << i);
>> +return dr6;
>> +}
>> +
>> +static int kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu)
>> +{
>> +struct kvm_run *kvm_run = vcpu->run;
>> +unsigned long eip = vcpu->arch.emulate_ctxt.eip;
>> +u32 dr6 = 0;
>> +
>> +if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) &&
>> +(vcpu->arch.guest_debug_dr7 & DR7_BP_EN_MASK)) {
>> +dr6 = kvm_vcpu_check_hw_bp(eip, 0,
>> +   vcpu->arch.guest_debug_dr7,
>> +   vcpu->arch.eff_db);
>> +
>> +if (dr6 != 0) {
>> +kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1;
>> +kvm_run->debug.arch.pc = kvm_rip_read(vcpu) +
>> +get_segment_base(vcpu, VCPU_SREG_CS);
>> +
>> +kvm_run->debug.arch.exception = DB_VECTOR;
>> +kvm_run->exit_reason = KVM_EXIT_DEBUG;
>> +return EMULATE_DO_MMIO;
>> +}
>> +}
>> +
>> +if (unlikely(vcpu->arch.dr7 & DR7_BP_EN_MASK)) {
>> +dr6 = kvm_vcpu_check_hw_bp(eip, 0,
>> +   vcpu->arch.dr7,
>> +   vcpu->arch.db);
>> +
>> +if (dr6 != 0) {
>> +vcpu->arch.dr6 &= ~15;
>> +vcpu->arch.dr6 |= dr6;
>> +kvm_queue_exception(vcpu, DB_VECTOR);
>> +return EMULATE_DONE;
>> +}
>> +}
>> +
>> +return EMULATE_PROCEED;
>> +}
>> +
>>  int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>>  unsigned long cr2,
>>  int emulation_type,
>> @@ -4892,6 +4946,17 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>>  
>>  if (!(emulation_type & EMULTYPE_NO_DECODE)) {
>>  init_emulate_ctxt(vcpu);
>> +
>> +/*
>> + * We will reenter on the same instruction since
>> + * we do not set complete_userspace_io.  This does not
>> + * handle watchpoints yet, those would be handled in
>> + * the emulate_ops.
>> + */
>> +r = kvm_vcpu_check_breakpoint(vcpu);
>> +if (r != EMULATE_PROCEED)
>> +return r;
>> +
>>  ctxt->interruptibility =

Re: [PATCH 1/1] KVM: add kvm_para_available to asm-generic/kvm_para.h

2013-06-04 Thread James Hogan

On 4 June 2013 10:05, Gleb Natapov  wrote:
> On Wed, May 22, 2013 at 12:29:22PM +0100, James Hogan wrote:
>> According to include/uapi/linux/kvm_para.h architectures should define
>> kvm_para_available, so add an implementation to asm-generic/kvm_para.h
>> which just returns false.
>>
> What is this fixing? The only user of kvm_para_available() that can
> benefit from this is in sound/pci/intel8x0.c, but I do not see follow up
> patch to use it there.

It was an unintentional config with mips + kvm + intel8x0 that hit it
(I think I accidentally based my mips config off an x86_64 config).
Kind of equivalent to a randconfig build failure I suppose.

Cheers
James
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: x86: handle hardware breakpoints during emulation

2013-06-04 Thread Gleb Natapov

On Thu, May 30, 2013 at 06:00:30PM +0200, Paolo Bonzini wrote:
> This lets debugging work better during emulation of invalid
> guest state.
> 
> The check is done before emulating the instruction, and (in the case
> of guest debugging) reuses EMULATE_DO_MMIO to exit with KVM_EXIT_DEBUG.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  arch/x86/include/asm/kvm_host.h |  3 +-
>  arch/x86/kvm/x86.c  | 65 
> +
>  2 files changed, 67 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index e2e09f3..aefd8c2 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -788,9 +788,10 @@ extern u32  kvm_min_guest_tsc_khz;
>  extern u32  kvm_max_guest_tsc_khz;
>  
>  enum emulation_result {
> - EMULATE_DONE,   /* no further processing */
> - EMULATE_DO_MMIO,  /* kvm_run filled with mmio request */
> + EMULATE_DONE, /* no further processing */
> + EMULATE_DO_MMIO,  /* kvm_run ready for userspace exit */
If it no longer means MMIO (or PIO) lest rename it to something more
meaningful. EMULATE_EXIT? EMULATE_USER_EXIT?

>   EMULATE_FAIL, /* can't emulate this instruction */
> + EMULATE_PROCEED,  /* proceed with rest of emulation */
I think we can do without this. Have to function: check_bp(),
handle_bp(). Do:

 if (check_bp())
   return handle_bp();

>  };
>  
>  #define EMULTYPE_NO_DECODE   (1 << 0)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 1d928af..33b51bc 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4872,6 +4872,60 @@ static bool retry_instruction(struct x86_emulate_ctxt 
> *ctxt,
>  static int complete_emulated_mmio(struct kvm_vcpu *vcpu);
>  static int complete_emulated_pio(struct kvm_vcpu *vcpu);
>  
> +static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
> + unsigned long *db)
> +{
> + u32 dr6 = 0;
> + int i;
> + u32 enable, rwlen;
> +
> + enable = dr7;
> + rwlen = dr7 >> 16;
> + for (i = 0; i < 4; i++, enable >>= 2, rwlen >>= 4)
> + if ((enable & 3) && (rwlen & 15) == type && db[i] == addr)
> + dr6 |= (1 << i);
> + return dr6;
> +}
> +
> +static int kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_run *kvm_run = vcpu->run;
> + unsigned long eip = vcpu->arch.emulate_ctxt.eip;
> + u32 dr6 = 0;
> +
> + if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) &&
> + (vcpu->arch.guest_debug_dr7 & DR7_BP_EN_MASK)) {
> + dr6 = kvm_vcpu_check_hw_bp(eip, 0,
> +vcpu->arch.guest_debug_dr7,
> +vcpu->arch.eff_db);
> +
> + if (dr6 != 0) {
> + kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1;
> + kvm_run->debug.arch.pc = kvm_rip_read(vcpu) +
> + get_segment_base(vcpu, VCPU_SREG_CS);
> +
> + kvm_run->debug.arch.exception = DB_VECTOR;
> + kvm_run->exit_reason = KVM_EXIT_DEBUG;
> + return EMULATE_DO_MMIO;
> + }
> + }
> +
> + if (unlikely(vcpu->arch.dr7 & DR7_BP_EN_MASK)) {
> + dr6 = kvm_vcpu_check_hw_bp(eip, 0,
> +vcpu->arch.dr7,
> +vcpu->arch.db);
> +
> + if (dr6 != 0) {
> + vcpu->arch.dr6 &= ~15;
> + vcpu->arch.dr6 |= dr6;
> + kvm_queue_exception(vcpu, DB_VECTOR);
> + return EMULATE_DONE;
> + }
> + }
> +
> + return EMULATE_PROCEED;
> +}
> +
>  int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>   unsigned long cr2,
>   int emulation_type,
> @@ -4892,6 +4946,17 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>  
>   if (!(emulation_type & EMULTYPE_NO_DECODE)) {
>   init_emulate_ctxt(vcpu);
> +
> + /*
> +  * We will reenter on the same instruction since
> +  * we do not set complete_userspace_io.  This does not
> +  * handle watchpoints yet, those would be handled in
> +  * the emulate_ops.
> +  */
> + r = kvm_vcpu_check_breakpoint(vcpu);
> + if (r != EMULATE_PROCEED)
> + return r;
> +
>   ctxt->interruptibility = 0;
>   ctxt->have_exception = false;
>   ctxt->perm_ok = false;
> -- 
> 1.8.1.4
> 

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC V9 0/19] Paravirtualized ticket spinlocks

2013-06-04 Thread Raghavendra K T


On 06/02/2013 01:44 AM, Andi Kleen wrote:


FWIW I use the paravirt spinlock ops for adding lock elision
to the spinlocks.

This needs to be done at the top level (so the level you're removing)

However I don't like the pv mechanism very much and would
be fine with using an static key hook in the main path
like I do for all the other lock types.

It also uses interrupt ops patching, for that it would
be still needed though.



Hi Andi, IIUC, you are okay with the current approach overall right?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] KVM: add kvm_para_available to asm-generic/kvm_para.h

2013-06-04 Thread Gleb Natapov

On Wed, May 22, 2013 at 12:29:22PM +0100, James Hogan wrote:
> According to include/uapi/linux/kvm_para.h architectures should define
> kvm_para_available, so add an implementation to asm-generic/kvm_para.h
> which just returns false.
> 
What is this fixing? The only user of kvm_para_available() that can
benefit from this is in sound/pci/intel8x0.c, but I do not see follow up
patch to use it there.

> Signed-off-by: James Hogan 
> Cc: Marcelo Tosatti 
> Cc: Gleb Natapov 
> Cc: Arnd Bergmann 
> ---
>  include/asm-generic/kvm_para.h | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/include/asm-generic/kvm_para.h b/include/asm-generic/kvm_para.h
> index 9d96605..fa25bec 100644
> --- a/include/asm-generic/kvm_para.h
> +++ b/include/asm-generic/kvm_para.h
> @@ -18,4 +18,9 @@ static inline unsigned int kvm_arch_para_features(void)
>   return 0;
>  }
>  
> +static inline bool kvm_para_available(void)
> +{
> + return false;
> +}
> +
>  #endif
> -- 
> 1.8.1.2
> 

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm: exclude ioeventfd from counting kvm_io_range limit

2013-06-04 Thread Gleb Natapov

On Sat, May 25, 2013 at 06:44:15AM +0800, Amos Kong wrote:
> We can easily reach the 1000 limit by start VM with a couple
> hundred I/O devices (multifunction=on). The hardcode limit
> already been adjusted 3 times (6 ~ 200 ~ 300 ~ 1000).
> 
> In userspace, we already have maximum file descriptor to
> limit ioeventfd count. But kvm_io_bus devices also are used
> for pit, pic, ioapic, coalesced_mmio. They couldn't be limited
> by maximum file descriptor.
> 
> Currently only ioeventfds take too much kvm_io_bus devices,
> so just exclude it from counting kvm_io_range limit.
> 
> Also fixed one indent issue in kvm_host.h
> 
> Signed-off-by: Amos Kong 
Applied, thanks.

> ---
>  include/linux/kvm_host.h | 3 ++-
>  virt/kvm/eventfd.c   | 2 ++
>  virt/kvm/kvm_main.c  | 3 ++-
>  3 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index f0eea07..ef261ab 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -144,7 +144,8 @@ struct kvm_io_range {
>  #define NR_IOBUS_DEVS 1000
>  
>  struct kvm_io_bus {
> - int   dev_count;
> + int dev_count;
> + int ioeventfd_count;
>   struct kvm_io_range range[];
>  };
>  
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index 64ee720..1550637 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -753,6 +753,7 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct 
> kvm_ioeventfd *args)
>   if (ret < 0)
>   goto unlock_fail;
>  
> + kvm->buses[bus_idx]->ioeventfd_count++;
>   list_add_tail(&p->list, &kvm->ioeventfds);
>  
>   mutex_unlock(&kvm->slots_lock);
> @@ -798,6 +799,7 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct 
> kvm_ioeventfd *args)
>   continue;
>  
>   kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
> + kvm->buses[bus_idx]->ioeventfd_count--;
>   ioeventfd_release(p);
>   ret = 0;
>   break;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 302681c..c6d9baf 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2926,7 +2926,8 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum 
> kvm_bus bus_idx, gpa_t addr,
>   struct kvm_io_bus *new_bus, *bus;
>  
>   bus = kvm->buses[bus_idx];
> - if (bus->dev_count > NR_IOBUS_DEVS - 1)
> + /* exclude ioeventfd which is limited by maximum fd */
> + if (bus->dev_count - bus->ioeventfd_count > NR_IOBUS_DEVS - 1)
>   return -ENOSPC;
>  
>   new_bus = kzalloc(sizeof(*bus) + ((bus->dev_count + 1) *
> -- 
> 1.8.1.4

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC V9 19/19] kvm hypervisor: Add directed yield in vcpu block path

2013-06-04 Thread Raghavendra K T


On 06/03/2013 09:35 PM, Konrad Rzeszutek Wilk wrote:

On Sun, Jun 02, 2013 at 12:56:45AM +0530, Raghavendra K T wrote:

kvm hypervisor: Add directed yield in vcpu block path

From: Raghavendra K T 

We use the improved PLE handler logic in vcpu block patch for
scheduling rather than plain schedule, so that we can make
intelligent decisions


You are missing '.' there, and



Yep.



Signed-off-by: Raghavendra K T 
---
  arch/ia64/include/asm/kvm_host.h|5 +
  arch/powerpc/include/asm/kvm_host.h |5 +
  arch/s390/include/asm/kvm_host.h|5 +
  arch/x86/include/asm/kvm_host.h |2 +-
  arch/x86/kvm/x86.c  |8 
  include/linux/kvm_host.h|2 +-
  virt/kvm/kvm_main.c |6 --
  7 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/arch/ia64/include/asm/kvm_host.h b/arch/ia64/include/asm/kvm_host.h
index 989dd3f..999ab15 100644
--- a/arch/ia64/include/asm/kvm_host.h
+++ b/arch/ia64/include/asm/kvm_host.h
@@ -595,6 +595,11 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu);
  int kvm_pal_emul(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run);
  void kvm_sal_emul(struct kvm_vcpu *vcpu);

+static inline void kvm_do_schedule(struct kvm_vcpu *vcpu)
+{
+   schedule();
+}
+
  #define __KVM_HAVE_ARCH_VM_ALLOC 1
  struct kvm *kvm_arch_alloc_vm(void);
  void kvm_arch_free_vm(struct kvm *kvm);
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index af326cd..1aeecc0 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -628,4 +628,9 @@ struct kvm_vcpu_arch {
  #define __KVM_HAVE_ARCH_WQP
  #define __KVM_HAVE_CREATE_DEVICE

+static inline void kvm_do_schedule(struct kvm_vcpu *vcpu)
+{
+   schedule();
+}
+
  #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 16bd5d1..db09a56 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -266,4 +266,9 @@ struct kvm_arch{
  };

  extern int sie64a(struct kvm_s390_sie_block *, u64 *);
+static inline void kvm_do_schedule(struct kvm_vcpu *vcpu)
+{
+   schedule();
+}
+
  #endif
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 95702de..72ff791 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1042,5 +1042,5 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info);
  int kvm_pmu_read_pmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data);
  void kvm_handle_pmu_event(struct kvm_vcpu *vcpu);
  void kvm_deliver_pmi(struct kvm_vcpu *vcpu);
-
+void kvm_do_schedule(struct kvm_vcpu *vcpu);
  #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b963c86..d26c4be 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7281,6 +7281,14 @@ bool kvm_arch_can_inject_async_page_present(struct 
kvm_vcpu *vcpu)
kvm_x86_ops->interrupt_allowed(vcpu);
  }

+void kvm_do_schedule(struct kvm_vcpu *vcpu)
+{
+   /* We try to yield to a kikced vcpu else do a schedule */


s/kikced/kicked/


:(.  Thanks .. will change that.




[...]

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC V9 18/19] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock

2013-06-04 Thread Raghavendra K T


On 06/03/2013 09:34 PM, Konrad Rzeszutek Wilk wrote:

On Sun, Jun 02, 2013 at 12:56:24AM +0530, Raghavendra K T wrote:

Documentation/kvm : Add documentation on Hypercalls and features used for PV 
spinlock

From: Raghavendra K T 

KVM_HC_KICK_CPU  hypercall added to wakeup halted vcpu in paravirtual spinlock
enabled guest.

KVM_FEATURE_PV_UNHALT enables guest to check whether pv spinlock can be enabled
in guest.

Thanks Vatsa for rewriting KVM_HC_KICK_CPU

Signed-off-by: Srivatsa Vaddagiri 
Signed-off-by: Raghavendra K T 
---
  Documentation/virtual/kvm/cpuid.txt  |4 
  Documentation/virtual/kvm/hypercalls.txt |   13 +
  2 files changed, 17 insertions(+)

diff --git a/Documentation/virtual/kvm/cpuid.txt 
b/Documentation/virtual/kvm/cpuid.txt
index 83afe65..654f43c 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -43,6 +43,10 @@ KVM_FEATURE_CLOCKSOURCE2   || 3 || kvmclock 
available at msrs
  KVM_FEATURE_ASYNC_PF   || 4 || async pf can be enabled by
 ||   || writing to msr 0x4b564d02
  --
+KVM_FEATURE_PV_UNHALT  || 6 || guest checks this feature bit
+   ||   || before enabling paravirtualized
+   ||   || spinlock support.
+--
  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no guest-side
 ||   || per-cpu warps are expected in
 ||   || kvmclock.
diff --git a/Documentation/virtual/kvm/hypercalls.txt 
b/Documentation/virtual/kvm/hypercalls.txt
index ea113b5..2a4da11 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -64,3 +64,16 @@ Purpose: To enable communication between the hypervisor and 
guest there is a
  shared page that contains parts of supervisor visible register state.
  The guest can map this shared page to access its supervisor register through
  memory using this hypercall.
+
+5. KVM_HC_KICK_CPU
+
+Architecture: x86
+Status: active
+Purpose: Hypercall used to wakeup a vcpu from HLT state
+Usage example : A vcpu of a paravirtualized guest that is busywaiting in guest
+kernel mode for an event to occur (ex: a spinlock to become available) can
+execute HLT instruction once it has busy-waited for more than a threshold
+time-interval. Execution of HLT instruction would cause the hypervisor to put
+the vcpu to sleep until occurence of an appropriate event. Another vcpu of the
+same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
+specifying APIC ID of the vcpu to be wokenup.


woken up.


Yep. :)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC V9 5/19] xen/pvticketlock: Xen implementation for PV ticket locks

2013-06-04 Thread Raghavendra K T


On 06/03/2013 09:33 PM, Konrad Rzeszutek Wilk wrote:

On Sat, Jun 01, 2013 at 12:23:14PM -0700, Raghavendra K T wrote:

xen/pvticketlock: Xen implementation for PV ticket locks

From: Jeremy Fitzhardinge 

Replace the old Xen implementation of PV spinlocks with and implementation
of xen_lock_spinning and xen_unlock_kick.

xen_lock_spinning simply registers the cpu in its entry in lock_waiting,
adds itself to the waiting_cpus set, and blocks on an event channel
until the channel becomes pending.

xen_unlock_kick searches the cpus in waiting_cpus looking for the one
which next wants this lock with the next ticket, if any.  If found,
it kicks it by making its event channel pending, which wakes it up.

We need to make sure interrupts are disabled while we're relying on the
contents of the per-cpu lock_waiting values, otherwise an interrupt
handler could come in, try to take some other lock, block, and overwrite
our values.

Raghu: use function + enum instead of macro, cmpxchg for zero status reset

Signed-off-by: Jeremy Fitzhardinge 
Reviewed-by: Konrad Rzeszutek Wilk 
Signed-off-by: Raghavendra K T 
---
  arch/x86/xen/spinlock.c |  347 +++
  1 file changed, 78 insertions(+), 269 deletions(-)

diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index d6481a9..860e190 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -16,45 +16,44 @@
  #include "xen-ops.h"
  #include "debugfs.h"

-#ifdef CONFIG_XEN_DEBUG_FS
-static struct xen_spinlock_stats
-{
-   u64 taken;
-   u32 taken_slow;
-   u32 taken_slow_nested;
-   u32 taken_slow_pickup;
-   u32 taken_slow_spurious;
-   u32 taken_slow_irqenable;
+enum xen_contention_stat {
+   TAKEN_SLOW,
+   TAKEN_SLOW_PICKUP,
+   TAKEN_SLOW_SPURIOUS,
+   RELEASED_SLOW,
+   RELEASED_SLOW_KICKED,
+   NR_CONTENTION_STATS
+};

-   u64 released;
-   u32 released_slow;
-   u32 released_slow_kicked;

+#ifdef CONFIG_XEN_DEBUG_FS
  #define HISTO_BUCKETS 30
-   u32 histo_spin_total[HISTO_BUCKETS+1];
-   u32 histo_spin_spinning[HISTO_BUCKETS+1];
+static struct xen_spinlock_stats
+{
+   u32 contention_stats[NR_CONTENTION_STATS];
u32 histo_spin_blocked[HISTO_BUCKETS+1];
-
-   u64 time_total;
-   u64 time_spinning;
u64 time_blocked;
  } spinlock_stats;

  static u8 zero_stats;

-static unsigned lock_timeout = 1 << 10;
-#define TIMEOUT lock_timeout
-
  static inline void check_zero(void)
  {
-   if (unlikely(zero_stats)) {
-   memset(&spinlock_stats, 0, sizeof(spinlock_stats));
-   zero_stats = 0;
+   u8 ret;
+   u8 old = ACCESS_ONCE(zero_stats);
+   if (unlikely(old)) {
+   ret = cmpxchg(&zero_stats, old, 0);
+   /* This ensures only one fellow resets the stat */
+   if (ret == old)
+   memset(&spinlock_stats, 0, sizeof(spinlock_stats));
}
  }

-#define ADD_STATS(elem, val)   \
-   do { check_zero(); spinlock_stats.elem += (val); } while(0)
+static inline void add_stats(enum xen_contention_stat var, u32 val)
+{
+   check_zero();
+   spinlock_stats.contention_stats[var] += val;
+}

  static inline u64 spin_time_start(void)
  {
@@ -73,22 +72,6 @@ static void __spin_time_accum(u64 delta, u32 *array)
array[HISTO_BUCKETS]++;
  }

-static inline void spin_time_accum_spinning(u64 start)
-{
-   u32 delta = xen_clocksource_read() - start;
-
-   __spin_time_accum(delta, spinlock_stats.histo_spin_spinning);
-   spinlock_stats.time_spinning += delta;
-}
-
-static inline void spin_time_accum_total(u64 start)
-{
-   u32 delta = xen_clocksource_read() - start;
-
-   __spin_time_accum(delta, spinlock_stats.histo_spin_total);
-   spinlock_stats.time_total += delta;
-}
-
  static inline void spin_time_accum_blocked(u64 start)
  {
u32 delta = xen_clocksource_read() - start;
@@ -98,19 +81,15 @@ static inline void spin_time_accum_blocked(u64 start)
  }
  #else  /* !CONFIG_XEN_DEBUG_FS */
  #define TIMEOUT   (1 << 10)
-#define ADD_STATS(elem, val)   do { (void)(val); } while(0)
+static inline void add_stats(enum xen_contention_stat var, u32 val)
+{
+}

  static inline u64 spin_time_start(void)
  {
return 0;
  }

-static inline void spin_time_accum_total(u64 start)
-{
-}
-static inline void spin_time_accum_spinning(u64 start)
-{
-}
  static inline void spin_time_accum_blocked(u64 start)
  {
  }
@@ -133,229 +112,82 @@ typedef u16 xen_spinners_t;
asm(LOCK_PREFIX " decw %0" : "+m" ((xl)->spinners) : : "memory");
  #endif

-struct xen_spinlock {
-   unsigned char lock; /* 0 -> free; 1 -> locked */
-   xen_spinners_t spinners;/* count of waiting cpus */
+struct xen_lock_waiting {
+   struct arch_spinlock *lock;
+   __ticket_t want;
  };

  static DEFINE_PER_CPU(int, lock_kicker_irq) = -1;
+static DEFINE_PER_

Re: [PATCH RFC V9 16/19] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor

2013-06-04 Thread Raghavendra K T


On 06/03/2013 09:30 PM, Konrad Rzeszutek Wilk wrote:

On Sun, Jun 02, 2013 at 12:55:57AM +0530, Raghavendra K T wrote:

kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor

From: Srivatsa Vaddagiri 

During smp_boot_cpus  paravirtualied KVM guest detects if the hypervisor has
required feature (KVM_FEATURE_PV_UNHALT) to support pv-ticketlocks. If so,
  support for pv-ticketlocks is registered via pv_lock_ops.

Use KVM_HC_KICK_CPU hypercall to wakeup waiting/halted vcpu.

Signed-off-by: Srivatsa Vaddagiri 
Signed-off-by: Suzuki Poulose 
[Raghu: check_zero race fix, enum for kvm_contention_stat
jumplabel related changes ]
Signed-off-by: Raghavendra K T 
---
  arch/x86/include/asm/kvm_para.h |   14 ++
  arch/x86/kernel/kvm.c   |  256 +++
  2 files changed, 268 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 695399f..427afcb 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -118,10 +118,20 @@ void kvm_async_pf_task_wait(u32 token);
  void kvm_async_pf_task_wake(u32 token);
  u32 kvm_read_and_reset_pf_reason(void);
  extern void kvm_disable_steal_time(void);
-#else
-#define kvm_guest_init() do { } while (0)
+
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+void __init kvm_spinlock_init(void);
+#else /* !CONFIG_PARAVIRT_SPINLOCKS */
+static inline void kvm_spinlock_init(void)
+{
+}
+#endif /* CONFIG_PARAVIRT_SPINLOCKS */
+
+#else /* CONFIG_KVM_GUEST */
+#define kvm_guest_init() do {} while (0)
  #define kvm_async_pf_task_wait(T) do {} while(0)
  #define kvm_async_pf_task_wake(T) do {} while(0)
+
  static inline u32 kvm_read_and_reset_pf_reason(void)
  {
return 0;
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index cd6d9a5..2715b92 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -34,6 +34,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -419,6 +420,7 @@ static void __init kvm_smp_prepare_boot_cpu(void)
WARN_ON(kvm_register_clock("primary cpu clock"));
kvm_guest_cpu_init();
native_smp_prepare_boot_cpu();
+   kvm_spinlock_init();
  }

  static void __cpuinit kvm_guest_cpu_online(void *dummy)
@@ -523,3 +525,257 @@ static __init int activate_jump_labels(void)
return 0;
  }
  arch_initcall(activate_jump_labels);
+
+/* Kick a cpu by its apicid. Used to wake up a halted vcpu */
+void kvm_kick_cpu(int cpu)
+{
+   int apicid;
+
+   apicid = per_cpu(x86_cpu_to_apicid, cpu);
+   kvm_hypercall1(KVM_HC_KICK_CPU, apicid);
+}
+
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+
+enum kvm_contention_stat {
+   TAKEN_SLOW,
+   TAKEN_SLOW_PICKUP,
+   RELEASED_SLOW,
+   RELEASED_SLOW_KICKED,
+   NR_CONTENTION_STATS
+};
+
+#ifdef CONFIG_KVM_DEBUG_FS
+#define HISTO_BUCKETS  30
+
+static struct kvm_spinlock_stats
+{
+   u32 contention_stats[NR_CONTENTION_STATS];
+   u32 histo_spin_blocked[HISTO_BUCKETS+1];
+   u64 time_blocked;
+} spinlock_stats;
+
+static u8 zero_stats;
+
+static inline void check_zero(void)
+{
+   u8 ret;
+   u8 old;
+
+   old = ACCESS_ONCE(zero_stats);
+   if (unlikely(old)) {
+   ret = cmpxchg(&zero_stats, old, 0);
+   /* This ensures only one fellow resets the stat */
+   if (ret == old)
+   memset(&spinlock_stats, 0, sizeof(spinlock_stats));
+   }
+}
+
+static inline void add_stats(enum kvm_contention_stat var, u32 val)
+{
+   check_zero();
+   spinlock_stats.contention_stats[var] += val;
+}
+
+
+static inline u64 spin_time_start(void)
+{
+   return sched_clock();
+}
+
+static void __spin_time_accum(u64 delta, u32 *array)
+{
+   unsigned index;
+
+   index = ilog2(delta);
+   check_zero();
+
+   if (index < HISTO_BUCKETS)
+   array[index]++;
+   else
+   array[HISTO_BUCKETS]++;
+}
+
+static inline void spin_time_accum_blocked(u64 start)
+{
+   u32 delta;
+
+   delta = sched_clock() - start;
+   __spin_time_accum(delta, spinlock_stats.histo_spin_blocked);
+   spinlock_stats.time_blocked += delta;
+}
+
+static struct dentry *d_spin_debug;
+static struct dentry *d_kvm_debug;
+
+struct dentry *kvm_init_debugfs(void)
+{
+   d_kvm_debug = debugfs_create_dir("kvm", NULL);
+   if (!d_kvm_debug)
+   printk(KERN_WARNING "Could not create 'kvm' debugfs 
directory\n");
+
+   return d_kvm_debug;
+}
+
+static int __init kvm_spinlock_debugfs(void)
+{
+   struct dentry *d_kvm;
+
+   d_kvm = kvm_init_debugfs();
+   if (d_kvm == NULL)
+   return -ENOMEM;
+
+   d_spin_debug = debugfs_create_dir("spinlocks", d_kvm);
+
+   debugfs_create_u8("zero_stats", 0644, d_spin_debug, &zero_stats);
+
+   debugfs_create_u32("taken_slow", 0444, d_spin_debug,
+  &spinlock_stats.contention_stats[TAKEN_SLOW]);
+   debugf

Re: [PATCH RFC V9 12/19] xen: Enable PV ticketlocks on HVM Xen

2013-06-04 Thread Raghavendra K T


On 06/03/2013 09:27 PM, Konrad Rzeszutek Wilk wrote:

On Sun, Jun 02, 2013 at 12:55:03AM +0530, Raghavendra K T wrote:

xen: Enable PV ticketlocks on HVM Xen


There is more to it. You should also revert 
70dd4998cb85f0ecd6ac892cc7232abefa432efb



Yes, true. Do you expect the revert to be folded into this patch itself?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC V9 9/19] Split out rate limiting from jump_label.h

2013-06-04 Thread Raghavendra K T


On 06/03/2013 09:26 PM, Konrad Rzeszutek Wilk wrote:

On Sun, Jun 02, 2013 at 12:54:22AM +0530, Raghavendra K T wrote:

Split jumplabel ratelimit


I would change the title a bit, perhaps prefix it with: "jump_label: "


From: Andrew Jones 

Commit b202952075f62603bea9bfb6ebc6b0420db11949 introduced rate limiting


Also please add right after the git id this:

("perf, core: Rate limit perf_sched_events jump_label patching")


Agreed.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

68 matches

Mail list logo