Re: dirty page tracking in kvm/qemu -- page faults inevitable?

2014-07-30 Thread Chris Friesen

On 07/30/2014 12:09 AM, Xiao Guangrong wrote:

On 07/30/2014 06:12 AM, Chris Friesen wrote:

Hi,

I've got an issue where we're hitting major performance penalties while doing 
live migration, and it seems like it might
be due to page faults triggering hypervisor exits, and then we get stuck 
waiting for the iothread lock which is held by
the qemu dirty page scanning code.


I am afraid that using dirty-bit instead of write-protection may cause the case
even more worse for iothread-lock because we need to walk whole sptes to get
dirty-set pages, however currently we only need to walk the page set in the
bitmap.


I found a document at
"http://ftp.software-sources.co.il/Processor_Architecture_Update-Bob_Valentine.pdf"; 
which talks about the benefits of Haswell.  One of the items reads:


"New Accessed and Dirty bits for Extended Page Tables (EPT) eliminates 
major cause of vmexits"


Is that accurate?  If so, then it seems like it should allow for the VM 
to run without trying to exit the hypervisor, and as long as it just 
does in-memory operations it won't contend on the iothread lock.


Chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr

2014-07-30 Thread Bharat Bhushan
This are not specific to e500hv but applicable for bookehv
(As per comment from Scott Wood on my patch
"kvm: ppc: bookehv: Added wrapper macros for shadow registers")

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/include/asm/kvm_ppc.h | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index cbee453..2ae2897 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -540,16 +540,16 @@ static inline bool kvmppc_shared_big_endian(struct 
kvm_vcpu *vcpu)
 #endif
 }
 
-#define SPRNG_WRAPPER_GET(reg, e500hv_spr) \
+#define SPRNG_WRAPPER_GET(reg, bookehv_spr)\
 static inline ulong kvmppc_get_##reg(struct kvm_vcpu *vcpu)\
 {  \
-   return mfspr(e500hv_spr);   \
+   return mfspr(bookehv_spr);  \
 }  \
 
-#define SPRNG_WRAPPER_SET(reg, e500hv_spr) \
+#define SPRNG_WRAPPER_SET(reg, bookehv_spr)\
 static inline void kvmppc_set_##reg(struct kvm_vcpu *vcpu, ulong val)  \
 {  \
-   mtspr(e500hv_spr, val); \
+   mtspr(bookehv_spr, val);
\
 }  \
 
 #define SHARED_WRAPPER_GET(reg, size)  \
@@ -574,18 +574,18 @@ static inline void kvmppc_set_##reg(struct kvm_vcpu 
*vcpu, u##size val)   \
SHARED_WRAPPER_GET(reg, size)   \
SHARED_WRAPPER_SET(reg, size)   \
 
-#define SPRNG_WRAPPER(reg, e500hv_spr) \
-   SPRNG_WRAPPER_GET(reg, e500hv_spr)  \
-   SPRNG_WRAPPER_SET(reg, e500hv_spr)  \
+#define SPRNG_WRAPPER(reg, bookehv_spr)
\
+   SPRNG_WRAPPER_GET(reg, bookehv_spr) \
+   SPRNG_WRAPPER_SET(reg, bookehv_spr) \
 
 #ifdef CONFIG_KVM_BOOKE_HV
 
-#define SHARED_SPRNG_WRAPPER(reg, size, e500hv_spr)\
-   SPRNG_WRAPPER(reg, e500hv_spr)  \
+#define SHARED_SPRNG_WRAPPER(reg, size, bookehv_spr)   \
+   SPRNG_WRAPPER(reg, bookehv_spr) \
 
 #else
 
-#define SHARED_SPRNG_WRAPPER(reg, size, e500hv_spr)\
+#define SHARED_SPRNG_WRAPPER(reg, size, bookehv_spr)   \
SHARED_WRAPPER(reg, size)   \
 
 #endif
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr

2014-07-30 Thread Alexander Graf


On 30.07.14 11:33, Bharat Bhushan wrote:

This are not specific to e500hv but applicable for bookehv
(As per comment from Scott Wood on my patch
"kvm: ppc: bookehv: Added wrapper macros for shadow registers")

Signed-off-by: Bharat Bhushan 


Thanks, applied to kvm-ppc-queue.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-30 Thread Marc Zyngier
On Fri, Jul 25 2014 at  4:29:12 pm BST, Will Deacon  wrote:
> If the physical address of GICV isn't page-aligned, then we end up
> creating a stage-2 mapping of the page containing it, which causes us to
> map neighbouring memory locations directly into the guest.
>
> As an example, consider a platform with GICV at physical 0x2c02f000
> running a 64k-page host kernel. If qemu maps this into the guest at
> 0x8001, then guest physical addresses 0x8001 - 0x8001efff will
> map host physical region 0x2c02 - 0x2c02efff. Accesses to these
> physical regions may cause UNPREDICTABLE behaviour, for example, on the
> Juno platform this will cause an SError exception to EL3, which brings
> down the entire physical CPU resulting in RCU stalls / HYP panics / host
> crashing / wasted weeks of debugging.
>
> SBSA recommends that systems alias the 4k GICV across the bounding 64k
> region, in which case GICV physical could be described as 0x2c02 in
> the above scenario.
>
> This patch fixes the problem by failing the vgic probe if the physical
> base address or the size of GICV aren't page-aligned. Note that this
> generated a warning in dmesg about freeing enabled IRQs, so I had to
> move the IRQ enabling later in the probe.
>
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Cc: Gleb Natapov 
> Cc: Paolo Bonzini 
> Cc: Joel Schopp 
> Cc: Don Dutile 
> Acked-by: Peter Maydell 
> Signed-off-by: Will Deacon 

Looks good to me:

Acked-by: Marc Zyngier 

Christoffer, can you please take this as an urgent fix?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: nVMX: nested TPR shadow/threshold emulation

2014-07-30 Thread Wanpeng Li
This patch fix bug https://bugzilla.kernel.org/show_bug.cgi?id=61411

TPR shadow/threshold feature is important to speed up the Windows guest. 
Besides, it is a must feature for certain VMM.

We map virtual APIC page address and TPR threshold from L1 VMCS. If 
TPR_BELOW_THRESHOLD VM exit is triggered by L2 guest and L1 interested 
in, we inject it into L1 VMM for handling.

Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a3845b8..f60846c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2331,7 +2331,7 @@ static __init void nested_vmx_setup_ctls_msrs(void)
CPU_BASED_MOV_DR_EXITING | CPU_BASED_UNCOND_IO_EXITING |
CPU_BASED_USE_IO_BITMAPS | CPU_BASED_MONITOR_EXITING |
CPU_BASED_RDPMC_EXITING | CPU_BASED_RDTSC_EXITING |
-   CPU_BASED_PAUSE_EXITING |
+   CPU_BASED_PAUSE_EXITING | CPU_BASED_TPR_SHADOW |
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
/*
 * We can allow some features even when not supported by the
@@ -6937,7 +6937,7 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
case EXIT_REASON_MCE_DURING_VMENTRY:
return 0;
case EXIT_REASON_TPR_BELOW_THRESHOLD:
-   return 1;
+   return nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW);
case EXIT_REASON_APIC_ACCESS:
return nested_cpu_has2(vmcs12,
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES);
@@ -7058,6 +7058,9 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
 
 static void update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
 {
+   if (is_guest_mode(vcpu))
+   return;
+
if (irr == -1 || tpr < irr) {
vmcs_write32(TPR_THRESHOLD, 0);
return;
@@ -7962,14 +7965,14 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, 
struct vmcs12 *vmcs12)
if (!vmx->rdtscp_enabled)
exec_control &= ~SECONDARY_EXEC_RDTSCP;
/* Take the following fields only from vmcs12 */
-   exec_control &= ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
- SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
+   exec_control &= ~(SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
   SECONDARY_EXEC_APIC_REGISTER_VIRT);
if (nested_cpu_has(vmcs12,
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS))
exec_control |= vmcs12->secondary_vm_exec_control;
 
if (exec_control & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES) {
+   struct page *virtual_apic_page;
/*
 * Translate L1 physical address to host physical
 * address for vmcs02. Keep the page pinned, so this
@@ -7992,6 +7995,15 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct 
vmcs12 *vmcs12)
else
vmcs_write64(APIC_ACCESS_ADDR,
  page_to_phys(vmx->nested.apic_access_page));
+
+   virtual_apic_page = nested_get_page(vcpu,
+   vmcs12->virtual_apic_page_addr);
+   if (vmcs_read64(VIRTUAL_APIC_PAGE_ADDR) !=
+   page_to_phys(virtual_apic_page))
+   vmcs_write64(VIRTUAL_APIC_PAGE_ADDR,
+   page_to_phys(virtual_apic_page));
+   nested_release_page(virtual_apic_page);
+
} else if (vm_need_virtualize_apic_accesses(vmx->vcpu.kvm)) {
exec_control |=
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
@@ -8002,6 +8014,8 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct 
vmcs12 *vmcs12)
vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
}
 
+   if (nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW))
+   vmcs_write32(TPR_THRESHOLD, vmcs12->tpr_threshold);
 
/*
 * Set host-state according to L0's settings (vmcs12 is irrelevant here)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-30 Thread Christoffer Dall
From: Will Deacon 

If the physical address of GICV isn't page-aligned, then we end up
creating a stage-2 mapping of the page containing it, which causes us to
map neighbouring memory locations directly into the guest.

As an example, consider a platform with GICV at physical 0x2c02f000
running a 64k-page host kernel. If qemu maps this into the guest at
0x8001, then guest physical addresses 0x8001 - 0x8001efff will
map host physical region 0x2c02 - 0x2c02efff. Accesses to these
physical regions may cause UNPREDICTABLE behaviour, for example, on the
Juno platform this will cause an SError exception to EL3, which brings
down the entire physical CPU resulting in RCU stalls / HYP panics / host
crashing / wasted weeks of debugging.

SBSA recommends that systems alias the 4k GICV across the bounding 64k
region, in which case GICV physical could be described as 0x2c02 in
the above scenario.

This patch fixes the problem by failing the vgic probe if the physical
base address or the size of GICV aren't page-aligned. Note that this
generated a warning in dmesg about freeing enabled IRQs, so I had to
move the IRQ enabling later in the probe.

Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Gleb Natapov 
Cc: Paolo Bonzini 
Cc: Joel Schopp 
Cc: Don Dutile 
Acked-by: Peter Maydell 
Acked-by: Joel Schopp 
Acked-by: Marc Zyngier 
Signed-off-by: Will Deacon 
Signed-off-by: Christoffer Dall 
---
 virt/kvm/arm/vgic.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 56ff9be..476d3bf 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1526,17 +1526,33 @@ int kvm_vgic_hyp_init(void)
goto out_unmap;
}
 
-   kvm_info("%s@%llx IRQ%d\n", vgic_node->name,
-vctrl_res.start, vgic_maint_irq);
-   on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1);
-
if (of_address_to_resource(vgic_node, 3, &vcpu_res)) {
kvm_err("Cannot obtain VCPU resource\n");
ret = -ENXIO;
goto out_unmap;
}
+
+   if (!PAGE_ALIGNED(vcpu_res.start)) {
+   kvm_err("GICV physical address 0x%llx not page aligned\n",
+   (unsigned long long)vcpu_res.start);
+   ret = -ENXIO;
+   goto out_unmap;
+   }
+
+   if (!PAGE_ALIGNED(resource_size(&vcpu_res))) {
+   kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n",
+   (unsigned long long)resource_size(&vcpu_res),
+   PAGE_SIZE);
+   ret = -ENXIO;
+   goto out_unmap;
+   }
+
vgic_vcpu_base = vcpu_res.start;
 
+   kvm_info("%s@%llx IRQ%d\n", vgic_node->name,
+vctrl_res.start, vgic_maint_irq);
+   on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1);
+
goto out;
 
 out_unmap:
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

2014-07-30 Thread Christoffer Dall
On Wed, Jul 30, 2014 at 11:47:40AM +0100, Marc Zyngier wrote:
> On Fri, Jul 25 2014 at  4:29:12 pm BST, Will Deacon  
> wrote:
> > If the physical address of GICV isn't page-aligned, then we end up
> > creating a stage-2 mapping of the page containing it, which causes us to
> > map neighbouring memory locations directly into the guest.
> >
> > As an example, consider a platform with GICV at physical 0x2c02f000
> > running a 64k-page host kernel. If qemu maps this into the guest at
> > 0x8001, then guest physical addresses 0x8001 - 0x8001efff will
> > map host physical region 0x2c02 - 0x2c02efff. Accesses to these
> > physical regions may cause UNPREDICTABLE behaviour, for example, on the
> > Juno platform this will cause an SError exception to EL3, which brings
> > down the entire physical CPU resulting in RCU stalls / HYP panics / host
> > crashing / wasted weeks of debugging.
> >
> > SBSA recommends that systems alias the 4k GICV across the bounding 64k
> > region, in which case GICV physical could be described as 0x2c02 in
> > the above scenario.
> >
> > This patch fixes the problem by failing the vgic probe if the physical
> > base address or the size of GICV aren't page-aligned. Note that this
> > generated a warning in dmesg about freeing enabled IRQs, so I had to
> > move the IRQ enabling later in the probe.
> >
> > Cc: Christoffer Dall 
> > Cc: Marc Zyngier 
> > Cc: Gleb Natapov 
> > Cc: Paolo Bonzini 
> > Cc: Joel Schopp 
> > Cc: Don Dutile 
> > Acked-by: Peter Maydell 
> > Signed-off-by: Will Deacon 
> 
> Looks good to me:
> 
> Acked-by: Marc Zyngier 
> 
> Christoffer, can you please take this as an urgent fix?
> 
Yes, sorry for the delay,

Applied to master and notified the KVM guys to try and get it into 3.16.

Thanks,
-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] KVM/ARM Urgent fix for 3.16

2014-07-30 Thread Christoffer Dall
Hi Paolo and Gleb,

Is there any chance you can get this urgent fix (which allows KVM guest
to bring down the entire system on some 64K enabled ARM64 hosts) merged
for 3.16?

The following changes since commit bb18b526a9d8d4a3fe56f234d5013b9f6036978d:

  Merge tag 'signed-for-3.16' of git://github.com/agraf/linux-2.6 into 
kvm-master (2014-07-08 12:08:58 +0200)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git 
tags/kvm-arm-for-3.16-rc7

for you to fetch changes up to 63afbe7a0ac184ef8485dac4914e87b211b5bfaa:

  kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform (2014-07-30 
14:35:42 +0200)

---
Will Deacon (1):
  kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

 virt/kvm/arm/vgic.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: HV: Remove generic instruction emulation

2014-07-30 Thread Alexander Graf
Now that we have properly split load/store instruction emulation and generic
instruction emulation, we can move the generic one from kvm.ko to kvm-pr.ko
on book3s_64.

This reduces the attack surface and amount of code loaded on HV KVM kernels.

Signed-off-by: Alexander Graf 
---
 arch/powerpc/kvm/Makefile   |  2 +-
 arch/powerpc/kvm/trace_pr.h | 20 
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 1ccd7a1..2d590de 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -48,6 +48,7 @@ kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) := 
\
 
 kvm-pr-y := \
fpu.o \
+   emulate.o \
book3s_paired_singles.o \
book3s_pr.o \
book3s_pr_papr.o \
@@ -91,7 +92,6 @@ kvm-book3s_64-module-objs += \
$(KVM)/kvm_main.o \
$(KVM)/eventfd.o \
powerpc.o \
-   emulate.o \
emulate_loadstore.o \
book3s.o \
book3s_64_vio.o \
diff --git a/arch/powerpc/kvm/trace_pr.h b/arch/powerpc/kvm/trace_pr.h
index e1357cd..a674f09 100644
--- a/arch/powerpc/kvm/trace_pr.h
+++ b/arch/powerpc/kvm/trace_pr.h
@@ -291,6 +291,26 @@ TRACE_EVENT(kvm_unmap_hva,
TP_printk("unmap hva 0x%lx\n", __entry->hva)
 );
 
+TRACE_EVENT(kvm_ppc_instr,
+   TP_PROTO(unsigned int inst, unsigned long _pc, unsigned int emulate),
+   TP_ARGS(inst, _pc, emulate),
+
+   TP_STRUCT__entry(
+   __field(unsigned int,   inst)
+   __field(unsigned long,  pc  )
+   __field(unsigned int,   emulate )
+   ),
+
+   TP_fast_assign(
+   __entry->inst   = inst;
+   __entry->pc = _pc;
+   __entry->emulate= emulate;
+   ),
+
+   TP_printk("inst %u pc 0x%lx emulate %u\n",
+ __entry->inst, __entry->pc, __entry->emulate)
+);
+
 #endif /* _TRACE_KVM_H */
 
 /* This part must be outside protection */
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] Final KVM change for 3.16

2014-07-30 Thread Paolo Bonzini
Linus,

The following changes since commit bb18b526a9d8d4a3fe56f234d5013b9f6036978d:

  Merge tag 'signed-for-3.16' of git://github.com/agraf/linux-2.6 into 
kvm-master (2014-07-08 12:08:58 +0200)

are available in the git repository at:


  git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus

for you to fetch changes up to 63afbe7a0ac184ef8485dac4914e87b211b5bfaa:

  kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform (2014-07-30 
14:35:42 +0200)


Fix a bug which allows KVM guests to bring down the entire system
on some 64K enabled ARM64 hosts.


Will Deacon (1):
  kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform

 virt/kvm/arm/vgic.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] KVM/ARM Urgent fix for 3.16

2014-07-30 Thread Paolo Bonzini
Il 30/07/2014 14:55, Christoffer Dall ha scritto:
> Hi Paolo and Gleb,
> 
> Is there any chance you can get this urgent fix (which allows KVM guest
> to bring down the entire system on some 64K enabled ARM64 hosts) merged
> for 3.16?
> 
> The following changes since commit bb18b526a9d8d4a3fe56f234d5013b9f6036978d:
> 
>   Merge tag 'signed-for-3.16' of git://github.com/agraf/linux-2.6 into 
> kvm-master (2014-07-08 12:08:58 +0200)
> 
> are available in the git repository at:
> 
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git 
> tags/kvm-arm-for-3.16-rc7
> 
> for you to fetch changes up to 63afbe7a0ac184ef8485dac4914e87b211b5bfaa:
> 
>   kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform (2014-07-30 
> 14:35:42 +0200)
> 
> ---
> Will Deacon (1):
>   kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform
> 
>  virt/kvm/arm/vgic.c | 24 
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 

I think Gleb is on vacation now, but unfortunately I've already had
enough this year.

I resent the pull request from
git://git.kernel.org/pub/scm/virt/kvm/kvm.git, even though you had CCed
Linus here already.


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] watchdog: control hard lockup detection default

2014-07-30 Thread Don Zickus
On Fri, Jul 25, 2014 at 01:25:11PM +0200, Andrew Jones wrote:
> > to enable hard lockup detection explicitly.
> > 
> > I think changing the 'watchdog_thresh' while 'watchdog_running' is true 
> > should
> > _not_ enable hard lockup detection as a side-effect, because a user may 
> > have a
> > 'sysctl.conf' entry such as
> > 
> >kernel.watchdog_thresh = ...
> > 
> > or may only want to change the 'watchdog_thresh' on the fly.
> > 
> > I think the following flow of execution could cause such undesired 
> > side-effect.
> > 
> >proc_dowatchdog
> >  if (watchdog_user_enabled && watchdog_thresh) {
> > 
> >  watchdog_enable_hardlockup_detector
> >hardlockup_detector_enabled = true
> > 
> >  watchdog_enable_all_cpus
> >if (!watchdog_running) {
> >...
> >} else if (sample_period_changed)
> >   update_timers_all_cpus
> > for_each_online_cpu
> > update_timers
> >   watchdog_nmi_disable
> >   ...
> >   watchdog_nmi_enable
> > 
> > watchdog_hardlockup_detector_is_enabled
> >   return true
> > 
> > enable perf counter for hard lockup 
> > detection
> > 
> > Regards,
> > 
> > Uli
> 
> Nice catch. Looks like this will need a v2. Paolo, do we have a
> consensus on the proc echoing? Or should that be revisited in the v2 as
> well?

As discussed privately, how about something like this to handle that case:
(applied on top of these patches)

Cheers,
Don

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 34eca29..027fb6c 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -666,7 +666,12 @@ int proc_dowatchdog(struct ctl_table *table, int write,
 * watchdog_*_all_cpus() function takes care of this.
 */
if (watchdog_user_enabled && watchdog_thresh) {
-   watchdog_enable_hardlockup_detector(true);
+   /*
+* Prevent a change in watchdog_thresh accidentally overriding
+* the enablement of the hardlockup detector.
+*/
+   if (watchdog_user_enabled != old_enabled)
+   watchdog_enable_hardlockup_detector(true);
err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh);
} else
watchdog_disable_all_cpus();

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] watchdog: control hard lockup detection default

2014-07-30 Thread Paolo Bonzini
Il 30/07/2014 15:43, Don Zickus ha scritto:
>> > Nice catch. Looks like this will need a v2. Paolo, do we have a
>> > consensus on the proc echoing? Or should that be revisited in the v2 as
>> > well?
> As discussed privately, how about something like this to handle that case:
> (applied on top of these patches)

Don, what do you think about proc?

My opinion is still what I mentioned earlier in the thread, i.e. that if
the file says "1", writing "0" and then "1" should not constitute a
change WRT to the initial state.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: nVMX: nested TPR shadow/threshold emulation

2014-07-30 Thread Paolo Bonzini
Il 30/07/2014 14:04, Wanpeng Li ha scritto:
> @@ -7962,14 +7965,14 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, 
> struct vmcs12 *vmcs12)
>   if (!vmx->rdtscp_enabled)
>   exec_control &= ~SECONDARY_EXEC_RDTSCP;
>   /* Take the following fields only from vmcs12 */
> - exec_control &= ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
> -   SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
> + exec_control &= ~(SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
>SECONDARY_EXEC_APIC_REGISTER_VIRT);

This change is wrong.  You don't have to take L0's "virtualize APIC
accesses" setting into account, because while running L2 you cannot
modify L1's CR8 (only the virtual nested one).

> +
> + virtual_apic_page = nested_get_page(vcpu,
> + vmcs12->virtual_apic_page_addr);
> + if (vmcs_read64(VIRTUAL_APIC_PAGE_ADDR) !=
> + page_to_phys(virtual_apic_page))
> + vmcs_write64(VIRTUAL_APIC_PAGE_ADDR,
> + page_to_phys(virtual_apic_page));
> + nested_release_page(virtual_apic_page);
> +

You cannot release this page here.  You need to the exactly the same
thing that is done for apic_access_page.

One thing:

> + if (nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW))
> + vmcs_write32(TPR_THRESHOLD, vmcs12->tpr_threshold);

I think you can just do this write unconditionally, since most
hypervisors will enable this.  Also, you probably can add the tpr
threshold field to the read-write fields for shadow VMCS.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: vmx: remove duplicate vmx_mpx_supported()

2014-07-30 Thread Paolo Bonzini
Il 29/07/2014 23:14, Chris J Arges ha scritto:
> Remove a function which was added by both 93c4adc7afe and 36be0b9deb2.
> 
> Signed-off-by: Chris J Arges 
> ---
>  arch/x86/kvm/vmx.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 801332e..c4ea039 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -740,7 +740,6 @@ static u32 vmx_segment_access_rights(struct kvm_segment 
> *var);
>  static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu);
>  static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx);
>  static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx);
> -static bool vmx_mpx_supported(void);
>  
>  static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>  static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
> 

Thanks, applying.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dirty page tracking in kvm/qemu -- page faults inevitable?

2014-07-30 Thread Paolo Bonzini
Il 30/07/2014 09:41, Chris Friesen ha scritto:
>> I am afraid that using dirty-bit instead of write-protection may cause the 
>> case
>> even more worse for iothread-lock because we need to walk whole sptes to get
>> dirty-set pages, however currently we only need to walk the page set in the
>> bitmap.
> 
> I found a document at
> "http://ftp.software-sources.co.il/Processor_Architecture_Update-Bob_Valentine.pdf";
> which talks about the benefits of Haswell.  One of the items reads:
> 
> "New Accessed and Dirty bits for Extended Page Tables (EPT) eliminates
> major cause of vmexits"
> 
> Is that accurate?  If so, then it seems like it should allow for the VM
> to run without trying to exit the hypervisor, and as long as it just
> does in-memory operations it won't contend on the iothread lock.

True, but:

1) the problem is fishing the information out of the page tables and
passing it up to userspace.  You have to process the whole EPT tree one
page at a time, instead of doing it 64 bits at a time.  Also, one source
of bad performance is having to split all entries of the EPT page tables
down to 4K, and you get that anyway.

2) You should not get to userspace simply for marking a page as locked.
 As you describe it, your problem seems to be contention between QEMU
threads, KVM is not involved.

3) What version of QEMU are you using?  Things have been improving
steadily, and we probably will get to using atomic operations instead of
the iothread lock to protect the migration dirty bitmap.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dirty page tracking in kvm/qemu -- page faults inevitable?

2014-07-30 Thread Chris Friesen

On 07/30/2014 09:42 AM, Paolo Bonzini wrote:

Il 30/07/2014 09:41, Chris Friesen ha scritto:

I found a document at
"http://ftp.software-sources.co.il/Processor_Architecture_Update-Bob_Valentine.pdf";
which talks about the benefits of Haswell.  One of the items reads:

"New Accessed and Dirty bits for Extended Page Tables (EPT) eliminates
major cause of vmexits"

Is that accurate?  If so, then it seems like it should allow for the VM
to run without trying to exit the hypervisor, and as long as it just
does in-memory operations it won't contend on the iothread lock.



2) You should not get to userspace simply for marking a page as locked.
  As you describe it, your problem seems to be contention between QEMU
threads, KVM is not involved.


What about writing to a page where we're tracking dirty pages?  Would 
that get back up to qemu or would that be handled entirely in the kvm 
kernel module?


I was assuming that it was due to the page faults since as far as I know 
the app in the VM is just doing packet processing from/to memory-mapped 
circular buffers--the qemu threads in question aren't doing "normal" I/O 
but something is causing them to try to acquire the iothread lock.



3) What version of QEMU are you using?  Things have been improving
steadily, and we probably will get to using atomic operations instead of
the iothread lock to protect the migration dirty bitmap.


We're currently on 1.4.2.  We're looking at trying out 1.7 to see if 
it's better, but we've got some local patches that would need to get ported.


Chris

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table

2014-07-30 Thread Paolo Bonzini
Currently, the EOI exit bitmap (used for APICv) does not include
interrupts that are masked.  However, this can cause a bug that manifests
as an interrupt storm inside the guest.  Alex Williamson reported the
bug and is the one who really debugged this; I only wrote the patch. :)

The scenario involves a multi-function PCI device with OHCI and EHCI
USB functions and an audio function, all assigned to the guest, where
both USB functions use legacy INTx interrupts.

As soon as the guest boots, interrupts for these devices turn into an
interrupt storm in the guest; the host does not see the interrupt storm.
Basically the EOI path does not work, and the guest continues to see the
interrupt over and over, even after it attempts to mask it at the APIC.
The bug is only visible with older kernels (RHEL6.5, based on 2.6.32
with not many changes in the area of APIC/IOAPIC handling).

Alex then tried forcing bit 59 (corresponding to the USB functions' IRQ)
on in the eoi_exit_bitmap and TMR, and things then work.  What happens
is that VFIO asserts IRQ11, then KVM recomputes the EOI exit bitmap.
It does not have set bit 59 because the RTE was masked, so the IOAPIC
never sees the EOI and the interrupt continues to fire in the guest.

Probably, the guest is masking the interrupt in the redirection table in
the interrupt routine, i.e. while the interrupt is set in a LAPIC's ISR.
The simplest fix is to ignore the masking state, we would rather have
an unnecessary exit rather than a missed IRQ ACK and anyway IOAPIC
interrupts are not as performance-sensitive as for example MSIs.

[Thanks to Alex for his precise description of the problem
 and initial debugging effort.  A lot of the text above is
 based on emails exchanged with him.]

Reported-by: Alex Williamson 
Cc: sta...@vger.kernel.org
Signed-off-by: Paolo Bonzini 
---
 virt/kvm/ioapic.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 2458a1dc2ba9..e8ce34c9db32 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -254,10 +254,9 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 
*eoi_exit_bitmap,
spin_lock(&ioapic->lock);
for (index = 0; index < IOAPIC_NUM_PINS; index++) {
e = &ioapic->redirtbl[index];
-   if (!e->fields.mask &&
-   (e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
-kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC,
-index) || index == RTC_GSI)) {
+   if (e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
+   kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC, 
index) ||
+   index == RTC_GSI) {
if (kvm_apic_match_dest(vcpu, NULL, 0,
e->fields.dest_id, e->fields.dest_mode)) {
__set_bit(e->fields.vector,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dirty page tracking in kvm/qemu -- page faults inevitable?

2014-07-30 Thread Paolo Bonzini
Il 30/07/2014 18:02, Chris Friesen ha scritto:
> On 07/30/2014 09:42 AM, Paolo Bonzini wrote:
>> Il 30/07/2014 09:41, Chris Friesen ha scritto:
>>> I found a document at
>>> "http://ftp.software-sources.co.il/Processor_Architecture_Update-Bob_Valentine.pdf";
>>>
>>> which talks about the benefits of Haswell.  One of the items reads:
>>>
>>> "New Accessed and Dirty bits for Extended Page Tables (EPT) eliminates
>>> major cause of vmexits"
>>>
>>> Is that accurate?  If so, then it seems like it should allow for the VM
>>> to run without trying to exit the hypervisor, and as long as it just
>>> does in-memory operations it won't contend on the iothread lock.
> 
>> 2) You should not get to userspace simply for marking a page as locked.
>>   As you describe it, your problem seems to be contention between QEMU
>> threads, KVM is not involved.
> 
> What about writing to a page where we're tracking dirty pages?  Would
> that get back up to qemu or would that be handled entirely in the kvm
> kernel module?

It's handle inside the kernel module.

Every now and then QEMU asks the kernel for the dirty pages and ORs the
bitmap returned by KVM with its own.  All this is done under the
iothread lock.

> I was assuming that it was due to the page faults since as far as I know
> the app in the VM is just doing packet processing from/to memory-mapped
> circular buffers--the qemu threads in question aren't doing "normal" I/O
> but something is causing them to try to acquire the iothread lock.
> 
>> 3) What version of QEMU are you using?  Things have been improving
>> steadily, and we probably will get to using atomic operations instead of
>> the iothread lock to protect the migration dirty bitmap.
> 
> We're currently on 1.4.2.  We're looking at trying out 1.7 to see if
> it's better, but we've got some local patches that would need to get
> ported.

>From a quick "git describe" 2.0 is needed.  The patches end at commit
ae2810c (memory: syncronize kvm bitmap using bitmaps operations,
2013-11-05).

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: HV: Remove generic instruction emulation

2014-07-30 Thread Paolo Bonzini
Il 30/07/2014 15:27, Alexander Graf ha scritto:
> Now that we have properly split load/store instruction emulation and generic
> instruction emulation, we can move the generic one from kvm.ko to kvm-pr.ko
> on book3s_64.
> 
> This reduces the attack surface and amount of code loaded on HV KVM kernels.

Can emulation races happen on HV KVM like you can have on x86?
Basically one CPU writes to MMIO while the other patches instructions so
that basically anything can end up in the hands of the emulator?  On PPC
it may even happen simply because of a missing icache invalidation, I
think, since it doesn't support self-modifying code without explicit
invalidation.

Paolo

> Signed-off-by: Alexander Graf 
> ---
>  arch/powerpc/kvm/Makefile   |  2 +-
>  arch/powerpc/kvm/trace_pr.h | 20 
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
> index 1ccd7a1..2d590de 100644
> --- a/arch/powerpc/kvm/Makefile
> +++ b/arch/powerpc/kvm/Makefile
> @@ -48,6 +48,7 @@ kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) 
> := \
>  
>  kvm-pr-y := \
>   fpu.o \
> + emulate.o \
>   book3s_paired_singles.o \
>   book3s_pr.o \
>   book3s_pr_papr.o \
> @@ -91,7 +92,6 @@ kvm-book3s_64-module-objs += \
>   $(KVM)/kvm_main.o \
>   $(KVM)/eventfd.o \
>   powerpc.o \
> - emulate.o \
>   emulate_loadstore.o \
>   book3s.o \
>   book3s_64_vio.o \
> diff --git a/arch/powerpc/kvm/trace_pr.h b/arch/powerpc/kvm/trace_pr.h
> index e1357cd..a674f09 100644
> --- a/arch/powerpc/kvm/trace_pr.h
> +++ b/arch/powerpc/kvm/trace_pr.h
> @@ -291,6 +291,26 @@ TRACE_EVENT(kvm_unmap_hva,
>   TP_printk("unmap hva 0x%lx\n", __entry->hva)
>  );
>  
> +TRACE_EVENT(kvm_ppc_instr,
> + TP_PROTO(unsigned int inst, unsigned long _pc, unsigned int emulate),
> + TP_ARGS(inst, _pc, emulate),
> +
> + TP_STRUCT__entry(
> + __field(unsigned int,   inst)
> + __field(unsigned long,  pc  )
> + __field(unsigned int,   emulate )
> + ),
> +
> + TP_fast_assign(
> + __entry->inst   = inst;
> + __entry->pc = _pc;
> + __entry->emulate= emulate;
> + ),
> +
> + TP_printk("inst %u pc 0x%lx emulate %u\n",
> +   __entry->inst, __entry->pc, __entry->emulate)
> +);
> +
>  #endif /* _TRACE_KVM_H */
>  
>  /* This part must be outside protection */
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table

2014-07-30 Thread Alex Williamson
On Wed, 2014-07-30 at 18:12 +0200, Paolo Bonzini wrote:
> Currently, the EOI exit bitmap (used for APICv) does not include
> interrupts that are masked.  However, this can cause a bug that manifests
> as an interrupt storm inside the guest.  Alex Williamson reported the
> bug and is the one who really debugged this; I only wrote the patch. :)
> 
> The scenario involves a multi-function PCI device with OHCI and EHCI
> USB functions and an audio function, all assigned to the guest, where
> both USB functions use legacy INTx interrupts.
> 
> As soon as the guest boots, interrupts for these devices turn into an
> interrupt storm in the guest; the host does not see the interrupt storm.
> Basically the EOI path does not work, and the guest continues to see the
> interrupt over and over, even after it attempts to mask it at the APIC.
> The bug is only visible with older kernels (RHEL6.5, based on 2.6.32
> with not many changes in the area of APIC/IOAPIC handling).
> 
> Alex then tried forcing bit 59 (corresponding to the USB functions' IRQ)
> on in the eoi_exit_bitmap and TMR, and things then work.  What happens
> is that VFIO asserts IRQ11, then KVM recomputes the EOI exit bitmap.
> It does not have set bit 59 because the RTE was masked, so the IOAPIC
> never sees the EOI and the interrupt continues to fire in the guest.
> 
> Probably, the guest is masking the interrupt in the redirection table in
> the interrupt routine, i.e. while the interrupt is set in a LAPIC's ISR.
> The simplest fix is to ignore the masking state, we would rather have
> an unnecessary exit rather than a missed IRQ ACK and anyway IOAPIC
> interrupts are not as performance-sensitive as for example MSIs.
> 
> [Thanks to Alex for his precise description of the problem
>  and initial debugging effort.  A lot of the text above is
>  based on emails exchanged with him.]
> 
> Reported-by: Alex Williamson 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Paolo Bonzini 

Thanks Paolo

Tested-by: Alex Williamson 

> ---
>  virt/kvm/ioapic.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> index 2458a1dc2ba9..e8ce34c9db32 100644
> --- a/virt/kvm/ioapic.c
> +++ b/virt/kvm/ioapic.c
> @@ -254,10 +254,9 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 
> *eoi_exit_bitmap,
>   spin_lock(&ioapic->lock);
>   for (index = 0; index < IOAPIC_NUM_PINS; index++) {
>   e = &ioapic->redirtbl[index];
> - if (!e->fields.mask &&
> - (e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
> -  kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC,
> -  index) || index == RTC_GSI)) {
> + if (e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
> + kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC, 
> index) ||
> + index == RTC_GSI) {
>   if (kvm_apic_match_dest(vcpu, NULL, 0,
>   e->fields.dest_id, e->fields.dest_mode)) {
>   __set_bit(e->fields.vector,



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] watchdog: control hard lockup detection default

2014-07-30 Thread Don Zickus
On Wed, Jul 30, 2014 at 04:16:38PM +0200, Paolo Bonzini wrote:
> Il 30/07/2014 15:43, Don Zickus ha scritto:
> >> > Nice catch. Looks like this will need a v2. Paolo, do we have a
> >> > consensus on the proc echoing? Or should that be revisited in the v2 as
> >> > well?
> > As discussed privately, how about something like this to handle that case:
> > (applied on top of these patches)
> 
> Don, what do you think about proc?
> 
> My opinion is still what I mentioned earlier in the thread, i.e. that if
> the file says "1", writing "0" and then "1" should not constitute a
> change WRT to the initial state.
> 

I can agree.  The problem is there are two things this proc value
controls, softlockup and hardlockup.  I have always tried to keep the both
disabled or enabled together.

This patchset tries to separate them for an edge case.  Hence the proc
value becomes slightly confusing.

I don't know the right way to solve this without introducing more proc
values.

We have /proc/sys/kernel/nmi_watchdog and /proc/sys/kernel/watchdog which
point to the same internal variable.  Do I separate them and have
'nmi_watchdog' just mean hardlockup and 'watchdog' mean softlockup?  Then
we can be clear on what the output is.  Or does 'watchdog' represent a
superset of 'nmi_watchdog' && softlockup?

That is where the confusion lies.

Cheers,
Don

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] KVM: PPC: BOOKE: No need to set DBCR0_EDM in guest visible register

2014-07-30 Thread Scott Wood
On Wed, 2014-07-30 at 00:21 -0500, Bhushan Bharat-R65777 wrote:
> 
> > -Original Message-
> > From: Wood Scott-B07421
> > Sent: Tuesday, July 29, 2014 3:22 AM
> > To: Bhushan Bharat-R65777
> > Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Yoder 
> > Stuart-
> > B08248
> > Subject: Re: [PATCH 1/6] KVM: PPC: BOOKE: No need to set DBCR0_EDM in guest
> > visible register
> > 
> > On Fri, 2014-07-11 at 14:08 +0530, Bharat Bhushan wrote:
> > > This is not used and  even I do not remember why this was added in
> > > first place.
> > >
> > > Signed-off-by: Bharat Bhushan 
> > > ---
> > >  arch/powerpc/kvm/booke.c | 2 --
> > >  1 file changed, 2 deletions(-)
> > >
> > > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
> > > ab62109..a5ee42c 100644
> > > --- a/arch/powerpc/kvm/booke.c
> > > +++ b/arch/powerpc/kvm/booke.c
> > > @@ -1804,8 +1804,6 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct 
> > > kvm_vcpu
> > *vcpu,
> > >   kvm_guest_protect_msr(vcpu, MSR_DE, true);
> > >   vcpu->guest_debug = dbg->control;
> > >   vcpu->arch.shadow_dbg_reg.dbcr0 = 0;
> > > - /* Set DBCR0_EDM in guest visible DBCR0 register. */
> > > - vcpu->arch.dbg_reg.dbcr0 = DBCR0_EDM;
> > >
> > >   if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
> > >   vcpu->arch.shadow_dbg_reg.dbcr0 |= DBCR0_IDM | DBCR0_IC;
> > 
> > This was intended to let the guest know that the host owns the debug 
> > resources,
> > by analogy to what a JTAG debugger would do.
> > 
> > The Power ISA has this "Virtualized Implementation Note":
> > 
> > It is the responsibility of the hypervisor to ensure that
> > DBCR0[EDM] is consistent with usage of DEP.
> 
> Ok, That means that if MSRP_DEP is set then set DBCR0_EDM  and if MSRP_DEP is 
> clear then clear DBCR0_EDM, right?
> We need to implement above mentioned this.

We should probably clear EDM only when guest debug emulation is working
and enabled (i.e. not until at least patch 6/6).

-Scott


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/6] KVM: PPC: BOOKE: No need to set DBCR0_EDM in guest visible register

2014-07-30 Thread bharat.bhus...@freescale.com


> -Original Message-
> From: Wood Scott-B07421
> Sent: Wednesday, July 30, 2014 11:18 PM
> To: Bhushan Bharat-R65777
> Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Yoder Stuart-
> B08248
> Subject: Re: [PATCH 1/6] KVM: PPC: BOOKE: No need to set DBCR0_EDM in guest
> visible register
> 
> On Wed, 2014-07-30 at 00:21 -0500, Bhushan Bharat-R65777 wrote:
> >
> > > -Original Message-
> > > From: Wood Scott-B07421
> > > Sent: Tuesday, July 29, 2014 3:22 AM
> > > To: Bhushan Bharat-R65777
> > > Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org;
> > > Yoder Stuart-
> > > B08248
> > > Subject: Re: [PATCH 1/6] KVM: PPC: BOOKE: No need to set DBCR0_EDM
> > > in guest visible register
> > >
> > > On Fri, 2014-07-11 at 14:08 +0530, Bharat Bhushan wrote:
> > > > This is not used and  even I do not remember why this was added in
> > > > first place.
> > > >
> > > > Signed-off-by: Bharat Bhushan 
> > > > ---
> > > >  arch/powerpc/kvm/booke.c | 2 --
> > > >  1 file changed, 2 deletions(-)
> > > >
> > > > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> > > > index ab62109..a5ee42c 100644
> > > > --- a/arch/powerpc/kvm/booke.c
> > > > +++ b/arch/powerpc/kvm/booke.c
> > > > @@ -1804,8 +1804,6 @@ int
> > > > kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu
> > > *vcpu,
> > > > kvm_guest_protect_msr(vcpu, MSR_DE, true);
> > > > vcpu->guest_debug = dbg->control;
> > > > vcpu->arch.shadow_dbg_reg.dbcr0 = 0;
> > > > -   /* Set DBCR0_EDM in guest visible DBCR0 register. */
> > > > -   vcpu->arch.dbg_reg.dbcr0 = DBCR0_EDM;
> > > >
> > > > if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
> > > > vcpu->arch.shadow_dbg_reg.dbcr0 |= DBCR0_IDM | DBCR0_IC;
> > >
> > > This was intended to let the guest know that the host owns the debug
> > > resources, by analogy to what a JTAG debugger would do.
> > >
> > > The Power ISA has this "Virtualized Implementation Note":
> > >
> > > It is the responsibility of the hypervisor to ensure that
> > > DBCR0[EDM] is consistent with usage of DEP.
> >
> > Ok, That means that if MSRP_DEP is set then set DBCR0_EDM  and if MSRP_DEP 
> > is
> clear then clear DBCR0_EDM, right?
> > We need to implement above mentioned this.
> 
> We should probably clear EDM only when guest debug emulation is working and
> enabled (i.e. not until at least patch 6/6).

But if EDM is set then guest debug emulation will not start/allowed.


Thanks
-Bharat

> 
> -Scott
> 



Re: [PATCH 1/6] KVM: PPC: BOOKE: No need to set DBCR0_EDM in guest visible register

2014-07-30 Thread Scott Wood
On Wed, 2014-07-30 at 12:57 -0500, Bhushan Bharat-R65777 wrote:
> 
> > -Original Message-
> > From: Wood Scott-B07421
> > Sent: Wednesday, July 30, 2014 11:18 PM
> > To: Bhushan Bharat-R65777
> > Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Yoder 
> > Stuart-
> > B08248
> > Subject: Re: [PATCH 1/6] KVM: PPC: BOOKE: No need to set DBCR0_EDM in guest
> > visible register
> > 
> > On Wed, 2014-07-30 at 00:21 -0500, Bhushan Bharat-R65777 wrote:
> > >
> > > > -Original Message-
> > > > From: Wood Scott-B07421
> > > > Sent: Tuesday, July 29, 2014 3:22 AM
> > > > To: Bhushan Bharat-R65777
> > > > Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org;
> > > > Yoder Stuart-
> > > > B08248
> > > > Subject: Re: [PATCH 1/6] KVM: PPC: BOOKE: No need to set DBCR0_EDM
> > > > in guest visible register
> > > >
> > > > On Fri, 2014-07-11 at 14:08 +0530, Bharat Bhushan wrote:
> > > > > This is not used and  even I do not remember why this was added in
> > > > > first place.
> > > > >
> > > > > Signed-off-by: Bharat Bhushan 
> > > > > ---
> > > > >  arch/powerpc/kvm/booke.c | 2 --
> > > > >  1 file changed, 2 deletions(-)
> > > > >
> > > > > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> > > > > index ab62109..a5ee42c 100644
> > > > > --- a/arch/powerpc/kvm/booke.c
> > > > > +++ b/arch/powerpc/kvm/booke.c
> > > > > @@ -1804,8 +1804,6 @@ int
> > > > > kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu
> > > > *vcpu,
> > > > >   kvm_guest_protect_msr(vcpu, MSR_DE, true);
> > > > >   vcpu->guest_debug = dbg->control;
> > > > >   vcpu->arch.shadow_dbg_reg.dbcr0 = 0;
> > > > > - /* Set DBCR0_EDM in guest visible DBCR0 register. */
> > > > > - vcpu->arch.dbg_reg.dbcr0 = DBCR0_EDM;
> > > > >
> > > > >   if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
> > > > >   vcpu->arch.shadow_dbg_reg.dbcr0 |= DBCR0_IDM | DBCR0_IC;
> > > >
> > > > This was intended to let the guest know that the host owns the debug
> > > > resources, by analogy to what a JTAG debugger would do.
> > > >
> > > > The Power ISA has this "Virtualized Implementation Note":
> > > >
> > > > It is the responsibility of the hypervisor to ensure that
> > > > DBCR0[EDM] is consistent with usage of DEP.
> > >
> > > Ok, That means that if MSRP_DEP is set then set DBCR0_EDM  and if 
> > > MSRP_DEP is
> > clear then clear DBCR0_EDM, right?
> > > We need to implement above mentioned this.
> > 
> > We should probably clear EDM only when guest debug emulation is working and
> > enabled (i.e. not until at least patch 6/6).
> 
> But if EDM is set then guest debug emulation will not start/allowed.

I don't mean after the guest tries to write to the registers -- I mean
after the code has been added to KVM to allow it to work.

-Scott


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] KVM/ARM Urgent fix for 3.16

2014-07-30 Thread Christoffer Dall
On Wed, Jul 30, 2014 at 03:34:11PM +0200, Paolo Bonzini wrote:
> Il 30/07/2014 14:55, Christoffer Dall ha scritto:
> > Hi Paolo and Gleb,
> > 
> > Is there any chance you can get this urgent fix (which allows KVM guest
> > to bring down the entire system on some 64K enabled ARM64 hosts) merged
> > for 3.16?
> > 
> > The following changes since commit bb18b526a9d8d4a3fe56f234d5013b9f6036978d:
> > 
> >   Merge tag 'signed-for-3.16' of git://github.com/agraf/linux-2.6 into 
> > kvm-master (2014-07-08 12:08:58 +0200)
> > 
> > are available in the git repository at:
> > 
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git 
> > tags/kvm-arm-for-3.16-rc7
> > 
> > for you to fetch changes up to 63afbe7a0ac184ef8485dac4914e87b211b5bfaa:
> > 
> >   kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform 
> > (2014-07-30 14:35:42 +0200)
> > 
> > ---
> > Will Deacon (1):
> >   kvm: arm64: vgic: fix hyp panic with 64k pages on juno platform
> > 
> >  virt/kvm/arm/vgic.c | 24 
> >  1 file changed, 20 insertions(+), 4 deletions(-)
> > 
> 
> I think Gleb is on vacation now, but unfortunately I've already had
> enough this year.
> 
> I resent the pull request from
> git://git.kernel.org/pub/scm/virt/kvm/kvm.git, even though you had CCed
> Linus here already.
> 
I cc'ed Linus in case you were on vacation and since this is urgent and
last minute.

In any case, thanks to all for dealing with this quickly.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: HV: Remove generic instruction emulation

2014-07-30 Thread Alexander Graf


On 30.07.14 18:21, Paolo Bonzini wrote:

Il 30/07/2014 15:27, Alexander Graf ha scritto:

Now that we have properly split load/store instruction emulation and generic
instruction emulation, we can move the generic one from kvm.ko to kvm-pr.ko
on book3s_64.

This reduces the attack surface and amount of code loaded on HV KVM kernels.

Can emulation races happen on HV KVM like you can have on x86?
Basically one CPU writes to MMIO while the other patches instructions so
that basically anything can end up in the hands of the emulator?  On PPC
it may even happen simply because of a missing icache invalidation, I
think, since it doesn't support self-modifying code without explicit
invalidation.


Yes, this is perfectly possible. As of my last patch set we will never 
enter the generic emulator for HV KVM, so that race is moot (we just 
inject a PROGRAM interrupt into the guest). With this patch even the 
code to emulate these bits doesn't exist in the kernel anymore if you 
don't modprobe kvm-pr.ko.



Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: HV: Remove generic instruction emulation

2014-07-30 Thread Paolo Bonzini
Il 30/07/2014 20:57, Alexander Graf ha scritto:
> Yes, this is perfectly possible. As of my last patch set we will never
> enter the generic emulator for HV KVM, so that race is moot (we just
> inject a PROGRAM interrupt into the guest). With this patch even the
> code to emulate these bits doesn't exist in the kernel anymore if you
> don't modprobe kvm-pr.ko.

What is a PROGRAM interrupt? :)

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: HV: Remove generic instruction emulation

2014-07-30 Thread Alexander Graf


On 30.07.14 21:47, Paolo Bonzini wrote:

Il 30/07/2014 20:57, Alexander Graf ha scritto:

Yes, this is perfectly possible. As of my last patch set we will never
enter the generic emulator for HV KVM, so that race is moot (we just
inject a PROGRAM interrupt into the guest). With this patch even the
code to emulate these bits doesn't exist in the kernel anymore if you
don't modprobe kvm-pr.ko.

What is a PROGRAM interrupt? :)


The thing that happens when you invoke an illegal or privileged 
instruction ;)



Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH kvm-unit-tests 0/3] x86: svm: minimal IOIO testing

2014-07-30 Thread Paolo Bonzini
So far the only "multi-stage" test was assembly only, so we have
to implement register save/restore around vmrun.

Paolo

Paolo Bonzini (3):
  x86: svm: load/save all GPRs
  x86: svm: initialize IO bitmap
  x86: svm: IOIO testing

 x86/svm.c | 191 --
 1 file changed, 186 insertions(+), 5 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH kvm-unit-tests 2/3] x86: svm: initialize IO bitmap

2014-07-30 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 x86/svm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/x86/svm.c b/x86/svm.c
index 4b7f06e..2cf5c81 100644
--- a/x86/svm.c
+++ b/x86/svm.c
@@ -36,6 +36,9 @@ u64 latclgi_max;
 u64 latclgi_min;
 u64 runs;
 
+u8 *io_bitmap;
+u8 io_bitmap_area[16384];
+
 static bool npt_supported(void)
 {
return cpuid(0x800A).d & 1;
@@ -53,6 +56,8 @@ static void setup_svm(void)
 
 scratch_page = alloc_page();
 
+io_bitmap = (void *) (((ulong)io_bitmap_area + 4095) & ~4095);
+
 if (!npt_supported())
 return;
 
@@ -149,6 +154,7 @@ static void vmcb_ident(struct vmcb *vmcb)
 save->g_pat = rdmsr(MSR_IA32_CR_PAT);
 save->dbgctl = rdmsr(MSR_IA32_DEBUGCTLMSR);
 ctrl->intercept = (1ULL << INTERCEPT_VMRUN) | (1ULL << INTERCEPT_VMMCALL);
+ctrl->iopm_base_pa = virt_to_phys(io_bitmap);
 
 if (npt_supported()) {
 ctrl->nested_ctl = 1;
-- 
1.8.3.1


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH kvm-unit-tests 3/3] x86: svm: IOIO testing

2014-07-30 Thread Paolo Bonzini
Testing the bitmap handling so far, does not cover string instructions
yet.

Signed-off-by: Paolo Bonzini 
---
 x86/svm.c | 126 ++
 1 file changed, 126 insertions(+)

diff --git a/x86/svm.c b/x86/svm.c
index 2cf5c81..290c33e 100644
--- a/x86/svm.c
+++ b/x86/svm.c
@@ -6,6 +6,7 @@
 #include "vm.h"
 #include "smp.h"
 #include "types.h"
+#include "io.h"
 
 /* for the nested page table*/
 u64 *pml4e;
@@ -505,6 +506,129 @@ static bool check_mode_switch(struct test *test)
return test->scratch == 2;
 }
 
+static void prepare_ioio(struct test *test)
+{
+test->vmcb->control.intercept |= (1ULL << INTERCEPT_IOIO_PROT);
+test->scratch = 0;
+memset(io_bitmap, 0, 8192);
+io_bitmap[8192] = 0xFF;
+}
+
+int get_test_stage(struct test *test)
+{
+barrier();
+return test->scratch;
+}
+
+void inc_test_stage(struct test *test)
+{
+barrier();
+test->scratch++;
+barrier();
+}
+
+static void test_ioio(struct test *test)
+{
+// stage 0, test IO pass
+inb(0x5000);
+outb(0x0, 0x5000);
+if (get_test_stage(test) != 0)
+goto fail;
+
+// test IO width, in/out
+io_bitmap[0] = 0xFF;
+inc_test_stage(test);
+inb(0x0);
+if (get_test_stage(test) != 2)
+goto fail;
+
+outw(0x0, 0x0);
+if (get_test_stage(test) != 3)
+goto fail;
+
+inl(0x0);
+if (get_test_stage(test) != 4)
+goto fail;
+
+// test low/high IO port
+io_bitmap[0x5000 / 8] = (1 << (0x5000 % 8));
+inb(0x5000);
+if (get_test_stage(test) != 5)
+goto fail;
+
+io_bitmap[0x9000 / 8] = (1 << (0x9000 % 8));
+inw(0x9000);
+if (get_test_stage(test) != 6)
+goto fail;
+
+// test partial pass
+io_bitmap[0x5000 / 8] = (1 << (0x5000 % 8));
+inl(0x4FFF);
+if (get_test_stage(test) != 7)
+goto fail;
+
+// test across pages
+inc_test_stage(test);
+inl(0x7FFF);
+if (get_test_stage(test) != 8)
+goto fail;
+
+inc_test_stage(test);
+io_bitmap[0x8000 / 8] = 1 << (0x8000 % 8);
+inl(0x7FFF);
+if (get_test_stage(test) != 10)
+goto fail;
+
+io_bitmap[0] = 0;
+inl(0x);
+if (get_test_stage(test) != 11)
+goto fail;
+
+io_bitmap[0] = 0xFF;
+io_bitmap[8192] = 0;
+inl(0x);
+inc_test_stage(test);
+if (get_test_stage(test) != 12)
+goto fail;
+
+return;
+
+fail:
+printf("test failure, stage %d\n", get_test_stage(test));
+test->scratch = -1;
+}
+
+static bool ioio_finished(struct test *test)
+{
+unsigned port, size;
+
+/* Only expect IOIO intercepts */
+if (test->vmcb->control.exit_code == SVM_EXIT_VMMCALL)
+return true;
+
+if (test->vmcb->control.exit_code != SVM_EXIT_IOIO)
+return true;
+
+/* one step forward */
+test->scratch += 1;
+
+port = test->vmcb->control.exit_info_1 >> 16;
+size = (test->vmcb->control.exit_info_1 >> SVM_IOIO_SIZE_SHIFT) & 7;
+
+while (size--) {
+io_bitmap[port / 8] &= ~(1 << (port & 7));
+port++;
+}
+
+return false;
+}
+
+static bool check_ioio(struct test *test)
+{
+memset(io_bitmap, 0, 8193);
+return test->scratch != -1;
+}
+
 static void prepare_asid_zero(struct test *test)
 {
 test->vmcb->control.asid = 0;
@@ -804,6 +928,8 @@ static struct test tests[] = {
   default_finished, null_check },
 { "vmrun", default_supported, default_prepare, test_vmrun,
default_finished, check_vmrun },
+{ "ioio", default_supported, prepare_ioio, test_ioio,
+   ioio_finished, check_ioio },
 { "vmrun intercept check", default_supported, prepare_no_vmrun_int,
   null_test, default_finished, check_no_vmrun_int },
 { "cr3 read intercept", default_supported, prepare_cr3_intercept,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH kvm-unit-tests 1/3] x86: svm: load/save all GPRs

2014-07-30 Thread Paolo Bonzini
The cr2 field is unused, but I prefer to keep it the same as vmx (it is
also unused there).

Signed-off-by: Paolo Bonzini 
---
 x86/svm.c | 59 ++-
 1 file changed, 54 insertions(+), 5 deletions(-)

diff --git a/x86/svm.c b/x86/svm.c
index 3e45426..4b7f06e 100644
--- a/x86/svm.c
+++ b/x86/svm.c
@@ -174,6 +174,48 @@ static void test_thunk(struct test *test)
 asm volatile ("vmmcall" : : : "memory");
 }
 
+struct regs {
+u64 rax;
+u64 rcx;
+u64 rdx;
+u64 rbx;
+u64 cr2;
+u64 rbp;
+u64 rsi;
+u64 rdi;
+u64 r8;
+u64 r9;
+u64 r10;
+u64 r11;
+u64 r12;
+u64 r13;
+u64 r14;
+u64 r15;
+u64 rflags;
+};
+
+struct regs regs;
+
+// rax handled specially below
+
+#define SAVE_GPR_C  \
+"xchg %%rbx, regs+0x8\n\t"  \
+"xchg %%rcx, regs+0x10\n\t" \
+"xchg %%rdx, regs+0x18\n\t" \
+"xchg %%rbp, regs+0x28\n\t" \
+"xchg %%rsi, regs+0x30\n\t" \
+"xchg %%rdi, regs+0x38\n\t" \
+"xchg %%r8, regs+0x40\n\t"  \
+"xchg %%r9, regs+0x48\n\t"  \
+"xchg %%r10, regs+0x50\n\t" \
+"xchg %%r11, regs+0x58\n\t" \
+"xchg %%r12, regs+0x60\n\t" \
+"xchg %%r13, regs+0x68\n\t" \
+"xchg %%r14, regs+0x70\n\t" \
+"xchg %%r15, regs+0x78\n\t"
+
+#define LOAD_GPR_C  SAVE_GPR_C
+
 static bool test_run(struct test *test, struct vmcb *vmcb)
 {
 u64 vmcb_phys = virt_to_phys(vmcb);
@@ -184,19 +226,26 @@ static bool test_run(struct test *test, struct vmcb *vmcb)
 test->prepare(test);
 vmcb->save.rip = (ulong)test_thunk;
 vmcb->save.rsp = (ulong)(guest_stack + ARRAY_SIZE(guest_stack));
+regs.rdi = (ulong)test;
 do {
 tsc_start = rdtsc();
 asm volatile (
 "clgi \n\t"
 "vmload \n\t"
-"push %%rbp \n\t"
-"push %1 \n\t"
+"mov regs+0x80, %%r15\n\t"  // rflags
+"mov %%r15, 0x170(%0)\n\t"
+"mov regs, %%r15\n\t"   // rax
+"mov %%r15, 0x1f8(%0)\n\t"
+LOAD_GPR_C
 "vmrun \n\t"
-"pop %1 \n\t"
-"pop %%rbp \n\t"
+SAVE_GPR_C
+"mov 0x170(%0), %%r15\n\t"  // rflags
+"mov %%r15, regs+0x80\n\t"
+"mov 0x1f8(%0), %%r15\n\t"  // rax
+"mov %%r15, regs\n\t"
 "vmsave \n\t"
 "stgi"
-: : "a"(vmcb_phys), "D"(test)
+: : "a"(vmcb_phys)
 : "rbx", "rcx", "rdx", "rsi",
   "r8", "r9", "r10", "r11" , "r12", "r13", "r14", "r15",
   "memory");
-- 
1.8.3.1


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 1/4] x86/kvm: Resolve some missing-initializers warnings

2014-07-30 Thread Mark D Rustad
Resolve some missing-initializers warnings that appear in W=2
builds. They are resolved by adding the name as a parameter to
the macros and having the macro generate all four fields of the
structure.

Signed-off-by: Mark Rustad 
Signed-off-by: Jeff Kirsher 

---
V2: Change macro to supply all four fields instead of using a
designated initializer. Also fix up the array terminator.
---
 arch/x86/kvm/x86.c |   71 ++--
 1 file changed, 36 insertions(+), 35 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ef432f891d30..623aea52ceba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -82,8 +82,9 @@ u64 __read_mostly efer_reserved_bits = ~((u64)(EFER_SCE | 
EFER_LME | EFER_LMA));
 static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE);
 #endif
 
-#define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM
-#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
+#define VM_STAT(name, x) name, offsetof(struct kvm, stat.x), KVM_STAT_VM, NULL
+#define VCPU_STAT(name, x) name, offsetof(struct kvm_vcpu, stat.x), \
+  KVM_STAT_VCPU, NULL
 
 static void update_cr8_intercept(struct kvm_vcpu *vcpu);
 static void process_nmi(struct kvm_vcpu *vcpu);
@@ -128,39 +129,39 @@ static struct kvm_shared_msrs_global __read_mostly 
shared_msrs_global;
 static struct kvm_shared_msrs __percpu *shared_msrs;
 
 struct kvm_stats_debugfs_item debugfs_entries[] = {
-   { "pf_fixed", VCPU_STAT(pf_fixed) },
-   { "pf_guest", VCPU_STAT(pf_guest) },
-   { "tlb_flush", VCPU_STAT(tlb_flush) },
-   { "invlpg", VCPU_STAT(invlpg) },
-   { "exits", VCPU_STAT(exits) },
-   { "io_exits", VCPU_STAT(io_exits) },
-   { "mmio_exits", VCPU_STAT(mmio_exits) },
-   { "signal_exits", VCPU_STAT(signal_exits) },
-   { "irq_window", VCPU_STAT(irq_window_exits) },
-   { "nmi_window", VCPU_STAT(nmi_window_exits) },
-   { "halt_exits", VCPU_STAT(halt_exits) },
-   { "halt_wakeup", VCPU_STAT(halt_wakeup) },
-   { "hypercalls", VCPU_STAT(hypercalls) },
-   { "request_irq", VCPU_STAT(request_irq_exits) },
-   { "irq_exits", VCPU_STAT(irq_exits) },
-   { "host_state_reload", VCPU_STAT(host_state_reload) },
-   { "efer_reload", VCPU_STAT(efer_reload) },
-   { "fpu_reload", VCPU_STAT(fpu_reload) },
-   { "insn_emulation", VCPU_STAT(insn_emulation) },
-   { "insn_emulation_fail", VCPU_STAT(insn_emulation_fail) },
-   { "irq_injections", VCPU_STAT(irq_injections) },
-   { "nmi_injections", VCPU_STAT(nmi_injections) },
-   { "mmu_shadow_zapped", VM_STAT(mmu_shadow_zapped) },
-   { "mmu_pte_write", VM_STAT(mmu_pte_write) },
-   { "mmu_pte_updated", VM_STAT(mmu_pte_updated) },
-   { "mmu_pde_zapped", VM_STAT(mmu_pde_zapped) },
-   { "mmu_flooded", VM_STAT(mmu_flooded) },
-   { "mmu_recycled", VM_STAT(mmu_recycled) },
-   { "mmu_cache_miss", VM_STAT(mmu_cache_miss) },
-   { "mmu_unsync", VM_STAT(mmu_unsync) },
-   { "remote_tlb_flush", VM_STAT(remote_tlb_flush) },
-   { "largepages", VM_STAT(lpages) },
-   { NULL }
+   { VCPU_STAT("pf_fixed", pf_fixed) },
+   { VCPU_STAT("pf_guest", pf_guest) },
+   { VCPU_STAT("tlb_flush", tlb_flush) },
+   { VCPU_STAT("invlpg", invlpg) },
+   { VCPU_STAT("exits", exits) },
+   { VCPU_STAT("io_exits", io_exits) },
+   { VCPU_STAT("mmio_exits", mmio_exits) },
+   { VCPU_STAT("signal_exits", signal_exits) },
+   { VCPU_STAT("irq_window", irq_window_exits) },
+   { VCPU_STAT("nmi_window", nmi_window_exits) },
+   { VCPU_STAT("halt_exits", halt_exits) },
+   { VCPU_STAT("halt_wakeup", halt_wakeup) },
+   { VCPU_STAT("hypercalls", hypercalls) },
+   { VCPU_STAT("request_irq", request_irq_exits) },
+   { VCPU_STAT("irq_exits", irq_exits) },
+   { VCPU_STAT("host_state_reload", host_state_reload) },
+   { VCPU_STAT("efer_reload", efer_reload) },
+   { VCPU_STAT("fpu_reload", fpu_reload) },
+   { VCPU_STAT("insn_emulation", insn_emulation) },
+   { VCPU_STAT("insn_emulation_fail", insn_emulation_fail) },
+   { VCPU_STAT("irq_injections", irq_injections) },
+   { VCPU_STAT("nmi_injections", nmi_injections) },
+   { VM_STAT("mmu_shadow_zapped", mmu_shadow_zapped) },
+   { VM_STAT("mmu_pte_write", mmu_pte_write) },
+   { VM_STAT("mmu_pte_updated", mmu_pte_updated) },
+   { VM_STAT("mmu_pde_zapped", mmu_pde_zapped) },
+   { VM_STAT("mmu_flooded", mmu_flooded) },
+   { VM_STAT("mmu_recycled", mmu_recycled) },
+   { VM_STAT("mmu_cache_miss", mmu_cache_miss) },
+   { VM_STAT("mmu_unsync", mmu_unsync) },
+   { VM_STAT("remote_tlb_flush", remote_tlb_flush) },
+   { VM_STAT("largepages", lpages) },
+   { NULL, 0, 0, NULL }
 };
 
 u64 __read_mostly host_xcr0;

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.

[PATCH V2 3/4] x86/kvm: Resolve shadow warnings in macro expansion

2014-07-30 Thread Mark D Rustad
Resolve shadow warnings that appear in W=2 builds. Instead of
using ret to hold the return pointer, save the length in a new
variable saved_len and compute the pointer on exit. This also
resolves a very technical error, in that ret was declared as
a const char *, when it really was a char * const, which
theoretically could have allowed the compiler to do something
wrong.

Signed-off-by: Mark Rustad 
Signed-off-by: Jeff Kirsher 

---
Changes in V2:
- Instead of renaming all inner variables, just delete the
  ret variable in favor of the new saved_len variable.
---
 arch/x86/kvm/mmutrace.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmutrace.h b/arch/x86/kvm/mmutrace.h
index 9d2e0ffcb190..5aaf35641768 100644
--- a/arch/x86/kvm/mmutrace.h
+++ b/arch/x86/kvm/mmutrace.h
@@ -22,7 +22,7 @@
__entry->unsync = sp->unsync;
 
 #define KVM_MMU_PAGE_PRINTK() ({   \
-   const char *ret = p->buffer + p->len;   \
+   const u32 saved_len = p->len;   \
static const char *access_str[] = { \
"---", "--x", "w--", "w-x", "-u-", "-ux", "wu-", "wux"  \
};  \
@@ -41,7 +41,7 @@
 role.nxe ? "" : "!",   \
 __entry->root_count,   \
 __entry->unsync ? "unsync" : "sync", 0);   \
-   ret;\
+   p->buffer + saved_len;  \
})
 
 #define kvm_mmu_trace_pferr_flags   \

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


hang after seabios

2014-07-30 Thread Zetan Drableg
Hi,
Locally I have a supermicro server running OEL 6.5 with KVM
 can do virt-sysprep and libguestfs-test-tool no problem.

Linux 2.6.39-400.215.6.el6uek.x86_64
qemu-kvm-0.12.1.2-2.415.el6_5.10.x86_64
seabios-0.6.1.2-28.el6.x86_64

However I have a server in a datacenter (Sun X4-2) running the same
versions, and libguestfs-test-tool hangs when launching KVM.

virt-sysprep also hangs the same way when trying to access a disk image, so
I'm using libguestfs-test-tool as my example:


   [root@kvm]# libguestfs-test-tool

*IMPORTANT NOTICE
*
* When reporting bugs, include the COMPLETE, UNEDITED
* output below in your bug report.
*

   LIBGUESTFS_APPEND=edd=off
   PATH=/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr
   /sbin:/usr/bin:/root/bin
   SELinux: Enforcing
   library version: 1.20.11rhel=6,release=2.el6
   guestfs_get_append: edd=off
   guestfs_get_attach_method: appliance
   guestfs_get_autosync: 1
   guestfs_get_cachedir: /var/tmp
   guestfs_get_direct: 0
   guestfs_get_memsize: 500
   guestfs_get_network: 0
   guestfs_get_path: /usr/lib64/guestfs
   guestfs_get_pgroup: 0
   guestfs_get_qemu: /usr/libexec/qemu-kvm
   guestfs_get_recovery_proc: 1
   guestfs_get_selinux: 0
   guestfs_get_smp: 1
   guestfs_get_tmpdir: /tmp
   guestfs_get_trace: 0
   guestfs_get_verbose: 1
   host_cpu: x86_64
   Launching appliance, timeout set to 600 seconds.
   libguestfs: launch: attach-method=appliance
   libguestfs: launch: tmpdir=/tmp/libguestfspx9994
   libguestfs: launch: umask=0077
   libguestfs: launch: euid=0
   libguestfs: command: run: febootstrap-supermin-helper
   libguestfs: command: run: \ --verbose
   libguestfs: command: run: \ -f checksum
   libguestfs: command: run: \ /usr/lib64/guestfs/supermin.d
   libguestfs: command: run: \ x86_64
   supermin helper [0ms] whitelist = (not specified), host_cpu =
   x86_64, kernel = (null), initrd = (null), appliance = (null)
   supermin helper [0ms] inputs[0] = /usr/lib64/guestfs/supermin.d
   checking modpath /lib/modules/2.6.32-279.el6.x86_64 is a directory
   picked vmlinuz-2.6.32-279.el6.x86_64 because modpath /lib/modules/2.6.32-
   279.el6.x86_64 exists
   checking modpath /lib/modules/2.6.39-200.24.1.el6uek.x86_64 is a
   directory
   picked vmlinuz-2.6.39-200.24.1.el6uek.x86_64 because modpath /lib/modules
   /2.6.39-200.24.1.el6uek.x86_64 exists
   supermin helper [0ms] finished creating kernel
   supermin helper [0ms] visiting /usr/lib64/guestfs/supermin.d
   supermin helper [0ms] visiting /usr/lib64/guestfs/supermin.d/base.img
   supermin helper [0ms] visiting /usr/lib64/guestfs/supermin.d/
   daemon.img
   supermin helper [0ms] visiting /usr/lib64/guestfs/supermin.d/
   hostfiles
   supermin helper [00020ms] visiting /usr/lib64/guestfs/supermin.d/init.img
   supermin helper [00020ms] visiting /usr/lib64/guestfs/supermin.d/
   udev-rules.img
   supermin helper [00020ms] adding kernel modules
   supermin helper [00051ms] finished creating appliance
   libguestfs: checksum of existing appliance:
   4805d2b09b84366bd753e62706693476b59c3971f4c1808739426b92f8baa3bf
   libguestfs: [00054ms] begin testing qemu features
   libguestfs: command: run: /usr/libexec/qemu-kvm
   libguestfs: command: run: \ -nographic
   libguestfs: command: run: \ -help
   libguestfs: command: run: /usr/libexec/qemu-kvm
   libguestfs: command: run: \ -nographic
   libguestfs: command: run: \ -version
   libguestfs: qemu version 0.12
   libguestfs: command: run: /usr/libexec/qemu-kvm
   libguestfs: command: run: \ -nographic
   libguestfs: command: run: \ -machine accel=kvm:tcg
   libguestfs: command: run: \ -device ?
   libguestfs: [00182ms] finished testing qemu features
   libguestfs: accept_from_daemon: 0x2266e00 g->state = 1
   [00183ms] /usr/libexec/qemu-kvm \
   -global virtio-blk-pci.scsi=off \
   -nodefconfig \
   -nodefaults \
   -nographic \
   -machine accel=kvm:tcg \
   -cpu host,+kvmclock \
   -m 500 \
   -no-reboot \
   -kernel /var/tmp/.guestfs-0/kernel.47903 \
   -initrd /var/tmp/.guestfs-0/initrd.47903 \
   -device virtio-scsi-pci,id=scsi \
   -drive file=/tmp/libguestfs-test-tool-sda-Iakpwe,cache=none,format
   =raw,id=hd0,if=none \
   -device scsi-hd,drive=hd0 \
   -drive file=/var/tmp/.guestfs-0/root.47903,snapshot=on,id=appliance,
   if=none,cache=unsafe \
   -device scsi-hd,drive=appliance \
   -device virtio-serial \
   -serial stdio \
   -device sga \
   -chardev socket,path=/tmp/libguestfspx9994/guestfsd.sock,id=channel0
   \
   -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0
   \
   -append 'panic=1 console=ttyS0 udevtimeout=600 no_timer_check
   acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0
 

-- Re: Very Urgent............

2014-07-30 Thread Fabian Morision
Greetings from gulf region

Thanks for the e-mail. I am very interested on funding lucrative
business partnership with you acting as the manager and sole
controller of the investment while i remain a silent investor for a
period of ten yrs , though I am only looking at investment
opportunities within the range you specified for a start. You can
reply me here (fmoris...@yahoo.com)

Let me know your thought asap

Regards

Financial Consultant

Mr.Fabian Morision
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers and exception

2014-07-30 Thread Scott Wood
On Wed, 2014-07-30 at 01:43 -0500, Bhushan Bharat-R65777 wrote:
> 
> > -Original Message-
> > From: Wood Scott-B07421
> > Sent: Tuesday, July 29, 2014 3:58 AM
> > To: Bhushan Bharat-R65777
> > Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Yoder 
> > Stuart-
> > B08248
> > Subject: Re: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers and 
> > exception
> > 
> >  Userspace might be interested in
> > the raw value,
> 
> With the current design, If userspace is interested then it will not
> get the DBSR.

Oh, because DBSR isn't currently implemented in sregs or one reg?

>  But why userspace will be interested?

Do you expose all of the hardware's debugging features in your
high-level interface?

> > plus it's a change from the current API semantics.
> 
> Can you please let us know how ?

It looked like it was removing dbsr visibility and the requirement for
userspace to clear dbsr.  I guess the old way was that the value in
vcpu->arch.dbsr didn't matter until the next debug exception, when it
would be overwritten by the new SPRN_DBSR?

> > > + case SPRN_DBCR2:
> > > + /*
> > > +  * If userspace is debugging guest then guest
> > > +  * can not access debug registers.
> > > +  */
> > > + if (vcpu->guest_debug)
> > > + break;
> > > +
> > > + debug_inst = true;
> > > + vcpu->arch.dbg_reg.dbcr2 = spr_val;
> > > + vcpu->arch.shadow_dbg_reg.dbcr2 = spr_val;
> > >   break;
> > 
> > In what circumstances can the architected and shadow registers differ?
> 
> As of now they are same. But I think that if we want to implement other 
> features like "Freeze Timer (FT)" then they can be different.

I don't think we can possibly implement Freeze Timer.
 
> > >   case SPRN_DBSR:
> > > + /*
> > > +  * If userspace is debugging guest then guest
> > > +  * can not access debug registers.
> > > +  */
> > > + if (vcpu->guest_debug)
> > > + break;
> > > +
> > >   vcpu->arch.dbsr &= ~spr_val;
> > > + if (vcpu->arch.dbsr == 0)
> > > + kvmppc_core_dequeue_debug(vcpu);
> > >   break;
> > 
> > Not all DBSR bits cause an exception, e.g. IDE and MRR.
> 
> I am not sure what we should in that case ?
>
> As we are currently emulating a subset of debug events (IAC, DAC, IC,
> BT and TIE --- DBCR0 emulation) then we should expose status of those
> events in guest dbsr and rest should be cleared ?

I'm not saying they need to be exposed to the guest, but I don't see
where you filter out bits like these.

> > > @@ -273,6 +397,10 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu 
> > > *vcpu, int
> > sprn, ulong spr_val)
> > >   emulated = EMULATE_FAIL;
> > >   }
> > >
> > > + if (debug_inst) {
> > > + switch_booke_debug_regs(&vcpu->arch.shadow_dbg_reg);
> > > + current->thread.debug = vcpu->arch.shadow_dbg_reg;
> > > + }
> > 
> > Could you explain what's going on with regard to copying the registers
> > into current->thread.debug?  Why is it done after loading the registers
> > into the hardware (is there a race if we get preempted in the middle)?
> 
> Yes, and this was something I was not clear when writing this code.
> Should we have preempt disable-enable around this.

Can they be reordered instead?

-Scott


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers and exception

2014-07-30 Thread bharat.bhus...@freescale.com


> -Original Message-
> From: Wood Scott-B07421
> Sent: Thursday, July 31, 2014 8:18 AM
> To: Bhushan Bharat-R65777
> Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Yoder Stuart-
> B08248
> Subject: Re: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers and 
> exception
> 
> On Wed, 2014-07-30 at 01:43 -0500, Bhushan Bharat-R65777 wrote:
> >
> > > -Original Message-
> > > From: Wood Scott-B07421
> > > Sent: Tuesday, July 29, 2014 3:58 AM
> > > To: Bhushan Bharat-R65777
> > > Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org;
> > > Yoder Stuart-
> > > B08248
> > > Subject: Re: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers
> > > and exception
> > >
> > >  Userspace might be interested in
> > > the raw value,
> >
> > With the current design, If userspace is interested then it will not
> > get the DBSR.
> 
> Oh, because DBSR isn't currently implemented in sregs or one reg?

That is one reason. Another is that if we give dbsr visibility to userspace 
then userspace have to clear dbsr in handling KVM_EXIT_DEBUG. And we think 
there is no gain in doing that because 
" 
- QEMU cannot inject debug interrupt to guest (as this does not know guest 
ability to handle debug interrupt; MSR_DE), so will always clear DBSR.
- If QEMU has to always clear DBSR in handling KVM_EXIT_DEBUG then this 
(clearing dbsr in kernel) avoid doing in SET_SREGS/set_one_reg()
" This makes dbsr not visible to userspace.

Also this (clearing of dbsr) should not be part of this patch, this should be a 
separate patch. I will do that in next version.

> 
> >  But why userspace will be interested?
> 
> Do you expose all of the hardware's debugging features in your high-level
> interface?

We support h/w breakpoint, watchpoint and IC (single stepping) and status in 
userspace exit provide all required information to userspace.

> 
> > > plus it's a change from the current API semantics.
> >
> > Can you please let us know how ?
> 
> It looked like it was removing dbsr visibility and the requirement for 
> userspace
> to clear dbsr.  I guess the old way was that the value in
> vcpu->arch.dbsr didn't matter until the next debug exception, when it
> would be overwritten by the new SPRN_DBSR?

But that means old dbsr will be visibility to userspace, which is even bad than 
not visible, no?

Also this can lead to old dbsr visible to guest once userspace releases debug 
resources, but this can be solved by clearing dbsr in 
kvm_arch_vcpu_ioctl_set_guest_debug() -> " if (!(dbg->control & 
KVM_GUESTDBG_ENABLE)) { }".

> 
> > > > +   case SPRN_DBCR2:
> > > > +   /*
> > > > +* If userspace is debugging guest then guest
> > > > +* can not access debug registers.
> > > > +*/
> > > > +   if (vcpu->guest_debug)
> > > > +   break;
> > > > +
> > > > +   debug_inst = true;
> > > > +   vcpu->arch.dbg_reg.dbcr2 = spr_val;
> > > > +   vcpu->arch.shadow_dbg_reg.dbcr2 = spr_val;
> > > > break;
> > >
> > > In what circumstances can the architected and shadow registers differ?
> >
> > As of now they are same. But I think that if we want to implement other
> features like "Freeze Timer (FT)" then they can be different.
> 
> I don't think we can possibly implement Freeze Timer.

May be, but in my opinion we should keep this open.

> 
> > > > case SPRN_DBSR:
> > > > +   /*
> > > > +* If userspace is debugging guest then guest
> > > > +* can not access debug registers.
> > > > +*/
> > > > +   if (vcpu->guest_debug)
> > > > +   break;
> > > > +
> > > > vcpu->arch.dbsr &= ~spr_val;
> > > > +   if (vcpu->arch.dbsr == 0)
> > > > +   kvmppc_core_dequeue_debug(vcpu);
> > > > break;
> > >
> > > Not all DBSR bits cause an exception, e.g. IDE and MRR.
> >
> > I am not sure what we should in that case ?
> >
> > As we are currently emulating a subset of debug events (IAC, DAC, IC,
> > BT and TIE --- DBCR0 emulation) then we should expose status of those
> > events in guest dbsr and rest should be cleared ?
> 
> I'm not saying they need to be exposed to the guest, but I don't see where you
> filter out bits like these.

I am trying to get what all bits should be filtered out, all bits except IACx, 
DACx, IC, BT and TIE (same as event set filtering done when setting DBCR0) ? 

i.e IDE, UDE, MRR, IRPT, RET, CIRPT, CRET should be filtered out?

> 
> > > > @@ -273,6 +397,10 @@ int kvmppc_booke_emulate_mtspr(struct
> > > > kvm_vcpu *vcpu, int
> > > sprn, ulong spr_val)
> > > > emulated = EMULATE_FAIL;
> > > > }
> > > >
> > > > +   if (debug_inst) {
> > > > +   switch_booke_debug_regs(&vcpu->arch.shadow_dbg_reg);
> > > > +   current->thread.debug = vcpu->arch.shadow_dbg_reg;
> >