date:20140801

Re: [PATCH] KVM: nVMX: nested TPR shadow/threshold emulation

2014-08-01 Thread Paolo Bonzini

Il 01/08/2014 02:57, Zhang, Yang Z ha scritto:
  TPR_THRESHOLD will be likely written as zero, but the processor will
  never use it anyway.  It's just a small optimization because
  nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW) will almost always be true.
 
 Theoretically, you are right. But we should not expect all VMMs
 follow it. It is not worth to violate the SDM just for saving two or
 three instructions' cost.

Yes, you do need an if (cpu_has_vmx_tpr_shadow()) around the
vmcs_write32.  But still, checking nested_cpu_has is not strictly
necessary.  Right now they both are a single AND, but I have plans to
change all of the cpu_has_*() checks to static keys.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH] KVM: nVMX: nested TPR shadow/threshold emulation

2014-08-01 Thread Zhang, Yang Z

Paolo Bonzini wrote on 2014-08-01:
 Il 01/08/2014 02:57, Zhang, Yang Z ha scritto:
 TPR_THRESHOLD will be likely written as zero, but the processor
 will never use it anyway.  It's just a small optimization because
 nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW) will almost always
 be true.
 
 Theoretically, you are right. But we should not expect all VMMs
 follow it. It is not worth to violate the SDM just for saving two or
 three instructions' cost.
 
 Yes, you do need an if (cpu_has_vmx_tpr_shadow()) around the
 vmcs_write32.  But still, checking nested_cpu_has is not strictly necessary.
 Right now they both are a single AND, but I have plans to change all
 of the
 cpu_has_*() checks to static keys.

See v2 patch. It isn't a problem anymore.

 
 Paolo


Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Integrity in untrusted environments

2014-08-01 Thread Paolo Bonzini

Il 31/07/2014 23:25, Shiva V ha scritto:
 Hello,
  I am exploring ideas to implement a service inside a virtual machine on 
 untrusted hypervisors under current cloud infrastructures.
  Particularly, I am interested how one can verify the integrity of the 
 service in an environment where hypervisor is not trusted. This is my setup.
 
 1. I have two virtual machines. (Normal client VM's).
 2. VM-A is executing a service and VM-B wants to verify its integrity.
 3. Both are executing on untrusted hypervisor.
 
 Though, Intel SGX will solve this, by using the concept of enclaves, its not 
 publicly available yet.
 
 One could also use SMM to verify the integrity. But since this is time based 
 approach, one could easily exploit between the time window.
 
 I was drilling down this idea, We know Write xor Execute Memory Protection 
 Scheme. Using this idea,If we could lock down the VM-A memory pages where 
 the service is running and also corresponding page-table entries, then have 
 a handler code that temporarily unlocks them for legitimate updates, then 
 one could verify the integrity of the service running. 

You can make a malicious hypervisor that makes all executable pages also
writable, but hides the fact to the running process.  But really, if you
control the hypervisor you can just write to guest memory as you wish.

SMM will be emulated by the hypervisor.

If the hypervisor is untrusted, you cannot solve _everything_.  For the
third time, what attacks are you trying to protect from?

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2] KVM: nVMX: nested TPR shadow/threshold emulation

2014-08-01 Thread Wanpeng Li

This patch fix bug https://bugzilla.kernel.org/show_bug.cgi?id=61411

TPR shadow/threshold feature is important to speed up the Windows guest.
Besides, it is a must feature for certain VMM.

We map virtual APIC page address and TPR threshold from L1 VMCS. If
TPR_BELOW_THRESHOLD VM exit is triggered by L2 guest and L1 interested
in, we inject it into L1 VMM for handling.

Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com
---
v1 - v2:
 * don't take L0's virtualize APIC accesses setting into account
 * virtual_apic_page do exactly the same thing that is done for apic_access_page
 * add the tpr threshold field to the read-write fields for shadow VMCS

 arch/x86/kvm/vmx.c | 33 +++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a3845b8..0e6e95e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -379,6 +379,7 @@ struct nested_vmx {
 * we must keep them pinned while L2 runs.
 */
struct page *apic_access_page;
+   struct page *virtual_apic_page;
u64 msr_ia32_feature_control;
 
struct hrtimer preemption_timer;
@@ -533,6 +534,7 @@ static int max_shadow_read_only_fields =
ARRAY_SIZE(shadow_read_only_fields);
 
 static unsigned long shadow_read_write_fields[] = {
+   TPR_THRESHOLD,
GUEST_RIP,
GUEST_RSP,
GUEST_CR0,
@@ -2331,7 +2333,7 @@ static __init void nested_vmx_setup_ctls_msrs(void)
CPU_BASED_MOV_DR_EXITING | CPU_BASED_UNCOND_IO_EXITING |
CPU_BASED_USE_IO_BITMAPS | CPU_BASED_MONITOR_EXITING |
CPU_BASED_RDPMC_EXITING | CPU_BASED_RDTSC_EXITING |
-   CPU_BASED_PAUSE_EXITING |
+   CPU_BASED_PAUSE_EXITING | CPU_BASED_TPR_SHADOW |
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
/*
 * We can allow some features even when not supported by the
@@ -6149,6 +6151,10 @@ static void free_nested(struct vcpu_vmx *vmx)
nested_release_page(vmx-nested.apic_access_page);
vmx-nested.apic_access_page = 0;
}
+   if (vmx-nested.virtual_apic_page) {
+   nested_release_page(vmx-nested.virtual_apic_page);
+   vmx-nested.virtual_apic_page = 0;
+   }
 
nested_free_all_saved_vmcss(vmx);
 }
@@ -6937,7 +6943,7 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
case EXIT_REASON_MCE_DURING_VMENTRY:
return 0;
case EXIT_REASON_TPR_BELOW_THRESHOLD:
-   return 1;
+   return nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW);
case EXIT_REASON_APIC_ACCESS:
return nested_cpu_has2(vmcs12,
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES);
@@ -7058,6 +7064,9 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
 
 static void update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
 {
+   if (is_guest_mode(vcpu))
+   return;
+
if (irr == -1 || tpr  irr) {
vmcs_write32(TPR_THRESHOLD, 0);
return;
@@ -8025,6 +8034,22 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct 
vmcs12 *vmcs12)
exec_control = ~CPU_BASED_VIRTUAL_NMI_PENDING;
exec_control = ~CPU_BASED_TPR_SHADOW;
exec_control |= vmcs12-cpu_based_vm_exec_control;
+
+   if (exec_control  CPU_BASED_TPR_SHADOW) {
+   if (vmx-nested.virtual_apic_page)
+   nested_release_page(vmx-nested.virtual_apic_page);
+   vmx-nested.virtual_apic_page =
+  nested_get_page(vcpu, vmcs12-virtual_apic_page_addr);
+   if (!vmx-nested.virtual_apic_page)
+   exec_control =
+   ~CPU_BASED_TPR_SHADOW;
+   else
+   vmcs_write64(VIRTUAL_APIC_PAGE_ADDR,
+   page_to_phys(vmx-nested.virtual_apic_page));
+
+   vmcs_write32(TPR_THRESHOLD, vmcs12-tpr_threshold);
+   }
+
/*
 * Merging of IO and MSR bitmaps not currently supported.
 * Rather, exit every time.
@@ -8793,6 +8818,10 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 
exit_reason,
nested_release_page(vmx-nested.apic_access_page);
vmx-nested.apic_access_page = 0;
}
+   if (vmx-nested.virtual_apic_page) {
+   nested_release_page(vmx-nested.virtual_apic_page);
+   vmx-nested.virtual_apic_page = 0;
+   }
 
/*
 * Exiting from L2 to L1, we're now back to L1 which thinks it just
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] KVM: nVMX: Fix nested vmexit ack intr before load vmcs01

2014-08-01 Thread Wanpeng Li

External interrupt will cause L1 vmexit w/ reason external interrupt when L2 is 
running. Then L1 will pick up the interrupt through vmcs12 if L1 set the ack 
interrupt bit. Commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
if L1 asks us to) get intr that belongs to L1 before load vmcs01 which is 
wrong, 
especially this lead to the obvious L1 ack APICv behavior weired since APICv 
is for L1 instead of L2. This patch fix it by ack intr after load vmcs01.

Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com
---
 arch/x86/kvm/vmx.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e618f34..b8122b3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8754,14 +8754,6 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 
exit_reason,
prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
   exit_qualification);
 
-   if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
-nested_exit_intr_ack_set(vcpu)) {
-   int irq = kvm_cpu_get_interrupt(vcpu);
-   WARN_ON(irq  0);
-   vmcs12-vm_exit_intr_info = irq |
-   INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
-   }
-
trace_kvm_nested_vmexit_inject(vmcs12-vm_exit_reason,
   vmcs12-exit_qualification,
   vmcs12-idt_vectoring_info_field,
@@ -8771,6 +8763,14 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 
exit_reason,
 
vmx_load_vmcs01(vcpu);
 
+   if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
+nested_exit_intr_ack_set(vcpu)) {
+   int irq = kvm_cpu_get_interrupt(vcpu);
+   WARN_ON(irq  0);
+   vmcs12-vm_exit_intr_info = irq |
+   INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
+   }
+
vm_entry_controls_init(vmx, vmcs_read32(VM_ENTRY_CONTROLS));
vm_exit_controls_init(vmx, vmcs_read32(VM_EXIT_CONTROLS));
vmx_segment_cache_clear(vmx);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use

2014-08-01 Thread Wanpeng Li

After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
if L1 asks us to), Acknowledge interrupt on exit behavior can be
emulated. To do so, KVM will ask the APIC for the interrupt vector if
during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set.  With APICv,
kvm_get_apic_interrupt would return -1 and give the following WARNING:

Call Trace:
 [81493563] dump_stack+0x49/0x5e
 [8103f0eb] warn_slowpath_common+0x7c/0x96
 [a059709a] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [8103f11a] warn_slowpath_null+0x15/0x17
 [a059709a] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [a0594295] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
 [a0537931] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
 [a05972ec] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
 [a051ebe9] inject_pending_event+0xd0/0x16e [kvm]
 [a051efa0] vcpu_enter_guest+0x319/0x704 [kvm]

If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
But when L2 is running, external interrupt will casue L1 vmexit with
reason external interrupt. Then L1 will pick up the interrupt through
vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
L1 is running, so APIC-v hardware still will do vEOI updating. The problem
is that the interrupt is delivered not through APIC-v hardware, this means
SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
updating. The solution is that, when L1 tried to pick up the interrupt
from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
sure the following vEOI updating and vPPR updating corrently.

Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
not cleare vIRR and hypervisor need to clear it before L1 running.

Suggested-by: Paolo Bonzini pbonz...@redhat.com
Suggested-by: Zhang, Yang Z yang.z.zh...@intel.com
Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com
---
 arch/x86/kvm/lapic.c | 18 ++
 arch/x86/kvm/lapic.h |  1 +
 arch/x86/kvm/vmx.c   | 10 ++
 3 files changed, 29 insertions(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3855103..06942b9 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -534,6 +534,24 @@ static void apic_set_tpr(struct kvm_lapic *apic, u32 tpr)
apic_update_ppr(apic);
 }
 
+int kvm_lapic_ack_apicv(struct kvm_vcpu *vcpu)
+{
+   struct kvm_lapic *apic = vcpu-arch.apic;
+   int vec;
+
+   vec = kvm_apic_has_interrupt(vcpu);
+
+   if (vec == -1)
+   return vec;
+
+   apic_set_vector(vec, apic-regs + APIC_ISR);
+   apic_update_ppr(apic);
+   apic_clear_vector(vec, apic-regs + APIC_IRR);
+
+   return vec;
+}
+EXPORT_SYMBOL_GPL(kvm_lapic_ack_apicv);
+
 int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest)
 {
return dest == 0xff || kvm_apic_id(apic) == dest;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 6a11845..ead1392 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -169,5 +169,6 @@ static inline bool kvm_apic_has_events(struct kvm_vcpu 
*vcpu)
 }
 
 bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
+int kvm_lapic_ack_apicv(struct kvm_vcpu *vcpu);
 
 #endif
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b8122b3..c604f3c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8766,6 +8766,16 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 
exit_reason,
if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
 nested_exit_intr_ack_set(vcpu)) {
int irq = kvm_cpu_get_interrupt(vcpu);
+
+   if (irq  0  kvm_apic_vid_enabled(vcpu-kvm)) {
+   irq = kvm_lapic_ack_apicv(vcpu);
+   if (irq = 0) {
+   vmx_hwapic_isr_update(vcpu-kvm, irq);
+   /* try to update RVI */
+   kvm_make_request(KVM_REQ_EVENT, vcpu);
+   }
+   }
+
WARN_ON(irq  0);
vmcs12-vm_exit_intr_info = irq |
INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use

2014-08-01 Thread Wanpeng Li

After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
if L1 asks us to), Acknowledge interrupt on exit behavior can be
emulated. To do so, KVM will ask the APIC for the interrupt vector if
during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set.  With APICv,
kvm_get_apic_interrupt would return -1 and give the following WARNING:

Call Trace:
 [81493563] dump_stack+0x49/0x5e
 [8103f0eb] warn_slowpath_common+0x7c/0x96
 [a059709a] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [8103f11a] warn_slowpath_null+0x15/0x17
 [a059709a] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [a0594295] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
 [a0537931] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
 [a05972ec] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
 [a051ebe9] inject_pending_event+0xd0/0x16e [kvm]
 [a051efa0] vcpu_enter_guest+0x319/0x704 [kvm]

If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
But when L2 is running, external interrupt will casue L1 vmexit with
reason external interrupt. Then L1 will pick up the interrupt through
vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
L1 is running, so APIC-v hardware still will do vEOI updating. The problem
is that the interrupt is delivered not through APIC-v hardware, this means
SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
updating. The solution is that, when L1 tried to pick up the interrupt
from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
sure the following vEOI updating and vPPR updating corrently.

Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
not cleare vIRR and hypervisor need to clear it before L1 running.

Suggested-by: Paolo Bonzini pbonz...@redhat.com
Suggested-by: Zhang, Yang Z yang.z.zh...@intel.com
Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com
---
 arch/x86/kvm/lapic.c | 18 ++
 arch/x86/kvm/lapic.h |  1 +
 arch/x86/kvm/vmx.c   | 10 ++
 3 files changed, 29 insertions(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3855103..06942b9 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -534,6 +534,24 @@ static void apic_set_tpr(struct kvm_lapic *apic, u32 tpr)
apic_update_ppr(apic);
 }
 
+int kvm_lapic_ack_apicv(struct kvm_vcpu *vcpu)
+{
+   struct kvm_lapic *apic = vcpu-arch.apic;
+   int vec;
+
+   vec = kvm_apic_has_interrupt(vcpu);
+
+   if (vec == -1)
+   return vec;
+
+   apic_set_vector(vec, apic-regs + APIC_ISR);
+   apic_update_ppr(apic);
+   apic_clear_vector(vec, apic-regs + APIC_IRR);
+
+   return vec;
+}
+EXPORT_SYMBOL_GPL(kvm_lapic_ack_apicv);
+
 int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest)
 {
return dest == 0xff || kvm_apic_id(apic) == dest;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 6a11845..ead1392 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -169,5 +169,6 @@ static inline bool kvm_apic_has_events(struct kvm_vcpu 
*vcpu)
 }
 
 bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
+int kvm_lapic_ack_apicv(struct kvm_vcpu *vcpu);
 
 #endif
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b8122b3..c604f3c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8766,6 +8766,16 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 
exit_reason,
if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
 nested_exit_intr_ack_set(vcpu)) {
int irq = kvm_cpu_get_interrupt(vcpu);
+
+   if (irq  0  kvm_apic_vid_enabled(vcpu-kvm)) {
+   irq = kvm_lapic_ack_apicv(vcpu);
+   if (irq = 0) {
+   vmx_hwapic_isr_update(vcpu-kvm, irq);
+   /* try to update RVI */
+   kvm_make_request(KVM_REQ_EVENT, vcpu);
+   }
+   }
+
WARN_ON(irq  0);
vmcs12-vm_exit_intr_info = irq |
INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use

2014-08-01 Thread Wanpeng Li

Please ignore this duplicate one.
于 14-8-1 下午4:13, Wanpeng Li 写道:
 After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
 if L1 asks us to), Acknowledge interrupt on exit behavior can be
 emulated. To do so, KVM will ask the APIC for the interrupt vector if
 during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set.  With APICv,
 kvm_get_apic_interrupt would return -1 and give the following WARNING:

 Call Trace:
  [81493563] dump_stack+0x49/0x5e
  [8103f0eb] warn_slowpath_common+0x7c/0x96
  [a059709a] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
  [8103f11a] warn_slowpath_null+0x15/0x17
  [a059709a] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
  [a0594295] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
  [a0537931] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
  [a05972ec] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
  [a051ebe9] inject_pending_event+0xd0/0x16e [kvm]
  [a051efa0] vcpu_enter_guest+0x319/0x704 [kvm]

 If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
 But when L2 is running, external interrupt will casue L1 vmexit with
 reason external interrupt. Then L1 will pick up the interrupt through
 vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
 L1 is running, so APIC-v hardware still will do vEOI updating. The problem
 is that the interrupt is delivered not through APIC-v hardware, this means
 SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
 updating. The solution is that, when L1 tried to pick up the interrupt
 from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
 sure the following vEOI updating and vPPR updating corrently.
 
 Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
 not cleare vIRR and hypervisor need to clear it before L1 running.

 Suggested-by: Paolo Bonzini pbonz...@redhat.com
 Suggested-by: Zhang, Yang Z yang.z.zh...@intel.com
 Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com
 ---
  arch/x86/kvm/lapic.c | 18 ++
  arch/x86/kvm/lapic.h |  1 +
  arch/x86/kvm/vmx.c   | 10 ++
  3 files changed, 29 insertions(+)

 diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
 index 3855103..06942b9 100644
 --- a/arch/x86/kvm/lapic.c
 +++ b/arch/x86/kvm/lapic.c
 @@ -534,6 +534,24 @@ static void apic_set_tpr(struct kvm_lapic *apic, u32 tpr)
   apic_update_ppr(apic);
  }
  
 +int kvm_lapic_ack_apicv(struct kvm_vcpu *vcpu)
 +{
 + struct kvm_lapic *apic = vcpu-arch.apic;
 + int vec;
 +
 + vec = kvm_apic_has_interrupt(vcpu);
 +
 + if (vec == -1)
 + return vec;
 +
 + apic_set_vector(vec, apic-regs + APIC_ISR);
 + apic_update_ppr(apic);
 + apic_clear_vector(vec, apic-regs + APIC_IRR);
 +
 + return vec;
 +}
 +EXPORT_SYMBOL_GPL(kvm_lapic_ack_apicv);
 +
  int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest)
  {
   return dest == 0xff || kvm_apic_id(apic) == dest;
 diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
 index 6a11845..ead1392 100644
 --- a/arch/x86/kvm/lapic.h
 +++ b/arch/x86/kvm/lapic.h
 @@ -169,5 +169,6 @@ static inline bool kvm_apic_has_events(struct kvm_vcpu 
 *vcpu)
  }
  
  bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
 +int kvm_lapic_ack_apicv(struct kvm_vcpu *vcpu);
  
  #endif
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index b8122b3..c604f3c 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -8766,6 +8766,16 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, 
 u32 exit_reason,
   if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
nested_exit_intr_ack_set(vcpu)) {
   int irq = kvm_cpu_get_interrupt(vcpu);
 +
 + if (irq  0  kvm_apic_vid_enabled(vcpu-kvm)) {
 + irq = kvm_lapic_ack_apicv(vcpu);
 + if (irq = 0) {
 + vmx_hwapic_isr_update(vcpu-kvm, irq);
 + /* try to update RVI */
 + kvm_make_request(KVM_REQ_EVENT, vcpu);
 + }
 + }
 +
   WARN_ON(irq  0);
   vmcs12-vm_exit_intr_info = irq |
   INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] x86: AMD: mark TSC unstable on APU family 15h models 10h-1fh

2014-08-01 Thread Borislav Petkov

On Thu, Jul 31, 2014 at 09:47:12AM +, Igor Mammedov wrote:
 Due to erratum #778 from
 Revision Guide for AMD Family 15h Models 10h-1Fh Processors,
  Publication # 48931, Issue Date: May 2013, Revision: 3.10
 
 TSC on affected processor, a core may drift under certain conditions,
 which makes initially synchronized TSCs to become unsynchronized.
 
 As result TSC clocksource becomes unsuitable for using as wallclock
 and it brakes pvclock when it's running with PVCLOCK_TSC_STABLE_BIT
 flag set.
 That causes backwards clock jumps when pvclock is first read on
 CPU with drifted TSC and then on CPU where TSC was stable or had
 a lower drift rate.
 
 To fix issue mark TSC as unstable on affected CPU, so it won't
 be used as clocksource. Which in turn disables master_clock
 mechanism in KVM and force pvclock using global clock counter
 that can't go backwards.
 
 Signed-off-by: Igor Mammedov imamm...@redhat.com

Acked-by: Borislav Petkov b...@suse.de
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] KVM: nVMX: nested TPR shadow/threshold emulation

2014-08-01 Thread Paolo Bonzini

Il 01/08/2014 10:09, Wanpeng Li ha scritto:
 This patch fix bug https://bugzilla.kernel.org/show_bug.cgi?id=61411
 
 TPR shadow/threshold feature is important to speed up the Windows guest.
 Besides, it is a must feature for certain VMM.
 
 We map virtual APIC page address and TPR threshold from L1 VMCS. If
 TPR_BELOW_THRESHOLD VM exit is triggered by L2 guest and L1 interested
 in, we inject it into L1 VMM for handling.
 
 Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com
 ---
 v1 - v2:
  * don't take L0's virtualize APIC accesses setting into account
  * virtual_apic_page do exactly the same thing that is done for 
 apic_access_page
  * add the tpr threshold field to the read-write fields for shadow VMCS
 
  arch/x86/kvm/vmx.c | 33 +++--
  1 file changed, 31 insertions(+), 2 deletions(-)
 
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index a3845b8..0e6e95e 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -379,6 +379,7 @@ struct nested_vmx {
* we must keep them pinned while L2 runs.
*/
   struct page *apic_access_page;
 + struct page *virtual_apic_page;
   u64 msr_ia32_feature_control;
  
   struct hrtimer preemption_timer;
 @@ -533,6 +534,7 @@ static int max_shadow_read_only_fields =
   ARRAY_SIZE(shadow_read_only_fields);
  
  static unsigned long shadow_read_write_fields[] = {
 + TPR_THRESHOLD,
   GUEST_RIP,
   GUEST_RSP,
   GUEST_CR0,
 @@ -2331,7 +2333,7 @@ static __init void nested_vmx_setup_ctls_msrs(void)
   CPU_BASED_MOV_DR_EXITING | CPU_BASED_UNCOND_IO_EXITING |
   CPU_BASED_USE_IO_BITMAPS | CPU_BASED_MONITOR_EXITING |
   CPU_BASED_RDPMC_EXITING | CPU_BASED_RDTSC_EXITING |
 - CPU_BASED_PAUSE_EXITING |
 + CPU_BASED_PAUSE_EXITING | CPU_BASED_TPR_SHADOW |
   CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
   /*
* We can allow some features even when not supported by the
 @@ -6149,6 +6151,10 @@ static void free_nested(struct vcpu_vmx *vmx)
   nested_release_page(vmx-nested.apic_access_page);
   vmx-nested.apic_access_page = 0;
   }
 + if (vmx-nested.virtual_apic_page) {
 + nested_release_page(vmx-nested.virtual_apic_page);
 + vmx-nested.virtual_apic_page = 0;
 + }
  
   nested_free_all_saved_vmcss(vmx);
  }
 @@ -6937,7 +6943,7 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu 
 *vcpu)
   case EXIT_REASON_MCE_DURING_VMENTRY:
   return 0;
   case EXIT_REASON_TPR_BELOW_THRESHOLD:
 - return 1;
 + return nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW);
   case EXIT_REASON_APIC_ACCESS:
   return nested_cpu_has2(vmcs12,
   SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES);
 @@ -7058,6 +7064,9 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
  
  static void update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
  {
 + if (is_guest_mode(vcpu))
 + return;
 +
   if (irr == -1 || tpr  irr) {
   vmcs_write32(TPR_THRESHOLD, 0);
   return;
 @@ -8025,6 +8034,22 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, 
 struct vmcs12 *vmcs12)
   exec_control = ~CPU_BASED_VIRTUAL_NMI_PENDING;
   exec_control = ~CPU_BASED_TPR_SHADOW;
   exec_control |= vmcs12-cpu_based_vm_exec_control;
 +
 + if (exec_control  CPU_BASED_TPR_SHADOW) {
 + if (vmx-nested.virtual_apic_page)
 + nested_release_page(vmx-nested.virtual_apic_page);
 + vmx-nested.virtual_apic_page =
 +nested_get_page(vcpu, vmcs12-virtual_apic_page_addr);
 + if (!vmx-nested.virtual_apic_page)
 + exec_control =
 + ~CPU_BASED_TPR_SHADOW;

This will cause L1 to miss exits when L2 writes to CR8.  I think the
only sensible thing to do if this happens is fail the vmentry.

The problem is that while the APIC access page field is used to trap
reads/writes to the APIC access page itself, here the processor will
read/write the virtual APIC page when L2 does CR8 accesses.

Paolo
 + else
 + vmcs_write64(VIRTUAL_APIC_PAGE_ADDR,
 + page_to_phys(vmx-nested.virtual_apic_page));
 +
 + vmcs_write32(TPR_THRESHOLD, vmcs12-tpr_threshold);
 + }
 +
   /*
* Merging of IO and MSR bitmaps not currently supported.
* Rather, exit every time.
 @@ -8793,6 +8818,10 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, 
 u32 exit_reason,
   nested_release_page(vmx-nested.apic_access_page);
   vmx-nested.apic_access_page = 0;
   }
 + if (vmx-nested.virtual_apic_page) {
 + nested_release_page(vmx-nested.virtual_apic_page);
 + vmx-nested.virtual_apic_page = 0;
 + }
  
   /*
* Exiting from L2 to L1, we're now back to L1 which thinks

Re: [PATCH] arm64: KVM: export current vcpu-pause state via pseudo regs

2014-08-01 Thread Alex Bennée


Christoffer Dall writes:

 On Thu, Jul 31, 2014 at 04:14:51PM +0100, Alex Bennée wrote:
 
 Christoffer Dall writes:
 
  On Wed, Jul 09, 2014 at 02:55:12PM +0100, Alex Bennée wrote:
  To cleanly restore an SMP VM we need to ensure that the current pause
  state of each vcpu is correctly recorded. Things could get confused if
  the CPU starts running after migration restore completes when it was
  paused before it state was captured.
  
 snip
  +/* Power state (PSCI), not real registers */
  +#define KVM_REG_ARM_PSCI (0x0014  KVM_REG_ARM_COPROC_SHIFT)
  +#define KVM_REG_ARM_PSCI_REG(n) \
  + (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | KVM_REG_ARM_PSCI | \
  + (n  ~KVM_REG_ARM_COPROC_MASK))
 
  I don't understand this mask, why isn't this
  (n  0x))
 
 I was trying to use the existing masks, but of course if anyone changes
 that it would be an ABI change so probably not worth it.
 

 the KVM_REG_ARM_COPROC_MASK is part of the uapi IIRC, so that's not the
 issue, but that mask doesn't cover all the upper bits, so it feels weird
 to use that to me.

Yeah I missed that. I could do a:

#define KVM_REG_ARM_COPROC_INDEX_MASK   ((1KVM_REG_ARM_COPROC_SHIFT)-1)

and use that. I'm generally try to avoid hardcoded numbers but I could
be being a little OCD here ;-)

  Can you add the 32-bit counterpart as part of this patch?
 
 Same patch? Sure.

 really up to you if you want to split it up into two patches, but I
 think it's small enough that you can just create one patch.

Given the similarity of this code between arm and arm64 I'm wondering if
it's worth doing a arch/arm/kvm/guest_common.c or something to reduce
the amount of copy paste stuff?

-- 
Alex Bennée
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 19/63] KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE

2014-08-01 Thread Alexander Graf

Some data structures are always stored in big endian. Among those are the LPPACA
fields as well as the shadow slb. These structures might be shared with a
hypervisor.

So whenever we access those fields, make sure we do so in big endian byte order.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index e66c1e38..bf5270e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -32,10 +32,6 @@
 
 #define VCPU_GPRS_TM(reg) (((reg) * ULONG_SIZE) + VCPU_GPR_TM)
 
-#ifdef __LITTLE_ENDIAN__
-#error Need to fix lppaca and SLB shadow accesses in little endian mode
-#endif
-
 /* Values in HSTATE_NAPPING(r13) */
 #define NAPPING_CEDE   1
 #define NAPPING_NOVCPU 2
@@ -595,9 +591,10 @@ kvmppc_got_guest:
ld  r3, VCPU_VPA(r4)
cmpdi   r3, 0
beq 25f
-   lwz r5, LPPACA_YIELDCOUNT(r3)
+   li  r6, LPPACA_YIELDCOUNT
+   LWZX_BE r5, r3, r6
addir5, r5, 1
-   stw r5, LPPACA_YIELDCOUNT(r3)
+   STWX_BE r5, r3, r6
li  r6, 1
stb r6, VCPU_VPA_DIRTY(r4)
 25:
@@ -1442,9 +1439,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
ld  r8, VCPU_VPA(r9)/* do they have a VPA? */
cmpdi   r8, 0
beq 25f
-   lwz r3, LPPACA_YIELDCOUNT(r8)
+   li  r4, LPPACA_YIELDCOUNT
+   LWZX_BE r3, r8, r4
addir3, r3, 1
-   stw r3, LPPACA_YIELDCOUNT(r8)
+   STWX_BE r3, r8, r4
li  r3, 1
stb r3, VCPU_VPA_DIRTY(r9)
 25:
@@ -1757,8 +1755,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 33:ld  r8,PACA_SLBSHADOWPTR(r13)
 
.rept   SLB_NUM_BOLTED
-   ld  r5,SLBSHADOW_SAVEAREA(r8)
-   ld  r6,SLBSHADOW_SAVEAREA+8(r8)
+   li  r3, SLBSHADOW_SAVEAREA
+   LDX_BE  r5, r8, r3
+   addir3, r3, 8
+   LDX_BE  r6, r8, r3
andis.  r7,r5,SLB_ESID_V@h
beq 1f
slbmte  r6,r5
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 29/63] kvm: ppc: bookehv: Added wrapper macros for shadow registers

2014-08-01 Thread Alexander Graf

From: Bharat Bhushan bharat.bhus...@freescale.com

There are shadow registers like, GSPRG[0-3], GSRR0, GSRR1 etc on
BOOKE-HV and these shadow registers are guest accessible.
So these shadow registers needs to be updated on BOOKE-HV.
This patch adds new macro for get/set helper of shadow register .

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h | 44 +++---
 1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index e2fd5a1..6520d09 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -472,8 +472,20 @@ static inline bool kvmppc_shared_big_endian(struct 
kvm_vcpu *vcpu)
 #endif
 }
 
+#define SPRNG_WRAPPER_GET(reg, e500hv_spr) \
+static inline ulong kvmppc_get_##reg(struct kvm_vcpu *vcpu)\
+{  \
+   return mfspr(e500hv_spr);   \
+}  \
+
+#define SPRNG_WRAPPER_SET(reg, e500hv_spr) \
+static inline void kvmppc_set_##reg(struct kvm_vcpu *vcpu, ulong val)  \
+{  \
+   mtspr(e500hv_spr, val); \
+}  \
+
 #define SHARED_WRAPPER_GET(reg, size)  \
-static inline u##size kvmppc_get_##reg(struct kvm_vcpu *vcpu)  \
+static inline u##size kvmppc_get_##reg(struct kvm_vcpu *vcpu)  \
 {  \
if (kvmppc_shared_big_endian(vcpu)) \
   return be##size##_to_cpu(vcpu-arch.shared-reg);\
@@ -494,14 +506,30 @@ static inline void kvmppc_set_##reg(struct kvm_vcpu 
*vcpu, u##size val)   \
SHARED_WRAPPER_GET(reg, size)   \
SHARED_WRAPPER_SET(reg, size)   \
 
+#define SPRNG_WRAPPER(reg, e500hv_spr) \
+   SPRNG_WRAPPER_GET(reg, e500hv_spr)  \
+   SPRNG_WRAPPER_SET(reg, e500hv_spr)  \
+
+#ifdef CONFIG_KVM_BOOKE_HV
+
+#define SHARED_SPRNG_WRAPPER(reg, size, e500hv_spr)\
+   SPRNG_WRAPPER(reg, e500hv_spr)  \
+
+#else
+
+#define SHARED_SPRNG_WRAPPER(reg, size, e500hv_spr)\
+   SHARED_WRAPPER(reg, size)   \
+
+#endif
+
 SHARED_WRAPPER(critical, 64)
-SHARED_WRAPPER(sprg0, 64)
-SHARED_WRAPPER(sprg1, 64)
-SHARED_WRAPPER(sprg2, 64)
-SHARED_WRAPPER(sprg3, 64)
-SHARED_WRAPPER(srr0, 64)
-SHARED_WRAPPER(srr1, 64)
-SHARED_WRAPPER(dar, 64)
+SHARED_SPRNG_WRAPPER(sprg0, 64, SPRN_GSPRG0)
+SHARED_SPRNG_WRAPPER(sprg1, 64, SPRN_GSPRG1)
+SHARED_SPRNG_WRAPPER(sprg2, 64, SPRN_GSPRG2)
+SHARED_SPRNG_WRAPPER(sprg3, 64, SPRN_GSPRG3)
+SHARED_SPRNG_WRAPPER(srr0, 64, SPRN_GSRR0)
+SHARED_SPRNG_WRAPPER(srr1, 64, SPRN_GSRR1)
+SHARED_SPRNG_WRAPPER(dar, 64, SPRN_GDEAR)
 SHARED_WRAPPER_GET(msr, 64)
 static inline void kvmppc_set_msr_fast(struct kvm_vcpu *vcpu, u64 val)
 {
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 07/63] KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue

2014-08-01 Thread Alexander Graf

From: Anton Blanchard an...@samba.org

To establish addressability quickly, ABIv2 requires the target
address of the function being called to be in r12.

Signed-off-by: Anton Blanchard an...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 868347e..da1cac5 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1913,8 +1913,8 @@ hcall_try_real_mode:
lwaxr3,r3,r4
cmpwi   r3,0
beq guest_exit_cont
-   add r3,r3,r4
-   mtctr   r3
+   add r12,r3,r4
+   mtctr   r12
mr  r3,r9   /* get vcpu pointer */
ld  r4,VCPU_GPR(R4)(r9)
bctrl
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 52/63] KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page

2014-08-01 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

When calculating the lower bits of AVA field, use the shift
count based on the base page size. Also add the missing segment
size and remove stale comment.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Acked-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s_64.h | 6 --
 arch/powerpc/kvm/book3s_hv.c | 6 --
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index e504f88..07cf9df 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -147,6 +147,8 @@ static inline unsigned long compute_tlbie_rb(unsigned long 
v, unsigned long r,
 */
/* This covers 14..54 bits of va*/
rb = (v  ~0x7fUL)  16;   /* AVA field */
+
+   rb |= v  (62 - 8);/*  B field */
/*
 * AVA in v had cleared lower 23 bits. We need to derive
 * that from pteg index
@@ -177,10 +179,10 @@ static inline unsigned long compute_tlbie_rb(unsigned 
long v, unsigned long r,
{
int aval_shift;
/*
-* remaining 7bits of AVA/LP fields
+* remaining bits of AVA/LP fields
 * Also contain the rr bits of LP
 */
-   rb |= (va_low  0x7f)  16;
+   rb |= (va_low  mmu_psize_defs[b_psize].shift)  0x7ff000;
/*
 * Now clear not needed LP bits based on actual psize
 */
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index c470d55..27cced9 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2064,12 +2064,6 @@ static void kvmppc_add_seg_page_size(struct 
kvm_ppc_one_seg_page_size **sps,
(*sps)-page_shift = def-shift;
(*sps)-slb_enc = def-sllp;
(*sps)-enc[0].page_shift = def-shift;
-   /*
-* Only return base page encoding. We don't want to return
-* all the supporting pte_enc, because our H_ENTER doesn't
-* support MPSS yet. Once they do, we can start passing all
-* support pte_enc here
-*/
(*sps)-enc[0].pte_enc = def-penc[linux_psize];
/*
 * Add 16MB MPSS support if host supports it
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 56/63] KVM: PPC: Use kvm_read_guest in kvmppc_ld

2014-08-01 Thread Alexander Graf

We have a nice and handy helper to read from guest physical address space,
so we should make use of it in kvmppc_ld as we already do for its counterpart
in kvmppc_st.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/powerpc.c | 27 ++-
 1 file changed, 2 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3d59730..be40886 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -309,19 +309,6 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvmppc_emulate_mmio);
 
-static hva_t kvmppc_pte_to_hva(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte)
-{
-   hva_t hpage;
-
-   hpage = gfn_to_hva(vcpu-kvm, pte-raddr  PAGE_SHIFT);
-   if (kvm_is_error_hva(hpage))
-   goto err;
-
-   return hpage | (pte-raddr  ~PAGE_MASK);
-err:
-   return KVM_HVA_ERR_BAD;
-}
-
 int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
  bool data)
 {
@@ -351,7 +338,6 @@ int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int 
size, void *ptr,
  bool data)
 {
struct kvmppc_pte pte;
-   hva_t hva = *eaddr;
int rc;
 
vcpu-stat.ld++;
@@ -369,19 +355,10 @@ int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int 
size, void *ptr,
if (!data  !pte.may_execute)
return -ENOEXEC;
 
-   hva = kvmppc_pte_to_hva(vcpu, pte);
-   if (kvm_is_error_hva(hva))
-   goto mmio;
-
-   if (copy_from_user(ptr, (void __user *)hva, size)) {
-   printk(KERN_INFO kvmppc_ld at 0x%lx failed\n, hva);
-   goto mmio;
-   }
+   if (kvm_read_guest(vcpu-kvm, pte.raddr, ptr, size))
+   return EMULATE_DO_MMIO;
 
return EMULATE_DONE;
-
-mmio:
-   return EMULATE_DO_MMIO;
 }
 EXPORT_SYMBOL_GPL(kvmppc_ld);
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 12/63] KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling

2014-08-01 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This provides a way for userspace controls which sPAPR hcalls get
handled in the kernel.  Each hcall can be individually enabled or
disabled for in-kernel handling, except for H_RTAS.  The exception
for H_RTAS is because userspace can already control whether
individual RTAS functions are handled in-kernel or not via the
KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for
H_RTAS is out of the normal sequence of hcall numbers.

Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the
KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM.
The args field of the struct kvm_enable_cap specifies the hcall number
in args[0] and the enable/disable flag in args[1]; 0 means disable
in-kernel handling (so that the hcall will always cause an exit to
userspace) and 1 means enable.  Enabling or disabling in-kernel
handling of an hcall is effective across the whole VM.

The ability for KVM_ENABLE_CAP to be used on a VM file descriptor
on PowerPC is new, added by this commit.  The KVM_CAP_ENABLE_CAP_VM
capability advertises that this ability exists.

When a VM is created, an initial set of hcalls are enabled for
in-kernel handling.  The set that is enabled is the set that have
an in-kernel implementation at this point.  Any new hcall
implementations from this point onwards should not be added to the
default set without a good reason.

No distinction is made between real-mode and virtual-mode hcall
implementations; the one setting controls them both.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 Documentation/virtual/kvm/api.txt   | 41 --
 arch/powerpc/include/asm/kvm_book3s.h   |  1 +
 arch/powerpc/include/asm/kvm_host.h |  2 ++
 arch/powerpc/kernel/asm-offsets.c   |  1 +
 arch/powerpc/kvm/book3s_hv.c| 51 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 +++
 arch/powerpc/kvm/book3s_pr.c|  5 
 arch/powerpc/kvm/book3s_pr_papr.c   | 37 
 arch/powerpc/kvm/powerpc.c  | 45 +
 include/uapi/linux/kvm.h|  1 +
 10 files changed, 193 insertions(+), 2 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 0fe3649..5c54d19 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2863,8 +2863,8 @@ The fields in each entry are defined as follows:
  this function/index combination
 
 
-6. Capabilities that can be enabled

+6. Capabilities that can be enabled on vCPUs
+
 
 There are certain capabilities that change the behavior of the virtual CPU when
 enabled. To enable them, please see section 4.37. Below you can find a list of
@@ -3002,3 +3002,40 @@ Parameters: args[0] is the XICS device fd
 args[1] is the XICS CPU number (server ID) for this vcpu
 
 This capability connects the vcpu to an in-kernel XICS device.
+
+
+7. Capabilities that can be enabled on VMs
+--
+
+There are certain capabilities that change the behavior of the virtual
+machine when enabled. To enable them, please see section 4.37. Below
+you can find a list of capabilities and what their effect on the VM
+is when enabling them.
+
+The following information is provided along with the description:
+
+  Architectures: which instruction set architectures provide this ioctl.
+  x86 includes both i386 and x86_64.
+
+  Parameters: what parameters are accepted by the capability.
+
+  Returns: the return value.  General error numbers (EBADF, ENOMEM, EINVAL)
+  are not detailed, but errors with specific meanings are.
+
+
+7.1 KVM_CAP_PPC_ENABLE_HCALL
+
+Architectures: ppc
+Parameters: args[0] is the sPAPR hcall number
+   args[1] is 0 to disable, 1 to enable in-kernel handling
+
+This capability controls whether individual sPAPR hypercalls (hcalls)
+get handled by the kernel or not.  Enabling or disabling in-kernel
+handling of an hcall is effective across the VM.  On creation, an
+initial set of hcalls are enabled for in-kernel handling, which
+consists of those hcalls for which in-kernel handlers were implemented
+before this capability was implemented.  If disabled, the kernel will
+not to attempt to handle the hcall, but will always exit to userspace
+to handle it.  Note that it may not make sense to enable some and
+disable others of a group of related hcalls, but KVM does not prevent
+userspace from doing that.
diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index a20cc0b..052ab2a 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -187,6 +187,7 @@ extern void kvmppc_hv_entry_trampoline(void);
 extern u32 kvmppc_alignment_dsisr(struct

[PULL 24/63] KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct

2014-08-01 Thread Alexander Graf

When building KVM with a lot of vcores (NR_CPUS is big), we can potentially
get out of the ld immediate range for dereferences inside that struct.

Move the array to the end of our kvm_arch struct. This fixes compilation
issues with NR_CPUS=2048 for me.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index faf2f0e..855ba4d 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -255,7 +255,6 @@ struct kvm_arch {
atomic_t hpte_mod_interest;
spinlock_t slot_phys_lock;
cpumask_t need_tlb_flush;
-   struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
int hpt_cma_alloc;
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
@@ -273,6 +272,10 @@ struct kvm_arch {
struct kvmppc_xics *xics;
 #endif
struct kvmppc_ops *kvm_ops;
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+   /* This array can grow quite large, keep it at the end */
+   struct kvmppc_vcore *vcores[KVM_MAX_VCORES];
+#endif
 };
 
 /*
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 28/63] KVM: PPC: Book3S: Make magic page properly 4k mappable

2014-08-01 Thread Alexander Graf

The magic page is defined as a 4k page of per-vCPU data that is shared
between the guest and the host to accelerate accesses to privileged
registers.

However, when the host is using 64k page size granularity we weren't quite
as strict about that rule anymore. Instead, we partially treated all of the
upper 64k as magic page and mapped only the uppermost 4k with the actual
magic contents.

This works well enough for Linux which doesn't use any memory in kernel
space in the upper 64k, but Mac OS X got upset. So this patch makes magic
page actually stay in a 4k range even on 64k page size hosts.

This patch fixes magic page usage with Mac OS X (using MOL) on 64k PAGE_SIZE
hosts for me.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |  2 +-
 arch/powerpc/kvm/book3s.c | 12 ++--
 arch/powerpc/kvm/book3s_32_mmu_host.c |  7 +++
 arch/powerpc/kvm/book3s_64_mmu_host.c |  5 +++--
 arch/powerpc/kvm/book3s_pr.c  | 13 ++---
 arch/powerpc/kvm/powerpc.c| 19 +++
 6 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index b1cf18d..20fb6f2 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -158,7 +158,7 @@ extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct 
kvmppc_bat *bat,
   bool upper, u32 val);
 extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
 extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu 
*vcpu);
-extern pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, bool writing,
+extern pfn_t kvmppc_gpa_to_pfn(struct kvm_vcpu *vcpu, gpa_t gpa, bool writing,
bool *writable);
 extern void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
unsigned long *rmap, long pte_index, int realmode);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 1d13764..31facfc 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -354,18 +354,18 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvmppc_core_prepare_to_enter);
 
-pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, bool writing,
+pfn_t kvmppc_gpa_to_pfn(struct kvm_vcpu *vcpu, gpa_t gpa, bool writing,
bool *writable)
 {
-   ulong mp_pa = vcpu-arch.magic_page_pa;
+   ulong mp_pa = vcpu-arch.magic_page_pa  KVM_PAM;
+   gfn_t gfn = gpa  PAGE_SHIFT;
 
if (!(kvmppc_get_msr(vcpu)  MSR_SF))
mp_pa = (uint32_t)mp_pa;
 
/* Magic page override */
-   if (unlikely(mp_pa) 
-   unlikely(((gfn  PAGE_SHIFT)  KVM_PAM) ==
-((mp_pa  PAGE_MASK)  KVM_PAM))) {
+   gpa = ~0xFFFULL;
+   if (unlikely(mp_pa)  unlikely((gpa  KVM_PAM) == mp_pa)) {
ulong shared_page = ((ulong)vcpu-arch.shared)  PAGE_MASK;
pfn_t pfn;
 
@@ -378,7 +378,7 @@ pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, 
bool writing,
 
return gfn_to_pfn_prot(vcpu-kvm, gfn, writing, writable);
 }
-EXPORT_SYMBOL_GPL(kvmppc_gfn_to_pfn);
+EXPORT_SYMBOL_GPL(kvmppc_gpa_to_pfn);
 
 static int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr, bool data,
bool iswrite, struct kvmppc_pte *pte)
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c 
b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 678e753..2035d16 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -156,11 +156,10 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct 
kvmppc_pte *orig_pte,
bool writable;
 
/* Get host physical address for gpa */
-   hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte-raddr  PAGE_SHIFT,
-  iswrite, writable);
+   hpaddr = kvmppc_gpa_to_pfn(vcpu, orig_pte-raddr, iswrite, writable);
if (is_error_noslot_pfn(hpaddr)) {
-   printk(KERN_INFO Couldn't get guest page for gfn %lx!\n,
-orig_pte-eaddr);
+   printk(KERN_INFO Couldn't get guest page for gpa %lx!\n,
+orig_pte-raddr);
r = -EINVAL;
goto out;
}
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c 
b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 0ac9839..b982d92 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -104,9 +104,10 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct 
kvmppc_pte *orig_pte,
smp_rmb();
 
/* Get host physical address for gpa */
-   pfn = kvmppc_gfn_to_pfn(vcpu, gfn, iswrite, writable);
+   pfn = kvmppc_gpa_to_pfn(vcpu, orig_pte-raddr, iswrite, writable);
if (is_error_noslot_pfn(pfn)) {
-   printk(KERN_INFO Couldn't get guest page for gfn

[PULL 53/63] KVM: PPC: Implement kvmppc_xlate for all targets

2014-08-01 Thread Alexander Graf

We have a nice API to find the translated GPAs of a GVA including protection
flags. So far we only use it on Book3S, but there's no reason the same shouldn't
be used on BookE as well.

Implement a kvmppc_xlate() version for BookE and clean it up to make it more
readable in general.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h | 13 ++
 arch/powerpc/kvm/book3s.c  | 12 ++---
 arch/powerpc/kvm/booke.c   | 51 ++
 3 files changed, 72 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index e381363..1a60af9 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -52,6 +52,16 @@ enum instruction_type {
INST_SC,/* system call */
 };
 
+enum xlate_instdata {
+   XLATE_INST, /* translate instruction address */
+   XLATE_DATA  /* translate data address */
+};
+
+enum xlate_readwrite {
+   XLATE_READ, /* check for read permissions */
+   XLATE_WRITE /* check for write permissions */
+};
+
 extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern void kvmppc_handler_highmem(void);
@@ -94,6 +104,9 @@ extern gpa_t kvmppc_mmu_xlate(struct kvm_vcpu *vcpu, 
unsigned int gtlb_index,
   gva_t eaddr);
 extern void kvmppc_mmu_dtlb_miss(struct kvm_vcpu *vcpu);
 extern void kvmppc_mmu_itlb_miss(struct kvm_vcpu *vcpu);
+extern int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr,
+   enum xlate_instdata xlid, enum xlate_readwrite xlrw,
+   struct kvmppc_pte *pte);
 
 extern struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm,
 unsigned int id);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index a3cbada..0b6c84e 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -380,9 +380,11 @@ pfn_t kvmppc_gpa_to_pfn(struct kvm_vcpu *vcpu, gpa_t gpa, 
bool writing,
 }
 EXPORT_SYMBOL_GPL(kvmppc_gpa_to_pfn);
 
-static int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr, bool data,
-   bool iswrite, struct kvmppc_pte *pte)
+int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr, enum xlate_instdata xlid,
+enum xlate_readwrite xlrw, struct kvmppc_pte *pte)
 {
+   bool data = (xlid == XLATE_DATA);
+   bool iswrite = (xlrw == XLATE_WRITE);
int relocated = (kvmppc_get_msr(vcpu)  (data ? MSR_DR : MSR_IR));
int r;
 
@@ -434,7 +436,8 @@ int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int 
size, void *ptr,
 
vcpu-stat.st++;
 
-   r = kvmppc_xlate(vcpu, *eaddr, data, true, pte);
+   r = kvmppc_xlate(vcpu, *eaddr, data ? XLATE_DATA : XLATE_INST,
+XLATE_WRITE, pte);
if (r  0)
return r;
 
@@ -459,7 +462,8 @@ int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int 
size, void *ptr,
 
vcpu-stat.ld++;
 
-   rc = kvmppc_xlate(vcpu, *eaddr, data, false, pte);
+   rc = kvmppc_xlate(vcpu, *eaddr, data ? XLATE_DATA : XLATE_INST,
+ XLATE_READ, pte);
if (rc)
return rc;
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 97bcde2..2f697b4 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1785,6 +1785,57 @@ void kvm_guest_protect_msr(struct kvm_vcpu *vcpu, ulong 
prot_bitmap, bool set)
 #endif
 }
 
+int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr, enum xlate_instdata xlid,
+enum xlate_readwrite xlrw, struct kvmppc_pte *pte)
+{
+   int gtlb_index;
+   gpa_t gpaddr;
+
+#ifdef CONFIG_KVM_E500V2
+   if (!(vcpu-arch.shared-msr  MSR_PR) 
+   (eaddr  PAGE_MASK) == vcpu-arch.magic_page_ea) {
+   pte-eaddr = eaddr;
+   pte-raddr = (vcpu-arch.magic_page_pa  PAGE_MASK) |
+(eaddr  ~PAGE_MASK);
+   pte-vpage = eaddr  PAGE_SHIFT;
+   pte-may_read = true;
+   pte-may_write = true;
+   pte-may_execute = true;
+
+   return 0;
+   }
+#endif
+
+   /* Check the guest TLB. */
+   switch (xlid) {
+   case XLATE_INST:
+   gtlb_index = kvmppc_mmu_itlb_index(vcpu, eaddr);
+   break;
+   case XLATE_DATA:
+   gtlb_index = kvmppc_mmu_dtlb_index(vcpu, eaddr);
+   break;
+   default:
+   BUG();
+   }
+
+   /* Do we have a TLB entry at all? */
+   if (gtlb_index  0)
+   return -ENOENT;
+
+   gpaddr = kvmppc_mmu_xlate(vcpu, gtlb_index, eaddr);
+
+   pte-eaddr = eaddr;
+   pte-raddr = (gpaddr  PAGE_MASK) | (eaddr  ~PAGE_MASK);
+   pte-vpage = eaddr

[PULL 03/63] KVM: PPC: BOOK3S: PR: Emulate instruction counter

2014-08-01 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

Writing to IC is not allowed in the privileged mode.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h | 1 +
 arch/powerpc/kvm/book3s.c   | 6 ++
 arch/powerpc/kvm/book3s_emulate.c   | 3 +++
 arch/powerpc/kvm/book3s_hv.c| 6 --
 arch/powerpc/kvm/book3s_pr.c| 4 
 5 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index bd3caea..f9ae696 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -506,6 +506,7 @@ struct kvm_vcpu_arch {
/* Time base value when we entered the guest */
u64 entry_tb;
u64 entry_vtb;
+   u64 entry_ic;
u32 tcr;
ulong tsr; /* we need to perform set/clr_bits() which requires ulong */
u32 ivor[64];
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index ddce1ea..90aa5c7 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -649,6 +649,9 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
case KVM_REG_PPC_VTB:
val = get_reg_val(reg-id, vcpu-arch.vtb);
break;
+   case KVM_REG_PPC_IC:
+   val = get_reg_val(reg-id, vcpu-arch.ic);
+   break;
default:
r = -EINVAL;
break;
@@ -756,6 +759,9 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
case KVM_REG_PPC_VTB:
vcpu-arch.vtb = set_reg_val(reg-id, val);
break;
+   case KVM_REG_PPC_IC:
+   vcpu-arch.ic = set_reg_val(reg-id, val);
+   break;
default:
r = -EINVAL;
break;
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 1bb16a5..84fddcd 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -580,6 +580,9 @@ int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, int 
sprn, ulong *spr_val
case SPRN_VTB:
*spr_val = vcpu-arch.vtb;
break;
+   case SPRN_IC:
+   *spr_val = vcpu-arch.ic;
+   break;
case SPRN_GQR0:
case SPRN_GQR1:
case SPRN_GQR2:
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 315e884..1562acf 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -894,9 +894,6 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 
id,
case KVM_REG_PPC_CIABR:
*val = get_reg_val(id, vcpu-arch.ciabr);
break;
-   case KVM_REG_PPC_IC:
-   *val = get_reg_val(id, vcpu-arch.ic);
-   break;
case KVM_REG_PPC_CSIGR:
*val = get_reg_val(id, vcpu-arch.csigr);
break;
@@ -1091,9 +1088,6 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
if ((vcpu-arch.ciabr  CIABR_PRIV) == CIABR_PRIV_HYPER)
vcpu-arch.ciabr = ~CIABR_PRIV;/* disable */
break;
-   case KVM_REG_PPC_IC:
-   vcpu-arch.ic = set_reg_val(id, *val);
-   break;
case KVM_REG_PPC_CSIGR:
vcpu-arch.csigr = set_reg_val(id, *val);
break;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index d2deb9e..3da412e 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -126,6 +126,8 @@ void kvmppc_copy_to_svcpu(struct kvmppc_book3s_shadow_vcpu 
*svcpu,
 */
vcpu-arch.entry_tb = get_tb();
vcpu-arch.entry_vtb = get_vtb();
+   if (cpu_has_feature(CPU_FTR_ARCH_207S))
+   vcpu-arch.entry_ic = mfspr(SPRN_IC);
svcpu-in_use = true;
 }
 
@@ -178,6 +180,8 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
vcpu-arch.purr += get_tb() - vcpu-arch.entry_tb;
vcpu-arch.spurr += get_tb() - vcpu-arch.entry_tb;
vcpu-arch.vtb += get_vtb() - vcpu-arch.entry_vtb;
+   if (cpu_has_feature(CPU_FTR_ARCH_207S))
+   vcpu-arch.ic += mfspr(SPRN_IC) - vcpu-arch.entry_ic;
svcpu-in_use = false;
 
 out:
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 04/63] KVM: PPC: Book3s PR: Disable AIL mode with OPAL

2014-08-01 Thread Alexander Graf

When we're using PR KVM we must not allow the CPU to take interrupts
in virtual mode, as the SLB does not contain host kernel mappings
when running inside the guest context.

To make sure we get good performance for non-KVM tasks but still
properly functioning PR KVM, let's just disable AIL whenever a vcpu
is scheduled in.

This is fundamentally different from how we deal with AIL on pSeries
type machines where we disable AIL for the whole machine as soon as
a single KVM VM is up.

The reason for that is easy - on pSeries we do not have control over
per-cpu configuration of AIL. We also don't want to mess with CPU hotplug
races and AIL configuration, so setting it per CPU is easier and more
flexible.

This patch fixes running PR KVM on POWER8 bare metal for me.

Signed-off-by: Alexander Graf ag...@suse.de
Acked-by: Paul Mackerras pau...@samba.org
---
 arch/powerpc/kvm/book3s_pr.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 3da412e..8ea7da4 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -71,6 +71,12 @@ static void kvmppc_core_vcpu_load_pr(struct kvm_vcpu *vcpu, 
int cpu)
svcpu-in_use = 0;
svcpu_put(svcpu);
 #endif
+
+   /* Disable AIL if supported */
+   if (cpu_has_feature(CPU_FTR_HVMODE) 
+   cpu_has_feature(CPU_FTR_ARCH_207S))
+   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR)  ~LPCR_AIL);
+
vcpu-cpu = smp_processor_id();
 #ifdef CONFIG_PPC_BOOK3S_32
current-thread.kvm_shadow_vcpu = vcpu-arch.shadow_vcpu;
@@ -91,6 +97,12 @@ static void kvmppc_core_vcpu_put_pr(struct kvm_vcpu *vcpu)
 
kvmppc_giveup_ext(vcpu, MSR_FP | MSR_VEC | MSR_VSX);
kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
+
+   /* Enable AIL if supported */
+   if (cpu_has_feature(CPU_FTR_HVMODE) 
+   cpu_has_feature(CPU_FTR_ARCH_207S))
+   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) | LPCR_AIL_3);
+
vcpu-cpu = -1;
 }
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 55/63] KVM: PPC: Remove kvmppc_bad_hva()

2014-08-01 Thread Alexander Graf

We have a proper define for invalid HVA numbers. Use those instead of the
ppc specific kvmppc_bad_hva().

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/powerpc.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 2c5a1c3..3d59730 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -309,11 +309,6 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvmppc_emulate_mmio);
 
-static hva_t kvmppc_bad_hva(void)
-{
-   return PAGE_OFFSET;
-}
-
 static hva_t kvmppc_pte_to_hva(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte)
 {
hva_t hpage;
@@ -324,7 +319,7 @@ static hva_t kvmppc_pte_to_hva(struct kvm_vcpu *vcpu, 
struct kvmppc_pte *pte)
 
return hpage | (pte-raddr  ~PAGE_MASK);
 err:
-   return kvmppc_bad_hva();
+   return KVM_HVA_ERR_BAD;
 }
 
 int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 45/63] KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call

2014-08-01 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This does for PR KVM what c9438092cae4 (KVM: PPC: Book3S HV: Take SRCU
read lock around kvm_read_guest() call) did for HV KVM, that is,
eliminate a suspicious rcu_dereference_check() usage! warning by
taking the SRCU lock around the call to kvmppc_rtas_hcall().

It also fixes a return of RESUME_HOST to return EMULATE_FAIL instead,
since kvmppc_h_pr() is supposed to return EMULATE_* values.

Signed-off-by: Paul Mackerras pau...@samba.org
Cc: sta...@vger.kernel.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_pr_papr.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr_papr.c 
b/arch/powerpc/kvm/book3s_pr_papr.c
index 6d0143f..ce3c893 100644
--- a/arch/powerpc/kvm/book3s_pr_papr.c
+++ b/arch/powerpc/kvm/book3s_pr_papr.c
@@ -267,6 +267,8 @@ static int kvmppc_h_pr_xics_hcall(struct kvm_vcpu *vcpu, 
u32 cmd)
 
 int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
 {
+   int rc, idx;
+
if (cmd = MAX_HCALL_OPCODE 
!test_bit(cmd/4, vcpu-kvm-arch.enabled_hcalls))
return EMULATE_FAIL;
@@ -299,8 +301,11 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
break;
case H_RTAS:
if (list_empty(vcpu-kvm-arch.rtas_tokens))
-   return RESUME_HOST;
-   if (kvmppc_rtas_hcall(vcpu))
+   break;
+   idx = srcu_read_lock(vcpu-kvm-srcu);
+   rc = kvmppc_rtas_hcall(vcpu);
+   srcu_read_unlock(vcpu-kvm-srcu, idx);
+   if (rc)
break;
kvmppc_set_gpr(vcpu, 3, 0);
return EMULATE_DONE;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 59/63] KVM: PPC: Expose helper functions for data/inst faults

2014-08-01 Thread Alexander Graf

We're going to implement guest code interpretation in KVM for some rare
corner cases. This code needs to be able to inject data and instruction
faults into the guest when it encounters them.

Expose generic APIs to do this in a reasonably subarch agnostic fashion.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h |  8 
 arch/powerpc/kvm/book3s.c  | 17 +
 arch/powerpc/kvm/booke.c   | 16 ++--
 3 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 2214ee6..cbee453 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -132,6 +132,14 @@ extern void kvmppc_core_dequeue_dec(struct kvm_vcpu *vcpu);
 extern void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
struct kvm_interrupt *irq);
 extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu);
+extern void kvmppc_core_queue_dtlb_miss(struct kvm_vcpu *vcpu, ulong 
dear_flags,
+   ulong esr_flags);
+extern void kvmppc_core_queue_data_storage(struct kvm_vcpu *vcpu,
+  ulong dear_flags,
+  ulong esr_flags);
+extern void kvmppc_core_queue_itlb_miss(struct kvm_vcpu *vcpu);
+extern void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu,
+  ulong esr_flags);
 extern void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu);
 extern int kvmppc_core_check_requests(struct kvm_vcpu *vcpu);
 
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index de8da33..dd03f6b 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -230,6 +230,23 @@ void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu)
kvmppc_book3s_dequeue_irqprio(vcpu, BOOK3S_INTERRUPT_EXTERNAL_LEVEL);
 }
 
+void kvmppc_core_queue_data_storage(struct kvm_vcpu *vcpu, ulong dar,
+   ulong flags)
+{
+   kvmppc_set_dar(vcpu, dar);
+   kvmppc_set_dsisr(vcpu, flags);
+   kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_DATA_STORAGE);
+}
+
+void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu, ulong flags)
+{
+   u64 msr = kvmppc_get_msr(vcpu);
+   msr = ~(SRR1_ISI_NOPT | SRR1_ISI_N_OR_G | SRR1_ISI_PROT);
+   msr |= flags  (SRR1_ISI_NOPT | SRR1_ISI_N_OR_G | SRR1_ISI_PROT);
+   kvmppc_set_msr_fast(vcpu, msr);
+   kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_INST_STORAGE);
+}
+
 int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
 {
int deliver = 1;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 2f697b4..f30948a 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -185,24 +185,28 @@ static void kvmppc_booke_queue_irqprio(struct kvm_vcpu 
*vcpu,
set_bit(priority, vcpu-arch.pending_exceptions);
 }
 
-static void kvmppc_core_queue_dtlb_miss(struct kvm_vcpu *vcpu,
-ulong dear_flags, ulong esr_flags)
+void kvmppc_core_queue_dtlb_miss(struct kvm_vcpu *vcpu,
+ulong dear_flags, ulong esr_flags)
 {
vcpu-arch.queued_dear = dear_flags;
vcpu-arch.queued_esr = esr_flags;
kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_DTLB_MISS);
 }
 
-static void kvmppc_core_queue_data_storage(struct kvm_vcpu *vcpu,
-   ulong dear_flags, ulong esr_flags)
+void kvmppc_core_queue_data_storage(struct kvm_vcpu *vcpu,
+   ulong dear_flags, ulong esr_flags)
 {
vcpu-arch.queued_dear = dear_flags;
vcpu-arch.queued_esr = esr_flags;
kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_DATA_STORAGE);
 }
 
-static void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu,
-   ulong esr_flags)
+void kvmppc_core_queue_itlb_miss(struct kvm_vcpu *vcpu)
+{
+   kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_ITLB_MISS);
+}
+
+void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu, ulong esr_flags)
 {
vcpu-arch.queued_esr = esr_flags;
kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_INST_STORAGE);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 60/63] KVM: PPC: Remove DCR handling

2014-08-01 Thread Alexander Graf

DCR handling was only needed for 440 KVM. Since we removed it, we can also
remove handling of DCR accesses.

Signed-off-by: Alexander Graf ag...@suse.de
---
 Documentation/virtual/kvm/api.txt   |  6 +++---
 arch/powerpc/include/asm/kvm_host.h |  4 
 arch/powerpc/include/asm/kvm_ppc.h  |  1 -
 arch/powerpc/kvm/booke.c|  5 -
 arch/powerpc/kvm/powerpc.c  | 10 --
 arch/powerpc/kvm/timing.c   |  1 -
 arch/powerpc/kvm/timing.h   |  3 ---
 include/uapi/linux/kvm.h|  4 ++--
 8 files changed, 5 insertions(+), 29 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 8898caf..a21ff22 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2613,8 +2613,8 @@ The 'data' member contains, in its first 'len' bytes, the 
value as it would
 appear if the VCPU performed a load or store of the appropriate width directly
 to the byte array.
 
-NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_DCR,
-  KVM_EXIT_PAPR and KVM_EXIT_EPR the corresponding
+NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI KVM_EXIT_PAPR and
+  KVM_EXIT_EPR the corresponding
 operations are complete (and guest state is consistent) only after userspace
 has re-entered the kernel with KVM_RUN.  The kernel side will first finish
 incomplete operations and then check for pending signals.  Userspace
@@ -2685,7 +2685,7 @@ Principles of Operation Book in the Chapter for Dynamic 
Address Translation
__u8  is_write;
} dcr;
 
-powerpc specific.
+Deprecated - was used for 440 KVM.
 
/* KVM_EXIT_OSI */
struct {
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 66f5b59..98d9dd5 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -94,7 +94,6 @@ struct kvm_vm_stat {
 struct kvm_vcpu_stat {
u32 sum_exits;
u32 mmio_exits;
-   u32 dcr_exits;
u32 signal_exits;
u32 light_exits;
/* Account for special types of light exits: */
@@ -126,7 +125,6 @@ struct kvm_vcpu_stat {
 
 enum kvm_exit_types {
MMIO_EXITS,
-   DCR_EXITS,
SIGNAL_EXITS,
ITLB_REAL_MISS_EXITS,
ITLB_VIRT_MISS_EXITS,
@@ -601,8 +599,6 @@ struct kvm_vcpu_arch {
u8 io_gpr; /* GPR used as IO source/target */
u8 mmio_is_bigendian;
u8 mmio_sign_extend;
-   u8 dcr_needed;
-   u8 dcr_is_write;
u8 osi_needed;
u8 osi_enabled;
u8 papr_enabled;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index cbee453..8e36c1e 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -41,7 +41,6 @@
 enum emulation_result {
EMULATE_DONE, /* no further processing */
EMULATE_DO_MMIO,  /* kvm_run filled with MMIO request */
-   EMULATE_DO_DCR,   /* kvm_run filled with DCR request */
EMULATE_FAIL, /* can't emulate this instruction */
EMULATE_AGAIN,/* something went wrong. go again */
EMULATE_EXIT_USER,/* emulation requires exit to user-space */
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index f30948a..b4c89fa 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -51,7 +51,6 @@ unsigned long kvmppc_booke_handlers;
 
 struct kvm_stats_debugfs_item debugfs_entries[] = {
{ mmio,   VCPU_STAT(mmio_exits) },
-   { dcr,VCPU_STAT(dcr_exits) },
{ sig,VCPU_STAT(signal_exits) },
{ itlb_r, VCPU_STAT(itlb_real_miss_exits) },
{ itlb_v, VCPU_STAT(itlb_virt_miss_exits) },
@@ -709,10 +708,6 @@ static int emulation_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
case EMULATE_AGAIN:
return RESUME_GUEST;
 
-   case EMULATE_DO_DCR:
-   run-exit_reason = KVM_EXIT_DCR;
-   return RESUME_HOST;
-
case EMULATE_FAIL:
printk(KERN_CRIT %s: emulation at %lx failed (%08x)\n,
   __func__, vcpu-arch.pc, vcpu-arch.last_inst);
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index c14ed15..288b4bb 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -743,12 +743,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 #endif
 }
 
-static void kvmppc_complete_dcr_load(struct kvm_vcpu *vcpu,
- struct kvm_run *run)
-{
-   kvmppc_set_gpr(vcpu, vcpu-arch.io_gpr, run-dcr.data);
-}
-
 static void kvmppc_complete_mmio_load(struct kvm_vcpu *vcpu,
   struct kvm_run *run)
 {
@@ -945,10 +939,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
if (!vcpu-mmio_is_write)
kvmppc_complete_mmio_load(vcpu, run);

[PULL 16/63] PPC: Add asm helpers for BE 32bit load/store

2014-08-01 Thread Alexander Graf

From assembly code we might not only have to explicitly BE access 64bit values,
but sometimes also 32bit ones. Add helpers that allow for easy use of lwzx/stwx
in their respective byte-reverse or native form.

Signed-off-by: Alexander Graf ag...@suse.de
CC: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 arch/powerpc/include/asm/asm-compat.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/include/asm/asm-compat.h 
b/arch/powerpc/include/asm/asm-compat.h
index 4b237aa..21be8ae 100644
--- a/arch/powerpc/include/asm/asm-compat.h
+++ b/arch/powerpc/include/asm/asm-compat.h
@@ -34,10 +34,14 @@
 #define PPC_MIN_STKFRM 112
 
 #ifdef __BIG_ENDIAN__
+#define LWZX_BEstringify_in_c(lwzx)
 #define LDX_BE stringify_in_c(ldx)
+#define STWX_BEstringify_in_c(stwx)
 #define STDX_BEstringify_in_c(stdx)
 #else
+#define LWZX_BEstringify_in_c(lwbrx)
 #define LDX_BE stringify_in_c(ldbrx)
+#define STWX_BEstringify_in_c(stwbrx)
 #define STDX_BEstringify_in_c(stdbrx)
 #endif
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 50/63] KVM: Allow KVM_CHECK_EXTENSION on the vm fd

2014-08-01 Thread Alexander Graf

The KVM_CHECK_EXTENSION is only available on the kvm fd today. Unfortunately
on PPC some of the capabilities change depending on the way a VM was created.

So instead we need a way to expose capabilities as VM ioctl, so that we can
see which VM type we're using (HV or PR). To enable this, add the
KVM_CHECK_EXTENSION ioctl to our vm ioctl portfolio.

Signed-off-by: Alexander Graf ag...@suse.de
Acked-by: Paolo Bonzini pbonz...@redhat.com
---
 Documentation/virtual/kvm/api.txt |  7 +++--
 include/uapi/linux/kvm.h  |  1 +
 virt/kvm/kvm_main.c   | 58 +--
 3 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 884f819..8898caf 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -148,9 +148,9 @@ of banks, as set via the KVM_X86_SETUP_MCE ioctl.
 
 4.4 KVM_CHECK_EXTENSION
 
-Capability: basic
+Capability: basic, KVM_CAP_CHECK_EXTENSION_VM for vm ioctl
 Architectures: all
-Type: system ioctl
+Type: system ioctl, vm ioctl
 Parameters: extension identifier (KVM_CAP_*)
 Returns: 0 if unsupported; 1 (or some other positive integer) if supported
 
@@ -160,6 +160,9 @@ receives an integer that describes the extension 
availability.
 Generally 0 means no and 1 means yes, but some extensions may report
 additional information in the integer return value.
 
+Based on their initialization different VMs may have different capabilities.
+It is thus encouraged to use the vm ioctl to query for capabilities (available
+with KVM_CAP_CHECK_EXTENSION_VM on the vm fd)
 
 4.5 KVM_GET_VCPU_MMAP_SIZE
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 0418b74..51776ca 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -759,6 +759,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ARM_PSCI_0_2 102
 #define KVM_CAP_PPC_FIXUP_HCALL 103
 #define KVM_CAP_PPC_ENABLE_HCALL 104
+#define KVM_CAP_CHECK_EXTENSION_VM 105
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e28f3ca..1b95cc9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2324,6 +2324,34 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
return 0;
 }
 
+static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
+{
+   switch (arg) {
+   case KVM_CAP_USER_MEMORY:
+   case KVM_CAP_DESTROY_MEMORY_REGION_WORKS:
+   case KVM_CAP_JOIN_MEMORY_REGIONS_WORKS:
+#ifdef CONFIG_KVM_APIC_ARCHITECTURE
+   case KVM_CAP_SET_BOOT_CPU_ID:
+#endif
+   case KVM_CAP_INTERNAL_ERROR_DATA:
+#ifdef CONFIG_HAVE_KVM_MSI
+   case KVM_CAP_SIGNAL_MSI:
+#endif
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+   case KVM_CAP_IRQFD_RESAMPLE:
+#endif
+   case KVM_CAP_CHECK_EXTENSION_VM:
+   return 1;
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+   case KVM_CAP_IRQ_ROUTING:
+   return KVM_MAX_IRQ_ROUTES;
+#endif
+   default:
+   break;
+   }
+   return kvm_vm_ioctl_check_extension(kvm, arg);
+}
+
 static long kvm_vm_ioctl(struct file *filp,
   unsigned int ioctl, unsigned long arg)
 {
@@ -2487,6 +2515,9 @@ static long kvm_vm_ioctl(struct file *filp,
r = 0;
break;
}
+   case KVM_CHECK_EXTENSION:
+   r = kvm_vm_ioctl_check_extension_generic(kvm, arg);
+   break;
default:
r = kvm_arch_vm_ioctl(filp, ioctl, arg);
if (r == -ENOTTY)
@@ -2571,33 +2602,6 @@ static int kvm_dev_ioctl_create_vm(unsigned long type)
return r;
 }
 
-static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
-{
-   switch (arg) {
-   case KVM_CAP_USER_MEMORY:
-   case KVM_CAP_DESTROY_MEMORY_REGION_WORKS:
-   case KVM_CAP_JOIN_MEMORY_REGIONS_WORKS:
-#ifdef CONFIG_KVM_APIC_ARCHITECTURE
-   case KVM_CAP_SET_BOOT_CPU_ID:
-#endif
-   case KVM_CAP_INTERNAL_ERROR_DATA:
-#ifdef CONFIG_HAVE_KVM_MSI
-   case KVM_CAP_SIGNAL_MSI:
-#endif
-#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
-   case KVM_CAP_IRQFD_RESAMPLE:
-#endif
-   return 1;
-#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
-   case KVM_CAP_IRQ_ROUTING:
-   return KVM_MAX_IRQ_ROUTES;
-#endif
-   default:
-   break;
-   }
-   return kvm_vm_ioctl_check_extension(kvm, arg);
-}
-
 static long kvm_dev_ioctl(struct file *filp,
  unsigned int ioctl, unsigned long arg)
 {
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 47/63] Split out struct kvmppc_vcore creation to separate function

2014-08-01 Thread Alexander Graf

From: Stewart Smith stew...@linux.vnet.ibm.com

No code changes, just split it out to a function so that with the addition
of micro partition prefetch buffer allocation (in subsequent patch) looks
neater and doesn't require excessive indentation.

Signed-off-by: Stewart Smith stew...@linux.vnet.ibm.com
Acked-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv.c | 31 +--
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 0c5266e..5042ccc 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1303,6 +1303,26 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
return r;
 }
 
+static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int core)
+{
+   struct kvmppc_vcore *vcore;
+
+   vcore = kzalloc(sizeof(struct kvmppc_vcore), GFP_KERNEL);
+
+   if (vcore == NULL)
+   return NULL;
+
+   INIT_LIST_HEAD(vcore-runnable_threads);
+   spin_lock_init(vcore-lock);
+   init_waitqueue_head(vcore-wq);
+   vcore-preempt_tb = TB_NIL;
+   vcore-lpcr = kvm-arch.lpcr;
+   vcore-first_vcpuid = core * threads_per_subcore;
+   vcore-kvm = kvm;
+
+   return vcore;
+}
+
 static struct kvm_vcpu *kvmppc_core_vcpu_create_hv(struct kvm *kvm,
   unsigned int id)
 {
@@ -1354,16 +1374,7 @@ static struct kvm_vcpu 
*kvmppc_core_vcpu_create_hv(struct kvm *kvm,
mutex_lock(kvm-lock);
vcore = kvm-arch.vcores[core];
if (!vcore) {
-   vcore = kzalloc(sizeof(struct kvmppc_vcore), GFP_KERNEL);
-   if (vcore) {
-   INIT_LIST_HEAD(vcore-runnable_threads);
-   spin_lock_init(vcore-lock);
-   init_waitqueue_head(vcore-wq);
-   vcore-preempt_tb = TB_NIL;
-   vcore-lpcr = kvm-arch.lpcr;
-   vcore-first_vcpuid = core * threads_per_subcore;
-   vcore-kvm = kvm;
-   }
+   vcore = kvmppc_vcore_create(kvm, core);
kvm-arch.vcores[core] = vcore;
kvm-arch.online_vcores++;
}
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 54/63] KVM: PPC: Move kvmppc_ld/st to common code

2014-08-01 Thread Alexander Graf

We have enough common infrastructure now to resolve GVA-GPA mappings at
runtime. With this we can move our book3s specific helpers to load / store
in guest virtual address space to common code as well.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |  2 +-
 arch/powerpc/include/asm/kvm_host.h   |  4 +-
 arch/powerpc/include/asm/kvm_ppc.h|  4 ++
 arch/powerpc/kvm/book3s.c | 81 ---
 arch/powerpc/kvm/powerpc.c| 81 +++
 5 files changed, 88 insertions(+), 84 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index a86ca65..172fd6d 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -148,8 +148,8 @@ extern void kvmppc_mmu_hpte_sysexit(void);
 extern int kvmppc_mmu_hv_init(void);
 extern int kvmppc_book3s_hcall_implemented(struct kvm *kvm, unsigned long hc);
 
+/* XXX remove this export when load_last_inst() is generic */
 extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
-extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
 extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int 
vec);
 extern void kvmppc_book3s_dequeue_irqprio(struct kvm_vcpu *vcpu,
  unsigned int vec);
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 11385bb..66f5b59 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -111,15 +111,15 @@ struct kvm_vcpu_stat {
u32 halt_wakeup;
u32 dbell_exits;
u32 gdbell_exits;
+   u32 ld;
+   u32 st;
 #ifdef CONFIG_PPC_BOOK3S
u32 pf_storage;
u32 pf_instruc;
u32 sp_storage;
u32 sp_instruc;
u32 queue_intr;
-   u32 ld;
u32 ld_slow;
-   u32 st;
u32 st_slow;
 #endif
 };
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 1a60af9..17fa277 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -80,6 +80,10 @@ extern int kvmppc_handle_store(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 extern int kvmppc_load_last_inst(struct kvm_vcpu *vcpu,
 enum instruction_type type, u32 *inst);
 
+extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
+bool data);
+extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
+bool data);
 extern int kvmppc_emulate_instruction(struct kvm_run *run,
   struct kvm_vcpu *vcpu);
 extern int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 0b6c84e..de8da33 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -410,87 +410,6 @@ int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr, enum 
xlate_instdata xlid,
return r;
 }
 
-static hva_t kvmppc_bad_hva(void)
-{
-   return PAGE_OFFSET;
-}
-
-static hva_t kvmppc_pte_to_hva(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte)
-{
-   hva_t hpage;
-
-   hpage = gfn_to_hva(vcpu-kvm, pte-raddr  PAGE_SHIFT);
-   if (kvm_is_error_hva(hpage))
-   goto err;
-
-   return hpage | (pte-raddr  ~PAGE_MASK);
-err:
-   return kvmppc_bad_hva();
-}
-
-int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
- bool data)
-{
-   struct kvmppc_pte pte;
-   int r;
-
-   vcpu-stat.st++;
-
-   r = kvmppc_xlate(vcpu, *eaddr, data ? XLATE_DATA : XLATE_INST,
-XLATE_WRITE, pte);
-   if (r  0)
-   return r;
-
-   *eaddr = pte.raddr;
-
-   if (!pte.may_write)
-   return -EPERM;
-
-   if (kvm_write_guest(vcpu-kvm, pte.raddr, ptr, size))
-   return EMULATE_DO_MMIO;
-
-   return EMULATE_DONE;
-}
-EXPORT_SYMBOL_GPL(kvmppc_st);
-
-int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
- bool data)
-{
-   struct kvmppc_pte pte;
-   hva_t hva = *eaddr;
-   int rc;
-
-   vcpu-stat.ld++;
-
-   rc = kvmppc_xlate(vcpu, *eaddr, data ? XLATE_DATA : XLATE_INST,
- XLATE_READ, pte);
-   if (rc)
-   return rc;
-
-   *eaddr = pte.raddr;
-
-   if (!pte.may_read)
-   return -EPERM;
-
-   if (!data  !pte.may_execute)
-   return -ENOEXEC;
-
-   hva = kvmppc_pte_to_hva(vcpu, pte);
-   if (kvm_is_error_hva(hva))
-   goto mmio;
-
-   if (copy_from_user(ptr, (void __user *)hva, size)) {
-   printk(KERN_INFO kvmppc_ld at 0x%lx failed\n, hva);
-   goto mmio;
-   }
-
-   return

[PULL 58/63] KVM: PPC: Separate loadstore emulation from priv emulation

2014-08-01 Thread Alexander Graf

Today the instruction emulator can get called via 2 separate code paths. It
can either be called by MMIO emulation detection code or by privileged
instruction traps.

This is bad, as both code paths prepare the environment differently. For MMIO
emulation we already know the virtual address we faulted on, so instructions
there don't have to actually fetch that information.

Split out the two separate use cases into separate files.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h   |   1 +
 arch/powerpc/kvm/Makefile|   4 +-
 arch/powerpc/kvm/emulate.c   | 192 +
 arch/powerpc/kvm/emulate_loadstore.c | 272 +++
 arch/powerpc/kvm/powerpc.c   |   2 +-
 5 files changed, 278 insertions(+), 193 deletions(-)
 create mode 100644 arch/powerpc/kvm/emulate_loadstore.c

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 17fa277..2214ee6 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -86,6 +86,7 @@ extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int 
size, void *ptr,
 bool data);
 extern int kvmppc_emulate_instruction(struct kvm_run *run,
   struct kvm_vcpu *vcpu);
+extern int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu);
 extern int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu);
 extern void kvmppc_emulate_dec(struct kvm_vcpu *vcpu);
 extern u32 kvmppc_get_dec(struct kvm_vcpu *vcpu, u64 tb);
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 777f894..1ccd7a1 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -13,8 +13,9 @@ common-objs-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \
 CFLAGS_e500_mmu.o := -I.
 CFLAGS_e500_mmu_host.o := -I.
 CFLAGS_emulate.o  := -I.
+CFLAGS_emulate_loadstore.o  := -I.
 
-common-objs-y += powerpc.o emulate.o
+common-objs-y += powerpc.o emulate.o emulate_loadstore.o
 obj-$(CONFIG_KVM_EXIT_TIMING) += timing.o
 obj-$(CONFIG_KVM_BOOK3S_HANDLER) += book3s_exports.o
 
@@ -91,6 +92,7 @@ kvm-book3s_64-module-objs += \
$(KVM)/eventfd.o \
powerpc.o \
emulate.o \
+   emulate_loadstore.o \
book3s.o \
book3s_64_vio.o \
book3s_rtas.o \
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index c5c64b6..e96b50d 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -207,25 +207,12 @@ static int kvmppc_emulate_mfspr(struct kvm_vcpu *vcpu, 
int sprn, int rt)
return emulated;
 }
 
-/* XXX to do:
- * lhax
- * lhaux
- * lswx
- * lswi
- * stswx
- * stswi
- * lha
- * lhau
- * lmw
- * stmw
- *
- */
 /* XXX Should probably auto-generate instruction decoding for a particular core
  * from opcode tables in the future. */
 int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 {
u32 inst;
-   int ra, rs, rt, sprn;
+   int rs, rt, sprn;
enum emulation_result emulated;
int advance = 1;
 
@@ -238,7 +225,6 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
 
pr_debug(Emulating opcode %d / %d\n, get_op(inst), get_xop(inst));
 
-   ra = get_ra(inst);
rs = get_rs(inst);
rt = get_rt(inst);
sprn = get_sprn(inst);
@@ -270,200 +256,24 @@ int kvmppc_emulate_instruction(struct kvm_run *run, 
struct kvm_vcpu *vcpu)
 #endif
advance = 0;
break;
-   case OP_31_XOP_LWZX:
-   emulated = kvmppc_handle_load(run, vcpu, rt, 4, 1);
-   break;
-
-   case OP_31_XOP_LBZX:
-   emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1);
-   break;
-
-   case OP_31_XOP_LBZUX:
-   emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1);
-   kvmppc_set_gpr(vcpu, ra, vcpu-arch.vaddr_accessed);
-   break;
-
-   case OP_31_XOP_STWX:
-   emulated = kvmppc_handle_store(run, vcpu,
-  kvmppc_get_gpr(vcpu, rs),
-  4, 1);
-   break;
-
-   case OP_31_XOP_STBX:
-   emulated = kvmppc_handle_store(run, vcpu,
-  kvmppc_get_gpr(vcpu, rs),
-  1, 1);
-   break;
-
-   case OP_31_XOP_STBUX:
-   emulated = kvmppc_handle_store(run, vcpu,
-  kvmppc_get_gpr(vcpu, rs),
-  1, 1);
-   kvmppc_set_gpr(vcpu, ra, vcpu-arch.vaddr_accessed);
-   break;
-

[PULL 37/63] KVM: PPC: Book3s: Remove kvmppc_read_inst() function

2014-08-01 Thread Alexander Graf

From: Mihai Caraman mihai.cara...@freescale.com

In the context of replacing kvmppc_ld() function calls with a version of
kvmppc_get_last_inst() which allow to fail, Alex Graf suggested this:

If we get EMULATE_AGAIN, we just have to make sure we go back into the guest.
No need to inject an ISI into  the guest - it'll do that all by itself.
With an error returning kvmppc_get_last_inst we can just use completely
get rid of kvmppc_read_inst() and only use kvmppc_get_last_inst() instead.

As a intermediate step get rid of kvmppc_read_inst() and only use kvmppc_ld()
instead.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_pr.c | 85 ++--
 1 file changed, 34 insertions(+), 51 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index e40765f..e76aec3 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -710,42 +710,6 @@ static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong 
fac)
 #endif
 }
 
-static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
-{
-   ulong srr0 = kvmppc_get_pc(vcpu);
-   u32 last_inst = kvmppc_get_last_inst(vcpu);
-   int ret;
-
-   ret = kvmppc_ld(vcpu, srr0, sizeof(u32), last_inst, false);
-   if (ret == -ENOENT) {
-   ulong msr = kvmppc_get_msr(vcpu);
-
-   msr = kvmppc_set_field(msr, 33, 33, 1);
-   msr = kvmppc_set_field(msr, 34, 36, 0);
-   msr = kvmppc_set_field(msr, 42, 47, 0);
-   kvmppc_set_msr_fast(vcpu, msr);
-   kvmppc_book3s_queue_irqprio(vcpu, 
BOOK3S_INTERRUPT_INST_STORAGE);
-   return EMULATE_AGAIN;
-   }
-
-   return EMULATE_DONE;
-}
-
-static int kvmppc_check_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr)
-{
-
-   /* Need to do paired single emulation? */
-   if (!(vcpu-arch.hflags  BOOK3S_HFLAG_PAIRED_SINGLE))
-   return EMULATE_DONE;
-
-   /* Read out the instruction */
-   if (kvmppc_read_inst(vcpu) == EMULATE_DONE)
-   /* Need to emulate */
-   return EMULATE_FAIL;
-
-   return EMULATE_AGAIN;
-}
-
 /* Handle external providers (FPU, Altivec, VSX) */
 static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 ulong msr)
@@ -1149,31 +1113,49 @@ program_interrupt:
case BOOK3S_INTERRUPT_VSX:
{
int ext_msr = 0;
+   int emul;
+   ulong pc;
+   u32 last_inst;
+
+   if (vcpu-arch.hflags  BOOK3S_HFLAG_PAIRED_SINGLE) {
+   /* Do paired single instruction emulation */
+   pc = kvmppc_get_pc(vcpu);
+   last_inst = kvmppc_get_last_inst(vcpu);
+   emul = kvmppc_ld(vcpu, pc, sizeof(u32), last_inst,
+false);
+   if (emul == EMULATE_DONE)
+   goto program_interrupt;
+   else
+   r = RESUME_GUEST;
 
-   switch (exit_nr) {
-   case BOOK3S_INTERRUPT_FP_UNAVAIL: ext_msr = MSR_FP;  break;
-   case BOOK3S_INTERRUPT_ALTIVEC:ext_msr = MSR_VEC; break;
-   case BOOK3S_INTERRUPT_VSX:ext_msr = MSR_VSX; break;
+   break;
}
 
-   switch (kvmppc_check_ext(vcpu, exit_nr)) {
-   case EMULATE_DONE:
-   /* everything ok - let's enable the ext */
-   r = kvmppc_handle_ext(vcpu, exit_nr, ext_msr);
+   /* Enable external provider */
+   switch (exit_nr) {
+   case BOOK3S_INTERRUPT_FP_UNAVAIL:
+   ext_msr = MSR_FP;
break;
-   case EMULATE_FAIL:
-   /* we need to emulate this instruction */
-   goto program_interrupt;
+
+   case BOOK3S_INTERRUPT_ALTIVEC:
+   ext_msr = MSR_VEC;
break;
-   default:
-   /* nothing to worry about - go again */
+
+   case BOOK3S_INTERRUPT_VSX:
+   ext_msr = MSR_VSX;
break;
}
+
+   r = kvmppc_handle_ext(vcpu, exit_nr, ext_msr);
break;
}
case BOOK3S_INTERRUPT_ALIGNMENT:
-   if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
-   u32 last_inst = kvmppc_get_last_inst(vcpu);
+   {
+   ulong pc = kvmppc_get_pc(vcpu);
+   u32 last_inst = kvmppc_get_last_inst(vcpu);
+   int emul = kvmppc_ld(vcpu, pc, sizeof(u32), last_inst, false);
+
+   if (emul == EMULATE_DONE) {
u32 dsisr;
u64 dar;
 
@@ -1187,6

[PULL 42/63] KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer

2014-08-01 Thread Alexander Graf

From: Bharat Bhushan bharat.bhus...@freescale.com

Scott Wood pointed out that We are no longer using SPRG1 for vcpu pointer,
but using SPRN_SPRG_THREAD = SPRG3 (thread-vcpu). So this comment
is not valid now.

Note: SPRN_SPRG3R is not supported (do not see any need as of now),
and if we want to support this in future then we have to shift to using
SPRG1 for VCPU pointer.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/reg.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index c8f3381..0ef17ad 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -944,9 +944,6 @@
  *  readable variant for reads, which can avoid a fault
  *  with KVM type virtualization.
  *
- *  (*) Under KVM, the host SPRG1 is used to point to
- *  the current VCPU data structure
- *
  * 32-bit 8xx:
  * - SPRG0 scratch for exception vectors
  * - SPRG1 scratch for exception vectors
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 43/63] KVM: PPC: Remove 440 support

2014-08-01 Thread Alexander Graf

The 440 target hasn't been properly functioning for a few releases and
before I was the only one who fixes a very serious bug that indicates to
me that nobody used it before either.

Furthermore KVM on 440 is slow to the extent of unusable.

We don't have to carry along completely unused code. Remove 440 and give
us one less thing to worry about.

Signed-off-by: Alexander Graf ag...@suse.de
---
 Documentation/powerpc/00-INDEX|   2 -
 Documentation/powerpc/kvm_440.txt |  41 ---
 arch/powerpc/Kconfig.debug|   4 +-
 arch/powerpc/configs/ppc44x_defconfig |   1 -
 arch/powerpc/include/asm/kvm_44x.h|  67 -
 arch/powerpc/include/asm/kvm_asm.h|   1 -
 arch/powerpc/include/asm/kvm_host.h   |   3 -
 arch/powerpc/kvm/44x.c| 237 ---
 arch/powerpc/kvm/44x_emulate.c| 194 -
 arch/powerpc/kvm/44x_tlb.c| 528 --
 arch/powerpc/kvm/44x_tlb.h|  86 --
 arch/powerpc/kvm/Kconfig  |  16 +-
 arch/powerpc/kvm/Makefile |  12 -
 arch/powerpc/kvm/booke.h  |   7 -
 arch/powerpc/kvm/booke_interrupts.S   |   5 -
 arch/powerpc/kvm/bookehv_interrupts.S |   1 -
 arch/powerpc/kvm/powerpc.c|   1 -
 17 files changed, 2 insertions(+), 1204 deletions(-)
 delete mode 100644 Documentation/powerpc/kvm_440.txt
 delete mode 100644 arch/powerpc/include/asm/kvm_44x.h
 delete mode 100644 arch/powerpc/kvm/44x.c
 delete mode 100644 arch/powerpc/kvm/44x_emulate.c
 delete mode 100644 arch/powerpc/kvm/44x_tlb.c
 delete mode 100644 arch/powerpc/kvm/44x_tlb.h

diff --git a/Documentation/powerpc/00-INDEX b/Documentation/powerpc/00-INDEX
index 6db73df..a68784d 100644
--- a/Documentation/powerpc/00-INDEX
+++ b/Documentation/powerpc/00-INDEX
@@ -17,8 +17,6 @@ firmware-assisted-dump.txt
- Documentation on the firmware assisted dump mechanism fadump.
 hvcs.txt
- IBM Hypervisor Virtual Console Server Installation Guide
-kvm_440.txt
-   - Various notes on the implementation of KVM for PowerPC 440.
 mpc52xx.txt
- Linux 2.6.x on MPC52xx family
 pmu-ebb.txt
diff --git a/Documentation/powerpc/kvm_440.txt 
b/Documentation/powerpc/kvm_440.txt
deleted file mode 100644
index c02a003..000
--- a/Documentation/powerpc/kvm_440.txt
+++ /dev/null
@@ -1,41 +0,0 @@
-Hollis Blanchard holl...@us.ibm.com
-15 Apr 2008
-
-Various notes on the implementation of KVM for PowerPC 440:
-
-To enforce isolation, host userspace, guest kernel, and guest userspace all
-run at user privilege level. Only the host kernel runs in supervisor mode.
-Executing privileged instructions in the guest traps into KVM (in the host
-kernel), where we decode and emulate them. Through this technique, unmodified
-440 Linux kernels can be run (slowly) as guests. Future performance work will
-focus on reducing the overhead and frequency of these traps.
-
-The usual code flow is started from userspace invoking an run ioctl, which
-causes KVM to switch into guest context. We use IVPR to hijack the host
-interrupt vectors while running the guest, which allows us to direct all
-interrupts to kvmppc_handle_interrupt(). At this point, we could either
-- handle the interrupt completely (e.g. emulate mtspr SPRG0), or
-- let the host interrupt handler run (e.g. when the decrementer fires), or
-- return to host userspace (e.g. when the guest performs device MMIO)
-
-Address spaces: We take advantage of the fact that Linux doesn't use the AS=1
-address space (in host or guest), which gives us virtual address space to use
-for guest mappings. While the guest is running, the host kernel remains mapped
-in AS=0, but the guest can only use AS=1 mappings.
-
-TLB entries: The TLB entries covering the host linear mapping remain
-present while running the guest. This reduces the overhead of lightweight
-exits, which are handled by KVM running in the host kernel. We keep three
-copies of the TLB:
- - guest TLB: contents of the TLB as the guest sees it
- - shadow TLB: the TLB that is actually in hardware while guest is running
- - host TLB: to restore TLB state when context switching guest - host
-When a TLB miss occurs because a mapping was not present in the shadow TLB,
-but was present in the guest TLB, KVM handles the fault without invoking the
-guest. Large guest pages are backed by multiple 4KB shadow pages through this
-mechanism.
-
-IO: MMIO and DCR accesses are emulated by userspace. We use virtio for network
-and block IO, so those drivers must be enabled in the guest. It's possible
-that some qemu device emulation (e.g. e1000 or rtl8139) may also work with
-little effort.
diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 790352f..93500f6 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -202,9 +202,7 @@ config PPC_EARLY_DEBUG_BEAT
 
 config PPC_EARLY_DEBUG_44x
bool Early serial debugging for IBM/AMCC 44x CPUs
-   # PPC_EARLY_DEBUG on 440 leaves

[PULL 46/63] KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication

2014-08-01 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

At present, kvmppc_ld calls kvmppc_xlate, and if kvmppc_xlate returns
any error indication, it returns -ENOENT, which is taken to mean an
HPTE not found error.  However, the error could have been a segment
found (no SLB entry) or a permission error.  Similarly,
kvmppc_pte_to_hva currently does permission checking, but any error
from it is taken by kvmppc_ld to mean that the access is an emulated
MMIO access.  Also, kvmppc_ld does no execute permission checking.

This fixes these problems by (a) returning any error from kvmppc_xlate
directly, (b) moving the permission check from kvmppc_pte_to_hva
into kvmppc_ld, and (c) adding an execute permission check to kvmppc_ld.

This is similar to what was done for kvmppc_st() by commit 82ff911317c3
(KVM: PPC: Deflect page write faults properly in kvmppc_st).

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s.c | 25 -
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 37ca8a0..a3cbada 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -413,17 +413,10 @@ static hva_t kvmppc_bad_hva(void)
return PAGE_OFFSET;
 }
 
-static hva_t kvmppc_pte_to_hva(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte,
-  bool read)
+static hva_t kvmppc_pte_to_hva(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte)
 {
hva_t hpage;
 
-   if (read  !pte-may_read)
-   goto err;
-
-   if (!read  !pte-may_write)
-   goto err;
-
hpage = gfn_to_hva(vcpu-kvm, pte-raddr  PAGE_SHIFT);
if (kvm_is_error_hva(hpage))
goto err;
@@ -462,15 +455,23 @@ int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int 
size, void *ptr,
 {
struct kvmppc_pte pte;
hva_t hva = *eaddr;
+   int rc;
 
vcpu-stat.ld++;
 
-   if (kvmppc_xlate(vcpu, *eaddr, data, false, pte))
-   goto nopte;
+   rc = kvmppc_xlate(vcpu, *eaddr, data, false, pte);
+   if (rc)
+   return rc;
 
*eaddr = pte.raddr;
 
-   hva = kvmppc_pte_to_hva(vcpu, pte, true);
+   if (!pte.may_read)
+   return -EPERM;
+
+   if (!data  !pte.may_execute)
+   return -ENOEXEC;
+
+   hva = kvmppc_pte_to_hva(vcpu, pte);
if (kvm_is_error_hva(hva))
goto mmio;
 
@@ -481,8 +482,6 @@ int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int 
size, void *ptr,
 
return EMULATE_DONE;
 
-nopte:
-   return -ENOENT;
 mmio:
return EMULATE_DO_MMIO;
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 44/63] KVM: PPC: Book3S: Fix LPCR one_reg interface

2014-08-01 Thread Alexander Graf

From: Alexey Kardashevskiy a...@ozlabs.ru

Unfortunately, the LPCR got defined as a 32-bit register in the
one_reg interface.  This is unfortunate because KVM allows userspace
to control the DPFD (default prefetch depth) field, which is in the
upper 32 bits.  The result is that DPFD always get set to 0, which
reduces performance in the guest.

We can't just change KVM_REG_PPC_LPCR to be a 64-bit register ID,
since that would break existing userspace binaries.  Instead we define
a new KVM_REG_PPC_LPCR_64 id which is 64-bit.  Userspace can still use
the old KVM_REG_PPC_LPCR id, but it now only modifies those fields in
the bottom 32 bits that userspace can modify (ILE, TC and AIL).
If userspace uses the new KVM_REG_PPC_LPCR_64 id, it can modify DPFD
as well.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Paul Mackerras pau...@samba.org
Cc: sta...@vger.kernel.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 Documentation/virtual/kvm/api.txt   |  3 ++-
 arch/powerpc/include/uapi/asm/kvm.h |  1 +
 arch/powerpc/kvm/book3s_hv.c| 13 +++--
 arch/powerpc/kvm/book3s_pr.c|  2 ++
 4 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 6955318..884f819 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1869,7 +1869,8 @@ registers, find a list below:
   PPC   | KVM_REG_PPC_PID  | 64
   PPC   | KVM_REG_PPC_ACOP | 64
   PPC   | KVM_REG_PPC_VRSAVE   | 32
-  PPC   | KVM_REG_PPC_LPCR | 64
+  PPC   | KVM_REG_PPC_LPCR | 32
+  PPC   | KVM_REG_PPC_LPCR_64  | 64
   PPC   | KVM_REG_PPC_PPR  | 64
   PPC   | KVM_REG_PPC_ARCH_COMPAT 32
   PPC   | KVM_REG_PPC_DABRX | 32
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index 0e56d9e..e0e49db 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -548,6 +548,7 @@ struct kvm_get_htab_header {
 
 #define KVM_REG_PPC_VRSAVE (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb4)
 #define KVM_REG_PPC_LPCR   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb5)
+#define KVM_REG_PPC_LPCR_64(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb5)
 #define KVM_REG_PPC_PPR(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb6)
 
 /* Architecture compatibility level */
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index f1281c4..0c5266e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -863,7 +863,8 @@ static int kvm_arch_vcpu_ioctl_set_sregs_hv(struct kvm_vcpu 
*vcpu,
return 0;
 }
 
-static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 new_lpcr)
+static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 new_lpcr,
+   bool preserve_top32)
 {
struct kvmppc_vcore *vc = vcpu-arch.vcore;
u64 mask;
@@ -898,6 +899,10 @@ static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 
new_lpcr)
mask = LPCR_DPFD | LPCR_ILE | LPCR_TC;
if (cpu_has_feature(CPU_FTR_ARCH_207S))
mask |= LPCR_AIL;
+
+   /* Broken 32-bit version of LPCR must not clear top bits */
+   if (preserve_top32)
+   mask = 0x;
vc-lpcr = (vc-lpcr  ~mask) | (new_lpcr  mask);
spin_unlock(vc-lock);
 }
@@ -1011,6 +1016,7 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
*val = get_reg_val(id, vcpu-arch.vcore-tb_offset);
break;
case KVM_REG_PPC_LPCR:
+   case KVM_REG_PPC_LPCR_64:
*val = get_reg_val(id, vcpu-arch.vcore-lpcr);
break;
case KVM_REG_PPC_PPR:
@@ -1216,7 +1222,10 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
ALIGN(set_reg_val(id, *val), 1UL  24);
break;
case KVM_REG_PPC_LPCR:
-   kvmppc_set_lpcr(vcpu, set_reg_val(id, *val));
+   kvmppc_set_lpcr(vcpu, set_reg_val(id, *val), true);
+   break;
+   case KVM_REG_PPC_LPCR_64:
+   kvmppc_set_lpcr(vcpu, set_reg_val(id, *val), false);
break;
case KVM_REG_PPC_PPR:
vcpu-arch.ppr = set_reg_val(id, *val);
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index b18f2d4..e7a1fa2 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -1314,6 +1314,7 @@ static int kvmppc_get_one_reg_pr(struct kvm_vcpu *vcpu, 
u64 id,
*val = get_reg_val(id, to_book3s(vcpu)-hior);
break;
case KVM_REG_PPC_LPCR:
+   case KVM_REG_PPC_LPCR_64:
/*
 * We are only interested in the LPCR_ILE bit
 */
@@ -1349,6 +1350,7 @@ static int kvmppc_set_one_reg_pr(struct kvm_vcpu *vcpu, 
u64 id,
to_book3s(vcpu)-hior_explicit = true;
break;
case KVM_REG_PPC_LPCR:
+   case KVM_REG_PPC_LPCR_64:

[PULL 62/63] KVM: PPC: HV: Remove generic instruction emulation

2014-08-01 Thread Alexander Graf

Now that we have properly split load/store instruction emulation and generic
instruction emulation, we can move the generic one from kvm.ko to kvm-pr.ko
on book3s_64.

This reduces the attack surface and amount of code loaded on HV KVM kernels.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/Makefile   |  2 +-
 arch/powerpc/kvm/trace_pr.h | 20 
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 1ccd7a1..2d590de 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -48,6 +48,7 @@ kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) := 
\
 
 kvm-pr-y := \
fpu.o \
+   emulate.o \
book3s_paired_singles.o \
book3s_pr.o \
book3s_pr_papr.o \
@@ -91,7 +92,6 @@ kvm-book3s_64-module-objs += \
$(KVM)/kvm_main.o \
$(KVM)/eventfd.o \
powerpc.o \
-   emulate.o \
emulate_loadstore.o \
book3s.o \
book3s_64_vio.o \
diff --git a/arch/powerpc/kvm/trace_pr.h b/arch/powerpc/kvm/trace_pr.h
index e1357cd..a674f09 100644
--- a/arch/powerpc/kvm/trace_pr.h
+++ b/arch/powerpc/kvm/trace_pr.h
@@ -291,6 +291,26 @@ TRACE_EVENT(kvm_unmap_hva,
TP_printk(unmap hva 0x%lx\n, __entry-hva)
 );
 
+TRACE_EVENT(kvm_ppc_instr,
+   TP_PROTO(unsigned int inst, unsigned long _pc, unsigned int emulate),
+   TP_ARGS(inst, _pc, emulate),
+
+   TP_STRUCT__entry(
+   __field(unsigned int,   inst)
+   __field(unsigned long,  pc  )
+   __field(unsigned int,   emulate )
+   ),
+
+   TP_fast_assign(
+   __entry-inst   = inst;
+   __entry-pc = _pc;
+   __entry-emulate= emulate;
+   ),
+
+   TP_printk(inst %u pc 0x%lx emulate %u\n,
+ __entry-inst, __entry-pc, __entry-emulate)
+);
+
 #endif /* _TRACE_KVM_H */
 
 /* This part must be outside protection */
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 41/63] KVM: PPC: Booke-hv: Add one reg interface for SPRG9

2014-08-01 Thread Alexander Graf

From: Bharat Bhushan bharat.bhus...@freescale.com

We now support SPRG9 for guest, so also add a one reg interface for same
Note: Changes are in bookehv code only as we do not have SPRG9 on booke-pr.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/uapi/asm/kvm.h |  1 +
 arch/powerpc/kvm/e500mc.c   | 22 --
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index 2bc4a94..0e56d9e 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -555,6 +555,7 @@ struct kvm_get_htab_header {
 
 #define KVM_REG_PPC_DABRX  (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb8)
 #define KVM_REG_PPC_WORT   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb9)
+#define KVM_REG_PPC_SPRG9  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xba)
 
 /* Transactional Memory checkpointed state:
  * This is all GPRs, all VSX regs and a subset of SPRs
diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 690499d..164bad2 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -267,14 +267,32 @@ static int kvmppc_core_set_sregs_e500mc(struct kvm_vcpu 
*vcpu,
 static int kvmppc_get_one_reg_e500mc(struct kvm_vcpu *vcpu, u64 id,
  union kvmppc_one_reg *val)
 {
-   int r = kvmppc_get_one_reg_e500_tlb(vcpu, id, val);
+   int r = 0;
+
+   switch (id) {
+   case KVM_REG_PPC_SPRG9:
+   *val = get_reg_val(id, vcpu-arch.sprg9);
+   break;
+   default:
+   r = kvmppc_get_one_reg_e500_tlb(vcpu, id, val);
+   }
+
return r;
 }
 
 static int kvmppc_set_one_reg_e500mc(struct kvm_vcpu *vcpu, u64 id,
  union kvmppc_one_reg *val)
 {
-   int r = kvmppc_set_one_reg_e500_tlb(vcpu, id, val);
+   int r = 0;
+
+   switch (id) {
+   case KVM_REG_PPC_SPRG9:
+   vcpu-arch.sprg9 = set_reg_val(id, *val);
+   break;
+   default:
+   r = kvmppc_set_one_reg_e500_tlb(vcpu, id, val);
+   }
+
return r;
 }
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 63/63] KVM: PPC: PR: Handle FSCR feature deselects

2014-08-01 Thread Alexander Graf

We handle FSCR feature bits (well, TAR only really today) lazily when the guest
starts using them. So when a guest activates the bit and later uses that feature
we enable it for real in hardware.

However, when the guest stops using that bit we don't stop setting it in
hardware. That means we can potentially lose a trap that the guest expects to
happen because it thinks a feature is not active.

This patch adds support to drop TAR when then guest turns it off in FSCR. While
at it it also restricts FSCR access to 64bit systems - 32bit ones don't have it.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h | 1 +
 arch/powerpc/kvm/book3s_emulate.c | 6 +++---
 arch/powerpc/kvm/book3s_pr.c  | 9 +
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 6166791..6acf0c2 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -182,6 +182,7 @@ extern long kvmppc_hv_get_dirty_log(struct kvm *kvm,
struct kvm_memory_slot *memslot, unsigned long *map);
 extern void kvmppc_update_lpcr(struct kvm *kvm, unsigned long lpcr,
unsigned long mask);
+extern void kvmppc_set_fscr(struct kvm_vcpu *vcpu, u64 fscr);
 
 extern void kvmppc_entry_trampoline(void);
 extern void kvmppc_hv_entry_trampoline(void);
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 84fddcd..5a2bc4b 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -449,10 +449,10 @@ int kvmppc_core_emulate_mtspr_pr(struct kvm_vcpu *vcpu, 
int sprn, ulong spr_val)
case SPRN_GQR7:
to_book3s(vcpu)-gqr[sprn - SPRN_GQR0] = spr_val;
break;
+#ifdef CONFIG_PPC_BOOK3S_64
case SPRN_FSCR:
-   vcpu-arch.fscr = spr_val;
+   kvmppc_set_fscr(vcpu, spr_val);
break;
-#ifdef CONFIG_PPC_BOOK3S_64
case SPRN_BESCR:
vcpu-arch.bescr = spr_val;
break;
@@ -593,10 +593,10 @@ int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, 
int sprn, ulong *spr_val
case SPRN_GQR7:
*spr_val = to_book3s(vcpu)-gqr[sprn - SPRN_GQR0];
break;
+#ifdef CONFIG_PPC_BOOK3S_64
case SPRN_FSCR:
*spr_val = vcpu-arch.fscr;
break;
-#ifdef CONFIG_PPC_BOOK3S_64
case SPRN_BESCR:
*spr_val = vcpu-arch.bescr;
break;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index e7a1fa2..faffb27 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -871,6 +871,15 @@ static int kvmppc_handle_fac(struct kvm_vcpu *vcpu, ulong 
fac)
 
return RESUME_GUEST;
 }
+
+void kvmppc_set_fscr(struct kvm_vcpu *vcpu, u64 fscr)
+{
+   if ((vcpu-arch.fscr  FSCR_TAR)  !(fscr  FSCR_TAR)) {
+   /* TAR got dropped, drop it in shadow too */
+   kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
+   }
+   vcpu-arch.fscr = fscr;
+}
 #endif
 
 int kvmppc_handle_exit_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 51/63] KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode

2014-08-01 Thread Alexander Graf

With Book3S KVM we can create both PR and HV VMs in parallel on the same
machine. That gives us new challenges on the CAPs we return - both have
different capabilities.

When we get asked about CAPs on the kvm fd, there's nothing we can do. We
can try to be smart and assume we're running HV if HV is available, PR
otherwise. However with the newly added VM CHECK_EXTENSION we can now ask
for capabilities directly on a VM which knows whether it's PR or HV.

With this patch I can successfully expose KVM PVINFO data to user space
in the PR case, fixing magic page mapping for PAPR guests.

Signed-off-by: Alexander Graf ag...@suse.de
Acked-by: Paolo Bonzini pbonz...@redhat.com
---
 arch/powerpc/kvm/powerpc.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index d870bac..eaa57da 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -394,11 +394,17 @@ void kvm_arch_sync_events(struct kvm *kvm)
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
int r;
-   /* FIXME!!
-* Should some of this be vm ioctl ? is it possible now ?
-*/
+   /* Assume we're using HV mode when the HV module is loaded */
int hv_enabled = kvmppc_hv_ops ? 1 : 0;
 
+   if (kvm) {
+   /*
+* Hooray - we know which VM type we're running on. Depend on
+* that rather than the guess above.
+*/
+   hv_enabled = is_kvmppc_hv_enabled(kvm);
+   }
+
switch (ext) {
 #ifdef CONFIG_BOOKE
case KVM_CAP_PPC_BOOKE_SREGS:
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 61/63] KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr

2014-08-01 Thread Alexander Graf

From: Bharat Bhushan bharat.bhus...@freescale.com

This are not specific to e500hv but applicable for bookehv
(As per comment from Scott Wood on my patch
kvm: ppc: bookehv: Added wrapper macros for shadow registers)

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 8e36c1e..fb86a22 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -539,16 +539,16 @@ static inline bool kvmppc_shared_big_endian(struct 
kvm_vcpu *vcpu)
 #endif
 }
 
-#define SPRNG_WRAPPER_GET(reg, e500hv_spr) \
+#define SPRNG_WRAPPER_GET(reg, bookehv_spr)\
 static inline ulong kvmppc_get_##reg(struct kvm_vcpu *vcpu)\
 {  \
-   return mfspr(e500hv_spr);   \
+   return mfspr(bookehv_spr);  \
 }  \
 
-#define SPRNG_WRAPPER_SET(reg, e500hv_spr) \
+#define SPRNG_WRAPPER_SET(reg, bookehv_spr)\
 static inline void kvmppc_set_##reg(struct kvm_vcpu *vcpu, ulong val)  \
 {  \
-   mtspr(e500hv_spr, val); \
+   mtspr(bookehv_spr, val);
\
 }  \
 
 #define SHARED_WRAPPER_GET(reg, size)  \
@@ -573,18 +573,18 @@ static inline void kvmppc_set_##reg(struct kvm_vcpu 
*vcpu, u##size val)   \
SHARED_WRAPPER_GET(reg, size)   \
SHARED_WRAPPER_SET(reg, size)   \
 
-#define SPRNG_WRAPPER(reg, e500hv_spr) \
-   SPRNG_WRAPPER_GET(reg, e500hv_spr)  \
-   SPRNG_WRAPPER_SET(reg, e500hv_spr)  \
+#define SPRNG_WRAPPER(reg, bookehv_spr)
\
+   SPRNG_WRAPPER_GET(reg, bookehv_spr) \
+   SPRNG_WRAPPER_SET(reg, bookehv_spr) \
 
 #ifdef CONFIG_KVM_BOOKE_HV
 
-#define SHARED_SPRNG_WRAPPER(reg, size, e500hv_spr)\
-   SPRNG_WRAPPER(reg, e500hv_spr)  \
+#define SHARED_SPRNG_WRAPPER(reg, size, bookehv_spr)   \
+   SPRNG_WRAPPER(reg, bookehv_spr) \
 
 #else
 
-#define SHARED_SPRNG_WRAPPER(reg, size, e500hv_spr)\
+#define SHARED_SPRNG_WRAPPER(reg, size, bookehv_spr)   \
SHARED_WRAPPER(reg, size)   \
 
 #endif
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 49/63] KVM: Rename and add argument to check_extension

2014-08-01 Thread Alexander Graf

In preparation to make the check_extension function available to VM scope
we add a struct kvm * argument to the function header and rename the function
accordingly. It will still be called from the /dev/kvm fd, but with a NULL
argument for struct kvm *.

Signed-off-by: Alexander Graf ag...@suse.de
Acked-by: Paolo Bonzini pbonz...@redhat.com
---
 arch/arm/kvm/arm.c | 2 +-
 arch/ia64/kvm/kvm-ia64.c   | 2 +-
 arch/mips/kvm/mips.c   | 2 +-
 arch/powerpc/kvm/powerpc.c | 2 +-
 arch/s390/kvm/kvm-s390.c   | 2 +-
 arch/x86/kvm/x86.c | 2 +-
 include/linux/kvm_host.h   | 2 +-
 virt/kvm/kvm_main.c| 6 +++---
 8 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 3c82b37..cb77f999 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -184,7 +184,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
}
 }
 
-int kvm_dev_ioctl_check_extension(long ext)
+int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
int r;
switch (ext) {
diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 6a4309b..0729ba6 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -190,7 +190,7 @@ void kvm_arch_check_processor_compat(void *rtn)
*(int *)rtn = 0;
 }
 
-int kvm_dev_ioctl_check_extension(long ext)
+int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
 
int r;
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index d687c6e..3ca79aa 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -885,7 +885,7 @@ int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct 
vm_fault *vmf)
return VM_FAULT_SIGBUS;
 }
 
-int kvm_dev_ioctl_check_extension(long ext)
+int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
int r;
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 8e03568..d870bac 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -391,7 +391,7 @@ void kvm_arch_sync_events(struct kvm *kvm)
 {
 }
 
-int kvm_dev_ioctl_check_extension(long ext)
+int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
int r;
/* FIXME!!
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 2f3e14f..00268ca 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -146,7 +146,7 @@ long kvm_arch_dev_ioctl(struct file *filp,
return -EINVAL;
 }
 
-int kvm_dev_ioctl_check_extension(long ext)
+int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
int r;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5a8691b..5a62d91 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2616,7 +2616,7 @@ out:
return r;
 }
 
-int kvm_dev_ioctl_check_extension(long ext)
+int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
int r;
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ec4e3bd..5065b95 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -602,7 +602,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 unsigned int ioctl, unsigned long arg);
 int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf);
 
-int kvm_dev_ioctl_check_extension(long ext);
+int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext);
 
 int kvm_get_dirty_log(struct kvm *kvm,
struct kvm_dirty_log *log, int *is_dirty);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4b6c01b..e28f3ca 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2571,7 +2571,7 @@ static int kvm_dev_ioctl_create_vm(unsigned long type)
return r;
 }
 
-static long kvm_dev_ioctl_check_extension_generic(long arg)
+static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
 {
switch (arg) {
case KVM_CAP_USER_MEMORY:
@@ -2595,7 +2595,7 @@ static long kvm_dev_ioctl_check_extension_generic(long 
arg)
default:
break;
}
-   return kvm_dev_ioctl_check_extension(arg);
+   return kvm_vm_ioctl_check_extension(kvm, arg);
 }
 
 static long kvm_dev_ioctl(struct file *filp,
@@ -2614,7 +2614,7 @@ static long kvm_dev_ioctl(struct file *filp,
r = kvm_dev_ioctl_create_vm(arg);
break;
case KVM_CHECK_EXTENSION:
-   r = kvm_dev_ioctl_check_extension_generic(arg);
+   r = kvm_vm_ioctl_check_extension_generic(NULL, arg);
break;
case KVM_GET_VCPU_MMAP_SIZE:
r = -EINVAL;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 27/63] KVM: PPC: Book3S: Add hack for split real mode

2014-08-01 Thread Alexander Graf

Today we handle split real mode by mapping both instruction and data faults
into a special virtual address space that only exists during the split mode
phase.

This is good enough to catch 32bit Linux guests that use split real mode for
copy_from/to_user. In this case we're always prefixed with 0xc000 for our
instruction pointer and can map the user space process freely below there.

However, that approach fails when we're running KVM inside of KVM. Here the 1st
level last_inst reader may well be in the same virtual page as a 2nd level
interrupt handler.

It also fails when running Mac OS X guests. Here we have a 4G/4G split, so a
kernel copy_from/to_user implementation can easily overlap with user space
addresses.

The architecturally correct way to fix this would be to implement an instruction
interpreter in KVM that kicks in whenever we go into split real mode. This
interpreter however would not receive a great amount of testing and be a lot of
bloat for a reasonably isolated corner case.

So I went back to the drawing board and tried to come up with a way to make
split real mode work with a single flat address space. And then I realized that
we could get away with the same trick that makes it work for Linux:

Whenever we see an instruction address during split real mode that may collide,
we just move it higher up the virtual address space to a place that hopefully
does not collide (keep your fingers crossed!).

That approach does work surprisingly well. I am able to successfully run
Mac OS X guests with KVM and QEMU (no split real mode hacks like MOL) when I
apply a tiny timing probe hack to QEMU. I'd say this is a win over even more
broken split real mode :).

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_asm.h|  1 +
 arch/powerpc/include/asm/kvm_book3s.h |  3 +++
 arch/powerpc/kvm/book3s.c | 19 ++
 arch/powerpc/kvm/book3s_pr.c  | 48 +++
 4 files changed, 71 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_asm.h 
b/arch/powerpc/include/asm/kvm_asm.h
index 9601741..3f3e530 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -131,6 +131,7 @@
 #define BOOK3S_HFLAG_NATIVE_PS 0x8
 #define BOOK3S_HFLAG_MULTI_PGSIZE  0x10
 #define BOOK3S_HFLAG_NEW_TLBIE 0x20
+#define BOOK3S_HFLAG_SPLIT_HACK0x40
 
 #define RESUME_FLAG_NV  (10)  /* Reload guest nonvolatile state? */
 #define RESUME_FLAG_HOST(11)  /* Resume host? */
diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 8ac5392..b1cf18d 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -324,4 +324,7 @@ static inline bool is_kvmppc_resume_guest(int r)
 /* LPIDs we support with this build -- runtime limit may be lower */
 #define KVMPPC_NR_LPIDS(LPID_RSVD + 1)
 
+#define SPLIT_HACK_MASK0xff00
+#define SPLIT_HACK_OFFS0xfb00
+
 #endif /* __ASM_KVM_BOOK3S_H__ */
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 9624c56..1d13764 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -72,6 +72,17 @@ void kvmppc_core_load_guest_debugstate(struct kvm_vcpu *vcpu)
 {
 }
 
+void kvmppc_unfixup_split_real(struct kvm_vcpu *vcpu)
+{
+   if (vcpu-arch.hflags  BOOK3S_HFLAG_SPLIT_HACK) {
+   ulong pc = kvmppc_get_pc(vcpu);
+   if ((pc  SPLIT_HACK_MASK) == SPLIT_HACK_OFFS)
+   kvmppc_set_pc(vcpu, pc  ~SPLIT_HACK_MASK);
+   vcpu-arch.hflags = ~BOOK3S_HFLAG_SPLIT_HACK;
+   }
+}
+EXPORT_SYMBOL_GPL(kvmppc_unfixup_split_real);
+
 static inline unsigned long kvmppc_interrupt_offset(struct kvm_vcpu *vcpu)
 {
if (!is_kvmppc_hv_enabled(vcpu-kvm))
@@ -118,6 +129,7 @@ static inline bool kvmppc_critical_section(struct kvm_vcpu 
*vcpu)
 
 void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags)
 {
+   kvmppc_unfixup_split_real(vcpu);
kvmppc_set_srr0(vcpu, kvmppc_get_pc(vcpu));
kvmppc_set_srr1(vcpu, kvmppc_get_msr(vcpu) | flags);
kvmppc_set_pc(vcpu, kvmppc_interrupt_offset(vcpu) + vec);
@@ -384,6 +396,13 @@ static int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong 
eaddr, bool data,
pte-may_write = true;
pte-may_execute = true;
r = 0;
+
+   if ((kvmppc_get_msr(vcpu)  (MSR_IR | MSR_DR)) == MSR_DR 
+   !data) {
+   if ((vcpu-arch.hflags  BOOK3S_HFLAG_SPLIT_HACK) 
+   ((eaddr  SPLIT_HACK_MASK) == SPLIT_HACK_OFFS))
+   pte-raddr = ~SPLIT_HACK_MASK;
+   }
}
 
return r;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 15fd6c2..6125f60 100644
---

[PULL 40/63] kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit

2014-08-01 Thread Alexander Graf

From: Bharat Bhushan bharat.bhus...@freescale.com

SPRN_SPRG is used by debug interrupt handler, so this is required for
debug support.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h   | 1 +
 arch/powerpc/kernel/asm-offsets.c | 1 +
 arch/powerpc/kvm/bookehv_interrupts.S | 4 
 3 files changed, 6 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 855ba4d..562f685 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -587,6 +587,7 @@ struct kvm_vcpu_arch {
u32 mmucfg;
u32 eptcfg;
u32 epr;
+   u64 sprg9;
u32 pwrmgtcr0;
u32 crit_save;
/* guest debug registers*/
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 17ffcb4..ab9ae04 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -668,6 +668,7 @@ int main(void)
DEFINE(VCPU_LR, offsetof(struct kvm_vcpu, arch.lr));
DEFINE(VCPU_CTR, offsetof(struct kvm_vcpu, arch.ctr));
DEFINE(VCPU_PC, offsetof(struct kvm_vcpu, arch.pc));
+   DEFINE(VCPU_SPRG9, offsetof(struct kvm_vcpu, arch.sprg9));
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index e000b39..b4f8fba 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -398,6 +398,7 @@ _GLOBAL(kvmppc_resume_host)
 #ifdef CONFIG_64BIT
PPC_LL  r3, PACA_SPRG_VDSO(r13)
 #endif
+   mfspr   r5, SPRN_SPRG9
PPC_STD(r6, VCPU_SHARED_SPRG4, r11)
mfspr   r8, SPRN_SPRG6
PPC_STD(r7, VCPU_SHARED_SPRG5, r11)
@@ -405,6 +406,7 @@ _GLOBAL(kvmppc_resume_host)
 #ifdef CONFIG_64BIT
mtspr   SPRN_SPRG_VDSO_WRITE, r3
 #endif
+   PPC_STD(r5, VCPU_SPRG9, r4)
PPC_STD(r8, VCPU_SHARED_SPRG6, r11)
mfxer   r3
PPC_STD(r9, VCPU_SHARED_SPRG7, r11)
@@ -639,7 +641,9 @@ lightweight_exit:
mtspr   SPRN_SPRG5W, r6
PPC_LD(r8, VCPU_SHARED_SPRG7, r11)
mtspr   SPRN_SPRG6W, r7
+   PPC_LD(r5, VCPU_SPRG9, r4)
mtspr   SPRN_SPRG7W, r8
+   mtspr   SPRN_SPRG9, r5
 
/* Load some guest volatiles. */
PPC_LL  r3, VCPU_LR(r4)
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 48/63] Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8

2014-08-01 Thread Alexander Graf

From: Stewart Smith stew...@linux.vnet.ibm.com

The POWER8 processor has a Micro Partition Prefetch Engine, which is
a fancy way of saying has way to store and load contents of L2 or
L2+MRU way of L3 cache. We initiate the storing of the log (list of
addresses) using the logmpp instruction and start restore by writing
to a SPR.

The logmpp instruction takes parameters in a single 64bit register:
- starting address of the table to store log of L2/L2+L3 cache contents
  - 32kb for L2
  - 128kb for L2+L3
  - Aligned relative to maximum size of the table (32kb or 128kb)
- Log control (no-op, L2 only, L2 and L3, abort logout)

We should abort any ongoing logging before initiating one.

To initiate restore, we write to the MPPR SPR. The format of what to write
to the SPR is similar to the logmpp instruction parameter:
- starting address of the table to read from (same alignment requirements)
- table size (no data, until end of table)
- prefetch rate (from fastest possible to slower. about every 8, 16, 24 or
  32 cycles)

The idea behind loading and storing the contents of L2/L3 cache is to
reduce memory latency in a system that is frequently swapping vcores on
a physical CPU.

The best case scenario for doing this is when some vcores are doing very
cache heavy workloads. The worst case is when they have about 0 cache hits,
so we just generate needless memory operations.

This implementation just does L2 store/load. In my benchmarks this proves
to be useful.

Benchmark 1:
 - 16 core POWER8
 - 3x Ubuntu 14.04LTS guests (LE) with 8 VCPUs each
 - No split core/SMT
 - two guests running sysbench memory test.
   sysbench --test=memory --num-threads=8 run
 - one guest running apache bench (of default HTML page)
   ab -n 49 -c 400 http://localhost/

This benchmark aims to measure performance of real world application (apache)
where other guests are cache hot with their own workloads. The sysbench memory
benchmark does pointer sized writes to a (small) memory buffer in a loop.

In this benchmark with this patch I can see an improvement both in requests
per second (~5%) and in mean and median response times (again, about 5%).
The spread of minimum and maximum response times were largely unchanged.

benchmark 2:
 - Same VM config as benchmark 1
 - all three guests running sysbench memory benchmark

This benchmark aims to see if there is a positive or negative affect to this
cache heavy benchmark. Although due to the nature of the benchmark (stores) we
may not see a difference in performance, but rather hopefully an improvement
in consistency of performance (when vcore switched in, don't have to wait
many times for cachelines to be pulled in)

The results of this benchmark are improvements in consistency of performance
rather than performance itself. With this patch, the few outliers in duration
go away and we get more consistent performance in each guest.

benchmark 3:
 - same 3 guests and CPU configuration as benchmark 1 and 2.
 - two idle guests
 - 1 guest running STREAM benchmark

This scenario also saw performance improvement with this patch. On Copy and
Scale workloads from STREAM, I got 5-6% improvement with this patch. For
Add and triad, it was around 10% (or more).

benchmark 4:
 - same 3 guests as previous benchmarks
 - two guests running sysbench --memory, distinctly different cache heavy
   workload
 - one guest running STREAM benchmark.

Similar improvements to benchmark 3.

benchmark 5:
 - 1 guest, 8 VCPUs, Ubuntu 14.04
 - Host configured with split core (SMT8, subcores-per-core=4)
 - STREAM benchmark

In this benchmark, we see a 10-20% performance improvement across the board
of STREAM benchmark results with this patch.

Based on preliminary investigation and microbenchmarks
by Prerna Saxena pre...@linux.vnet.ibm.com

Signed-off-by: Stewart Smith stew...@linux.vnet.ibm.com
Acked-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/cache.h  |  7 +
 arch/powerpc/include/asm/kvm_host.h   |  2 ++
 arch/powerpc/include/asm/ppc-opcode.h | 17 +++
 arch/powerpc/include/asm/reg.h|  1 +
 arch/powerpc/kvm/book3s_hv.c  | 57 ++-
 5 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index ed0afc1..34a05a1 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -3,6 +3,7 @@
 
 #ifdef __KERNEL__
 
+#include asm/reg.h
 
 /* bytes per L1 cache line */
 #if defined(CONFIG_8xx) || defined(CONFIG_403GCX)
@@ -39,6 +40,12 @@ struct ppc64_caches {
 };
 
 extern struct ppc64_caches ppc64_caches;
+
+static inline void logmpp(u64 x)
+{
+   asm volatile(PPC_LOGMPP(R1) : : r (x));
+}
+
 #endif /* __powerpc64__  ! __ASSEMBLY__ */
 
 #if defined(__ASSEMBLY__)
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 5fe2b5d..11385bb 100644
---

[PULL 57/63] KVM: PPC: Handle magic page in kvmppc_ld/st

2014-08-01 Thread Alexander Graf

We use kvmppc_ld and kvmppc_st to emulate load/store instructions that may as
well access the magic page. Special case it out so that we can properly access
it.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |  7 +++
 arch/powerpc/include/asm/kvm_booke.h  | 10 ++
 arch/powerpc/kvm/powerpc.c| 22 ++
 3 files changed, 39 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 172fd6d..6166791 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -286,6 +286,13 @@ static inline bool is_kvmppc_resume_guest(int r)
return (r == RESUME_GUEST || r == RESUME_GUEST_NV);
 }
 
+static inline bool is_kvmppc_hv_enabled(struct kvm *kvm);
+static inline bool kvmppc_supports_magic_page(struct kvm_vcpu *vcpu)
+{
+   /* Only PR KVM supports the magic page */
+   return !is_kvmppc_hv_enabled(vcpu-kvm);
+}
+
 /* Magic register values loaded into r3 and r4 before the 'sc' assembly
  * instruction for the OSI hypercalls */
 #define OSI_SC_MAGIC_R30x113724FA
diff --git a/arch/powerpc/include/asm/kvm_booke.h 
b/arch/powerpc/include/asm/kvm_booke.h
index cbb1990..f7aa5cc 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -103,4 +103,14 @@ static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu 
*vcpu)
 {
return vcpu-arch.fault_dear;
 }
+
+static inline bool kvmppc_supports_magic_page(struct kvm_vcpu *vcpu)
+{
+   /* Magic page is only supported on e500v2 */
+#ifdef CONFIG_KVM_E500V2
+   return true;
+#else
+   return false;
+#endif
+}
 #endif /* __ASM_KVM_BOOKE_H__ */
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index be40886..544d1d3 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -312,6 +312,7 @@ EXPORT_SYMBOL_GPL(kvmppc_emulate_mmio);
 int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
  bool data)
 {
+   ulong mp_pa = vcpu-arch.magic_page_pa  KVM_PAM  PAGE_MASK;
struct kvmppc_pte pte;
int r;
 
@@ -327,6 +328,16 @@ int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int 
size, void *ptr,
if (!pte.may_write)
return -EPERM;
 
+   /* Magic page override */
+   if (kvmppc_supports_magic_page(vcpu)  mp_pa 
+   ((pte.raddr  KVM_PAM  PAGE_MASK) == mp_pa) 
+   !(kvmppc_get_msr(vcpu)  MSR_PR)) {
+   void *magic = vcpu-arch.shared;
+   magic += pte.eaddr  0xfff;
+   memcpy(magic, ptr, size);
+   return EMULATE_DONE;
+   }
+
if (kvm_write_guest(vcpu-kvm, pte.raddr, ptr, size))
return EMULATE_DO_MMIO;
 
@@ -337,6 +348,7 @@ EXPORT_SYMBOL_GPL(kvmppc_st);
 int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
  bool data)
 {
+   ulong mp_pa = vcpu-arch.magic_page_pa  KVM_PAM  PAGE_MASK;
struct kvmppc_pte pte;
int rc;
 
@@ -355,6 +367,16 @@ int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int 
size, void *ptr,
if (!data  !pte.may_execute)
return -ENOEXEC;
 
+   /* Magic page override */
+   if (kvmppc_supports_magic_page(vcpu)  mp_pa 
+   ((pte.raddr  KVM_PAM  PAGE_MASK) == mp_pa) 
+   !(kvmppc_get_msr(vcpu)  MSR_PR)) {
+   void *magic = vcpu-arch.shared;
+   magic += pte.eaddr  0xfff;
+   memcpy(ptr, magic, size);
+   return EMULATE_DONE;
+   }
+
if (kvm_read_guest(vcpu-kvm, pte.raddr, ptr, size))
return EMULATE_DO_MMIO;
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 06/63] KVM: PPC: Book3S PR: Handle hyp doorbell exits

2014-08-01 Thread Alexander Graf

If we're running PR KVM in HV mode, we may get hypervisor doorbell interrupts.
Handle those the same way we treat normal doorbells.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_pr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 8ea7da4..3b82e86 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -988,6 +988,7 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
case BOOK3S_INTERRUPT_DECREMENTER:
case BOOK3S_INTERRUPT_HV_DECREMENTER:
case BOOK3S_INTERRUPT_DOORBELL:
+   case BOOK3S_INTERRUPT_H_DOORBELL:
vcpu-stat.dec_exits++;
r = RESUME_GUEST;
break;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 39/63] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

2014-08-01 Thread Alexander Graf

From: Mihai Caraman mihai.cara...@freescale.com

On book3e, KVM uses load external pid (lwepx) dedicated instruction to read
guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI
and LRAT), generated by loading a guest address, needs to be handled by KVM.
These exceptions are generated in a substituted guest translation context
(EPLC[EGS] = 1) from host context (MSR[GS] = 0).

Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1),
doing minimal checks on the fast path to avoid host performance degradation.
lwepx exceptions originate from host state (MSR[GS] = 0) which implies
additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking
at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context
Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious
too intrusive for the host.

Read guest last instruction from kvmppc_load_last_inst() by searching for the
physical address and kmap it. This address the TODO for TLB eviction and
execute-but-not-read entries, and allow us to get rid of lwepx until we are
able to handle failures.

A simple stress benchmark shows a 1% sys performance degradation compared with
previous approach (lwepx without failure handling):

time for i in `seq 1 1`; do /bin/echo  /dev/null; done

real0m 8.85s
user0m 4.34s
sys 0m 4.48s

vs

real0m 8.84s
user0m 4.36s
sys 0m 4.44s

A solution to use lwepx and to handle its exceptions in KVM would be to 
temporary
highjack the interrupt vector from host. This imposes additional 
synchronizations
for cores like FSL e6500 that shares host IVOR registers between hardware 
threads.
This optimized solution can be later developed on top of this patch.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c  | 44 +
 arch/powerpc/kvm/bookehv_interrupts.S | 37 --
 arch/powerpc/kvm/e500_mmu_host.c  | 92 +++
 3 files changed, 145 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 50df5e3..97bcde2 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -819,6 +819,28 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
}
 }
 
+static int kvmppc_resume_inst_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
+ enum emulation_result emulated, u32 last_inst)
+{
+   switch (emulated) {
+   case EMULATE_AGAIN:
+   return RESUME_GUEST;
+
+   case EMULATE_FAIL:
+   pr_debug(%s: load instruction from guest address %lx failed\n,
+  __func__, vcpu-arch.pc);
+   /* For debugging, encode the failing instruction and
+* report it to userspace. */
+   run-hw.hardware_exit_reason = ~0ULL  32;
+   run-hw.hardware_exit_reason |= last_inst;
+   kvmppc_core_queue_program(vcpu, ESR_PIL);
+   return RESUME_HOST;
+
+   default:
+   BUG();
+   }
+}
+
 /**
  * kvmppc_handle_exit
  *
@@ -830,6 +852,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
int r = RESUME_HOST;
int s;
int idx;
+   u32 last_inst = KVM_INST_FETCH_FAILED;
+   enum emulation_result emulated = EMULATE_DONE;
 
/* update before a new last_exit_type is rewritten */
kvmppc_update_timing_stats(vcpu);
@@ -837,6 +861,20 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
/* restart interrupts if they were meant for the host */
kvmppc_restart_interrupt(vcpu, exit_nr);
 
+   /*
+* get last instruction before beeing preempted
+* TODO: for e6500 check also BOOKE_INTERRUPT_LRAT_ERROR  ESR_DATA
+*/
+   switch (exit_nr) {
+   case BOOKE_INTERRUPT_DATA_STORAGE:
+   case BOOKE_INTERRUPT_DTLB_MISS:
+   case BOOKE_INTERRUPT_HV_PRIV:
+   emulated = kvmppc_get_last_inst(vcpu, false, last_inst);
+   break;
+   default:
+   break;
+   }
+
local_irq_enable();
 
trace_kvm_exit(exit_nr, vcpu);
@@ -845,6 +883,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
run-exit_reason = KVM_EXIT_UNKNOWN;
run-ready_for_interrupt_injection = 1;
 
+   if (emulated != EMULATE_DONE) {
+   r = kvmppc_resume_inst_load(run, vcpu, emulated, last_inst);
+   goto out;
+   }
+
switch (exit_nr) {
case BOOKE_INTERRUPT_MACHINE_CHECK:
printk(MACHINE CHECK: %lx\n, mfspr(SPRN_MCSR));
@@ -1134,6 +1177,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
BUG();
}
 
+out:
/*
 * To avoid clobbering exit_reason, only check for signals if we
 * aren't already

[PULL 31/63] kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR

2014-08-01 Thread Alexander Graf

From: Bharat Bhushan bharat.bhus...@freescale.com

Uses kvmppc_set_dar() and kvmppc_get_dar() helper functions

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c | 24 +++-
 1 file changed, 3 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 3b43adb..8e8b14b 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -292,24 +292,6 @@ static void set_guest_mcsrr(struct kvm_vcpu *vcpu, 
unsigned long srr0, u32 srr1)
vcpu-arch.mcsrr1 = srr1;
 }
 
-static unsigned long get_guest_dear(struct kvm_vcpu *vcpu)
-{
-#ifdef CONFIG_KVM_BOOKE_HV
-   return mfspr(SPRN_GDEAR);
-#else
-   return vcpu-arch.shared-dar;
-#endif
-}
-
-static void set_guest_dear(struct kvm_vcpu *vcpu, unsigned long dear)
-{
-#ifdef CONFIG_KVM_BOOKE_HV
-   mtspr(SPRN_GDEAR, dear);
-#else
-   vcpu-arch.shared-dar = dear;
-#endif
-}
-
 static unsigned long get_guest_esr(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_KVM_BOOKE_HV
@@ -447,7 +429,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
if (update_esr == true)
set_guest_esr(vcpu, vcpu-arch.queued_esr);
if (update_dear == true)
-   set_guest_dear(vcpu, vcpu-arch.queued_dear);
+   kvmppc_set_dar(vcpu, vcpu-arch.queued_dear);
if (update_epr == true) {
if (vcpu-arch.epr_flags  KVMPPC_EPR_USER)
kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
@@ -1317,7 +1299,7 @@ static void get_sregs_base(struct kvm_vcpu *vcpu,
sregs-u.e.csrr1 = vcpu-arch.csrr1;
sregs-u.e.mcsr = vcpu-arch.mcsr;
sregs-u.e.esr = get_guest_esr(vcpu);
-   sregs-u.e.dear = get_guest_dear(vcpu);
+   sregs-u.e.dear = kvmppc_get_dar(vcpu);
sregs-u.e.tsr = vcpu-arch.tsr;
sregs-u.e.tcr = vcpu-arch.tcr;
sregs-u.e.dec = kvmppc_get_dec(vcpu, tb);
@@ -1335,7 +1317,7 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
vcpu-arch.csrr1 = sregs-u.e.csrr1;
vcpu-arch.mcsr = sregs-u.e.mcsr;
set_guest_esr(vcpu, sregs-u.e.esr);
-   set_guest_dear(vcpu, sregs-u.e.dear);
+   kvmppc_set_dar(vcpu, sregs-u.e.dear);
vcpu-arch.vrsave = sregs-u.e.vrsave;
kvmppc_set_tcr(vcpu, sregs-u.e.tcr);
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 30/63] kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1

2014-08-01 Thread Alexander Graf

From: Bharat Bhushan bharat.bhus...@freescale.com

Use kvmppc_set_srr0/srr1() and kvmppc_get_srr0/srr1() helper functions

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index ab62109..3b43adb 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -266,13 +266,8 @@ static void kvmppc_core_dequeue_watchdog(struct kvm_vcpu 
*vcpu)
 
 static void set_guest_srr(struct kvm_vcpu *vcpu, unsigned long srr0, u32 srr1)
 {
-#ifdef CONFIG_KVM_BOOKE_HV
-   mtspr(SPRN_GSRR0, srr0);
-   mtspr(SPRN_GSRR1, srr1);
-#else
-   vcpu-arch.shared-srr0 = srr0;
-   vcpu-arch.shared-srr1 = srr1;
-#endif
+   kvmppc_set_srr0(vcpu, srr0);
+   kvmppc_set_srr1(vcpu, srr1);
 }
 
 static void set_guest_csrr(struct kvm_vcpu *vcpu, unsigned long srr0, u32 srr1)
@@ -1265,8 +1260,8 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
regs-lr = vcpu-arch.lr;
regs-xer = kvmppc_get_xer(vcpu);
regs-msr = vcpu-arch.shared-msr;
-   regs-srr0 = vcpu-arch.shared-srr0;
-   regs-srr1 = vcpu-arch.shared-srr1;
+   regs-srr0 = kvmppc_get_srr0(vcpu);
+   regs-srr1 = kvmppc_get_srr1(vcpu);
regs-pid = vcpu-arch.pid;
regs-sprg0 = vcpu-arch.shared-sprg0;
regs-sprg1 = vcpu-arch.shared-sprg1;
@@ -1293,8 +1288,8 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
vcpu-arch.lr = regs-lr;
kvmppc_set_xer(vcpu, regs-xer);
kvmppc_set_msr(vcpu, regs-msr);
-   vcpu-arch.shared-srr0 = regs-srr0;
-   vcpu-arch.shared-srr1 = regs-srr1;
+   kvmppc_set_srr0(vcpu, regs-srr0);
+   kvmppc_set_srr1(vcpu, regs-srr1);
kvmppc_set_pid(vcpu, regs-pid);
vcpu-arch.shared-sprg0 = regs-sprg0;
vcpu-arch.shared-sprg1 = regs-sprg1;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 14/63] KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling

2014-08-01 Thread Alexander Graf

From: Michael Neuling mi...@neuling.org

This adds support for the H_SET_MODE hcall.  This hcall is a
multiplexer that has several functions, some of which are called
rarely, and some which are potentially called very frequently.
Here we add support for the functions that set the debug registers
CIABR (Completed Instruction Address Breakpoint Register) and
DAWR/DAWRX (Data Address Watchpoint Register and eXtension),
since they could be updated by the guest as often as every context
switch.

This also adds a kvmppc_power8_compatible() function to test to see
if a guest is compatible with POWER8 or not.  The CIABR and DAWR/X
only exist on POWER8.

Signed-off-by: Michael Neuling mi...@neuling.org
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/hvcall.h |  6 +
 arch/powerpc/kvm/book3s_hv.c  | 52 ++-
 2 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index 5dbbb29..85bc8c0 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -279,6 +279,12 @@
 #define H_GET_24X7_DATA0xF07C
 #define H_GET_PERF_COUNTER_INFO0xF080
 
+/* Values for 2nd argument to H_SET_MODE */
+#define H_SET_MODE_RESOURCE_SET_CIABR  1
+#define H_SET_MODE_RESOURCE_SET_DAWR   2
+#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE3
+#define H_SET_MODE_RESOURCE_LE 4
+
 #ifndef __ASSEMBLY__
 
 /**
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index c4377c7..7db9df2 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -557,6 +557,48 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
vcpu-arch.dtl.dirty = true;
 }
 
+static bool kvmppc_power8_compatible(struct kvm_vcpu *vcpu)
+{
+   if (vcpu-arch.vcore-arch_compat = PVR_ARCH_207)
+   return true;
+   if ((!vcpu-arch.vcore-arch_compat) 
+   cpu_has_feature(CPU_FTR_ARCH_207S))
+   return true;
+   return false;
+}
+
+static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, unsigned long mflags,
+unsigned long resource, unsigned long value1,
+unsigned long value2)
+{
+   switch (resource) {
+   case H_SET_MODE_RESOURCE_SET_CIABR:
+   if (!kvmppc_power8_compatible(vcpu))
+   return H_P2;
+   if (value2)
+   return H_P4;
+   if (mflags)
+   return H_UNSUPPORTED_FLAG_START;
+   /* Guests can't breakpoint the hypervisor */
+   if ((value1  CIABR_PRIV) == CIABR_PRIV_HYPER)
+   return H_P3;
+   vcpu-arch.ciabr  = value1;
+   return H_SUCCESS;
+   case H_SET_MODE_RESOURCE_SET_DAWR:
+   if (!kvmppc_power8_compatible(vcpu))
+   return H_P2;
+   if (mflags)
+   return H_UNSUPPORTED_FLAG_START;
+   if (value2  DABRX_HYP)
+   return H_P4;
+   vcpu-arch.dawr  = value1;
+   vcpu-arch.dawrx = value2;
+   return H_SUCCESS;
+   default:
+   return H_TOO_HARD;
+   }
+}
+
 int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
 {
unsigned long req = kvmppc_get_gpr(vcpu, 3);
@@ -626,7 +668,14 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
 
/* Send the error out to userspace via KVM_RUN */
return rc;
-
+   case H_SET_MODE:
+   ret = kvmppc_h_set_mode(vcpu, kvmppc_get_gpr(vcpu, 4),
+   kvmppc_get_gpr(vcpu, 5),
+   kvmppc_get_gpr(vcpu, 6),
+   kvmppc_get_gpr(vcpu, 7));
+   if (ret == H_TOO_HARD)
+   return RESUME_HOST;
+   break;
case H_XIRR:
case H_CPPR:
case H_EOI:
@@ -652,6 +701,7 @@ static int kvmppc_hcall_impl_hv(unsigned long cmd)
case H_PROD:
case H_CONFER:
case H_REGISTER_VPA:
+   case H_SET_MODE:
 #ifdef CONFIG_KVM_XICS
case H_XIRR:
case H_CPPR:
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 09/63] KVM: PPC: Book3S PR: Fix ABIv2 on LE

2014-08-01 Thread Alexander Graf

We switched to ABIv2 on Little Endian systems now which gets rid of the
dotted function names. Branch to the actual functions when we see such
a system.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_interrupts.S | 4 
 arch/powerpc/kvm/book3s_rmhandlers.S | 4 
 2 files changed, 8 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_interrupts.S 
b/arch/powerpc/kvm/book3s_interrupts.S
index e2c29e3..d044b8b 100644
--- a/arch/powerpc/kvm/book3s_interrupts.S
+++ b/arch/powerpc/kvm/book3s_interrupts.S
@@ -25,7 +25,11 @@
 #include asm/exception-64s.h
 
 #if defined(CONFIG_PPC_BOOK3S_64)
+#if defined(_CALL_ELF)  _CALL_ELF == 2
+#define FUNC(name) name
+#else
 #define FUNC(name) GLUE(.,name)
+#endif
 #define GET_SHADOW_VCPU(reg)addi   reg, r13, PACA_SVCPU
 
 #elif defined(CONFIG_PPC_BOOK3S_32)
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S 
b/arch/powerpc/kvm/book3s_rmhandlers.S
index 4850a22..16c4d88 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -36,7 +36,11 @@
 
 #if defined(CONFIG_PPC_BOOK3S_64)
 
+#if defined(_CALL_ELF)  _CALL_ELF == 2
+#define FUNC(name) name
+#else
 #define FUNC(name) GLUE(.,name)
+#endif
 
 #elif defined(CONFIG_PPC_BOOK3S_32)
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 21/63] KVM: PPC: Book3S HV: Fix ABIv2 on LE

2014-08-01 Thread Alexander Graf

For code that doesn't live in modules we can just branch to the real function
names, giving us compatibility with ABIv1 and ABIv2.

Do this for the compiled-in code of HV KVM.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 364ca0c..855521e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -668,9 +668,9 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 
mr  r31, r4
addir3, r31, VCPU_FPRS_TM
-   bl  .load_fp_state
+   bl  load_fp_state
addir3, r31, VCPU_VRS_TM
-   bl  .load_vr_state
+   bl  load_vr_state
mr  r4, r31
lwz r7, VCPU_VRSAVE_TM(r4)
mtspr   SPRN_VRSAVE, r7
@@ -1414,9 +1414,9 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 
/* Save FP/VSX. */
addir3, r9, VCPU_FPRS_TM
-   bl  .store_fp_state
+   bl  store_fp_state
addir3, r9, VCPU_VRS_TM
-   bl  .store_vr_state
+   bl  store_vr_state
mfspr   r6, SPRN_VRSAVE
stw r6, VCPU_VRSAVE_TM(r9)
 1:
@@ -2430,11 +2430,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
mtmsrd  r8
isync
addir3,r3,VCPU_FPRS
-   bl  .store_fp_state
+   bl  store_fp_state
 #ifdef CONFIG_ALTIVEC
 BEGIN_FTR_SECTION
addir3,r31,VCPU_VRS
-   bl  .store_vr_state
+   bl  store_vr_state
 END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
 #endif
mfspr   r6,SPRN_VRSAVE
@@ -2466,11 +2466,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
mtmsrd  r8
isync
addir3,r4,VCPU_FPRS
-   bl  .load_fp_state
+   bl  load_fp_state
 #ifdef CONFIG_ALTIVEC
 BEGIN_FTR_SECTION
addir3,r31,VCPU_VRS
-   bl  .load_vr_state
+   bl  load_vr_state
 END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
 #endif
lwz r7,VCPU_VRSAVE(r31)
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 33/63] kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7

2014-08-01 Thread Alexander Graf

From: Bharat Bhushan bharat.bhus...@freescale.com

Use kvmppc_set_sprg[0-7]() and kvmppc_get_sprg[0-7]() helper
functions

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c | 32 
 arch/powerpc/kvm/booke_emulate.c |  8 
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 25a7e70..34562d4 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1227,14 +1227,14 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
regs-srr0 = kvmppc_get_srr0(vcpu);
regs-srr1 = kvmppc_get_srr1(vcpu);
regs-pid = vcpu-arch.pid;
-   regs-sprg0 = vcpu-arch.shared-sprg0;
-   regs-sprg1 = vcpu-arch.shared-sprg1;
-   regs-sprg2 = vcpu-arch.shared-sprg2;
-   regs-sprg3 = vcpu-arch.shared-sprg3;
-   regs-sprg4 = vcpu-arch.shared-sprg4;
-   regs-sprg5 = vcpu-arch.shared-sprg5;
-   regs-sprg6 = vcpu-arch.shared-sprg6;
-   regs-sprg7 = vcpu-arch.shared-sprg7;
+   regs-sprg0 = kvmppc_get_sprg0(vcpu);
+   regs-sprg1 = kvmppc_get_sprg1(vcpu);
+   regs-sprg2 = kvmppc_get_sprg2(vcpu);
+   regs-sprg3 = kvmppc_get_sprg3(vcpu);
+   regs-sprg4 = kvmppc_get_sprg4(vcpu);
+   regs-sprg5 = kvmppc_get_sprg5(vcpu);
+   regs-sprg6 = kvmppc_get_sprg6(vcpu);
+   regs-sprg7 = kvmppc_get_sprg7(vcpu);
 
for (i = 0; i  ARRAY_SIZE(regs-gpr); i++)
regs-gpr[i] = kvmppc_get_gpr(vcpu, i);
@@ -1255,14 +1255,14 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
kvmppc_set_srr0(vcpu, regs-srr0);
kvmppc_set_srr1(vcpu, regs-srr1);
kvmppc_set_pid(vcpu, regs-pid);
-   vcpu-arch.shared-sprg0 = regs-sprg0;
-   vcpu-arch.shared-sprg1 = regs-sprg1;
-   vcpu-arch.shared-sprg2 = regs-sprg2;
-   vcpu-arch.shared-sprg3 = regs-sprg3;
-   vcpu-arch.shared-sprg4 = regs-sprg4;
-   vcpu-arch.shared-sprg5 = regs-sprg5;
-   vcpu-arch.shared-sprg6 = regs-sprg6;
-   vcpu-arch.shared-sprg7 = regs-sprg7;
+   kvmppc_set_sprg0(vcpu, regs-sprg0);
+   kvmppc_set_sprg1(vcpu, regs-sprg1);
+   kvmppc_set_sprg2(vcpu, regs-sprg2);
+   kvmppc_set_sprg3(vcpu, regs-sprg3);
+   kvmppc_set_sprg4(vcpu, regs-sprg4);
+   kvmppc_set_sprg5(vcpu, regs-sprg5);
+   kvmppc_set_sprg6(vcpu, regs-sprg6);
+   kvmppc_set_sprg7(vcpu, regs-sprg7);
 
for (i = 0; i  ARRAY_SIZE(regs-gpr); i++)
kvmppc_set_gpr(vcpu, i, regs-gpr[i]);
diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c
index 27a4b28..28c1588 100644
--- a/arch/powerpc/kvm/booke_emulate.c
+++ b/arch/powerpc/kvm/booke_emulate.c
@@ -165,16 +165,16 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int 
sprn, ulong spr_val)
 * guest (PR-mode only).
 */
case SPRN_SPRG4:
-   vcpu-arch.shared-sprg4 = spr_val;
+   kvmppc_set_sprg4(vcpu, spr_val);
break;
case SPRN_SPRG5:
-   vcpu-arch.shared-sprg5 = spr_val;
+   kvmppc_set_sprg5(vcpu, spr_val);
break;
case SPRN_SPRG6:
-   vcpu-arch.shared-sprg6 = spr_val;
+   kvmppc_set_sprg6(vcpu, spr_val);
break;
case SPRN_SPRG7:
-   vcpu-arch.shared-sprg7 = spr_val;
+   kvmppc_set_sprg7(vcpu, spr_val);
break;
 
case SPRN_IVPR:
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 36/63] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1

2014-08-01 Thread Alexander Graf

From: Mihai Caraman mihai.cara...@freescale.com

Add mising defines MAS0_GET_TLBSEL() and MAS1_GET_TSIZE() for Book3E.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/mmu-book3e.h | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-book3e.h 
b/arch/powerpc/include/asm/mmu-book3e.h
index 8d24f78..cd4f04a 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -40,9 +40,11 @@
 
 /* MAS registers bit definitions */
 
-#define MAS0_TLBSEL_MASK0x3000
-#define MAS0_TLBSEL_SHIFT   28
-#define MAS0_TLBSEL(x)  (((x)  MAS0_TLBSEL_SHIFT)  MAS0_TLBSEL_MASK)
+#define MAS0_TLBSEL_MASK   0x3000
+#define MAS0_TLBSEL_SHIFT  28
+#define MAS0_TLBSEL(x) (((x)  MAS0_TLBSEL_SHIFT)  MAS0_TLBSEL_MASK)
+#define MAS0_GET_TLBSEL(mas0)  (((mas0)  MAS0_TLBSEL_MASK)  \
+   MAS0_TLBSEL_SHIFT)
 #define MAS0_ESEL_MASK 0x0FFF
 #define MAS0_ESEL_SHIFT16
 #define MAS0_ESEL(x)   (((x)  MAS0_ESEL_SHIFT)  MAS0_ESEL_MASK)
@@ -60,6 +62,7 @@
 #define MAS1_TSIZE_MASK0x0f80
 #define MAS1_TSIZE_SHIFT   7
 #define MAS1_TSIZE(x)  (((x)  MAS1_TSIZE_SHIFT)  MAS1_TSIZE_MASK)
+#define MAS1_GET_TSIZE(mas1)   (((mas1)  MAS1_TSIZE_MASK)  MAS1_TSIZE_SHIFT)
 
 #define MAS2_EPN   (~0xFFFUL)
 #define MAS2_X00x0040
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 15/63] KVM: PPC: e500: Fix default tlb for victim hint

2014-08-01 Thread Alexander Graf

From: Mihai Caraman mihai.cara...@freescale.com

Tlb search operation used for victim hint relies on the default tlb set by the
host. When hardware tablewalk support is enabled in the host, the default tlb is
TLB1 which leads KVM to evict the bolted entry. Set and restore the default tlb
when searching for victim hint.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Reviewed-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/mmu-book3e.h | 5 -
 arch/powerpc/kvm/e500_mmu_host.c  | 4 
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/mmu-book3e.h 
b/arch/powerpc/include/asm/mmu-book3e.h
index d0918e0..8d24f78 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -40,7 +40,9 @@
 
 /* MAS registers bit definitions */
 
-#define MAS0_TLBSEL(x) (((x)  28)  0x3000)
+#define MAS0_TLBSEL_MASK0x3000
+#define MAS0_TLBSEL_SHIFT   28
+#define MAS0_TLBSEL(x)  (((x)  MAS0_TLBSEL_SHIFT)  MAS0_TLBSEL_MASK)
 #define MAS0_ESEL_MASK 0x0FFF
 #define MAS0_ESEL_SHIFT16
 #define MAS0_ESEL(x)   (((x)  MAS0_ESEL_SHIFT)  MAS0_ESEL_MASK)
@@ -86,6 +88,7 @@
 #define MAS3_SPSIZE0x003e
 #define MAS3_SPSIZE_SHIFT  1
 
+#define MAS4_TLBSEL_MASK   MAS0_TLBSEL_MASK
 #define MAS4_TLBSELD(x)MAS0_TLBSEL(x)
 #define MAS4_INDD  0x8000  /* Default IND */
 #define MAS4_TSIZED(x) MAS1_TSIZE(x)
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index dd2cc03..79677d7 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -107,11 +107,15 @@ static u32 get_host_mas0(unsigned long eaddr)
 {
unsigned long flags;
u32 mas0;
+   u32 mas4;
 
local_irq_save(flags);
mtspr(SPRN_MAS6, 0);
+   mas4 = mfspr(SPRN_MAS4);
+   mtspr(SPRN_MAS4, mas4  ~MAS4_TLBSEL_MASK);
asm volatile(tlbsx 0, %0 : : b (eaddr  ~CONFIG_PAGE_OFFSET));
mas0 = mfspr(SPRN_MAS0);
+   mtspr(SPRN_MAS4, mas4);
local_irq_restore(flags);
 
return mas0;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 23/63] KVM: PPC: e500: Emulate power management control SPR

2014-08-01 Thread Alexander Graf

From: Mihai Caraman mihai.cara...@freescale.com

For FSL e6500 core the kernel uses power management SPR register (PWRMGTCR0)
to enable idle power down for cores and devices by setting up the idle count
period at boot time. With the host already controlling the power management
configuration the guest could simply benefit from it, so emulate guest request
as a general store.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  1 +
 arch/powerpc/kvm/e500_emulate.c | 12 
 2 files changed, 13 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 62b2cee..faf2f0e 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -584,6 +584,7 @@ struct kvm_vcpu_arch {
u32 mmucfg;
u32 eptcfg;
u32 epr;
+   u32 pwrmgtcr0;
u32 crit_save;
/* guest debug registers*/
struct debug_reg dbg_reg;
diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c
index 002d517..c99c40e 100644
--- a/arch/powerpc/kvm/e500_emulate.c
+++ b/arch/powerpc/kvm/e500_emulate.c
@@ -250,6 +250,14 @@ int kvmppc_core_emulate_mtspr_e500(struct kvm_vcpu *vcpu, 
int sprn, ulong spr_va
spr_val);
break;
 
+   case SPRN_PWRMGTCR0:
+   /*
+* Guest relies on host power management configurations
+* Treat the request as a general store
+*/
+   vcpu-arch.pwrmgtcr0 = spr_val;
+   break;
+
/* extra exceptions */
case SPRN_IVOR32:
vcpu-arch.ivor[BOOKE_IRQPRIO_SPE_UNAVAIL] = spr_val;
@@ -368,6 +376,10 @@ int kvmppc_core_emulate_mfspr_e500(struct kvm_vcpu *vcpu, 
int sprn, ulong *spr_v
*spr_val = vcpu-arch.eptcfg;
break;
 
+   case SPRN_PWRMGTCR0:
+   *spr_val = vcpu-arch.pwrmgtcr0;
+   break;
+
/* extra exceptions */
case SPRN_IVOR32:
*spr_val = vcpu-arch.ivor[BOOKE_IRQPRIO_SPE_UNAVAIL];
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 22/63] KVM: PPC: Book3S HV: Enable for little endian hosts

2014-08-01 Thread Alexander Graf

Now that we've fixed all the issues that HV KVM code had on little endian
hosts, we can enable it in the kernel configuration for users to play with.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/Kconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index d6a53b9..8aeeda1 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -75,7 +75,6 @@ config KVM_BOOK3S_64
 config KVM_BOOK3S_64_HV
tristate KVM support for POWER7 and PPC970 using hypervisor mode in 
host
depends on KVM_BOOK3S_64
-   depends on !CPU_LITTLE_ENDIAN
select KVM_BOOK3S_HV_POSSIBLE
select MMU_NOTIFIER
select CMA
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 17/63] KVM: PPC: Book3S HV: Make HTAB code LE host aware

2014-08-01 Thread Alexander Graf

When running on an LE host all data structures are kept in little endian
byte order. However, the HTAB still needs to be maintained in big endian.

So every time we access any HTAB we need to make sure we do so in the right
byte order. Fix up all accesses to manually byte swap.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h|   4 +-
 arch/powerpc/include/asm/kvm_book3s_64.h |  15 +++-
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 128 ++-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  | 146 ++-
 4 files changed, 164 insertions(+), 129 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index ceb70aa..8ac5392 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -162,9 +162,9 @@ extern pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t 
gfn, bool writing,
bool *writable);
 extern void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
unsigned long *rmap, long pte_index, int realmode);
-extern void kvmppc_invalidate_hpte(struct kvm *kvm, unsigned long *hptep,
+extern void kvmppc_invalidate_hpte(struct kvm *kvm, __be64 *hptep,
unsigned long pte_index);
-void kvmppc_clear_ref_hpte(struct kvm *kvm, unsigned long *hptep,
+void kvmppc_clear_ref_hpte(struct kvm *kvm, __be64 *hptep,
unsigned long pte_index);
 extern void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long addr,
unsigned long *nb_ret);
diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index c7871f3..e504f88 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -59,20 +59,29 @@ extern unsigned long kvm_rma_pages;
 /* These bits are reserved in the guest view of the HPTE */
 #define HPTE_GR_RESERVED   HPTE_GR_MODIFIED
 
-static inline long try_lock_hpte(unsigned long *hpte, unsigned long bits)
+static inline long try_lock_hpte(__be64 *hpte, unsigned long bits)
 {
unsigned long tmp, old;
+   __be64 be_lockbit, be_bits;
+
+   /*
+* We load/store in native endian, but the HTAB is in big endian. If
+* we byte swap all data we apply on the PTE we're implicitly correct
+* again.
+*/
+   be_lockbit = cpu_to_be64(HPTE_V_HVLOCK);
+   be_bits = cpu_to_be64(bits);
 
asm volatile(  ldarx   %0,0,%2\n
   and.%1,%0,%3\n
   bne 2f\n
-  ori %0,%0,%4\n
+  or  %0,%0,%4\n
   stdcx.  %0,0,%2\n
   beq+2f\n
   mr  %1,%3\n
 2:isync
 : =r (tmp), =r (old)
-: r (hpte), r (bits), i (HPTE_V_HVLOCK)
+: r (hpte), r (be_bits), r (be_lockbit)
 : cc, memory);
return old == 0;
 }
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 8056107..2d154d9 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -450,7 +450,7 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
unsigned long slb_v;
unsigned long pp, key;
unsigned long v, gr;
-   unsigned long *hptep;
+   __be64 *hptep;
int index;
int virtmode = vcpu-arch.shregs.msr  (data ? MSR_DR : MSR_IR);
 
@@ -473,13 +473,13 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
preempt_enable();
return -ENOENT;
}
-   hptep = (unsigned long *)(kvm-arch.hpt_virt + (index  4));
-   v = hptep[0]  ~HPTE_V_HVLOCK;
+   hptep = (__be64 *)(kvm-arch.hpt_virt + (index  4));
+   v = be64_to_cpu(hptep[0])  ~HPTE_V_HVLOCK;
gr = kvm-arch.revmap[index].guest_rpte;
 
/* Unlock the HPTE */
asm volatile(lwsync : : : memory);
-   hptep[0] = v;
+   hptep[0] = cpu_to_be64(v);
preempt_enable();
 
gpte-eaddr = eaddr;
@@ -583,7 +583,8 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
unsigned long ea, unsigned long dsisr)
 {
struct kvm *kvm = vcpu-kvm;
-   unsigned long *hptep, hpte[3], r;
+   unsigned long hpte[3], r;
+   __be64 *hptep;
unsigned long mmu_seq, psize, pte_size;
unsigned long gpa_base, gfn_base;
unsigned long gpa, gfn, hva, pfn;
@@ -606,16 +607,16 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, 
struct kvm_vcpu *vcpu,
if (ea != vcpu-arch.pgfault_addr)
return RESUME_GUEST;
index = vcpu-arch.pgfault_index;
-   hptep = (unsigned long *)(kvm-arch.hpt_virt + (index

[PULL 25/63] KVM: PPC: Deflect page write faults properly in kvmppc_st

2014-08-01 Thread Alexander Graf

When we have a page that we're not allowed to write to, xlate() will already
tell us -EPERM on lookup of that page. With the code as is we change it into
a page missing error which a guest may get confused about. Instead, just
tell the caller about the -EPERM directly.

This fixes Mac OS X guests when run with DCBZ32 emulation.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index bd75902..9624c56 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -418,11 +418,13 @@ int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int 
size, void *ptr,
  bool data)
 {
struct kvmppc_pte pte;
+   int r;
 
vcpu-stat.st++;
 
-   if (kvmppc_xlate(vcpu, *eaddr, data, true, pte))
-   return -ENOENT;
+   r = kvmppc_xlate(vcpu, *eaddr, data, true, pte);
+   if (r  0)
+   return r;
 
*eaddr = pte.raddr;
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 05/63] KVM: PPC: Book3s HV: Fix tlbie compile error

2014-08-01 Thread Alexander Graf

Some compilers complain about uninitialized variables in the compute_tlbie_rb
function. When you follow the code path you'll realize that we'll never get
to that point, but the compiler isn't all that smart.

So just default to 4k page sizes for everything, making the compiler happy
and the code slightly easier to read.

Signed-off-by: Alexander Graf ag...@suse.de
Acked-by: Paul Mackerras pau...@samba.org
---
 arch/powerpc/include/asm/kvm_book3s_64.h | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index fddb72b..c7871f3 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -110,16 +110,12 @@ static inline int __hpte_actual_psize(unsigned int lp, 
int psize)
 static inline unsigned long compute_tlbie_rb(unsigned long v, unsigned long r,
 unsigned long pte_index)
 {
-   int b_psize, a_psize;
+   int b_psize = MMU_PAGE_4K, a_psize = MMU_PAGE_4K;
unsigned int penc;
unsigned long rb = 0, va_low, sllp;
unsigned int lp = (r  LP_SHIFT)  ((1  LP_BITS) - 1);
 
-   if (!(v  HPTE_V_LARGE)) {
-   /* both base and actual psize is 4k */
-   b_psize = MMU_PAGE_4K;
-   a_psize = MMU_PAGE_4K;
-   } else {
+   if (v  HPTE_V_LARGE) {
for (b_psize = 0; b_psize  MMU_PAGE_COUNT; b_psize++) {
 
/* valid entries have a shift value */
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 08/63] KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()

2014-08-01 Thread Alexander Graf

From: Anton Blanchard an...@samba.org

Both kvmppc_hv_entry_trampoline and kvmppc_entry_trampoline are
assembly functions that are exported to modules and also require
a valid r2.

As such we need to use _GLOBAL_TOC so we provide a global entry
point that establishes the TOC (r2).

Signed-off-by: Anton Blanchard an...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +-
 arch/powerpc/kvm/book3s_rmhandlers.S| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index da1cac5..64ac56f 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -48,7 +48,7 @@
  *
  * LR = return address to continue at after eventually re-enabling MMU
  */
-_GLOBAL(kvmppc_hv_entry_trampoline)
+_GLOBAL_TOC(kvmppc_hv_entry_trampoline)
mflrr0
std r0, PPC_LR_STKOFF(r1)
stdur1, -112(r1)
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S 
b/arch/powerpc/kvm/book3s_rmhandlers.S
index 9eec675..4850a22 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -146,7 +146,7 @@ kvmppc_handler_skip_ins:
  * On entry, r4 contains the guest shadow MSR
  * MSR.EE has to be 0 when calling this function
  */
-_GLOBAL(kvmppc_entry_trampoline)
+_GLOBAL_TOC(kvmppc_entry_trampoline)
mfmsr   r5
LOAD_REG_ADDR(r7, kvmppc_handler_trampoline_enter)
toreal(r7)
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 02/63] KVM: PPC: BOOK3S: PR: Emulate virtual timebase register

2014-08-01 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

virtual time base register is a per VM, per cpu register that needs
to be saved and restored on vm exit and entry. Writing to VTB is not
allowed in the privileged mode.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
[agraf: fix compile error]
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h | 1 +
 arch/powerpc/include/asm/reg.h  | 9 +
 arch/powerpc/include/asm/time.h | 9 +
 arch/powerpc/kvm/book3s.c   | 6 ++
 arch/powerpc/kvm/book3s_emulate.c   | 3 +++
 arch/powerpc/kvm/book3s_hv.c| 6 --
 arch/powerpc/kvm/book3s_pr.c| 3 ++-
 7 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 4a58731..bd3caea 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -505,6 +505,7 @@ struct kvm_vcpu_arch {
 #endif
/* Time base value when we entered the guest */
u64 entry_tb;
+   u64 entry_vtb;
u32 tcr;
ulong tsr; /* we need to perform set/clr_bits() which requires ulong */
u32 ivor[64];
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index bffd89d..c8f3381 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1203,6 +1203,15 @@
 : r ((unsigned long)(v)) \
 : memory)
 
+static inline unsigned long mfvtb (void)
+{
+#ifdef CONFIG_PPC_BOOK3S_64
+   if (cpu_has_feature(CPU_FTR_ARCH_207S))
+   return mfspr(SPRN_VTB);
+#endif
+   return 0;
+}
+
 #ifdef __powerpc64__
 #if defined(CONFIG_PPC_CELL) || defined(CONFIG_PPC_FSL_BOOK3E)
 #define mftb() ({unsigned long rval;   \
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 1d428e60..03cbada 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -102,6 +102,15 @@ static inline u64 get_rtc(void)
return (u64)hi * 10 + lo;
 }
 
+static inline u64 get_vtb(void)
+{
+#ifdef CONFIG_PPC_BOOK3S_64
+   if (cpu_has_feature(CPU_FTR_ARCH_207S))
+   return mfvtb();
+#endif
+   return 0;
+}
+
 #ifdef CONFIG_PPC64
 static inline u64 get_tb(void)
 {
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index c254c27..ddce1ea 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -646,6 +646,9 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
case KVM_REG_PPC_BESCR:
val = get_reg_val(reg-id, vcpu-arch.bescr);
break;
+   case KVM_REG_PPC_VTB:
+   val = get_reg_val(reg-id, vcpu-arch.vtb);
+   break;
default:
r = -EINVAL;
break;
@@ -750,6 +753,9 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
case KVM_REG_PPC_BESCR:
vcpu-arch.bescr = set_reg_val(reg-id, val);
break;
+   case KVM_REG_PPC_VTB:
+   vcpu-arch.vtb = set_reg_val(reg-id, val);
+   break;
default:
r = -EINVAL;
break;
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 3565e77..1bb16a5 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -577,6 +577,9 @@ int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, int 
sprn, ulong *spr_val
 */
*spr_val = vcpu-arch.spurr;
break;
+   case SPRN_VTB:
+   *spr_val = vcpu-arch.vtb;
+   break;
case SPRN_GQR0:
case SPRN_GQR1:
case SPRN_GQR2:
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 7a12edb..315e884 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -897,9 +897,6 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 
id,
case KVM_REG_PPC_IC:
*val = get_reg_val(id, vcpu-arch.ic);
break;
-   case KVM_REG_PPC_VTB:
-   *val = get_reg_val(id, vcpu-arch.vtb);
-   break;
case KVM_REG_PPC_CSIGR:
*val = get_reg_val(id, vcpu-arch.csigr);
break;
@@ -1097,9 +1094,6 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
case KVM_REG_PPC_IC:
vcpu-arch.ic = set_reg_val(id, *val);
break;
-   case KVM_REG_PPC_VTB:
-   vcpu-arch.vtb = set_reg_val(id, *val);
-   break;
case KVM_REG_PPC_CSIGR:
vcpu-arch.csigr =

[PULL 13/63] KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled

2014-08-01 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL
capability is used to enable or disable in-kernel handling of an
hcall, that the hcall is actually implemented by the kernel.
If not an EINVAL error is returned.

This also checks the default-enabled list of hcalls and prints a
warning if any hcall there is not actually implemented.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 Documentation/virtual/kvm/api.txt   |  4 
 arch/powerpc/include/asm/kvm_book3s.h   |  3 +++
 arch/powerpc/include/asm/kvm_ppc.h  |  2 +-
 arch/powerpc/kvm/book3s.c   |  5 +
 arch/powerpc/kvm/book3s_hv.c| 31 +--
 arch/powerpc/kvm/book3s_hv_builtin.c| 13 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |  1 +
 arch/powerpc/kvm/book3s_pr.c|  3 +++
 arch/powerpc/kvm/book3s_pr_papr.c   | 29 +++--
 arch/powerpc/kvm/powerpc.c  |  2 ++
 10 files changed, 88 insertions(+), 5 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 5c54d19..6955318 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3039,3 +3039,7 @@ not to attempt to handle the hcall, but will always exit 
to userspace
 to handle it.  Note that it may not make sense to enable some and
 disable others of a group of related hcalls, but KVM does not prevent
 userspace from doing that.
+
+If the hcall number specified is not one that has an in-kernel
+implementation, the KVM_ENABLE_CAP ioctl will fail with an EINVAL
+error.
diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 052ab2a..ceb70aa 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -146,6 +146,7 @@ extern void kvmppc_mmu_invalidate_pte(struct kvm_vcpu 
*vcpu, struct hpte_cache *
 extern int kvmppc_mmu_hpte_sysinit(void);
 extern void kvmppc_mmu_hpte_sysexit(void);
 extern int kvmppc_mmu_hv_init(void);
+extern int kvmppc_book3s_hcall_implemented(struct kvm *kvm, unsigned long hc);
 
 extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
 extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
@@ -188,6 +189,8 @@ extern u32 kvmppc_alignment_dsisr(struct kvm_vcpu *vcpu, 
unsigned int inst);
 extern ulong kvmppc_alignment_dar(struct kvm_vcpu *vcpu, unsigned int inst);
 extern int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd);
 extern void kvmppc_pr_init_default_hcalls(struct kvm *kvm);
+extern int kvmppc_hcall_impl_pr(unsigned long cmd);
+extern int kvmppc_hcall_impl_hv_realmode(unsigned long cmd);
 extern void kvmppc_copy_to_svcpu(struct kvmppc_book3s_shadow_vcpu *svcpu,
 struct kvm_vcpu *vcpu);
 extern void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 9c89cdd..e2fd5a1 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -228,7 +228,7 @@ struct kvmppc_ops {
void (*fast_vcpu_kick)(struct kvm_vcpu *vcpu);
long (*arch_vm_ioctl)(struct file *filp, unsigned int ioctl,
  unsigned long arg);
-
+   int (*hcall_implemented)(unsigned long hcall);
 };
 
 extern struct kvmppc_ops *kvmppc_hv_ops;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 90aa5c7..bd75902 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -925,6 +925,11 @@ int kvmppc_core_check_processor_compat(void)
return 0;
 }
 
+int kvmppc_book3s_hcall_implemented(struct kvm *kvm, unsigned long hcall)
+{
+   return kvm-arch.kvm_ops-hcall_implemented(hcall);
+}
+
 static int kvmppc_book3s_init(void)
 {
int r;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index cf445d2..c4377c7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -645,6 +645,28 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
return RESUME_GUEST;
 }
 
+static int kvmppc_hcall_impl_hv(unsigned long cmd)
+{
+   switch (cmd) {
+   case H_CEDE:
+   case H_PROD:
+   case H_CONFER:
+   case H_REGISTER_VPA:
+#ifdef CONFIG_KVM_XICS
+   case H_XIRR:
+   case H_CPPR:
+   case H_EOI:
+   case H_IPI:
+   case H_IPOLL:
+   case H_XIRR_X:
+#endif
+   return 1;
+   }
+
+   /* See if it's in the real-mode table */
+   return kvmppc_hcall_impl_hv_realmode(cmd);
+}
+
 static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu,
 struct task_struct *tsk)
 {
@@ -2451,9 +2473,13 @@ static unsigned int default_hcall_list[] = {
 static void init_default_hcalls(void)
 {
int i;
+

[PULL 34/63] kvm: ppc: Add SPRN_EPR get helper function

2014-08-01 Thread Alexander Graf

From: Bharat Bhushan bharat.bhus...@freescale.com

kvmppc_set_epr() is already defined in asm/kvm_ppc.h, So
rename and move get_epr helper function to same file.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
[agraf: remove duplicate return]
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h | 11 +++
 arch/powerpc/kvm/booke.c   | 11 +--
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index c95bdbd..246fb9a 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -392,6 +392,17 @@ static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, 
u32 cmd)
{ return 0; }
 #endif
 
+static inline unsigned long kvmppc_get_epr(struct kvm_vcpu *vcpu)
+{
+#ifdef CONFIG_KVM_BOOKE_HV
+   return mfspr(SPRN_GEPR);
+#elif defined(CONFIG_BOOKE)
+   return vcpu-arch.epr;
+#else
+   return 0;
+#endif
+}
+
 static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
 {
 #ifdef CONFIG_KVM_BOOKE_HV
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 34562d4..a06ef6b 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -292,15 +292,6 @@ static void set_guest_mcsrr(struct kvm_vcpu *vcpu, 
unsigned long srr0, u32 srr1)
vcpu-arch.mcsrr1 = srr1;
 }
 
-static unsigned long get_guest_epr(struct kvm_vcpu *vcpu)
-{
-#ifdef CONFIG_KVM_BOOKE_HV
-   return mfspr(SPRN_GEPR);
-#else
-   return vcpu-arch.epr;
-#endif
-}
-
 /* Deliver the interrupt of the corresponding priority, if possible. */
 static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 unsigned int priority)
@@ -1452,7 +1443,7 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
val = get_reg_val(reg-id, vcpu-arch.dbg_reg.dac2);
break;
case KVM_REG_PPC_EPR: {
-   u32 epr = get_guest_epr(vcpu);
+   u32 epr = kvmppc_get_epr(vcpu);
val = get_reg_val(reg-id, epr);
break;
}
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 35/63] KVM: PPC: e500mc: Revert add load inst fixup

2014-08-01 Thread Alexander Graf

From: Mihai Caraman mihai.cara...@freescale.com

The commit 1d628af7 add load inst fixup made an attempt to handle
failures generated by reading the guest current instruction. The fixup
code that was added works by chance hiding the real issue.

Load external pid (lwepx) instruction, used by KVM to read guest
instructions, is executed in a subsituted guest translation context
(EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
interrupts need to be handled by KVM, even though these interrupts
are generated from host context (MSR[GS] = 0) where lwepx is executed.

Currently, KVM hooks only interrupts generated from guest context
(MSR[GS] = 1), doing minimal checks on the fast path to avoid host
performance degradation. As a result, the host kernel handles lwepx
faults searching the faulting guest data address (loaded in DEAR) in
its own Logical Partition ID (LPID) 0 context. In case a host translation
is found the execution returns to the lwepx instruction instead of the
fixup, the host ending up in an infinite loop.

Revert the commit add load inst fixup. lwepx issue will be addressed
in a subsequent patch without needing fixup code.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/bookehv_interrupts.S | 26 +-
 1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index a1712b8..6ff4480 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -29,7 +29,6 @@
 #include asm/asm-compat.h
 #include asm/asm-offsets.h
 #include asm/bitsperlong.h
-#include asm/thread_info.h
 
 #ifdef CONFIG_64BIT
 #include asm/exception-64e.h
@@ -164,32 +163,9 @@
PPC_STL r30, VCPU_GPR(R30)(r4)
PPC_STL r31, VCPU_GPR(R31)(r4)
mtspr   SPRN_EPLC, r8
-
-   /* disable preemption, so we are sure we hit the fixup handler */
-   CURRENT_THREAD_INFO(r8, r1)
-   li  r7, 1
-   stw r7, TI_PREEMPT(r8)
-
isync
-
-   /*
-* In case the read goes wrong, we catch it and write an invalid value
-* in LAST_INST instead.
-*/
-1: lwepx   r9, 0, r5
-2:
-.section .fixup, ax
-3: li  r9, KVM_INST_FETCH_FAILED
-   b   2b
-.previous
-.section __ex_table,a
-   PPC_LONG_ALIGN
-   PPC_LONG 1b,3b
-.previous
-
+   lwepx   r9, 0, r5
mtspr   SPRN_EPLC, r3
-   li  r7, 0
-   stw r7, TI_PREEMPT(r8)
stw r9, VCPU_LAST_INST(r4)
.endif
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 11/63] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-08-01 Thread Alexander Graf

From: Mihai Caraman mihai.cara...@freescale.com

On vcpu schedule, the condition checked for tlb pollution is too loose.
The tlb entries of a vcpu become polluted (vs stale) only when a different
vcpu within the same logical partition runs in-between. Optimize the tlb
invalidation condition keeping last_vcpu per logical partition id.

With the new invalidation condition, a guest shows 4% performance improvement
on P5020DS while running a memory stress application with the cpu 
oversubscribed,
the other guest running a cpu intensive workload.

Guest - old invalidation condition
  real 3.89
  user 3.87
  sys 0.01

Guest - enhanced invalidation condition
  real 3.75
  user 3.73
  sys 0.01

Host
  real 3.70
  user 1.85
  sys 0.00

The memory stress application accesses 4KB pages backed by 75% of available
TLB0 entries:

char foo[ENTRIES][4096] __attribute__ ((aligned (4096)));

int main()
{
char bar;
int i, j;

for (i = 0; i  ITERATIONS; i++)
for (j = 0; j  ENTRIES; j++)
bar = foo[j][0];

return 0;
}

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Reviewed-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/e500mc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 17e4562..690499d 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -110,7 +110,7 @@ void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 
old_msr)
 {
 }
 
-static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
+static DEFINE_PER_CPU(struct kvm_vcpu *[KVMPPC_NR_LPIDS], last_vcpu_of_lpid);
 
 static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu)
 {
@@ -141,9 +141,9 @@ static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu 
*vcpu, int cpu)
mtspr(SPRN_GESR, vcpu-arch.shared-esr);
 
if (vcpu-arch.oldpir != mfspr(SPRN_PIR) ||
-   __get_cpu_var(last_vcpu_on_cpu) != vcpu) {
+   __get_cpu_var(last_vcpu_of_lpid)[vcpu-kvm-arch.lpid] != vcpu) {
kvmppc_e500_tlbil_all(vcpu_e500);
-   __get_cpu_var(last_vcpu_on_cpu) = vcpu;
+   __get_cpu_var(last_vcpu_of_lpid)[vcpu-kvm-arch.lpid] = vcpu;
}
 
kvmppc_load_guest_fp(vcpu);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 20/63] KVM: PPC: Book3S HV: Access XICS in BE

2014-08-01 Thread Alexander Graf

On the exit path from the guest we check what type of interrupt we received
if we received one. This means we're doing hardware access to the XICS interrupt
controller.

However, when running on a little endian system, this access is byte reversed.

So let's make sure to swizzle the bytes back again and virtually make XICS
accesses big endian.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index bf5270e..364ca0c 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2350,7 +2350,18 @@ kvmppc_read_intr:
cmpdi   r6, 0
beq-1f
lwzcix  r0, r6, r7
-   rlwinm. r3, r0, 0, 0xff
+   /*
+* Save XIRR for later. Since we get in in reverse endian on LE
+* systems, save it byte reversed and fetch it back in host endian.
+*/
+   li  r3, HSTATE_SAVED_XIRR
+   STWX_BE r0, r3, r13
+#ifdef __LITTLE_ENDIAN__
+   lwz r3, HSTATE_SAVED_XIRR(r13)
+#else
+   mr  r3, r0
+#endif
+   rlwinm. r3, r3, 0, 0xff
sync
beq 1f  /* if nothing pending in the ICP */
 
@@ -2382,10 +2393,9 @@ kvmppc_read_intr:
li  r3, -1
 1: blr
 
-42:/* It's not an IPI and it's for the host, stash it in the PACA
-* before exit, it will be picked up by the host ICP driver
+42:/* It's not an IPI and it's for the host. We saved a copy of XIRR in
+* the PACA earlier, it will be picked up by the host ICP driver
 */
-   stw r0, HSTATE_SAVED_XIRR(r13)
li  r3, 1
b   1b
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 10/63] KVM: PPC: Book3S PR: Fix sparse endian checks

2014-08-01 Thread Alexander Graf

While sending sparse with endian checks over the code base, it triggered at
some places that were missing casts or had wrong types. Fix them up.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_pr_papr.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr_papr.c 
b/arch/powerpc/kvm/book3s_pr_papr.c
index 52a63bf..f7c25c6 100644
--- a/arch/powerpc/kvm/book3s_pr_papr.c
+++ b/arch/powerpc/kvm/book3s_pr_papr.c
@@ -40,8 +40,9 @@ static int kvmppc_h_pr_enter(struct kvm_vcpu *vcpu)
 {
long flags = kvmppc_get_gpr(vcpu, 4);
long pte_index = kvmppc_get_gpr(vcpu, 5);
-   unsigned long pteg[2 * 8];
-   unsigned long pteg_addr, i, *hpte;
+   __be64 pteg[2 * 8];
+   __be64 *hpte;
+   unsigned long pteg_addr, i;
long int ret;
 
i = pte_index  7;
@@ -93,8 +94,8 @@ static int kvmppc_h_pr_remove(struct kvm_vcpu *vcpu)
pteg = get_pteg_addr(vcpu, pte_index);
mutex_lock(vcpu-kvm-arch.hpt_mutex);
copy_from_user(pte, (void __user *)pteg, sizeof(pte));
-   pte[0] = be64_to_cpu(pte[0]);
-   pte[1] = be64_to_cpu(pte[1]);
+   pte[0] = be64_to_cpu((__force __be64)pte[0]);
+   pte[1] = be64_to_cpu((__force __be64)pte[1]);
 
ret = H_NOT_FOUND;
if ((pte[0]  HPTE_V_VALID) == 0 ||
@@ -171,8 +172,8 @@ static int kvmppc_h_pr_bulk_remove(struct kvm_vcpu *vcpu)
 
pteg = get_pteg_addr(vcpu, tsh  H_BULK_REMOVE_PTEX);
copy_from_user(pte, (void __user *)pteg, sizeof(pte));
-   pte[0] = be64_to_cpu(pte[0]);
-   pte[1] = be64_to_cpu(pte[1]);
+   pte[0] = be64_to_cpu((__force __be64)pte[0]);
+   pte[1] = be64_to_cpu((__force __be64)pte[1]);
 
/* tsl = AVPN */
flags = (tsh  H_BULK_REMOVE_FLAGS)  26;
@@ -211,8 +212,8 @@ static int kvmppc_h_pr_protect(struct kvm_vcpu *vcpu)
pteg = get_pteg_addr(vcpu, pte_index);
mutex_lock(vcpu-kvm-arch.hpt_mutex);
copy_from_user(pte, (void __user *)pteg, sizeof(pte));
-   pte[0] = be64_to_cpu(pte[0]);
-   pte[1] = be64_to_cpu(pte[1]);
+   pte[0] = be64_to_cpu((__force __be64)pte[0]);
+   pte[1] = be64_to_cpu((__force __be64)pte[1]);
 
ret = H_NOT_FOUND;
if ((pte[0]  HPTE_V_VALID) == 0 ||
@@ -231,8 +232,8 @@ static int kvmppc_h_pr_protect(struct kvm_vcpu *vcpu)
 
rb = compute_tlbie_rb(v, r, pte_index);
vcpu-arch.mmu.tlbie(vcpu, rb, rb  1 ? true : false);
-   pte[0] = cpu_to_be64(pte[0]);
-   pte[1] = cpu_to_be64(pte[1]);
+   pte[0] = (__force u64)cpu_to_be64(pte[0]);
+   pte[1] = (__force u64)cpu_to_be64(pte[1]);
copy_to_user((void __user *)pteg, pte, sizeof(pte));
ret = H_SUCCESS;
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 38/63] KVM: PPC: Allow kvmppc_get_last_inst() to fail

2014-08-01 Thread Alexander Graf

From: Mihai Caraman mihai.cara...@freescale.com

On book3e, guest last instruction is read on the exit path using load
external pid (lwepx) dedicated instruction. This load operation may fail
due to TLB eviction and execute-but-not-read entries.

This patch lay down the path for an alternative solution to read the guest
last instruction, by allowing kvmppc_get_lat_inst() function to fail.
Architecture specific implmentations of kvmppc_load_last_inst() may read
last guest instruction and instruct the emulation layer to re-execute the
guest in case of failure.

Make kvmppc_get_last_inst() definition common between architectures.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h| 26 --
 arch/powerpc/include/asm/kvm_booke.h |  5 
 arch/powerpc/include/asm/kvm_ppc.h   | 31 ++
 arch/powerpc/kvm/book3s.c| 17 
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 17 
 arch/powerpc/kvm/book3s_paired_singles.c | 38 +--
 arch/powerpc/kvm/book3s_pr.c | 45 +++-
 arch/powerpc/kvm/booke.c |  3 +++
 arch/powerpc/kvm/e500_mmu_host.c |  6 +
 arch/powerpc/kvm/emulate.c   | 18 -
 arch/powerpc/kvm/powerpc.c   | 11 ++--
 11 files changed, 140 insertions(+), 77 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 20fb6f2..a86ca65 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -276,32 +276,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu 
*vcpu)
return (kvmppc_get_msr(vcpu)  MSR_LE) != (MSR_KERNEL  MSR_LE);
 }
 
-static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong 
pc)
-{
-   /* Load the instruction manually if it failed to do so in the
-* exit path */
-   if (vcpu-arch.last_inst == KVM_INST_FETCH_FAILED)
-   kvmppc_ld(vcpu, pc, sizeof(u32), vcpu-arch.last_inst, false);
-
-   return kvmppc_need_byteswap(vcpu) ? swab32(vcpu-arch.last_inst) :
-   vcpu-arch.last_inst;
-}
-
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-   return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu));
-}
-
-/*
- * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
- * Because the sc instruction sets SRR0 to point to the following
- * instruction, we have to fetch from pc - 4.
- */
-static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
-{
-   return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4);
-}
-
 static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
 {
return vcpu-arch.fault_dar;
diff --git a/arch/powerpc/include/asm/kvm_booke.h 
b/arch/powerpc/include/asm/kvm_booke.h
index c7aed61..cbb1990 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -69,11 +69,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu 
*vcpu)
return false;
 }
 
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-   return vcpu-arch.last_inst;
-}
-
 static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
 {
vcpu-arch.ctr = val;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 246fb9a..e381363 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -47,6 +47,11 @@ enum emulation_result {
EMULATE_EXIT_USER,/* emulation requires exit to user-space */
 };
 
+enum instruction_type {
+   INST_GENERIC,
+   INST_SC,/* system call */
+};
+
 extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern void kvmppc_handler_highmem(void);
@@ -62,6 +67,9 @@ extern int kvmppc_handle_store(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
   u64 val, unsigned int bytes,
   int is_default_endian);
 
+extern int kvmppc_load_last_inst(struct kvm_vcpu *vcpu,
+enum instruction_type type, u32 *inst);
+
 extern int kvmppc_emulate_instruction(struct kvm_run *run,
   struct kvm_vcpu *vcpu);
 extern int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu);
@@ -234,6 +242,29 @@ struct kvmppc_ops {
 extern struct kvmppc_ops *kvmppc_hv_ops;
 extern struct kvmppc_ops *kvmppc_pr_ops;
 
+static inline int kvmppc_get_last_inst(struct kvm_vcpu *vcpu,
+   enum instruction_type type, u32 *inst)
+{
+   int ret = EMULATE_DONE;
+   u32 fetched_inst;
+
+   /* Load the instruction manually if it failed to do so in the
+* exit path */
+   if

[PULL 32/63] kvm: ppc: booke: Add shared struct helpers of SPRN_ESR

2014-08-01 Thread Alexander Graf

From: Bharat Bhushan bharat.bhus...@freescale.com

Add and use kvmppc_set_esr() and kvmppc_get_esr() helper functions

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_ppc.h |  1 +
 arch/powerpc/kvm/booke.c   | 24 +++-
 2 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 6520d09..c95bdbd 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -530,6 +530,7 @@ SHARED_SPRNG_WRAPPER(sprg3, 64, SPRN_GSPRG3)
 SHARED_SPRNG_WRAPPER(srr0, 64, SPRN_GSRR0)
 SHARED_SPRNG_WRAPPER(srr1, 64, SPRN_GSRR1)
 SHARED_SPRNG_WRAPPER(dar, 64, SPRN_GDEAR)
+SHARED_SPRNG_WRAPPER(esr, 64, SPRN_GESR)
 SHARED_WRAPPER_GET(msr, 64)
 static inline void kvmppc_set_msr_fast(struct kvm_vcpu *vcpu, u64 val)
 {
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 8e8b14b..25a7e70 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -292,24 +292,6 @@ static void set_guest_mcsrr(struct kvm_vcpu *vcpu, 
unsigned long srr0, u32 srr1)
vcpu-arch.mcsrr1 = srr1;
 }
 
-static unsigned long get_guest_esr(struct kvm_vcpu *vcpu)
-{
-#ifdef CONFIG_KVM_BOOKE_HV
-   return mfspr(SPRN_GESR);
-#else
-   return vcpu-arch.shared-esr;
-#endif
-}
-
-static void set_guest_esr(struct kvm_vcpu *vcpu, u32 esr)
-{
-#ifdef CONFIG_KVM_BOOKE_HV
-   mtspr(SPRN_GESR, esr);
-#else
-   vcpu-arch.shared-esr = esr;
-#endif
-}
-
 static unsigned long get_guest_epr(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_KVM_BOOKE_HV
@@ -427,7 +409,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
 
vcpu-arch.pc = vcpu-arch.ivpr | vcpu-arch.ivor[priority];
if (update_esr == true)
-   set_guest_esr(vcpu, vcpu-arch.queued_esr);
+   kvmppc_set_esr(vcpu, vcpu-arch.queued_esr);
if (update_dear == true)
kvmppc_set_dar(vcpu, vcpu-arch.queued_dear);
if (update_epr == true) {
@@ -1298,7 +1280,7 @@ static void get_sregs_base(struct kvm_vcpu *vcpu,
sregs-u.e.csrr0 = vcpu-arch.csrr0;
sregs-u.e.csrr1 = vcpu-arch.csrr1;
sregs-u.e.mcsr = vcpu-arch.mcsr;
-   sregs-u.e.esr = get_guest_esr(vcpu);
+   sregs-u.e.esr = kvmppc_get_esr(vcpu);
sregs-u.e.dear = kvmppc_get_dar(vcpu);
sregs-u.e.tsr = vcpu-arch.tsr;
sregs-u.e.tcr = vcpu-arch.tcr;
@@ -1316,7 +1298,7 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
vcpu-arch.csrr0 = sregs-u.e.csrr0;
vcpu-arch.csrr1 = sregs-u.e.csrr1;
vcpu-arch.mcsr = sregs-u.e.mcsr;
-   set_guest_esr(vcpu, sregs-u.e.esr);
+   kvmppc_set_esr(vcpu, sregs-u.e.esr);
kvmppc_set_dar(vcpu, sregs-u.e.dear);
vcpu-arch.vrsave = sregs-u.e.vrsave;
kvmppc_set_tcr(vcpu, sregs-u.e.tcr);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 01/63] KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation

2014-08-01 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

We use time base for PURR and SPURR emulation with PR KVM since we
are emulating a single threaded core. When using time base
we need to make sure that we don't accumulate time spent in the host
in PURR and SPURR value.

Also we don't need to emulate mtspr because both the registers are
hypervisor resource.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |  2 --
 arch/powerpc/include/asm/kvm_host.h   |  4 ++--
 arch/powerpc/kvm/book3s_emulate.c | 16 
 arch/powerpc/kvm/book3s_pr.c  | 11 +++
 4 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index f52f656..a20cc0b 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -83,8 +83,6 @@ struct kvmppc_vcpu_book3s {
u64 sdr1;
u64 hior;
u64 msr_mask;
-   u64 purr_offset;
-   u64 spurr_offset;
 #ifdef CONFIG_PPC_BOOK3S_32
u32 vsid_pool[VSID_POOL_SIZE];
u32 vsid_next;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index bb66d8b..4a58731 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -503,8 +503,8 @@ struct kvm_vcpu_arch {
 #ifdef CONFIG_BOOKE
u32 decar;
 #endif
-   u32 tbl;
-   u32 tbu;
+   /* Time base value when we entered the guest */
+   u64 entry_tb;
u32 tcr;
ulong tsr; /* we need to perform set/clr_bits() which requires ulong */
u32 ivor[64];
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 3f29526..3565e77 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -439,12 +439,6 @@ int kvmppc_core_emulate_mtspr_pr(struct kvm_vcpu *vcpu, 
int sprn, ulong spr_val)
(mfmsr()  MSR_HV))
vcpu-arch.hflags |= BOOK3S_HFLAG_DCBZ32;
break;
-   case SPRN_PURR:
-   to_book3s(vcpu)-purr_offset = spr_val - get_tb();
-   break;
-   case SPRN_SPURR:
-   to_book3s(vcpu)-spurr_offset = spr_val - get_tb();
-   break;
case SPRN_GQR0:
case SPRN_GQR1:
case SPRN_GQR2:
@@ -572,10 +566,16 @@ int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, 
int sprn, ulong *spr_val
*spr_val = 0;
break;
case SPRN_PURR:
-   *spr_val = get_tb() + to_book3s(vcpu)-purr_offset;
+   /*
+* On exit we would have updated purr
+*/
+   *spr_val = vcpu-arch.purr;
break;
case SPRN_SPURR:
-   *spr_val = get_tb() + to_book3s(vcpu)-purr_offset;
+   /*
+* On exit we would have updated spurr
+*/
+   *spr_val = vcpu-arch.spurr;
break;
case SPRN_GQR0:
case SPRN_GQR1:
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 8eef1e5..671f5c92 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -120,6 +120,11 @@ void kvmppc_copy_to_svcpu(struct kvmppc_book3s_shadow_vcpu 
*svcpu,
 #ifdef CONFIG_PPC_BOOK3S_64
svcpu-shadow_fscr = vcpu-arch.shadow_fscr;
 #endif
+   /*
+* Now also save the current time base value. We use this
+* to find the guest purr and spurr value.
+*/
+   vcpu-arch.entry_tb = get_tb();
svcpu-in_use = true;
 }
 
@@ -166,6 +171,12 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
 #ifdef CONFIG_PPC_BOOK3S_64
vcpu-arch.shadow_fscr = svcpu-shadow_fscr;
 #endif
+   /*
+* Update purr and spurr using time base on exit.
+*/
+   vcpu-arch.purr += get_tb() - vcpu-arch.entry_tb;
+   vcpu-arch.spurr += get_tb() - vcpu-arch.entry_tb;
+
svcpu-in_use = false;
 
 out:
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 00/63] ppc patch queue 2014-08-01

2014-08-01 Thread Alexander Graf

Hi Paolo / Marcelo,

This is my current patch queue for ppc.  Please pull.

Alex


The following changes since commit 9f6226a762c7ae02f6a23a3d4fc552dafa57ea23:

  arch: x86: kvm: x86.c: Cleaning up variable is set more than once (2014-06-30 
16:52:04 +0200)

are available in the git repository at:

  git://github.com/agraf/linux-2.6.git tags/signed-kvm-ppc-next

for you to fetch changes up to 8e6afa36e754be84b468d7df9e5aa71cf4003f3b:

  KVM: PPC: PR: Handle FSCR feature deselects (2014-07-31 10:23:46 +0200)


Patch queue for ppc - 2014-08-01

Highlights in this release include:

  - BookE: Rework instruction fetch, not racy anymore now
  - BookE HV: Fix ONE_REG accessors for some in-hardware registers
  - Book3S: Good number of LE host fixes, enable HV on LE
  - Book3S: Some misc bug fixes
  - Book3S HV: Add in-guest debug support
  - Book3S HV: Preload cache lines on context switch
  - Remove 440 support

Alexander Graf (31):
  KVM: PPC: Book3s PR: Disable AIL mode with OPAL
  KVM: PPC: Book3s HV: Fix tlbie compile error
  KVM: PPC: Book3S PR: Handle hyp doorbell exits
  KVM: PPC: Book3S PR: Fix ABIv2 on LE
  KVM: PPC: Book3S PR: Fix sparse endian checks
  PPC: Add asm helpers for BE 32bit load/store
  KVM: PPC: Book3S HV: Make HTAB code LE host aware
  KVM: PPC: Book3S HV: Access guest VPA in BE
  KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE
  KVM: PPC: Book3S HV: Access XICS in BE
  KVM: PPC: Book3S HV: Fix ABIv2 on LE
  KVM: PPC: Book3S HV: Enable for little endian hosts
  KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct
  KVM: PPC: Deflect page write faults properly in kvmppc_st
  KVM: PPC: Book3S: Stop PTE lookup on write errors
  KVM: PPC: Book3S: Add hack for split real mode
  KVM: PPC: Book3S: Make magic page properly 4k mappable
  KVM: PPC: Remove 440 support
  KVM: Rename and add argument to check_extension
  KVM: Allow KVM_CHECK_EXTENSION on the vm fd
  KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode
  KVM: PPC: Implement kvmppc_xlate for all targets
  KVM: PPC: Move kvmppc_ld/st to common code
  KVM: PPC: Remove kvmppc_bad_hva()
  KVM: PPC: Use kvm_read_guest in kvmppc_ld
  KVM: PPC: Handle magic page in kvmppc_ld/st
  KVM: PPC: Separate loadstore emulation from priv emulation
  KVM: PPC: Expose helper functions for data/inst faults
  KVM: PPC: Remove DCR handling
  KVM: PPC: HV: Remove generic instruction emulation
  KVM: PPC: PR: Handle FSCR feature deselects

Alexey Kardashevskiy (1):
  KVM: PPC: Book3S: Fix LPCR one_reg interface

Aneesh Kumar K.V (4):
  KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation
  KVM: PPC: BOOK3S: PR: Emulate virtual timebase register
  KVM: PPC: BOOK3S: PR: Emulate instruction counter
  KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page

Anton Blanchard (2):
  KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue
  KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()

Bharat Bhushan (10):
  kvm: ppc: bookehv: Added wrapper macros for shadow registers
  kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1
  kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR
  kvm: ppc: booke: Add shared struct helpers of SPRN_ESR
  kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7
  kvm: ppc: Add SPRN_EPR get helper function
  kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit
  KVM: PPC: Booke-hv: Add one reg interface for SPRG9
  KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer
  KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr

Michael Neuling (1):
  KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling

Mihai Caraman (8):
  KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
  KVM: PPC: e500: Fix default tlb for victim hint
  KVM: PPC: e500: Emulate power management control SPR
  KVM: PPC: e500mc: Revert add load inst fixup
  KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
  KVM: PPC: Book3s: Remove kvmppc_read_inst() function
  KVM: PPC: Allow kvmppc_get_last_inst() to fail
  KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

Paul Mackerras (4):
  KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling
  KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled
  KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call
  KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication

Stewart Smith (2):
  Split out struct kvmppc_vcore creation to separate function
  Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8


Alexander Graf (31):
  KVM: PPC:

[PULL 26/63] KVM: PPC: Book3S: Stop PTE lookup on write errors

2014-08-01 Thread Alexander Graf

When a page lookup failed because we're not allowed to write to the page, we
should not overwrite that value with another lookup on the second PTEG which
will return page not found. Instead, we should just tell the caller that we
had a permission problem.

This fixes Mac OS X guests looping endlessly in page lookup code for me.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_32_mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c
index 93503bb..cd0b073 100644
--- a/arch/powerpc/kvm/book3s_32_mmu.c
+++ b/arch/powerpc/kvm/book3s_32_mmu.c
@@ -335,7 +335,7 @@ static int kvmppc_mmu_book3s_32_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
if (r  0)
r = kvmppc_mmu_book3s_32_xlate_pte(vcpu, eaddr, pte,
   data, iswrite, true);
-   if (r  0)
+   if (r == -ENOENT)
r = kvmppc_mmu_book3s_32_xlate_pte(vcpu, eaddr, pte,
   data, iswrite, false);
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 18/63] KVM: PPC: Book3S HV: Access guest VPA in BE

2014-08-01 Thread Alexander Graf

There are a few shared data structures between the host and the guest. Most
of them get registered through the VPA interface.

These data structures are defined to always be in big endian byte order, so
let's make sure we always access them in big endian.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv.c | 22 +++---
 arch/powerpc/kvm/book3s_hv_ras.c |  6 +++---
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 7db9df2..f1281c4 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -272,7 +272,7 @@ struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id)
 static void init_vpa(struct kvm_vcpu *vcpu, struct lppaca *vpa)
 {
vpa-__old_status |= LPPACA_OLD_SHARED_PROC;
-   vpa-yield_count = 1;
+   vpa-yield_count = cpu_to_be32(1);
 }
 
 static int set_vpa(struct kvm_vcpu *vcpu, struct kvmppc_vpa *v,
@@ -295,8 +295,8 @@ static int set_vpa(struct kvm_vcpu *vcpu, struct kvmppc_vpa 
*v,
 struct reg_vpa {
u32 dummy;
union {
-   u16 hword;
-   u32 word;
+   __be16 hword;
+   __be32 word;
} length;
 };
 
@@ -335,9 +335,9 @@ static unsigned long do_h_register_vpa(struct kvm_vcpu 
*vcpu,
if (va == NULL)
return H_PARAMETER;
if (subfunc == H_VPA_REG_VPA)
-   len = ((struct reg_vpa *)va)-length.hword;
+   len = be16_to_cpu(((struct reg_vpa *)va)-length.hword);
else
-   len = ((struct reg_vpa *)va)-length.word;
+   len = be32_to_cpu(((struct reg_vpa *)va)-length.word);
kvmppc_unpin_guest_page(kvm, va, vpa, false);
 
/* Check length */
@@ -542,18 +542,18 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
return;
memset(dt, 0, sizeof(struct dtl_entry));
dt-dispatch_reason = 7;
-   dt-processor_id = vc-pcpu + vcpu-arch.ptid;
-   dt-timebase = now + vc-tb_offset;
-   dt-enqueue_to_dispatch_time = stolen;
-   dt-srr0 = kvmppc_get_pc(vcpu);
-   dt-srr1 = vcpu-arch.shregs.msr;
+   dt-processor_id = cpu_to_be16(vc-pcpu + vcpu-arch.ptid);
+   dt-timebase = cpu_to_be64(now + vc-tb_offset);
+   dt-enqueue_to_dispatch_time = cpu_to_be32(stolen);
+   dt-srr0 = cpu_to_be64(kvmppc_get_pc(vcpu));
+   dt-srr1 = cpu_to_be64(vcpu-arch.shregs.msr);
++dt;
if (dt == vcpu-arch.dtl.pinned_end)
dt = vcpu-arch.dtl.pinned_addr;
vcpu-arch.dtl_ptr = dt;
/* order writing *dt vs. writing vpa-dtl_idx */
smp_wmb();
-   vpa-dtl_idx = ++vcpu-arch.dtl_index;
+   vpa-dtl_idx = cpu_to_be64(++vcpu-arch.dtl_index);
vcpu-arch.dtl.dirty = true;
 }
 
diff --git a/arch/powerpc/kvm/book3s_hv_ras.c b/arch/powerpc/kvm/book3s_hv_ras.c
index 3a5c568..d562c8e 100644
--- a/arch/powerpc/kvm/book3s_hv_ras.c
+++ b/arch/powerpc/kvm/book3s_hv_ras.c
@@ -45,14 +45,14 @@ static void reload_slb(struct kvm_vcpu *vcpu)
return;
 
/* Sanity check */
-   n = min_t(u32, slb-persistent, SLB_MIN_SIZE);
+   n = min_t(u32, be32_to_cpu(slb-persistent), SLB_MIN_SIZE);
if ((void *) slb-save_area[n]  vcpu-arch.slb_shadow.pinned_end)
return;
 
/* Load up the SLB from that */
for (i = 0; i  n; ++i) {
-   unsigned long rb = slb-save_area[i].esid;
-   unsigned long rs = slb-save_area[i].vsid;
+   unsigned long rb = be64_to_cpu(slb-save_area[i].esid);
+   unsigned long rs = be64_to_cpu(slb-save_area[i].vsid);
 
rb = (rb  ~0xFFFul) | i;   /* insert entry number */
asm volatile(slbmte %0,%1 : : r (rs), r (rb));
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers and exception

2014-08-01 Thread bharat.bhus...@freescale.com

 -Original Message-
 From: Wood Scott-B07421
 Sent: Friday, August 01, 2014 2:16 AM
 To: Bhushan Bharat-R65777
 Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Yoder Stuart-
 B08248
 Subject: Re: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers and 
 exception

 On Thu, 2014-07-31 at 01:15 -0500, Bhushan Bharat-R65777 wrote:

   -Original Message-
   From: Wood Scott-B07421
   Sent: Thursday, July 31, 2014 8:18 AM
   To: Bhushan Bharat-R65777
   Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Yoder
 Stuart-
   B08248
   Subject: Re: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers and
 exception

   On Wed, 2014-07-30 at 01:43 -0500, Bhushan Bharat-R65777 wrote:

 -Original Message-
 From: Wood Scott-B07421
 Sent: Tuesday, July 29, 2014 3:58 AM
 To: Bhushan Bharat-R65777
 Cc: ag...@suse.de; kvm-...@vger.kernel.org; kvm@vger.kernel.org;
 Yoder Stuart-
 B08248
 Subject: Re: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers
 and exception

  Userspace might be interested in
 the raw value,

With the current design, If userspace is interested then it will not
get the DBSR.

   Oh, because DBSR isn't currently implemented in sregs or one reg?

  That is one reason. Another is that if we give dbsr visibility to
  userspace then userspace have to clear dbsr in handling KVM_EXIT_DEBUG.

 Right -- since I didn't realize DBSR wasn't already exposed, I thought
 userspace already had this responsibility.

   It looked like it was removing dbsr visibility and the requirement for
 userspace
   to clear dbsr.  I guess the old way was that the value in
   vcpu-arch.dbsr didn't matter until the next debug exception, when it
   would be overwritten by the new SPRN_DBSR?

  But that means old dbsr will be visibility to userspace, which is even bad
 than not visible, no?

  Also this can lead to old dbsr visible to guest once userspace releases
  debug resources, but this can be solved by clearing dbsr in
  kvm_arch_vcpu_ioctl_set_guest_debug() -  if (!(dbg-control 
  KVM_GUESTDBG_ENABLE)) { }.

 I wasn't suggesting that you keep it that way, just clarifying my
 understanding of the current code.

  +   case SPRN_DBCR2:
  +   /*
  +* If userspace is debugging guest then guest
  +* can not access debug registers.
  +*/
  +   if (vcpu-guest_debug)
  +   break;
  +
  +   debug_inst = true;
  +   vcpu-arch.dbg_reg.dbcr2 = spr_val;
  +   vcpu-arch.shadow_dbg_reg.dbcr2 = spr_val;
  break;

 In what circumstances can the architected and shadow registers differ?

As of now they are same. But I think that if we want to implement other
   features like Freeze Timer (FT) then they can be different.

   I don't think we can possibly implement Freeze Timer.

  May be, but in my opinion we should keep this open.

 We're not talking about API here -- the implementation should be kept
 simple if there's no imminent need for shadow registers.

I am not sure what we should in that case ?

As we are currently emulating a subset of debug events (IAC, DAC, IC,
BT and TIE --- DBCR0 emulation) then we should expose status of those
events in guest dbsr and rest should be cleared ?

   I'm not saying they need to be exposed to the guest, but I don't see where
 you
   filter out bits like these.

  I am trying to get what all bits should be filtered out, all bits
  except IACx, DACx, IC, BT and TIE (same as event set filtering done
  when setting DBCR0) ?

  i.e IDE, UDE, MRR, IRPT, RET, CIRPT, CRET should be filtered out?

 Bits like IRPT and RET don't really matter, as you shouldn't see them
 happen.  Likewise MRR if you're sure you've cleared it since boot.

We can clear MRR bits when update vcpu-arch-dbsr with SPRM_DBSR in kvm debug 
handler

  But
 IDE could be set any time an asynchronous exception happens.  I don't
 think you should filter it out, but instead make sure that it doesn't
 cause an exception to be delivered.

So this means that in kvmpp_handle_debug() if DBSR_IDE is set then do not 
inject debug interrupt 

 and

on dbsr write emulation, deque the debug interrupt even if DBSR_IDE is set.

case SPRN_DBSR:

vcpu-arch.dbsr = ~spr_val;
if (!(vcpu-arch.dbsr  ~DBSR_IDE))
kvmppc_core_dequeue_debug(vcpu);
break;

or
vcpu-arch.dbsr = ~(spr_val | DBSR_IDE);
if (!vcpu-arch.dbsr)
kvmppc_core_dequeue_debug(vcpu);
break;

Thanks
-Bharat

 -Scott

Re: [PATCH] arm64: KVM: export current vcpu-pause state via pseudo regs

2014-08-01 Thread Alex Bennée


Christoffer Dall writes:

 On Thu, Jul 31, 2014 at 05:45:28PM +0100, Peter Maydell wrote:
 On 31 July 2014 17:38, Christoffer Dall christoffer.d...@linaro.org wrote:
   If we are not complaining when setting the pause value to false if it
   was true before, then we probably also need to wake up the thread in
   case this is called from another thread, right?
  
   or perhaps we should just return an error if you're trying to un-pause a
   CPU through this interface, h.
 
  Wouldn't it be an error to mess with any register when the system is not
  in a quiescent state? I was assuming that the wake state is dealt with
  when the run loop finally restarts.
 
 
  The ABI doesn't really define it as an error (the ABI doesn't enforce
  anything right now) so the question is, does it ever make sense to clear
  the pause flag through this ioctl?  If not, I think we should just err
  on the side of caution and specify in the docs that this is not
  supported and return an error.
 
 Consider the case where the reset state of the system is
 CPU 0 running, CPUs 1..N stopped, and we're doing an
 incoming migration to a state where all CPUs are running.
 In that case we'll be using this ioctl to clear the pause flag,
 right? (We'll also obviously need to set the PC and other
 register state correctly before resuming the guest.)
 
 Doh, you're right, I somehow had it in my mind that when you send the
 thread a signal, the pause flag would be cleared, but that goes against
 the whole idea of a CPU being turned off for KVM.

 But wouldn't we then have to also wake up the thread when clearing the
 pause flag?  It feels strange that the ioctl can clear the pause flag,
 but keep the thread on a wake-queue, and then userspace has to send the
 thread a signal of some sort to wake it up?
snip

Isn't the vCPU off the wait-queue by definition if the ioctl exits and
you go through the KVM_SET_ONE_REG stuff?

Once you re-enter the KVM_RUN ioctl it sees the pause_flag as cleared
and falls straight through into kvm_guest_enter() otherwise it will
again wait on wait_event_interruptible(*wq, !vcpu-arch.pause).

-- 
Alex Bennée
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PULL 16/63] PPC: Add asm helpers for BE 32bit load/store

2014-08-01 Thread Benjamin Herrenschmidt

On Fri, 2014-08-01 at 11:17 +0200, Alexander Graf wrote:
 From assembly code we might not only have to explicitly BE access 64bit 
 values,
 but sometimes also 32bit ones. Add helpers that allow for easy use of 
 lwzx/stwx
 in their respective byte-reverse or native form.
 
 Signed-off-by: Alexander Graf ag...@suse.de

Acked-by: Benjamin Herrenschmidt b...@kernel.crashing.org
 ---
  arch/powerpc/include/asm/asm-compat.h | 4 
  1 file changed, 4 insertions(+)
 
 diff --git a/arch/powerpc/include/asm/asm-compat.h 
 b/arch/powerpc/include/asm/asm-compat.h
 index 4b237aa..21be8ae 100644
 --- a/arch/powerpc/include/asm/asm-compat.h
 +++ b/arch/powerpc/include/asm/asm-compat.h
 @@ -34,10 +34,14 @@
  #define PPC_MIN_STKFRM   112
  
  #ifdef __BIG_ENDIAN__
 +#define LWZX_BE  stringify_in_c(lwzx)
  #define LDX_BE   stringify_in_c(ldx)
 +#define STWX_BE  stringify_in_c(stwx)
  #define STDX_BE  stringify_in_c(stdx)
  #else
 +#define LWZX_BE  stringify_in_c(lwbrx)
  #define LDX_BE   stringify_in_c(ldbrx)
 +#define STWX_BE  stringify_in_c(stwbrx)
  #define STDX_BE  stringify_in_c(stdbrx)
  #endif
  


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] arm64: KVM: fix 64bit CP15 VM access for 32bit guests

2014-08-01 Thread Marc Zyngier

Commit f0a3eaff71b8 (ARM64: KVM: fix big endian issue in
access_vm_reg for 32bit guest) changed the way we handle CP15
VM accesses, so that all 64bit accesses are done via vcpu_sys_reg.

This looks like a good idea as it solves indianness issues in an
elegant way, except for one small detail: the register index is
doesn't refer to the same array! We end up corrupting some random
data structure instead.

Fix this by reverting to the original code, except for the introduction
of a vcpu_cp15_64_high macro that deals with the endianness thing.

Tested on Juno with 32bit SMP guests.

Cc: Victor Kamensky victor.kamen...@linaro.org
Cc: Christoffer Dall christoffer.d...@linaro.org
Signed-off-by: Marc Zyngier marc.zyng...@arm.com
---
Christoffer, can you please have a look at this one and and queue it if
you find it acceptable?

Thanks,

M.

 arch/arm64/include/asm/kvm_host.h | 6 --
 arch/arm64/kvm/sys_regs.c | 7 +--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 79812be..e10c45a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -149,9 +149,11 @@ struct kvm_vcpu_arch {
 #define vcpu_cp15(v,r) ((v)-arch.ctxt.copro[(r)])
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
-#define vcpu_cp15_64_low(v,r) ((v)-arch.ctxt.copro[((r) + 1)])
+#define vcpu_cp15_64_high(v,r) vcpu_cp15((v),(r))
+#define vcpu_cp15_64_low(v,r)  vcpu_cp15((v),(r) + 1)
 #else
-#define vcpu_cp15_64_low(v,r) ((v)-arch.ctxt.copro[((r) + 0)])
+#define vcpu_cp15_64_high(v,r) vcpu_cp15((v),(r) + 1)
+#define vcpu_cp15_64_low(v,r)  vcpu_cp15((v),(r))
 #endif
 
 struct kvm_vm_stat {
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index a4fd526..5805e7c 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -135,10 +135,13 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
BUG_ON(!p-is_write);
 
val = *vcpu_reg(vcpu, p-Rt);
-   if (!p-is_aarch32 || !p-is_32bit)
+   if (!p-is_aarch32) {
vcpu_sys_reg(vcpu, r-reg) = val;
-   else
+   } else {
+   if (!p-is_32bit)
+   vcpu_cp15_64_high(vcpu, r-reg) = val  32;
vcpu_cp15_64_low(vcpu, r-reg) = val  0xUL;
+   }
 
return true;
 }
-- 
2.0.0

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] arm64: KVM: fix 64bit CP15 VM access for 32bit guests

2014-08-01 Thread Christoffer Dall

On Fri, Aug 01, 2014 at 12:00:36PM +0100, Marc Zyngier wrote:
 Commit f0a3eaff71b8 (ARM64: KVM: fix big endian issue in
 access_vm_reg for 32bit guest) changed the way we handle CP15
 VM accesses, so that all 64bit accesses are done via vcpu_sys_reg.
 
 This looks like a good idea as it solves indianness issues in an
 elegant way, except for one small detail: the register index is
 doesn't refer to the same array! We end up corrupting some random
 data structure instead.

Ouch!

 
 Fix this by reverting to the original code, except for the introduction
 of a vcpu_cp15_64_high macro that deals with the endianness thing.
 
 Tested on Juno with 32bit SMP guests.
 
 Cc: Victor Kamensky victor.kamen...@linaro.org
 Cc: Christoffer Dall christoffer.d...@linaro.org
 Signed-off-by: Marc Zyngier marc.zyng...@arm.com
 ---
 Christoffer, can you please have a look at this one and and queue it if
 you find it acceptable?
 

Good catch, it looks good, I'll queue it on kvmarm/next right away.

Reviewed-by: Christoffer Dall christoffer.d...@linaro.org

-Christoffer
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 04/17] COLO info: use colo info to tell migration target colo is enabled

2014-08-01 Thread Dr. David Alan Gilbert

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
 migrate colo info to migration target to tell the target colo is
 enabled.

If I understand this correctly this means that you send a 'colo info' device
information for migrations that don't have COLO enabled; that's bad because
it breaks migration unless the destination has it; I guess it's OK if you
were to guard it with a thing so it didn't do it for old machine-types.

You could use the QEMU_VM_COMMAND sections I've created for postcopy;
( http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00889.html ) and 
add a QEMU_VM_CMD_COLO to indicate you want the destination to become an SVM,
  then check the capability near the start of migration and send the command.

Or perhaps there's a way to add the colo-info device on the command line so it's
not always there.

Dave

 Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
 ---
  Makefile.objs  |  1 +
  include/migration/migration-colo.h |  3 ++
  migration-colo-comm.c  | 68 
 ++
  vl.c   |  4 +++
  4 files changed, 76 insertions(+)
  create mode 100644 migration-colo-comm.c
 
 diff --git a/Makefile.objs b/Makefile.objs
 index cab5824..1836a68 100644
 --- a/Makefile.objs
 +++ b/Makefile.objs
 @@ -50,6 +50,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
  common-obj-$(CONFIG_LINUX) += fsdev/
  
  common-obj-y += migration.o migration-tcp.o
 +common-obj-y += migration-colo-comm.o
  common-obj-$(CONFIG_COLO) += migration-colo.o
  common-obj-y += vmstate.o
  common-obj-y += qemu-file.o
 diff --git a/include/migration/migration-colo.h 
 b/include/migration/migration-colo.h
 index 35b384c..e3735d8 100644
 --- a/include/migration/migration-colo.h
 +++ b/include/migration/migration-colo.h
 @@ -12,6 +12,9 @@
  #define QEMU_MIGRATION_COLO_H
  
  #include qemu-common.h
 +#include migration/migration.h
 +
 +void colo_info_mig_init(void);
  
  bool colo_supported(void);
  
 diff --git a/migration-colo-comm.c b/migration-colo-comm.c
 new file mode 100644
 index 000..ccbc246
 --- /dev/null
 +++ b/migration-colo-comm.c
 @@ -0,0 +1,68 @@
 +/*
 + *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
 + *  (a.k.a. Fault Tolerance or Continuous Replication)
 + *
 + *  Copyright (C) 2014 FUJITSU LIMITED
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or
 + * later.  See the COPYING file in the top-level directory.
 + *
 + */
 +
 +#include migration/migration-colo.h
 +
 +#define DEBUG_COLO
 +
 +#ifdef DEBUG_COLO
 +#define DPRINTF(fmt, ...) \
 +do { fprintf(stdout, COLO:  fmt, ## __VA_ARGS__); } while (0)
 +#else
 +#define DPRINTF(fmt, ...) \
 +do { } while (0)
 +#endif
 +
 +static bool colo_requested;
 +
 +/* save */
 +
 +static bool migrate_use_colo(void)
 +{
 +MigrationState *s = migrate_get_current();
 +return s-enabled_capabilities[MIGRATION_CAPABILITY_COLO];
 +}
 +
 +static void colo_info_save(QEMUFile *f, void *opaque)
 +{
 +qemu_put_byte(f, migrate_use_colo());
 +}
 +
 +/* restore */
 +
 +static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
 +{
 +int value = qemu_get_byte(f);
 +
 +if (value  !colo_supported()) {
 +fprintf(stderr, COLO is not supported\n);
 +return -EINVAL;
 +}
 +
 +if (value  !colo_requested) {
 +DPRINTF(COLO requested!\n);
 +}
 +
 +colo_requested = value;
 +
 +return 0;
 +}
 +
 +static SaveVMHandlers savevm_colo_info_handlers = {
 +.save_state = colo_info_save,
 +.load_state = colo_info_load,
 +};
 +
 +void colo_info_mig_init(void)
 +{
 +register_savevm_live(NULL, colo info, -1, 1,
 + savevm_colo_info_handlers, NULL);
 +}
 diff --git a/vl.c b/vl.c
 index fe451aa..1a282d8 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -89,6 +89,7 @@ int main(int argc, char **argv)
  #include sysemu/dma.h
  #include audio/audio.h
  #include migration/migration.h
 +#include migration/migration-colo.h
  #include sysemu/kvm.h
  #include qapi/qmp/qjson.h
  #include qemu/option.h
 @@ -4339,6 +4340,9 @@ int main(int argc, char **argv, char **envp)
  
  blk_mig_init();
  ram_mig_init();
 +if (colo_supported()) {
 +colo_info_mig_init();
 +}
  
  /* open the virtual block devices */
  if (snapshot)
 -- 
 1.9.1
 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 05/17] COLO save: integrate COLO checkpointed save into qemu migration

2014-08-01 Thread Dr. David Alan Gilbert

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
   Integrate COLO checkpointed save flow into qemu migration.
   Add a migrate state: MIG_STATE_COLO, enter this migrate state
 after the first live migration successfully finished.
   Create a colo thread to do the checkpointed save.

In postcopy I added a 'migration_already_active' function
to merge all the different places that check for ACTIVE/SETUP etc.
( http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00850.html )

 +/*TODO: COLO checkpointed save loop*/
 +
 +if (s-state != MIG_STATE_ERROR) {
 +migrate_set_state(s, MIG_STATE_COLO, MIG_STATE_COMPLETED);
 +}

I thought migrate_set_state only changed the state if the old state
matched the 1st value - i.e. I think it'll only change to COMPLETED
if the state is COLO; so I don't think you need the if.

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 07/17] COLO buffer: implement colo buffer as well as QEMUFileOps based on it

2014-08-01 Thread Dr. David Alan Gilbert

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
 We need a buffer to store migration data.
 
 On save side:
   all saved data was write into colo buffer first, so that we can know
 the total size of the migration data. this can also separate the data
 transmission from colo control data, we use colo control data over
 socket fd to synchronous both side's stat.
 
 On restore side:
   all migration data was read into colo buffer first, then load data
 from the buffer: If network error happens while data transmission,
 the slaver can still functinal because the migration data are not yet
 loaded.

This is very similar to the QEMUSizedBuffer based QEMUFile's that Stefan Berger
wrote and that I use in both my postcopy and BER patchsets:

 http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00846.html

 (and to the similar code from Isaku Yamahata).

I think we should be able to use a shared version even if we need some changes.

 
 Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
 ---
  migration-colo.c | 112 
 +++
  1 file changed, 112 insertions(+)
 
 diff --git a/migration-colo.c b/migration-colo.c
 index d566b9d..b90d9b6 100644
 --- a/migration-colo.c
 +++ b/migration-colo.c
 @@ -11,6 +11,7 @@
  #include qemu/main-loop.h
  #include qemu/thread.h
  #include block/coroutine.h
 +#include qemu/error-report.h
  #include migration/migration-colo.h
  
  static QEMUBH *colo_bh;
 @@ -20,14 +21,122 @@ bool colo_supported(void)
  return true;
  }
  
 +/* colo buffer */
 +
 +#define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
 +#define COLO_BUFFER_MAX_SIZE (1000*1000*1000*10ULL)

Powers of 2 are nicer!

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 10/17] COLO ctl: introduce is_slave() and is_master()

2014-08-01 Thread Dr. David Alan Gilbert

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
 is_slaver is to determine whether the QEMU instance is a
 slaver(migration target) at runtime.
 is_master is to determine whether the QEMU instance is a
 master(migration starter) at runtime.
 This 2 APIs will be used later.

Since the names are made global in patch 15, I think it's best to
do it here, but also use a more specific name for them, like
colo_is_master.

Dave

 Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
 ---
  migration-colo.c | 11 +++
  1 file changed, 11 insertions(+)
 
 diff --git a/migration-colo.c b/migration-colo.c
 index 802f8b0..2699e77 100644
 --- a/migration-colo.c
 +++ b/migration-colo.c
 @@ -187,6 +187,12 @@ static const QEMUFileOps colo_read_ops = {
  
  /* save */
  
 +static __attribute__((unused)) bool is_master(void)
 +{
 +MigrationState *s = migrate_get_current();
 +return (s-state == MIG_STATE_COLO);
 +}
 +
  static void *colo_thread(void *opaque)
  {
  MigrationState *s = opaque;
 @@ -275,6 +281,11 @@ void colo_init_checkpointer(MigrationState *s)
  
  static Coroutine *colo;
  
 +static __attribute__((unused)) bool is_slave(void)
 +{
 +return colo != NULL;
 +}
 +
  /*
   * return:
   * 0: start a checkpoint
 -- 
 1.9.1
 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol

2014-08-01 Thread Dr. David Alan Gilbert

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
 implement colo checkpoint protocol.
 
 Checkpoint synchronzing points.
 
   Primary Secondary
   NEW @
   Suspend
   SUSPENDED   @
   SuspendSave state
   SEND@
   Send state  Receive state
   RECEIVED@
   Flush network   Load state
   LOADED  @
   Resume  Resume
 
   Start Comparing
 NOTE:
  1) '@' who sends the message
  2) Every sync-point is synchronized by two sides with only
 one handshake(single direction) for low-latency.
 If more strict synchronization is required, a opposite direction
 sync-point should be added.
  3) Since sync-points are single direction, the remote side may
 go forward a lot when this side just receives the sync-point.
 
 Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
 ---
  migration-colo.c | 268 
 +--
  1 file changed, 262 insertions(+), 6 deletions(-)
 
 diff --git a/migration-colo.c b/migration-colo.c
 index 2699e77..a708872 100644
 --- a/migration-colo.c
 +++ b/migration-colo.c
 @@ -24,6 +24,41 @@
   */
  #define CHKPOINT_TIMER 1
  
 +enum {
 +COLO_READY = 0x46,
 +
 +/*
 + * Checkpoint synchronzing points.
 + *
 + *  Primary Secondary
 + *  NEW @
 + *  Suspend
 + *  SUSPENDED   @
 + *  SuspendSave state
 + *  SEND@
 + *  Send state  Receive state
 + *  RECEIVED@
 + *  Flush network   Load state
 + *  LOADED  @
 + *  Resume  Resume
 + *
 + *  Start Comparing
 + * NOTE:
 + * 1) '@' who sends the message
 + * 2) Every sync-point is synchronized by two sides with only
 + *one handshake(single direction) for low-latency.
 + *If more strict synchronization is required, a opposite direction
 + *sync-point should be added.
 + * 3) Since sync-points are single direction, the remote side may
 + *go forward a lot when this side just receives the sync-point.
 + */
 +COLO_CHECKPOINT_NEW,
 +COLO_CHECKPOINT_SUSPENDED,
 +COLO_CHECKPOINT_SEND,
 +COLO_CHECKPOINT_RECEIVED,
 +COLO_CHECKPOINT_LOADED,
 +};
 +
  static QEMUBH *colo_bh;
  
  bool colo_supported(void)
 @@ -185,30 +220,161 @@ static const QEMUFileOps colo_read_ops = {
  .close = colo_close,
  };
  
 +/* colo checkpoint control helper */
 +static bool is_master(void);
 +static bool is_slave(void);
 +
 +static void ctl_error_handler(void *opaque, int err)
 +{
 +if (is_slave()) {
 +/* TODO: determine whether we need to failover */
 +/* FIXME: we will not failover currently, just kill slave */
 +error_report(error: colo transmission failed!\n);
 +exit(1);
 +} else if (is_master()) {
 +/* Master still alive, do not failover */
 +error_report(error: colo transmission failed!\n);
 +return;
 +} else {
 +error_report(COLO: Unexpected error happend!\n);
 +exit(EXIT_FAILURE);
 +}
 +}
 +
 +static int colo_ctl_put(QEMUFile *f, uint64_t request)
 +{
 +int ret = 0;
 +
 +qemu_put_be64(f, request);
 +qemu_fflush(f);
 +
 +ret = qemu_file_get_error(f);
 +if (ret  0) {
 +ctl_error_handler(f, ret);
 +return 1;
 +}
 +
 +return ret;
 +}
 +
 +static int colo_ctl_get_value(QEMUFile *f, uint64_t *value)
 +{
 +int ret = 0;
 +uint64_t temp;
 +
 +temp = qemu_get_be64(f);
 +
 +ret = qemu_file_get_error(f);
 +if (ret  0) {
 +ctl_error_handler(f, ret);
 +return 1;
 +}
 +
 +*value = temp;
 +return 0;
 +}
 +
 +static int colo_ctl_get(QEMUFile *f, uint64_t require)
 +{
 +int ret;
 +uint64_t value;
 +
 +ret = colo_ctl_get_value(f, value);
 +if (ret) {
 +return ret;
 +}
 +
 +if (value != require) {
 +error_report(unexpected state received!\n);

I find it useful to print the expected/received state to
be able to figure out what went wrong.

 +exit(1);
 +}
 +
 +return ret;
 +}
 +
  /* save */
  
 -static __attribute__((unused)) bool is_master(void)
 +static bool is_master(void)
  {
  MigrationState *s = migrate_get_current();
  return (s-state == MIG_STATE_COLO);
  }
  
 +static int do_colo_transaction(MigrationState *s, QEMUFile *control,
 +   QEMUFile *trans)
 +{
 +int ret;
 +
 +ret = colo_ctl_put(s-file, COLO_CHECKPOINT_NEW);
 +

Re: [RFC PATCH 13/17] COLO ctl: implement colo save

2014-08-01 Thread Dr. David Alan Gilbert

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
 implement colo save

My postcopy 'QEMU_VM_CMD_PACKAGED' does something similar to
parts of this with the QEMUSizedBuffer, we might be able to share some more:
https://lists.nongnu.org/archive/html/qemu-devel/2014-07/msg00886.html

 +/* we send the total size of the vmstate first */
 +ret = colo_ctl_put(s-file, colo_buffer.used);
 +if (ret) {
 +goto out;
 +}
 +
 +qemu_put_buffer_async(s-file, colo_buffer.data, colo_buffer.used);
 +ret = qemu_file_get_error(s-file);
 +if (ret  0) {
 +goto out;
 +}
 +qemu_fflush(s-file);

Is there a reason to use _async here?  I thought the only gain is
if you were going to do other writes in the shadow of the async, with the fflush
immediately after I'm not sure it helps.

Dave

  
  ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
  if (ret) {
  goto out;
  }
  
 -/* TODO: Flush network etc. */
 +/* Flush network etc. */
 +colo_compare_flush();
  
  ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
  if (ret) {
  goto out;
  }
  
 -/* TODO: resume master */
 +colo_compare_resume();
 +ret = 0;
  
  out:
 +/* resume master */
 +qemu_mutex_lock_iothread();
 +vm_start();
 +qemu_mutex_unlock_iothread();
 +
  return ret;
  }
  
 -- 
 1.9.1
 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 16/17] COLO ram cache: implement colo ram cache on slaver

2014-08-01 Thread Dr. David Alan Gilbert

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
 The ram cache was initially the same as PVM's memory. At
 checkpoint, we cache the dirty memory of PVM into ram cache
 (so that ram cache always the same as PVM's memory at every
 checkpoint), flush cached memory to SVM after we received
 all PVM dirty memory(only needed to flush memory that was
 both dirty on PVM and SVM since last checkpoint).

(Typo: 'r' on the end of the title)

I think I understand the need for the cache, to be able to restore pages
that the SVM has modified that the PVM hadn't; however, if I understand
the change here, (to host_from_stream_offset) the SVM will load the
snapshot into the ram_cache rather than directly into host memory - why
is this necessary?  If the SVMs CPU is stopped at this point couldn't
it load snapshot pages directly into host memory, clearing pages in the SVMs
bitmap, so that the only pages that then get copied in flush_cache are
the pages that the SVM modified but the PVM *didn't* include in the snapshot?
I can see that you would need to do it the way you've done it if the
snapshot-load could fail (at the sametime the PVM failed) and thus the old SVM
state would be the surviving state, but how could it fail at this point
given the whole stream is in the colo-buffer?


 +static void ram_flush_cache(void);
  static int ram_load(QEMUFile *f, void *opaque, int version_id)
  {
  ram_addr_t addr;
  int flags, ret = 0;
  static uint64_t seq_iter;
 +bool need_flush = false;

Probably better as 'ram_cache_needs_flush'

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 15/17] COLO save: reuse migration bitmap under colo checkpoint

2014-08-01 Thread Dr. David Alan Gilbert

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
 reuse migration bitmap under colo checkpoint, only send dirty pages
 per-checkpoint.
 
 Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
 ---
  arch_init.c| 20 +++-
  include/migration/migration-colo.h |  2 ++
  migration-colo.c   |  6 ++
  stubs/migration-colo.c | 10 ++
  4 files changed, 33 insertions(+), 5 deletions(-)
 
 diff --git a/arch_init.c b/arch_init.c
 index 8ddaf35..c84e6c8 100644
 --- a/arch_init.c
 +++ b/arch_init.c
 @@ -52,6 +52,7 @@
  #include exec/ram_addr.h
  #include hw/acpi/acpi.h
  #include qemu/host-utils.h
 +#include migration/migration-colo.h
  
  #ifdef DEBUG_ARCH_INIT
  #define DPRINTF(fmt, ...) \
 @@ -769,6 +770,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
  RAMBlock *block;
  int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
  
 +/*
 + * migration has already setup the bitmap, reuse it.
 + */
 +if (is_master()) {
 +qemu_mutex_lock_ramlist();
 +reset_ram_globals();
 +goto out_setup;
 +}
 +
  mig_throttle_on = false;
  dirty_rate_high_cnt = 0;
  bitmap_sync_count = 0;
 @@ -828,6 +838,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
  migration_bitmap_sync();
  qemu_mutex_unlock_iothread();
  
 +out_setup:
  qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
  
  QTAILQ_FOREACH(block, ram_list.blocks, next) {

Is it necessary to send the block list for each of your snapshots?

Dave

 @@ -937,7 +948,14 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
  }
  
  ram_control_after_iterate(f, RAM_CONTROL_FINISH);
 -migration_end();
 +
 +/*
 + * Since we need to reuse dirty bitmap in colo,
 + * don't cleanup the bitmap.
 + */
 +if (!migrate_use_colo() || migration_has_failed(migrate_get_current())) {
 +migration_end();
 +}
  
  qemu_mutex_unlock_ramlist();
  qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
 diff --git a/include/migration/migration-colo.h 
 b/include/migration/migration-colo.h
 index 861fa27..c286a60 100644
 --- a/include/migration/migration-colo.h
 +++ b/include/migration/migration-colo.h
 @@ -21,10 +21,12 @@ bool colo_supported(void);
  /* save */
  bool migrate_use_colo(void);
  void colo_init_checkpointer(MigrationState *s);
 +bool is_master(void);
  
  /* restore */
  bool restore_use_colo(void);
  void restore_exit_colo(void);
 +bool is_slave(void);
  
  void colo_process_incoming_checkpoints(QEMUFile *f);
  
 diff --git a/migration-colo.c b/migration-colo.c
 index 8596845..13a6a57 100644
 --- a/migration-colo.c
 +++ b/migration-colo.c
 @@ -222,8 +222,6 @@ static const QEMUFileOps colo_read_ops = {
  };
  
  /* colo checkpoint control helper */
 -static bool is_master(void);
 -static bool is_slave(void);
  
  static void ctl_error_handler(void *opaque, int err)
  {
 @@ -295,7 +293,7 @@ static int colo_ctl_get(QEMUFile *f, uint64_t require)
  
  /* save */
  
 -static bool is_master(void)
 +bool is_master(void)
  {
  MigrationState *s = migrate_get_current();
  return (s-state == MIG_STATE_COLO);
 @@ -499,7 +497,7 @@ void colo_init_checkpointer(MigrationState *s)
  
  static Coroutine *colo;
  
 -static bool is_slave(void)
 +bool is_slave(void)
  {
  return colo != NULL;
  }
 diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
 index 55f0d37..ef65be6 100644
 --- a/stubs/migration-colo.c
 +++ b/stubs/migration-colo.c
 @@ -22,3 +22,13 @@ void colo_init_checkpointer(MigrationState *s)
  void colo_process_incoming_checkpoints(QEMUFile *f)
  {
  }
 +
 +bool is_master(void)
 +{
 +return false;
 +}
 +
 +bool is_slave(void)
 +{
 +return false;
 +}
 -- 
 1.9.1
 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Integrity in untrusted environments

2014-08-01 Thread Shiva V

Paolo Bonzini pbonzini at redhat.com writes

 Hello,
 I am exploring ideas to implement a service inside a virtual machine on 
 untrusted hypervisors under current cloud infrastructures.
 Particularly, I am interested how one can verify the integrity of the 
 service in an environment where hypervisor is not trusted. This is my 
setup.
 
 1. I have two virtual machines. (Normal client VM's).
 2. VM-A is executing a service and VM-B wants to verify its integrity.
 3. Both are executing on untrusted hypervisor.
 
 Though, Intel SGX will solve this, by using the concept of enclaves, its 
not 
 publicly available yet.
 
One could also use SMM to verify the integrity. But since this is time based 
approach, one could easily exploit between the time window.

 I was drilling down this idea, We know Write xor Execute Memory Protection 
 Scheme. Using this idea,If we could lock down the VM-A memory pages where 
 the service is running and also corresponding page-table entries, then have 
 a handler code that temporarily unlocks them for legitimate updates, then 
 one could verify the integrity of the service running. 

 You can make a malicious hypervisor that makes all executable pages also
 writable, but hides the fact to the running process.  But really, if you
 control the hypervisor you can just write to guest memory as you wish.

 SMM will be emulated by the hypervisor.
If the hypervisor is untrusted, you cannot solve _everything_.  For the
third time, what attacks are you trying to protect from?
 
 Paolo


Thanks Paolo, I was considering all critical attacks possible that a client 
virtual machine could have under the untrusted hypervisor scenarios. For 
example,Memory based,Hypervisor based and few major side channel attacks. I 
am ignoring the network based attacks for the time being. 
And one more question to your reply. I did'nt understand as to what you were 
trying to describe here 
You can make a malicious hypervisor that makes all executable pages also 
writable, but hides the fact to the running process.  But really, if you 
control the hypervisor you can just write to guest memory as you wish

This is my understanding, Correct me if I am wrong here. 
If we lock down the code pages of genuine hypervisor as I discussed before, 
Isn't it sufficent? Because essentially hypervisor is the one that handles 
the traps from the virtual machines for execution.So, even if the hypervisor 
wishes to write to the client virtual machine, it will be captured since the 
memory pages of the hypervisor is locked down and is essentially non 
bypassable.




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service

2014-08-01 Thread Dr. David Alan Gilbert

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
 Virtual machine (VM) replication is a well known technique for
 providing application-agnostic software-implemented hardware fault
 tolerance non-stop service. COLO is a high availability solution.
 Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
 receive the same request from client, and generate response in parallel
 too. If the response packets from PVM and SVM are identical, they are
 released immediately. Otherwise, a VM checkpoint (on demand) is
 conducted. The idea is presented in Xen summit 2012, and 2013,
 and academia paper in SOCC 2013. It's also presented in KVM forum
 2013:
 http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
 Please refer to above document for detailed information. 
 Please also refer to previous posted RFC proposal:
 http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html

Hi Yang,
  Thanks for this set of patches (and I've replied to many individually).

 The patchset is also hosted on github:
 https://github.com/macrosheep/qemu/tree/colo_v0.1
 
 This patchset is RFC, implements the frame of colo, without
 failover and nic/disk replication. But it is ready for demo
 the COLO idea above QEMU-Kvm.
 Steps using this patchset to get an overview of COLO:
 1. configure the source with --enable-colo option
 2. compile
 3. just like QEMU's normal migration, run 2 QEMU VM:
- Primary VM 
- Secondary VM with -incoming tcp:[IP]:[PORT] option
 4. on Primary VM's QEMU monitor, run following command:
migrate_set_capability colo on
migrate tcp:[IP]:[PORT]
 5. done
 you will see two runing VMs, whenever you make changes to PVM, SVM
 will be synced to PVM's state.
 
 TODO list:
 1. failover
 2. nic replication
 3. disk replication[COLO Disk manager]

I wonder if there are any parts that can be borrowed from other code
to get it going; I notice that the reverse execution patchset
has a network packet record/replay mode:

https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00157.html

What was used for the nic comparison in the 2013 kvm forum paper?

Dave

 
 Any comments/feedbacks are warmly welcomed.
 
 Thanks,
 Yang
 
 Yang Hongyang (17):
   configure: add CONFIG_COLO to switch COLO support
   COLO: introduce an api colo_supported() to indicate COLO support
   COLO migration: add a migration capability 'colo'
   COLO info: use colo info to tell migration target colo is enabled
   COLO save: integrate COLO checkpointed save into qemu migration
   COLO restore: integrate COLO checkpointed restore into qemu restore
   COLO buffer: implement colo buffer as well as QEMUFileOps based on it
   COLO: disable qdev hotplug
   COLO ctl: implement API's that communicate with colo agent
   COLO ctl: introduce is_slave() and is_master()
   COLO ctl: implement colo checkpoint protocol
   COLO ctl: add a RunState RUN_STATE_COLO
   COLO ctl: implement colo save
   COLO ctl: implement colo restore
   COLO save: reuse migration bitmap under colo checkpoint
   COLO ram cache: implement colo ram cache on slaver
   HACK: trigger checkpoint every 500ms
 
  Makefile.objs  |   2 +
  arch_init.c| 174 +-
  configure  |  14 +
  include/exec/cpu-all.h |   1 +
  include/migration/migration-colo.h |  36 +++
  include/migration/migration.h  |  13 +
  include/qapi/qmp/qerror.h  |   3 +
  migration-colo-comm.c  |  78 +
  migration-colo.c   | 643 
 +
  migration.c|  45 ++-
  qapi-schema.json   |   9 +-
  stubs/Makefile.objs|   1 +
  stubs/migration-colo.c |  34 ++
  vl.c   |  12 +
  14 files changed, 1044 insertions(+), 21 deletions(-)
  create mode 100644 include/migration/migration-colo.h
  create mode 100644 migration-colo-comm.c
  create mode 100644 migration-colo.c
  create mode 100644 stubs/migration-colo.c
 
 -- 
 1.9.1
 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

George Traykov Donation

2014-08-01 Thread George Traykov

Dear Sir / Ma'am,

This is a personal email directed to you. My name is George Traykov and I have 
decided to write you to share my fortune to two (2) lucky winner.I won the 
lottery twice but I'm still not happy being labelled the world's most 
ungrateful winner hence I have voluntarily decided to donate $500,000.00 USD to 
you as part of my own charity project to improve the life of 2 lucky 
individuals all over the world. If you have received this email then you are 
one of the two lucky recipients, get back to me via email: 
georgetrayko...@yahoo.com for more details on how you can redeem your 
prize/donation.

You can verify this by visiting the web pages below: 
http://metro.co.uk/2013/10/17/george-traykov-i-won-the-lottery-twice-but-im-still-not-happy-4150822/

Yours Sincerely,
George Traykov
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Fwd: Question and Performance of Intel's APIC-v on Xeon E5 v2

2014-08-01 Thread William Tu

Hi folks,

I recently got a Intel Xeon E5-2609 v2 machine with APIC-v support. I
did some performance tests under Linux kernel 3.11 and have some
doubts about the new APICv feature. I'm appreciated for any comments
and please correct me if I'm wrong.

My understanding of APIC-v is that it mainly consists of
1) Virtual interrupt delivery (the same as posted interrupt), which
avoids KVM to inject vAPIC interrupts manually. In other word, it
post an interrupt to the guest without sending IPI, which causes
external interrupt exit.
2) EOI virtualization. So guest acknowledging the interrupt incurs no
EOI exit. (however, sometimes it exits)
3) Virtualized the APIC-registers so read/write won't trap into the
hypervisor. However, some APIC-write still trigger VM exit, but it
becomes trap-like instead of fault-like. (I don't know which
APIC-write causes exit and which does not)

=== Experiment A Result ===
1. virtio network with vhost, iperf TCP experiments, enable/disable APIC-v

[With APIC-v]
Total number of EXIT rate:   4351.1 exits second
-- VM EXIT Breakdown --
reason  exit/sec Avg(us)
IO_INSTRUCTION1428 81.471931
EXCEPTION_NMI  69 7.906276
EXTERNAL_INTERRUPT   1866 7.317781
MSR_WRITE   970  1.504932

[Without APIC-v]
Total number of EXIT rate:   83510.1 exits per second
-- VM EXIT Breakdown --
reason  exit/sec Avg(us)
IO_INSTRUCTION18428 81.471931
EXTERNAL_INTERRUPT   311667.317781
MSR_WRITE   30970   1.504932

VM exit rate reduces from 83k/sec to 4.3k/sec because
- the 31166 EXTERNAL_INTERRUPT mainly comes from vhost sending IPI,
while APIC-v's posted interrupt avoids it.
- the 30970 MSR_WRITE comes from EOI, while APIC-v's EOI
virtualization avoids it
- however, APIC-v still has 1866 EXTERNAL_INTERRUPT and 970 MSR_WRITE
exits, I found it's due to timer. I confirm with the next experiments.

=== Experiment B Result ===
I run cyclictest in VM and measure the VM exit behavior. The
cyclictest is configure to generate 1k timer per second.
For with or without APIC-v, I got similar results as below
total number of EXIT 156919 rate: 5225.18 exits per second
-- VM EXIT Breakdown --
reason   exit/sec  Avg(us)
IO_INSTRUCTION   18 47.412613
EXTERNAL_INTERRUPT  30853.022567
MSR_WRITE 20951.330987

I found APIC-v does not improve on timer interrupt delivery because
posted interrupt seems not work on LAPIC timer? If not, why? So the
3085 EXTERNAL_INTERRUPT is due to timer expiration interrupt and 2095
MSR_WRITE is due to EOI and program the TMICT (part of APIC's
register). Does this contradict with the APIC-v's assumption saying
APIC-write is direct without VM exit?


Thank you and any comments are welcome

Regards,
William Tu
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

kvm-unit-tests failures

2014-08-01 Thread Chris J Arges

Hi,

We are planning on running kvm-unit-tests as part of our test suite; but
I've noticed that many tests fail (even running the latest kvm tip).
After searching I found many BZ entires that seem to point at this
master bug for tracking these issues:
https://bugzilla.redhat.com/show_bug.cgi?id=1079979
However, this bug is private; and cannot be viewed by the public.

I'd like to know how to help report issues that we observe with testing
in order to help fix these tests, or understand any progress being made
to fix them already. Is there a public bug that everybody can view to
track these issues? Should I be reporting new bugs with failures in the
unit tests? Where is the appropriate place to file bugs against
kvm-unit-tests and discuss issues?

Thanks,
--chris j arges
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC][PATCH] kvm: x86: fix stale mmio cache bug

2014-08-01 Thread David Matlack

The following events can lead to an incorrect KVM_EXIT_MMIO bubbling
up to userspace:

(1) Guest accesses gpa X without a memory slot. The gfn is cached in
struct kvm_vcpu_arch (mmio_gfn). On Intel EPT-enabled hosts, KVM sets
the SPTE write-execute-noread so that future accesses cause
EPT_MISCONFIGs.

(2) Host userspace creates a memory slot via KVM_SET_USER_MEMORY_REGION
covering the page just accessed.

(3) Guest attempts to read or write to gpa X again. On Intel, this
generates an EPT_MISCONFIG. The memory slot generation number that
was incremented in (2) would normally take care of this but we fast
path mmio faults through quickly_check_mmio_pf(), which only checks
the per-vcpu mmio cache. Since we hit the cache, KVM passes a
KVM_EXIT_MMIO up to userspace.

This patch fixes the issue by clearing the mmio cache in the
KVM_MR_CREATE code path.
 - introduce KVM_REQ_CLEAR_MMIO_CACHE for clearing all vcpu mmio
   caches.
 - extend vcpu_clear_mmio_info to clear mmio_gfn in addition to
   mmio_gva, since both can be used to fast path mmio faults.
 - issue KVM_REQ_CLEAR_MMIO_CACHE during memslot creation to flush
   the mmio cache.
 - in mmu_sync_roots, unconditionally clear the mmio cache since
   even direct_map (e.g. tdp) hosts use it.

Signed-off-by: David Matlack dmatl...@google.com
---
 arch/x86/kvm/mmu.c   |  3 ++-
 arch/x86/kvm/x86.c   |  5 +
 arch/x86/kvm/x86.h   |  8 +---
 include/linux/kvm_host.h |  2 ++
 virt/kvm/kvm_main.c  | 10 +-
 5 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 9314678..8d50b84 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3157,13 +3157,14 @@ static void mmu_sync_roots(struct kvm_vcpu *vcpu)
int i;
struct kvm_mmu_page *sp;
 
+   vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY);
+
if (vcpu-arch.mmu.direct_map)
return;
 
if (!VALID_PAGE(vcpu-arch.mmu.root_hpa))
return;
 
-   vcpu_clear_mmio_info(vcpu, ~0ul);
kvm_mmu_audit(vcpu, AUDIT_PRE_SYNC);
if (vcpu-arch.mmu.root_level == PT64_ROOT_LEVEL) {
hpa_t root = vcpu-arch.mmu.root_hpa;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ef432f8..05b5629 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6001,6 +6001,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
kvm_deliver_pmi(vcpu);
if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
vcpu_scan_ioapic(vcpu);
+
+   if (kvm_check_request(KVM_REQ_CLEAR_MMIO_CACHE, vcpu))
+   vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY);
}
 
if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
@@ -7281,6 +7284,8 @@ void kvm_arch_memslots_updated(struct kvm *kvm)
 * mmio generation may have reached its maximum value.
 */
kvm_mmu_invalidate_mmio_sptes(kvm);
+
+   kvm_make_all_vcpus_request(kvm, KVM_REQ_CLEAR_MMIO_CACHE);
 }
 
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 8c97bac..41ef197 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -81,15 +81,17 @@ static inline void vcpu_cache_mmio_info(struct kvm_vcpu 
*vcpu,
 }
 
 /*
- * Clear the mmio cache info for the given gva,
- * specially, if gva is ~0ul, we clear all mmio cache info.
+ * Clear the mmio cache info for the given gva. If gva is MMIO_GVA_ANY,
+ * unconditionally clear the mmio cache.
  */
+#define MMIO_GVA_ANY (~0ul)
 static inline void vcpu_clear_mmio_info(struct kvm_vcpu *vcpu, gva_t gva)
 {
-   if (gva != (~0ul)  vcpu-arch.mmio_gva != (gva  PAGE_MASK))
+   if (gva != MMIO_GVA_ANY  vcpu-arch.mmio_gva != (gva  PAGE_MASK))
return;
 
vcpu-arch.mmio_gva = 0;
+   vcpu-arch.mmio_gfn = 0;
 }
 
 static inline bool vcpu_match_mmio_gva(struct kvm_vcpu *vcpu, unsigned long 
gva)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ec4e3bd..e4edaff 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -136,6 +136,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
 #define KVM_REQ_ENABLE_IBS23
 #define KVM_REQ_DISABLE_IBS   24
+#define KVM_REQ_CLEAR_MMIO_CACHE  25
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID   1
@@ -591,6 +592,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu);
 void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
 void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
 
+bool kvm_make_all_vcpus_request(struct kvm *kvm, unsigned int req);
 void kvm_flush_remote_tlbs(struct kvm *kvm);
 void kvm_reload_remote_mmus(struct kvm *kvm);
 void kvm_make_mclock_inprogress_request(struct kvm *kvm);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4b6c01b..d09527a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -152,7

Re: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers and exception

2014-08-01 Thread Scott Wood

On Fri, 2014-08-01 at 04:34 -0500, Bhushan Bharat-R65777 wrote:
 on dbsr write emulation, deque the debug interrupt even if DBSR_IDE is set.
 
 case SPRN_DBSR:
 
 vcpu-arch.dbsr = ~spr_val;
 if (!(vcpu-arch.dbsr  ~DBSR_IDE))
 kvmppc_core_dequeue_debug(vcpu);
 break;
 
 or
 vcpu-arch.dbsr = ~(spr_val | DBSR_IDE);
 if (!vcpu-arch.dbsr)
 kvmppc_core_dequeue_debug(vcpu);
 break;

The first option.  I see no reason to have KVM forcibly clear DBSR[IDE].

-Scott


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC][PATCH] kvm: x86: fix stale mmio cache bug

2014-08-01 Thread Xiao Guangrong


On Aug 2, 2014, at 7:54 AM, David Matlack dmatl...@google.com wrote:

 The following events can lead to an incorrect KVM_EXIT_MMIO bubbling
 up to userspace:
 
 (1) Guest accesses gpa X without a memory slot. The gfn is cached in
 struct kvm_vcpu_arch (mmio_gfn). On Intel EPT-enabled hosts, KVM sets
 the SPTE write-execute-noread so that future accesses cause
 EPT_MISCONFIGs.
 
 (2) Host userspace creates a memory slot via KVM_SET_USER_MEMORY_REGION
 covering the page just accessed.
 
 (3) Guest attempts to read or write to gpa X again. On Intel, this
 generates an EPT_MISCONFIG. The memory slot generation number that
 was incremented in (2) would normally take care of this but we fast
 path mmio faults through quickly_check_mmio_pf(), which only checks
 the per-vcpu mmio cache. Since we hit the cache, KVM passes a
 KVM_EXIT_MMIO up to userspace.
 

Good catch, thank you, David!

 This patch fixes the issue by clearing the mmio cache in the
 KVM_MR_CREATE code path.
 - introduce KVM_REQ_CLEAR_MMIO_CACHE for clearing all vcpu mmio
  caches.
 - extend vcpu_clear_mmio_info to clear mmio_gfn in addition to
  mmio_gva, since both can be used to fast path mmio faults.
 - issue KVM_REQ_CLEAR_MMIO_CACHE during memslot creation to flush
  the mmio cache.
 - in mmu_sync_roots, unconditionally clear the mmio cache since
  even direct_map (e.g. tdp) hosts use it.

I prefer to also caching the spte’s generation number, then check the number
in quickly_check_mmio_pf().

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 04/63] KVM: PPC: Book3s PR: Disable AIL mode with OPAL

2014-08-01 Thread Alexander Graf

When we're using PR KVM we must not allow the CPU to take interrupts
in virtual mode, as the SLB does not contain host kernel mappings
when running inside the guest context.

To make sure we get good performance for non-KVM tasks but still
properly functioning PR KVM, let's just disable AIL whenever a vcpu
is scheduled in.

This is fundamentally different from how we deal with AIL on pSeries
type machines where we disable AIL for the whole machine as soon as
a single KVM VM is up.

The reason for that is easy - on pSeries we do not have control over
per-cpu configuration of AIL. We also don't want to mess with CPU hotplug
races and AIL configuration, so setting it per CPU is easier and more
flexible.

This patch fixes running PR KVM on POWER8 bare metal for me.

Signed-off-by: Alexander Graf ag...@suse.de
Acked-by: Paul Mackerras pau...@samba.org
---
 arch/powerpc/kvm/book3s_pr.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 3da412e..8ea7da4 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -71,6 +71,12 @@ static void kvmppc_core_vcpu_load_pr(struct kvm_vcpu *vcpu, 
int cpu)
svcpu-in_use = 0;
svcpu_put(svcpu);
 #endif
+
+   /* Disable AIL if supported */
+   if (cpu_has_feature(CPU_FTR_HVMODE) 
+   cpu_has_feature(CPU_FTR_ARCH_207S))
+   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR)  ~LPCR_AIL);
+
vcpu-cpu = smp_processor_id();
 #ifdef CONFIG_PPC_BOOK3S_32
current-thread.kvm_shadow_vcpu = vcpu-arch.shadow_vcpu;
@@ -91,6 +97,12 @@ static void kvmppc_core_vcpu_put_pr(struct kvm_vcpu *vcpu)
 
kvmppc_giveup_ext(vcpu, MSR_FP | MSR_VEC | MSR_VSX);
kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
+
+   /* Enable AIL if supported */
+   if (cpu_has_feature(CPU_FTR_HVMODE) 
+   cpu_has_feature(CPU_FTR_ARCH_207S))
+   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) | LPCR_AIL_3);
+
vcpu-cpu = -1;
 }
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 19/63] KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE

2014-08-01 Thread Alexander Graf

Some data structures are always stored in big endian. Among those are the LPPACA
fields as well as the shadow slb. These structures might be shared with a
hypervisor.

So whenever we access those fields, make sure we do so in big endian byte order.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index e66c1e38..bf5270e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -32,10 +32,6 @@
 
 #define VCPU_GPRS_TM(reg) (((reg) * ULONG_SIZE) + VCPU_GPR_TM)
 
-#ifdef __LITTLE_ENDIAN__
-#error Need to fix lppaca and SLB shadow accesses in little endian mode
-#endif
-
 /* Values in HSTATE_NAPPING(r13) */
 #define NAPPING_CEDE   1
 #define NAPPING_NOVCPU 2
@@ -595,9 +591,10 @@ kvmppc_got_guest:
ld  r3, VCPU_VPA(r4)
cmpdi   r3, 0
beq 25f
-   lwz r5, LPPACA_YIELDCOUNT(r3)
+   li  r6, LPPACA_YIELDCOUNT
+   LWZX_BE r5, r3, r6
addir5, r5, 1
-   stw r5, LPPACA_YIELDCOUNT(r3)
+   STWX_BE r5, r3, r6
li  r6, 1
stb r6, VCPU_VPA_DIRTY(r4)
 25:
@@ -1442,9 +1439,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
ld  r8, VCPU_VPA(r9)/* do they have a VPA? */
cmpdi   r8, 0
beq 25f
-   lwz r3, LPPACA_YIELDCOUNT(r8)
+   li  r4, LPPACA_YIELDCOUNT
+   LWZX_BE r3, r8, r4
addir3, r3, 1
-   stw r3, LPPACA_YIELDCOUNT(r8)
+   STWX_BE r3, r8, r4
li  r3, 1
stb r3, VCPU_VPA_DIRTY(r9)
 25:
@@ -1757,8 +1755,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 33:ld  r8,PACA_SLBSHADOWPTR(r13)
 
.rept   SLB_NUM_BOLTED
-   ld  r5,SLBSHADOW_SAVEAREA(r8)
-   ld  r6,SLBSHADOW_SAVEAREA+8(r8)
+   li  r3, SLBSHADOW_SAVEAREA
+   LDX_BE  r5, r8, r3
+   addir3, r3, 8
+   LDX_BE  r6, r8, r3
andis.  r7,r5,SLB_ESID_V@h
beq 1f
slbmte  r6,r5
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 28/63] KVM: PPC: Book3S: Make magic page properly 4k mappable

2014-08-01 Thread Alexander Graf

The magic page is defined as a 4k page of per-vCPU data that is shared
between the guest and the host to accelerate accesses to privileged
registers.

However, when the host is using 64k page size granularity we weren't quite
as strict about that rule anymore. Instead, we partially treated all of the
upper 64k as magic page and mapped only the uppermost 4k with the actual
magic contents.

This works well enough for Linux which doesn't use any memory in kernel
space in the upper 64k, but Mac OS X got upset. So this patch makes magic
page actually stay in a 4k range even on 64k page size hosts.

This patch fixes magic page usage with Mac OS X (using MOL) on 64k PAGE_SIZE
hosts for me.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |  2 +-
 arch/powerpc/kvm/book3s.c | 12 ++--
 arch/powerpc/kvm/book3s_32_mmu_host.c |  7 +++
 arch/powerpc/kvm/book3s_64_mmu_host.c |  5 +++--
 arch/powerpc/kvm/book3s_pr.c  | 13 ++---
 arch/powerpc/kvm/powerpc.c| 19 +++
 6 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index b1cf18d..20fb6f2 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -158,7 +158,7 @@ extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct 
kvmppc_bat *bat,
   bool upper, u32 val);
 extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
 extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu 
*vcpu);
-extern pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, bool writing,
+extern pfn_t kvmppc_gpa_to_pfn(struct kvm_vcpu *vcpu, gpa_t gpa, bool writing,
bool *writable);
 extern void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
unsigned long *rmap, long pte_index, int realmode);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 1d13764..31facfc 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -354,18 +354,18 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvmppc_core_prepare_to_enter);
 
-pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, bool writing,
+pfn_t kvmppc_gpa_to_pfn(struct kvm_vcpu *vcpu, gpa_t gpa, bool writing,
bool *writable)
 {
-   ulong mp_pa = vcpu-arch.magic_page_pa;
+   ulong mp_pa = vcpu-arch.magic_page_pa  KVM_PAM;
+   gfn_t gfn = gpa  PAGE_SHIFT;
 
if (!(kvmppc_get_msr(vcpu)  MSR_SF))
mp_pa = (uint32_t)mp_pa;
 
/* Magic page override */
-   if (unlikely(mp_pa) 
-   unlikely(((gfn  PAGE_SHIFT)  KVM_PAM) ==
-((mp_pa  PAGE_MASK)  KVM_PAM))) {
+   gpa = ~0xFFFULL;
+   if (unlikely(mp_pa)  unlikely((gpa  KVM_PAM) == mp_pa)) {
ulong shared_page = ((ulong)vcpu-arch.shared)  PAGE_MASK;
pfn_t pfn;
 
@@ -378,7 +378,7 @@ pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, 
bool writing,
 
return gfn_to_pfn_prot(vcpu-kvm, gfn, writing, writable);
 }
-EXPORT_SYMBOL_GPL(kvmppc_gfn_to_pfn);
+EXPORT_SYMBOL_GPL(kvmppc_gpa_to_pfn);
 
 static int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr, bool data,
bool iswrite, struct kvmppc_pte *pte)
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c 
b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 678e753..2035d16 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -156,11 +156,10 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct 
kvmppc_pte *orig_pte,
bool writable;
 
/* Get host physical address for gpa */
-   hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte-raddr  PAGE_SHIFT,
-  iswrite, writable);
+   hpaddr = kvmppc_gpa_to_pfn(vcpu, orig_pte-raddr, iswrite, writable);
if (is_error_noslot_pfn(hpaddr)) {
-   printk(KERN_INFO Couldn't get guest page for gfn %lx!\n,
-orig_pte-eaddr);
+   printk(KERN_INFO Couldn't get guest page for gpa %lx!\n,
+orig_pte-raddr);
r = -EINVAL;
goto out;
}
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c 
b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 0ac9839..b982d92 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -104,9 +104,10 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct 
kvmppc_pte *orig_pte,
smp_rmb();
 
/* Get host physical address for gpa */
-   pfn = kvmppc_gfn_to_pfn(vcpu, gfn, iswrite, writable);
+   pfn = kvmppc_gpa_to_pfn(vcpu, orig_pte-raddr, iswrite, writable);
if (is_error_noslot_pfn(pfn)) {
-   printk(KERN_INFO Couldn't get guest page for gfn

[PULL 07/63] KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue

2014-08-01 Thread Alexander Graf

From: Anton Blanchard an...@samba.org

To establish addressability quickly, ABIv2 requires the target
address of the function being called to be in r12.

Signed-off-by: Anton Blanchard an...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 868347e..da1cac5 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1913,8 +1913,8 @@ hcall_try_real_mode:
lwaxr3,r3,r4
cmpwi   r3,0
beq guest_exit_cont
-   add r3,r3,r4
-   mtctr   r3
+   add r12,r3,r4
+   mtctr   r12
mr  r3,r9   /* get vcpu pointer */
ld  r4,VCPU_GPR(R4)(r9)
bctrl
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 >

1 - 100 of 160 matches

Mail list logo