Re: [Devel] [PATCH 0/6] backporting async_pf injection functionality

2017-09-20 Thread Roman Kagan
On Wed, Sep 20, 2017 at 05:30:59PM +0300, Denis Plotnikov wrote:
> the patch set is for
> 1. Replace Roman's patch avoiding async_pf injection while in the guest mode 
> with
>the similar patch from the mainstream for kernel code consistency
> 2. Force a nested vmexit if the injected #PF is async_pf
> 3. Let guest support delivery of async_pf from guest mode
> 
> Denis Plotnikov (1):
>   Revert "kvm/x86: skip async_pf when in guest mode"
> 
> Wanpeng Li (5):
>   KVM: nVMX: Fix exception injection
>   KVM: async_pf: Add L1 guest async_pf #PF vmexit handler
>   KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf
>   KVM: async_pf: avoid async pf injection when in guest mode
>   KVM: async_pf: Let guest support delivery of async_pf from guest mode
> 
>  Documentation/virtual/kvm/msr.txt|  5 ++--
>  arch/x86/include/asm/kvm_emulate.h   |  1 +
>  arch/x86/include/asm/kvm_host.h  |  4 +++
>  arch/x86/include/uapi/asm/kvm_para.h |  1 +
>  arch/x86/kernel/kvm.c|  7 -
>  arch/x86/kvm/mmu.c   | 40 +++--
>  arch/x86/kvm/mmu.h   |  4 +++
>  arch/x86/kvm/svm.c   | 50 
> +++-
>  arch/x86/kvm/vmx.c   | 36 +-
>  arch/x86/kvm/x86.c   | 20 ++-
>  10 files changed, 109 insertions(+), 59 deletions(-)

Briefly skimmed through the series, and it looks OK to me.

Reviewed-by: Roman Kagan 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH 6/6] KVM: async_pf: Let guest support delivery of async_pf from guest mode

2017-09-20 Thread Denis Plotnikov
From: Wanpeng Li 

Adds another flag bit (bit 2) to MSR_KVM_ASYNC_PF_EN. If bit 2 is 1,
async page faults are delivered to L1 as #PF vmexits; if bit 2 is 0,
kvm_can_do_async_pf returns 0 if in guest mode.

This is similar to what svm.c wanted to do all along, but it is only
enabled for Linux as L1 hypervisor.  Foreign hypervisors must never
receive async page faults as vmexits, because they'd probably be very
confused about that.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
Signed-off-by: Radim Krčmář 
(cherry picked from commit 52a5c155cf79f1f059bffebf4d06d0249573e659)
fix #PSBM-56498
Signed-off-by: Denis Plotnikov 
---
 Documentation/virtual/kvm/msr.txt| 5 +++--
 arch/x86/include/asm/kvm_host.h  | 1 +
 arch/x86/include/uapi/asm/kvm_para.h | 1 +
 arch/x86/kernel/kvm.c| 7 ++-
 arch/x86/kvm/mmu.c   | 2 +-
 arch/x86/kvm/vmx.c   | 2 +-
 arch/x86/kvm/x86.c   | 5 +++--
 7 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/Documentation/virtual/kvm/msr.txt 
b/Documentation/virtual/kvm/msr.txt
index 2a71c8f..2a3452f 100644
--- a/Documentation/virtual/kvm/msr.txt
+++ b/Documentation/virtual/kvm/msr.txt
@@ -166,10 +166,11 @@ MSR_KVM_SYSTEM_TIME: 0x12
 MSR_KVM_ASYNC_PF_EN: 0x4b564d02
data: Bits 63-6 hold 64-byte aligned physical address of a
64 byte memory area which must be in guest RAM and must be
-   zeroed. Bits 5-2 are reserved and should be zero. Bit 0 is 1
+   zeroed. Bits 5-3 are reserved and should be zero. Bit 0 is 1
when asynchronous page faults are enabled on the vcpu 0 when
disabled. Bit 1 is 1 if asynchronous page faults can be injected
-   when vcpu is in cpl == 0.
+   when vcpu is in cpl == 0. Bit 2 is 1 if asynchronous page faults
+   are delivered to L1 as #PF vmexits.
 
First 4 byte of 64 byte memory location will be written to by
the hypervisor at the time of asynchronous page fault (APF)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ac1a7c1..e8056a7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -596,6 +596,7 @@ struct kvm_vcpu_arch {
bool send_user_only;
u32 host_apf_reason;
unsigned long nested_apf_token;
+   bool delivery_as_pf_vmexit;
} apf;
 
/* OSVW MSRs (AMD only) */
diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
b/arch/x86/include/uapi/asm/kvm_para.h
index 23f966fe..47bee05 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -65,6 +65,7 @@ struct kvm_clock_pairing {
 
 #define KVM_ASYNC_PF_ENABLED   (1 << 0)
 #define KVM_ASYNC_PF_SEND_ALWAYS   (1 << 1)
+#define KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT (1 << 2)
 
 /* Operations for KVM_HC_MMU_OP */
 #define KVM_MMU_OP_WRITE_PTE1
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 14d04e7..32d5f5a 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -333,7 +333,12 @@ static void kvm_guest_cpu_init(void)
 #ifdef CONFIG_PREEMPT
pa |= KVM_ASYNC_PF_SEND_ALWAYS;
 #endif
-   wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
+   pa |= KVM_ASYNC_PF_ENABLED;
+
+   /* Async page fault support for L1 hypervisor is optional */
+   if (wrmsr_safe(MSR_KVM_ASYNC_PF_EN,
+   (pa | KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT) & 0x, 
pa >> 32) < 0)
+   wrmsrl(MSR_KVM_ASYNC_PF_EN, pa);
__this_cpu_write(apf_reason.enabled, 1);
printk(KERN_INFO"KVM setup async PF for cpu %d\n",
   smp_processor_id());
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index e199f38..bb15151 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3489,7 +3489,7 @@ bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu)
 kvm_event_needs_reinjection(vcpu)))
return false;
 
-   if (is_guest_mode(vcpu))
+   if (!vcpu->arch.apf.delivery_as_pf_vmexit && is_guest_mode(vcpu))
return false;
 
return kvm_x86_ops->interrupt_allowed(vcpu);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 2c14d1b..9bdb7e7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7737,7 +7737,7 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
if (is_nmi(intr_info))
return false;
else if (is_page_fault(intr_info))
-   return enable_ept;
+   return !vmx->vcpu.arch.apf.host_apf_reason && 
enable_ept;
else if (is_no_device(intr_info) &&
 !(vmcs12->guest_cr0 & 

[Devel] [PATCH 5/6] KVM: async_pf: avoid async pf injection when in guest mode

2017-09-20 Thread Denis Plotnikov
From: Wanpeng Li 

 INFO: task gnome-terminal-:1734 blocked for more than 120 seconds.
   Not tainted 4.12.0-rc4+ #8
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 gnome-terminal- D0  1734   1015 0x
 Call Trace:
  __schedule+0x3cd/0xb30
  schedule+0x40/0x90
  kvm_async_pf_task_wait+0x1cc/0x270
  ? __vfs_read+0x37/0x150
  ? prepare_to_swait+0x22/0x70
  do_async_page_fault+0x77/0xb0
  ? do_async_page_fault+0x77/0xb0
  async_page_fault+0x28/0x30

This is triggered by running both win7 and win2016 on L1 KVM simultaneously,
and then gives stress to memory on L1, I can observed this hang on L1 when
at least ~70% swap area is occupied on L0.

This is due to async pf was injected to L2 which should be injected to L1,
L2 guest starts receiving pagefault w/ bogus %cr2(apf token from the host
actually), and L1 guest starts accumulating tasks stuck in D state in
kvm_async_pf_task_wait() since missing PAGE_READY async_pfs.

This patch fixes the hang by doing async pf when executing L1 guest.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: sta...@vger.kernel.org
Signed-off-by: Wanpeng Li 
Signed-off-by: Paolo Bonzini 
(cherry picked from commit 9bc1f09f6fa76fdf31eb7d6a4a4df43574725f93)
fix #PSBM-56498
Signed-off-by: Denis Plotnikov 
---
 arch/x86/kvm/mmu.c | 7 +--
 arch/x86/kvm/mmu.h | 1 +
 arch/x86/kvm/x86.c | 3 +--
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 91bc5eb..e199f38 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3483,12 +3483,15 @@ static int kvm_arch_setup_async_pf(struct kvm_vcpu 
*vcpu, gva_t gva, gfn_t gfn)
return kvm_setup_async_pf(vcpu, gva, kvm_vcpu_gfn_to_hva(vcpu, gfn), 
);
 }
 
-static bool can_do_async_pf(struct kvm_vcpu *vcpu)
+bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu)
 {
if (unlikely(!lapic_in_kernel(vcpu) ||
 kvm_event_needs_reinjection(vcpu)))
return false;
 
+   if (is_guest_mode(vcpu))
+   return false;
+
return kvm_x86_ops->interrupt_allowed(vcpu);
 }
 
@@ -3504,7 +3507,7 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool 
prefault, gfn_t gfn,
if (!async)
return false; /* *pfn has correct page already */
 
-   if (!prefault && can_do_async_pf(vcpu)) {
+   if (!prefault && kvm_can_do_async_pf(vcpu)) {
trace_kvm_try_async_get_page(gva, gfn);
if (kvm_find_async_pf_gfn(vcpu, gfn)) {
trace_kvm_async_pf_doublefault(gva, gfn);
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index bfadd00..b3fb4f3 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -74,6 +74,7 @@ enum {
 int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr, bool direct);
 void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu);
 void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly);
+bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu);
 int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
u64 fault_address, char *insn, int insn_len,
bool need_unprotect);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 34eccf9..b67a745 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8458,8 +8458,7 @@ bool kvm_arch_can_inject_async_page_present(struct 
kvm_vcpu *vcpu)
if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED))
return true;
else
-   return !kvm_event_needs_reinjection(vcpu) &&
-   kvm_x86_ops->interrupt_allowed(vcpu);
+   return kvm_can_do_async_pf(vcpu);
 }
 
 void kvm_arch_start_assignment(struct kvm *kvm)
-- 
2.7.4

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH 3/6] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-09-20 Thread Denis Plotnikov
From: Wanpeng Li 

Add an nested_apf field to vcpu->arch.exception to identify an async page
fault, and constructs the expected vm-exit information fields. Force a
nested VM exit from nested_vmx_check_exception() if the injected #PF is
async page fault.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
Signed-off-by: Radim Krčmář 
(cherry picked from commit adfe20fb48785dd73af3bf91407196eb5403c8cf)
fix #PSBM-56498
Signed-off-by: Denis Plotnikov 
---
 arch/x86/include/asm/kvm_emulate.h |  1 +
 arch/x86/include/asm/kvm_host.h|  2 ++
 arch/x86/kvm/svm.c | 16 ++--
 arch/x86/kvm/vmx.c | 17 ++---
 arch/x86/kvm/x86.c |  9 -
 5 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index 19d14ac..ad1689b 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -23,6 +23,7 @@ struct x86_exception {
u16 error_code;
bool nested_page_fault;
u64 address; /* cr2 or nested page fault gpa */
+   u8 async_page_fault;
 };
 
 /*
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 36561f25..ac1a7c1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -495,6 +495,7 @@ struct kvm_vcpu_arch {
bool reinject;
u8 nr;
u32 error_code;
+   u8 nested_apf;
} exception;
 
struct kvm_queued_interrupt {
@@ -594,6 +595,7 @@ struct kvm_vcpu_arch {
u32 id;
bool send_user_only;
u32 host_apf_reason;
+   unsigned long nested_apf_token;
} apf;
 
/* OSVW MSRs (AMD only) */
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 05e224a..67939a0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2215,15 +2215,19 @@ static int nested_svm_check_exception(struct vcpu_svm 
*svm, unsigned nr,
if (!is_guest_mode(>vcpu))
return 0;
 
+   vmexit = nested_svm_intercept(svm);
+   if (vmexit != NESTED_EXIT_DONE)
+   return 0;
+
svm->vmcb->control.exit_code = SVM_EXIT_EXCP_BASE + nr;
svm->vmcb->control.exit_code_hi = 0;
svm->vmcb->control.exit_info_1 = error_code;
-   svm->vmcb->control.exit_info_2 = svm->vcpu.arch.cr2;
-
-   vmexit = nested_svm_intercept(svm);
-   if (vmexit == NESTED_EXIT_DONE)
-   svm->nested.exit_required = true;
+   if (svm->vcpu.arch.exception.nested_apf)
+   svm->vmcb->control.exit_info_2 = 
svm->vcpu.arch.apf.nested_apf_token;
+   else
+   svm->vmcb->control.exit_info_2 = svm->vcpu.arch.cr2;
 
+   svm->nested.exit_required = true;
return vmexit;
 }
 
@@ -2419,7 +2423,7 @@ static int nested_svm_intercept(struct vcpu_svm *svm)
vmexit = NESTED_EXIT_DONE;
/* async page fault always cause vmexit */
else if ((exit_code == SVM_EXIT_EXCP_BASE + PF_VECTOR) &&
-svm->vcpu.arch.apf.host_apf_reason != 0)
+svm->vcpu.arch.exception.nested_apf != 0)
vmexit = NESTED_EXIT_DONE;
break;
}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b2c01f8..2c14d1b 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2281,13 +2281,24 @@ static void skip_emulated_instruction(struct kvm_vcpu 
*vcpu)
  * KVM wants to inject page-faults which it got to the guest. This function
  * checks whether in a nested guest, we need to inject them to L1 or L2.
  */
-static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned nr)
+static int nested_vmx_check_exception(struct kvm_vcpu *vcpu)
 {
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+   unsigned int nr = vcpu->arch.exception.nr;
 
-   if (!(vmcs12->exception_bitmap & (1u << nr)))
+   if (!((vmcs12->exception_bitmap & (1u << nr)) ||
+   (nr == PF_VECTOR && vcpu->arch.exception.nested_apf)))
return 0;
 
+   if (vcpu->arch.exception.nested_apf) {
+   vmcs_write32(VM_EXIT_INTR_ERROR_CODE, 
vcpu->arch.exception.error_code);
+   nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
+   PF_VECTOR | INTR_TYPE_HARD_EXCEPTION |
+   INTR_INFO_DELIVER_CODE_MASK | INTR_INFO_VALID_MASK,
+   vcpu->arch.apf.nested_apf_token);
+   return 1;
+   }
+
nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
  vmcs_read32(VM_EXIT_INTR_INFO),
  vmcs_readl(EXIT_QUALIFICATION));
@@ -2302,7 +2313,7 @@ static void vmx_queue_exception(struct kvm_vcpu 

[Devel] [PATCH 0/6] backporting async_pf injection functionality

2017-09-20 Thread Denis Plotnikov
the patch set is for
1. Replace Roman's patch avoiding async_pf injection while in the guest mode 
with
   the similar patch from the mainstream for kernel code consistency
2. Force a nested vmexit if the injected #PF is async_pf
3. Let guest support delivery of async_pf from guest mode

Denis Plotnikov (1):
  Revert "kvm/x86: skip async_pf when in guest mode"

Wanpeng Li (5):
  KVM: nVMX: Fix exception injection
  KVM: async_pf: Add L1 guest async_pf #PF vmexit handler
  KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf
  KVM: async_pf: avoid async pf injection when in guest mode
  KVM: async_pf: Let guest support delivery of async_pf from guest mode

 Documentation/virtual/kvm/msr.txt|  5 ++--
 arch/x86/include/asm/kvm_emulate.h   |  1 +
 arch/x86/include/asm/kvm_host.h  |  4 +++
 arch/x86/include/uapi/asm/kvm_para.h |  1 +
 arch/x86/kernel/kvm.c|  7 -
 arch/x86/kvm/mmu.c   | 40 +++--
 arch/x86/kvm/mmu.h   |  4 +++
 arch/x86/kvm/svm.c   | 50 +++-
 arch/x86/kvm/vmx.c   | 36 +-
 arch/x86/kvm/x86.c   | 20 ++-
 10 files changed, 109 insertions(+), 59 deletions(-)

-- 
2.7.4

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH 2/6] KVM: async_pf: Add L1 guest async_pf #PF vmexit handler

2017-09-20 Thread Denis Plotnikov
From: Wanpeng Li 

This patch adds the L1 guest async page fault #PF vmexit handler, such
by L1 similar to ordinary async page fault.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
[Passed insn parameters to kvm_mmu_page_fault().]
Signed-off-by: Radim Krčmář 
(cherry picked from commit 1261bfa326f5e903166498628a1894edce0caabc)
fix #PSBM-56498
Signed-off-by: Denis Plotnikov 
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/mmu.c  | 33 +
 arch/x86/kvm/mmu.h  |  3 +++
 arch/x86/kvm/svm.c  | 36 ++--
 arch/x86/kvm/vmx.c  | 15 ---
 5 files changed, 51 insertions(+), 37 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 004aff6..36561f25 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -593,6 +593,7 @@ struct kvm_vcpu_arch {
u64 msr_val;
u32 id;
bool send_user_only;
+   u32 host_apf_reason;
} apf;
 
/* OSVW MSRs (AMD only) */
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6fdc4ef..83584eb 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include "trace.h"
 
 /*
  * When setting this variable to true it enables Two-Dimensional-Paging
@@ -3517,6 +3518,38 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool 
prefault, gfn_t gfn,
return false;
 }
 
+int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
+   u64 fault_address, char *insn, int insn_len,
+   bool need_unprotect)
+{
+   int r = 1;
+
+   switch (vcpu->arch.apf.host_apf_reason) {
+   default:
+   trace_kvm_page_fault(fault_address, error_code);
+
+   if (need_unprotect && kvm_event_needs_reinjection(vcpu))
+   kvm_mmu_unprotect_page_virt(vcpu, fault_address);
+   r = kvm_mmu_page_fault(vcpu, fault_address, error_code, insn,
+   insn_len);
+   break;
+   case KVM_PV_REASON_PAGE_NOT_PRESENT:
+   vcpu->arch.apf.host_apf_reason = 0;
+   local_irq_disable();
+   kvm_async_pf_task_wait(fault_address);
+   local_irq_enable();
+   break;
+   case KVM_PV_REASON_PAGE_READY:
+   vcpu->arch.apf.host_apf_reason = 0;
+   local_irq_disable();
+   kvm_async_pf_task_wake(fault_address);
+   local_irq_enable();
+   break;
+   }
+   return r;
+}
+EXPORT_SYMBOL_GPL(kvm_handle_page_fault);
+
 static bool
 check_hugepage_cache_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, int level)
 {
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 58fe98a..bfadd00 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -74,6 +74,9 @@ enum {
 int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr, bool direct);
 void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu);
 void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly);
+int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
+   u64 fault_address, char *insn, int insn_len,
+   bool need_unprotect);
 
 static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm)
 {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 89689b0..05e224a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -173,7 +173,6 @@ struct vcpu_svm {
 
unsigned int3_injected;
unsigned long int3_rip;
-   u32 apf_reason;
 
/* cached guest cpuid flags for faster access */
bool nrips_enabled  : 1;
@@ -1902,34 +1901,11 @@ static void svm_set_dr7(struct kvm_vcpu *vcpu, unsigned 
long value)
 static int pf_interception(struct vcpu_svm *svm)
 {
u64 fault_address = svm->vmcb->control.exit_info_2;
-   u32 error_code;
-   int r = 1;
+   u64 error_code = svm->vmcb->control.exit_info_1;
 
-   switch (svm->apf_reason) {
-   default:
-   error_code = svm->vmcb->control.exit_info_1;
-
-   trace_kvm_page_fault(fault_address, error_code);
-   if (!npt_enabled && kvm_event_needs_reinjection(>vcpu))
-   kvm_mmu_unprotect_page_virt(>vcpu, fault_address);
-   r = kvm_mmu_page_fault(>vcpu, fault_address, error_code,
+   return kvm_handle_page_fault(>vcpu, error_code, fault_address,
svm->vmcb->control.insn_bytes,
-   svm->vmcb->control.insn_len);
-   break;
-   case KVM_PV_REASON_PAGE_NOT_PRESENT:
-   svm->apf_reason = 0;
-   

[Devel] [PATCH 4/6] Revert "kvm/x86: skip async_pf when in guest mode"

2017-09-20 Thread Denis Plotnikov
This reverts commit 5173f45a28cdf3d5808e236eab882273a760a363.

The commit will be replaced with the mainstream commit which does
the same:
9bc1f09f6f KVM: async_pf: avoid async pf injection when in guest mode

This is done to make vzkernel look similar to the mainstream kernel with
all the consequences like mitigation of patch backporting.

fix #PSBM-56498

Signed-off-by: Denis Plotnikov 
---
 arch/x86/kvm/mmu.c | 2 +-
 arch/x86/kvm/x86.c | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 83584eb..91bc5eb 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3504,7 +3504,7 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool 
prefault, gfn_t gfn,
if (!async)
return false; /* *pfn has correct page already */
 
-   if (!prefault && !is_guest_mode(vcpu) && can_do_async_pf(vcpu)) {
+   if (!prefault && can_do_async_pf(vcpu)) {
trace_kvm_try_async_get_page(gva, gfn);
if (kvm_find_async_pf_gfn(vcpu, gfn)) {
trace_kvm_async_pf_doublefault(gva, gfn);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 66cbb9f..34eccf9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6891,8 +6891,7 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
break;
}
 
-   if (!is_guest_mode(vcpu))
-   kvm_check_async_pf_completion(vcpu);
+   kvm_check_async_pf_completion(vcpu);
 
if (signal_pending(current)) {
r = -EINTR;
-- 
2.7.4

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH 1/6] KVM: nVMX: Fix exception injection

2017-09-20 Thread Denis Plotnikov
From: Wanpeng Li 

 WARNING: CPU: 3 PID: 2840 at arch/x86/kvm/vmx.c:10966 
nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel]
 CPU: 3 PID: 2840 Comm: qemu-system-x86 Tainted: G   OE   4.12.0-rc3+ 
#23
 RIP: 0010:nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel]
 Call Trace:
  ? kvm_check_async_pf_completion+0xef/0x120 [kvm]
  ? rcu_read_lock_sched_held+0x79/0x80
  vmx_queue_exception+0x104/0x160 [kvm_intel]
  ? vmx_queue_exception+0x104/0x160 [kvm_intel]
  kvm_arch_vcpu_ioctl_run+0x1171/0x1ce0 [kvm]
  ? kvm_arch_vcpu_load+0x47/0x240 [kvm]
  ? kvm_arch_vcpu_load+0x62/0x240 [kvm]
  kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
  ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
  ? __fget+0xf3/0x210
  do_vfs_ioctl+0xa4/0x700
  ? __fget+0x114/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x81/0x220
  entry_SYSCALL64_slow_path+0x25/0x25

This is triggered occasionally by running both win7 and win2016 in L2, in
addition, EPT is disabled on both L1 and L2. It can't be reproduced easily.

Commit 0b6ac343fc (KVM: nVMX: Correct handling of exception injection) mentioned
that "KVM wants to inject page-faults which it got to the guest. This function
assumes it is called with the exit reason in vmcs02 being a #PF exception".
Commit e011c663 (KVM: nVMX: Check all exceptions for intercept during delivery 
to
L2) allows to check all exceptions for intercept during delivery to L2. However,
there is no guarantee the exit reason is exception currently, when there is an
external interrupt occurred on host, maybe a time interrupt for host which 
should
not be injected to guest, and somewhere queues an exception, then the function
nested_vmx_check_exception() will be called and the vmexit emulation codes will
try to emulate the "Acknowledge interrupt on exit" behavior, the warning is
triggered.

Reusing the exit reason from the L2->L0 vmexit is wrong in this case,
the reason must always be EXCEPTION_NMI when injecting an exception into
L1 as a nested vmexit.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
Fixes: e011c663b9c7 ("KVM: nVMX: Check all exceptions for intercept during 
delivery to L2")
Signed-off-by: Radim Krčmář 
(cherry picked from commit d4912215d1031e4fb3d1038d2e1857218dba0d0a)
fix #PSBM-56498
Signed-off-by: Denis Plotnikov 
---
 arch/x86/kvm/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 09b1851..b52ba18 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2288,7 +2288,7 @@ static int nested_vmx_check_exception(struct kvm_vcpu 
*vcpu, unsigned nr)
if (!(vmcs12->exception_bitmap & (1u << nr)))
return 0;
 
-   nested_vmx_vmexit(vcpu, to_vmx(vcpu)->exit_reason,
+   nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
  vmcs_read32(VM_EXIT_INTR_INFO),
  vmcs_readl(EXIT_QUALIFICATION));
return 1;
-- 
2.7.4

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH] connector: bump skb->users before callback invocation

2017-09-20 Thread Stanislav Kinsburskiy
From: Florian Westphal 

Backport of commit 55285bf09427c5abf43ee1d54e892f352092b1f1.

Dmitry reports memleak with syskaller program.
Problem is that connector bumps skb usecount but might not invoke callback.

So move skb_get to where we invoke the callback.

https://jira.sw.ru/browse/PSBM-71904

Reported-by: Dmitry Vyukov 
Signed-off-by: Florian Westphal 
Signed-off-by: David S. Miller 
Signed-off-by: Stanislav Kinsburskiy 
---
 drivers/connector/connector.c |   11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
index 752c692..57285ba 100644
--- a/drivers/connector/connector.c
+++ b/drivers/connector/connector.c
@@ -161,26 +161,21 @@ static int cn_call_callback(struct sk_buff *skb)
  *
  * It checks skb, netlink header and msg sizes, and calls callback helper.
  */
-static void cn_rx_skb(struct sk_buff *__skb)
+static void cn_rx_skb(struct sk_buff *skb)
 {
struct nlmsghdr *nlh;
-   struct sk_buff *skb;
int len, err;
 
-   skb = skb_get(__skb);
-
if (skb->len >= NLMSG_HDRLEN) {
nlh = nlmsg_hdr(skb);
len = nlmsg_len(nlh);
 
if (len < (int)sizeof(struct cn_msg) ||
skb->len < nlh->nlmsg_len ||
-   len > CONNECTOR_MAX_MSG_SIZE) {
-   kfree_skb(skb);
+   len > CONNECTOR_MAX_MSG_SIZE)
return;
-   }
 
-   err = cn_call_callback(skb);
+   err = cn_call_callback(skb_get(skb));
if (err < 0)
kfree_skb(skb);
}

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel