If two page ready notifications happen back to back the second one is not
delivered and the only mechanism we currently have is
kvm_check_async_pf_completion() check in vcpu_run() loop. The check will
only be performed with the next vmexit when it happens and in some cases
it may take a while. With interrupt based page ready notification delivery
the situation is even worse: unlike exceptions, interrupts are not handled
immediately so we must check if the slot is empty. This is slow and
unnecessary. Introduce dedicated MSR_KVM_ASYNC_PF_ACK MSR to communicate
the fact that the slot is free and host should check its notification
queue. Mandate using it for interrupt based type 2 APF event delivery.

Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com>
---
 Documentation/virt/kvm/msr.rst       | 16 +++++++++++++++-
 arch/x86/include/uapi/asm/kvm_para.h |  1 +
 arch/x86/kvm/x86.c                   |  9 ++++++++-
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst
index 7433e55f7184..18db3448db06 100644
--- a/Documentation/virt/kvm/msr.rst
+++ b/Documentation/virt/kvm/msr.rst
@@ -219,6 +219,11 @@ data:
        If during pagefault APF reason is 0 it means that this is regular
        page fault.
 
+       For interrupt based delivery, guest has to write '1' to
+       MSR_KVM_ASYNC_PF_ACK every time it clears reason in the shared
+       'struct kvm_vcpu_pv_apf_data', this forces KVM to re-scan its
+       queue and deliver next pending notification.
+
        During delivery of type 1 APF cr2 contains a token that will
        be used to notify a guest when missing page becomes
        available. When page becomes available type 2 APF is sent with
@@ -340,4 +345,13 @@ data:
 
        To switch to interrupt based delivery of type 2 APF events guests
        are supposed to enable asynchronous page faults and set bit 3 in
-       MSR_KVM_ASYNC_PF_EN first.
+
+MSR_KVM_ASYNC_PF_ACK:
+       0x4b564d07
+
+data:
+       Asynchronous page fault acknowledgment. When the guest is done
+       processing type 2 APF event and 'reason' field in 'struct
+       kvm_vcpu_pv_apf_data' is cleared it is supposed to write '1' to
+       Bit 0 of the MSR, this caused the host to re-scan its queue and
+       check if there are more notifications pending.
diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
b/arch/x86/include/uapi/asm/kvm_para.h
index 1bbb0b7e062f..5c7449980619 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -51,6 +51,7 @@
 #define MSR_KVM_PV_EOI_EN      0x4b564d04
 #define MSR_KVM_POLL_CONTROL   0x4b564d05
 #define MSR_KVM_ASYNC_PF2      0x4b564d06
+#define MSR_KVM_ASYNC_PF_ACK   0x4b564d07
 
 struct kvm_steal_time {
        __u64 steal;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 861dce1e7cf5..e3b91ac33bfd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1243,7 +1243,7 @@ static const u32 emulated_msrs_all[] = {
        HV_X64_MSR_TSC_EMULATION_STATUS,
 
        MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME,
-       MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF2,
+       MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF2, MSR_KVM_ASYNC_PF_ACK,
 
        MSR_IA32_TSC_ADJUST,
        MSR_IA32_TSCDEADLINE,
@@ -2915,6 +2915,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
                if (kvm_pv_enable_async_pf2(vcpu, data))
                        return 1;
                break;
+       case MSR_KVM_ASYNC_PF_ACK:
+               if (data & 0x1)
+                       kvm_check_async_pf_completion(vcpu);
+               break;
        case MSR_KVM_STEAL_TIME:
 
                if (unlikely(!sched_info_on()))
@@ -3194,6 +3198,9 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
        case MSR_KVM_ASYNC_PF2:
                msr_info->data = vcpu->arch.apf.msr2_val;
                break;
+       case MSR_KVM_ASYNC_PF_ACK:
+               msr_info->data = 0;
+               break;
        case MSR_KVM_STEAL_TIME:
                msr_info->data = vcpu->arch.st.msr_val;
                break;
-- 
2.25.3

Reply via email to