smp_load_acquire() is enough here and it's cheaper than smp_mb().
Adding a comment about reusing memory barrier of kvm_make_all_cpus_request()
here to keep order between modifications to the page tables and reading mode.

Signed-off-by: Lan Tianyu <tianyu....@intel.com>
---
 virt/kvm/kvm_main.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ec5aa8d..39ebee9a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -191,9 +191,23 @@ bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned 
int req)
 #ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
-       long dirty_count = kvm->tlbs_dirty;
+       /*
+        * Read tlbs_dirty before setting KVM_REQ_TLB_FLUSH in
+        * kvm_make_all_cpus_request.
+        */
+       long dirty_count = smp_load_acquire(&kvm->tlbs_dirty);
 
-       smp_mb();
+       /*
+        * We want to publish modifications to the page tables before reading
+        * mode. Pairs with a memory barrier in arch-specific code.
+        * - x86: smp_mb__after_srcu_read_unlock in vcpu_enter_guest
+        * and smp_mb in walk_shadow_page_lockless_begin/end.
+        * - powerpc: smp_mb in kvmppc_prepare_to_enter.
+        *
+        * There is already an smp_mb__after_atomic() before
+        * kvm_make_all_cpus_request() reads vcpu->mode. We reuse that
+        * barrier here.
+        */
        if (kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH))
                ++kvm->stat.remote_tlb_flush;
        cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
-- 
1.8.4.rc0.1.g8f6a3e5.dirty

Reply via email to