From: Wanpeng Li <wanpen...@tencent.com>

This patch reverts commit 75437bb304b20 (locking/pvqspinlock: Don't wait if 
vCPU is preempted), we found great regression caused by this commit.

Xeon Skylake box, 2 sockets, 40 cores, 80 threads, three VMs, each is 80 vCPUs.
The score of ebizzy -M can reduce from 13000-14000 records/s to 1700-1800 
records/s with this commit.

          Host                       Guest                score

vanilla + w/o kvm optimizes     vanilla               1700-1800 records/s
vanilla + w/o kvm optimizes     vanilla + revert      13000-14000 records/s
vanilla + w/ kvm optimizes      vanilla               4500-5000 records/s
vanilla + w/ kvm optimizes      vanilla + revert      14000-15500 records/s

Exit from aggressive wait-early mechanism can result in yield premature and 
incur extra scheduling latency in over-subscribe scenario.

kvm optimizes:
[1] commit d73eb57b80b (KVM: Boost vCPUs that are delivering interrupts)
[2] commit 266e85a5ec9 (KVM: X86: Boost queue head vCPU to mitigate lock waiter 
preemption)

Tested-by: loobin...@tencent.com
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@kernel.org>
Cc: Waiman Long <long...@redhat.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Radim Krčmář <rkrc...@redhat.com>
Cc: loobin...@tencent.com
Cc: sta...@vger.kernel.org 
Fixes: 75437bb304b20 (locking/pvqspinlock: Don't wait if vCPU is preempted)
Signed-off-by: Wanpeng Li <wanpen...@tencent.com>
---
 kernel/locking/qspinlock_paravirt.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index 89bab07..e84d21a 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -269,7 +269,7 @@ pv_wait_early(struct pv_node *prev, int loop)
        if ((loop & PV_PREV_CHECK_MASK) != 0)
                return false;
 
-       return READ_ONCE(prev->state) != vcpu_running || 
vcpu_is_preempted(prev->cpu);
+       return READ_ONCE(prev->state) != vcpu_running;
 }
 
 /*
-- 
2.7.4

Reply via email to