On 07/13/2015 09:48 AM, Peter Zijlstra wrote:
On Sat, Jul 11, 2015 at 04:36:52PM -0400, Waiman Long wrote:
@@ -229,19 +244,42 @@ static void pv_wait_head(struct qspinlock *lock, struct 
mcs_spinlock *node)
  {
        struct pv_node *pn = (struct pv_node *)node;
        struct __qspinlock *l = (void *)lock;
-       struct qspinlock **lp = NULL;
+       struct qspinlock **lp;
        int loop;

+       /*
+        * Initialize lp to a non-NULL value if it has already been in the
+        * pv_hashed state so that pv_hash() won't be called again.
+        */
+       lp = (READ_ONCE(pn->state) == vcpu_hashed) ? (struct qspinlock **)1
+                                                  : NULL;
        for (;;) {
+               WRITE_ONCE(pn->state, vcpu_running);
                for (loop = SPIN_THRESHOLD; loop; loop--) {
                        if (!READ_ONCE(l->locked))
                                return;
                        cpu_relax();
                }

-               WRITE_ONCE(pn->state, vcpu_halted);
+               /*
+                * Recheck lock value after setting vcpu_hashed state
+                *
+                * [S] state = vcpu_hashed      [S] l->locked = 0
+                *     MB                           MB
+                * [L] l->locked             [L] state == vcpu_hashed
+                *
+                * Matches smp_store_mb() in __pv_queued_spin_unlock()
+                */
+               smp_store_mb(pn->state, vcpu_hashed);
+
+               if (!READ_ONCE(l->locked)) {
+                       WRITE_ONCE(pn->state, vcpu_running);
+                       return;
+               }
+
                if (!lp) { /* ONCE */
                        lp = pv_hash(lock, pn);
+
                        /*
                         * lp must be set before setting _Q_SLOW_VAL
                         *
@@ -305,13 +343,16 @@ __visible void __pv_queued_spin_unlock(struct qspinlock 
*lock)
         * Now that we have a reference to the (likely) blocked pv_node,
         * release the lock.
         */
-       smp_store_release(&l->locked, 0);
+       smp_store_mb(l->locked, 0);

        /*
         * At this point the memory pointed at by lock can be freed/reused,
         * however we can still use the pv_node to kick the CPU.
+        * The other vCPU may not really be halted, but kicking an active
+        * vCPU is harmless other than the additional latency in completing
+        * the unlock.
         */
-       if (READ_ONCE(node->state) == vcpu_halted)
+       if (READ_ONCE(node->state) == vcpu_hashed)
                pv_kick(node->cpu);
  }
I think most of that is not actually required; if we let pv_kick_node()
set vcpu_hashed and avoid writing another value in pv_wait_head(), then
__pv_queued_spin_unlock() has two cases:

  - pv_kick_node() set _SLOW_VAL, which is the same 'thread' and things
    observe program order and we're trivially guaranteed to see
    node->state and the hash state.

I just found out that letting pv_kick_node() to wakeup vCPUs at locking time can have a slightly better performance in some cases. So I am going to keep it, but defer kicking to the unlock time when we can do multiple kicks. The advantage of doing it at unlock time is that the kicking can be done outside of the critical section. So I am going to keep the current name.

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to