On 07/15/2015 06:01 AM, Peter Zijlstra wrote:
On Tue, Jul 14, 2015 at 10:13:37PM -0400, Waiman Long wrote:
+static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
  {
        struct pv_node *pn = (struct pv_node *)node;
+       struct pv_node *pp = (struct pv_node *)prev;
+       bool wait_early, can_wait_early;
        int loop;

        for (;;) {
-               for (loop = SPIN_THRESHOLD; loop; loop--) {
+               /*
+                * Spin less if the previous vCPU was in the halted state
+                * and it is not the queue head.
+                */
+               can_wait_early = (pn->waithist>  PV_WAITHIST_THRESHOLD);
+               wait_early = can_wait_early&&  !READ_ONCE(prev->locked)&&
+                            (READ_ONCE(pp->state) == vcpu_halted);
+               loop = wait_early ? QNODE_SPIN_THRESHOLD_SHORT
+                                 : QNODE_SPIN_THRESHOLD;
+               for (; loop; loop--, cpu_relax()) {
+                       bool halted;
+
                        if (READ_ONCE(node->locked))
                                return;
-                       cpu_relax();
+
+                       if (!can_wait_early || (loop&  QNODE_SPIN_CHECK_MASK))
+                               continue;
+
+                       /*
+                        * Look for state transition at previous node.
+                        *
+                        * running =>  halted:
+                        *      call pv_wait() now if kick-ahead is enabled
+                        *      or reduce spin threshold to
+                        *      QNODE_SPIN_THRESHOLD_SHORT or less.
+                        * halted =>  running:
+                        *      reset spin threshold to QNODE_SPIN_THRESHOLD
+                        */
+                       halted = (READ_ONCE(pp->state) == vcpu_halted)&&
+                                !READ_ONCE(prev->locked);
+                       if (wait_early == halted)
+                               continue;
+                       wait_early = halted;
+
+                       if (!wait_early)
+                               loop = QNODE_SPIN_THRESHOLD;
+                       else if (pv_kick_ahead)
+                               break;
+                       else if (loop>  QNODE_SPIN_THRESHOLD_SHORT)
+                               loop = QNODE_SPIN_THRESHOLD_SHORT;
                }
+               if (wait_early)
+                       pvstat_inc(pvstat_wait_early);
+
+               /*
+                * A pv_wait while !wait_early has higher weight than when
+                * wait_early is true.
+                */
+               if (pn->waithist<  PV_WAITHIST_MAX)
+                       pn->waithist += wait_early ? 1 : PV_WAIT_INC;
So when you looked at this patch, you didn't go like, OMFG!?

So what was wrong with something like:

static inline int pv_spin_threshold(struct pv_node *prev)
{
        if (READ_ONCE(prev->locked)) /* it can run, wait for it */
                return SPIN_THRESHOLD;

        if (READ_ONCE(prev->state) == vcpu_halted) /* its not running, do not 
wait */
                return 1;

        return SPIN_THRESHOLD;
}

static void pv_wait_head(...)
{
        for (;;) {
                for (loop = pv_spin_threshold(pp); loop; loop--) {
                        if (READ_ONCE(node->locked))
                                return;
                        cpu_relax();
                }

                if (!lp) {
                        ...
                }
                pv_wait(&l->locked, _Q_SLOW_VAL);
        }
}

What part of: "keep it simple" and "gradual complexity" have you still
not grasped?

I confess that I was a bit sloppy in that part of the code. I want to get it out for review ASAP without doing too much fine tuning as I expect at least a few iterations for this patchset. I will certainly change it in the new patch. Anyway, thanks for the great suggestion.

Cheers,
Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to