On 07/15/2015 06:01 AM, Peter Zijlstra wrote:
On Tue, Jul 14, 2015 at 10:13:37PM -0400, Waiman Long wrote:
+static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
{
struct pv_node *pn = (struct pv_node *)node;
+ struct pv_node *pp = (struct pv_node *)prev;
+ bool wait_early, can_wait_early;
int loop;
for (;;) {
- for (loop = SPIN_THRESHOLD; loop; loop--) {
+ /*
+ * Spin less if the previous vCPU was in the halted state
+ * and it is not the queue head.
+ */
+ can_wait_early = (pn->waithist> PV_WAITHIST_THRESHOLD);
+ wait_early = can_wait_early&& !READ_ONCE(prev->locked)&&
+ (READ_ONCE(pp->state) == vcpu_halted);
+ loop = wait_early ? QNODE_SPIN_THRESHOLD_SHORT
+ : QNODE_SPIN_THRESHOLD;
+ for (; loop; loop--, cpu_relax()) {
+ bool halted;
+
if (READ_ONCE(node->locked))
return;
- cpu_relax();
+
+ if (!can_wait_early || (loop& QNODE_SPIN_CHECK_MASK))
+ continue;
+
+ /*
+ * Look for state transition at previous node.
+ *
+ * running => halted:
+ * call pv_wait() now if kick-ahead is enabled
+ * or reduce spin threshold to
+ * QNODE_SPIN_THRESHOLD_SHORT or less.
+ * halted => running:
+ * reset spin threshold to QNODE_SPIN_THRESHOLD
+ */
+ halted = (READ_ONCE(pp->state) == vcpu_halted)&&
+ !READ_ONCE(prev->locked);
+ if (wait_early == halted)
+ continue;
+ wait_early = halted;
+
+ if (!wait_early)
+ loop = QNODE_SPIN_THRESHOLD;
+ else if (pv_kick_ahead)
+ break;
+ else if (loop> QNODE_SPIN_THRESHOLD_SHORT)
+ loop = QNODE_SPIN_THRESHOLD_SHORT;
}
+ if (wait_early)
+ pvstat_inc(pvstat_wait_early);
+
+ /*
+ * A pv_wait while !wait_early has higher weight than when
+ * wait_early is true.
+ */
+ if (pn->waithist< PV_WAITHIST_MAX)
+ pn->waithist += wait_early ? 1 : PV_WAIT_INC;
So when you looked at this patch, you didn't go like, OMFG!?
So what was wrong with something like:
static inline int pv_spin_threshold(struct pv_node *prev)
{
if (READ_ONCE(prev->locked)) /* it can run, wait for it */
return SPIN_THRESHOLD;
if (READ_ONCE(prev->state) == vcpu_halted) /* its not running, do not
wait */
return 1;
return SPIN_THRESHOLD;
}
static void pv_wait_head(...)
{
for (;;) {
for (loop = pv_spin_threshold(pp); loop; loop--) {
if (READ_ONCE(node->locked))
return;
cpu_relax();
}
if (!lp) {
...
}
pv_wait(&l->locked, _Q_SLOW_VAL);
}
}
What part of: "keep it simple" and "gradual complexity" have you still
not grasped?
I confess that I was a bit sloppy in that part of the code. I want to
get it out for review ASAP without doing too much fine tuning as I
expect at least a few iterations for this patchset. I will certainly
change it in the new patch. Anyway, thanks for the great suggestion.
Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/