Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-17 Thread Waiman Long

On 09/16/2015 11:01 AM, Peter Zijlstra wrote:

On Tue, Sep 15, 2015 at 11:29:14AM -0400, Waiman Long wrote:


Only the queue head vCPU will be in pv_wait_head() spinning to acquire the
lock.

But what will guarantee fwd progress for the lock that is the head?

Suppose CPU0 becomes head and enters the /* claim the lock */ loop.

Then CPU1 comes in, steals it in pv_wait_head(). CPU1 releases, CPU1
re-acquires and _again_ steals in pv_wait_head(), etc..

All the while CPU0 doesn't go anywhere.



That can't happen. For a given lock, there can only be 1 queue head 
spinning on the lock at any instance in time. If CPU0 was the head, 
another CPU could not become head until CPU0 got the lock and pass the 
MCS lock bit to the next one in the queue. As I said in earlier mail, 
the only place where lock stealing can happen is in the 
pv_queued_spin_trylock_unfair() function where I purposely inserted a 
cpu_relax() to allow an actively spinning queue head CPU a better chance 
of getting the lock. Once a CPU enters the queue. It won't try to 
acquire the lock until it becomes the head and there is one and only one 
head.


Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-16 Thread Peter Zijlstra
On Tue, Sep 15, 2015 at 11:29:14AM -0400, Waiman Long wrote:

> Only the queue head vCPU will be in pv_wait_head() spinning to acquire the
> lock.

But what will guarantee fwd progress for the lock that is the head?

Suppose CPU0 becomes head and enters the /* claim the lock */ loop.

Then CPU1 comes in, steals it in pv_wait_head(). CPU1 releases, CPU1
re-acquires and _again_ steals in pv_wait_head(), etc..

All the while CPU0 doesn't go anywhere.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-15 Thread Waiman Long

On 09/15/2015 04:24 AM, Peter Zijlstra wrote:

On Mon, Sep 14, 2015 at 03:15:20PM -0400, Waiman Long wrote:

On 09/14/2015 10:00 AM, Peter Zijlstra wrote:

On Fri, Sep 11, 2015 at 02:37:37PM -0400, Waiman Long wrote:

This patch allows one attempt for the lock waiter to steal the lock

   ^^^


when entering the PV slowpath.  This helps to reduce the performance
penalty caused by lock waiter preemption while not having much of
the downsides of a real unfair lock.
@@ -415,8 +458,12 @@ static void pv_wait_head(struct qspinlock *lock, struct 
mcs_spinlock *node)

for (;; waitcnt++) {
for (loop = SPIN_THRESHOLD; loop; loop--) {
-   if (!READ_ONCE(l->locked))
-   return;
+   /*
+* Try to acquire the lock when it is free.
+*/
+   if (!READ_ONCE(l->locked)&&
+  (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0))
+   goto gotlock;
cpu_relax();
}


This isn't _once_, this is once per 'wakeup'. And note that interrupts
unrelated to the kick can equally wake the vCPU up.
void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
{
 :
 /*
 * We touched a (possibly) cold cacheline in the per-cpu queue node;
 * attempt the trylock once more in the hope someone let go while we
 * weren't watching.
 */
if (queued_spin_trylock(lock))
goto release;

This is the only place where I consider lock stealing happens. Again, I
should have a comment in pv_queued_spin_trylock_unfair() to say where it
will be called.

But you're not adding that..

What you did add is a steal in pv_wait_head(), and its not even once per
pv_wait_head, its inside the spin loop (I read it wrong yesterday).

So that makes the entire Changelog complete crap. There isn't _one_
attempt, and there is absolutely no fairness left.


Only the queue head vCPU will be in pv_wait_head() spinning to acquire 
the lock. The other vCPUs in the queue will still be spinning on their 
MCS nodes. The only competitors for the lock are those vCPUs that have 
just entered slowpath and execute the queued_spin_trylock() function 
once before being queued. That is what I mean by each task having only 
one chance of stealing the lock. Maybe the following code changes can 
make this point clearer.


Cheers,
Longman



--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -59,7 +59,8 @@ struct pv_node {
 /*
  * Allow one unfair trylock when entering the PV slowpath to reduce the
  * performance impact of lock waiter preemption (either explicitly via
- * pv_wait or implicitly via PLE).
+ * pv_wait or implicitly via PLE). This function will be called once when
+ * a lock waiter enter the slowpath before being queued.
  *
  * A little bit of unfairness here can improve performance without many
  * of the downsides of a real unfair lock.
@@ -72,8 +73,8 @@ static inline bool 
pv_queued_spin_trylock_unfair(struct qspinl

if (READ_ONCE(l->locked))
return 0;
/*
-* Wait a bit here to ensure that an actively spinning vCPU has 
a fair

-* chance of getting the lock.
+* Wait a bit here to ensure that an actively spinning queue 
head vCPU

+* has a fair chance of getting the lock.
 */
cpu_relax();

@@ -504,14 +505,23 @@ static int pv_wait_head_and_lock(struct qspinlock 
*lock,

 */
WRITE_ONCE(pn->state, vcpu_running);

-   for (loop = SPIN_THRESHOLD; loop; loop--) {
+   loop = SPIN_THRESHOLD;
+   while (loop) {
/*
-* Try to acquire the lock when it is free.
+* Spin until the lock is free
 */
-   if (!READ_ONCE(l->locked) &&
-  (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0))
+   for (; loop && READ_ONCE(l->locked); loop--)
+   cpu_relax();
+   /*
+* Seeing the lock is free, this queue head vCPU is
+* the rightful next owner of the lock. However, the
+* lock may have just been stolen by another 
task which
+* has entered the slowpath. So we need to use 
atomic
+* operation to make sure that we really get the 
lock.

+* Otherwise, we have to wait again.
+*/
+   if (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0)
goto gotlock;
-   cpu_relax();

Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-15 Thread Peter Zijlstra
On Mon, Sep 14, 2015 at 03:15:20PM -0400, Waiman Long wrote:
> On 09/14/2015 10:00 AM, Peter Zijlstra wrote:
> >On Fri, Sep 11, 2015 at 02:37:37PM -0400, Waiman Long wrote:
> >>This patch allows one attempt for the lock waiter to steal the lock
  ^^^

> >>when entering the PV slowpath.  This helps to reduce the performance
> >>penalty caused by lock waiter preemption while not having much of
> >>the downsides of a real unfair lock.

> >>@@ -415,8 +458,12 @@ static void pv_wait_head(struct qspinlock *lock, 
> >>struct mcs_spinlock *node)
> >>
> >>for (;; waitcnt++) {
> >>for (loop = SPIN_THRESHOLD; loop; loop--) {
> >>-   if (!READ_ONCE(l->locked))
> >>-   return;
> >>+   /*
> >>+* Try to acquire the lock when it is free.
> >>+*/
> >>+   if (!READ_ONCE(l->locked)&&
> >>+  (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0))
> >>+   goto gotlock;
> >>cpu_relax();
> >>}
> >>
> >This isn't _once_, this is once per 'wakeup'. And note that interrupts
> >unrelated to the kick can equally wake the vCPU up.

> > void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
> > {
> > :
> > /*
> > * We touched a (possibly) cold cacheline in the per-cpu queue node;
> > * attempt the trylock once more in the hope someone let go while we
> > * weren't watching.
> > */
> >if (queued_spin_trylock(lock))
> >goto release;
> 
> This is the only place where I consider lock stealing happens. Again, I
> should have a comment in pv_queued_spin_trylock_unfair() to say where it
> will be called.

But you're not adding that..

What you did add is a steal in pv_wait_head(), and its not even once per
pv_wait_head, its inside the spin loop (I read it wrong yesterday).

So that makes the entire Changelog complete crap. There isn't _one_
attempt, and there is absolutely no fairness left.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-14 Thread Waiman Long

On 09/14/2015 03:15 PM, Waiman Long wrote:

On 09/14/2015 10:00 AM, Peter Zijlstra wrote:

On Fri, Sep 11, 2015 at 02:37:37PM -0400, Waiman Long wrote:

This patch allows one attempt for the lock waiter to steal the lock
when entering the PV slowpath.  This helps to reduce the performance
penalty caused by lock waiter preemption while not having much of
the downsides of a real unfair lock.
@@ -415,8 +458,12 @@ static void pv_wait_head(struct qspinlock 
*lock, struct mcs_spinlock *node)


  for (;; waitcnt++) {
  for (loop = SPIN_THRESHOLD; loop; loop--) {
-if (!READ_ONCE(l->locked))
-return;
+/*
+ * Try to acquire the lock when it is free.
+ */
+if (!READ_ONCE(l->locked)&&
+   (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0))
+goto gotlock;
  cpu_relax();
  }


This isn't _once_, this is once per 'wakeup'. And note that interrupts
unrelated to the kick can equally wake the vCPU up.



Oh! There is a minor bug that I shouldn't need to have a second 
READ_ONCE() call here.


Oh! I misread the diff, the code was OK.

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-14 Thread Waiman Long

On 09/14/2015 10:04 AM, Peter Zijlstra wrote:

On Fri, Sep 11, 2015 at 02:37:37PM -0400, Waiman Long wrote:

This patch allows one attempt for the lock waiter to steal the lock
when entering the PV slowpath.  This helps to reduce the performance
penalty caused by lock waiter preemption while not having much of
the downsides of a real unfair lock.



@@ -416,7 +414,8 @@ queue:
 * does not imply a full barrier.
 *
 */

If it really were once, like the Changelog says it is, then you could
have simply added:

if (pv_try_steal_lock(...))
goto release;


My previous mail has clarified where the lock stealing happen. Will add 
the necessary comment to the patch.



here, and not wrecked pv_wait_head() like you did. Note that if you do
it like this, you also do not need to play games with the hash, because
you'll never get into that situation.


-   pv_wait_head(lock, node);
+   if (pv_wait_head_and_lock(lock, node, tail))
+   goto release;
while ((val = smp_load_acquire(&lock->val.counter))&  
_Q_LOCKED_PENDING_MASK)
cpu_relax();



Because we need to use atomic op to get the lock, we can't use the 
native logic to do the acquire. I know it is kind of hacky, but I don't 
have a good alternative here.


Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-14 Thread Waiman Long

On 09/14/2015 10:00 AM, Peter Zijlstra wrote:

On Fri, Sep 11, 2015 at 02:37:37PM -0400, Waiman Long wrote:

This patch allows one attempt for the lock waiter to steal the lock
when entering the PV slowpath.  This helps to reduce the performance
penalty caused by lock waiter preemption while not having much of
the downsides of a real unfair lock.
@@ -415,8 +458,12 @@ static void pv_wait_head(struct qspinlock *lock, struct 
mcs_spinlock *node)

for (;; waitcnt++) {
for (loop = SPIN_THRESHOLD; loop; loop--) {
-   if (!READ_ONCE(l->locked))
-   return;
+   /*
+* Try to acquire the lock when it is free.
+*/
+   if (!READ_ONCE(l->locked)&&
+  (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0))
+   goto gotlock;
cpu_relax();
}


This isn't _once_, this is once per 'wakeup'. And note that interrupts
unrelated to the kick can equally wake the vCPU up.



Oh! There is a minor bug that I shouldn't need to have a second 
READ_ONCE() call here.


As this is the queue head, finding the lock free entitles the vCPU to 
own the lock. However, because of lock stealing, I can't just write a 1 
to the lock and assume thing is all set. That is why I need to use 
cmpxchg() to make sure that the queue head vCPU can actually get the 
lock without the lock stolen underneath. I don't count that as lock 
stealing as it is the rightful owner of the lock.


I am sorry that I should have added a comment to clarify that. Will do 
so in the next update.


> void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
> {
> :
> /*
> * We touched a (possibly) cold cacheline in the per-cpu queue 
node;
> * attempt the trylock once more in the hope someone let go 
while we

> * weren't watching.
> */
>if (queued_spin_trylock(lock))
>goto release;

This is the only place where I consider lock stealing happens. Again, I 
should have a comment in pv_queued_spin_trylock_unfair() to say where it 
will be called.


Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-14 Thread Waiman Long

On 09/14/2015 09:57 AM, Peter Zijlstra wrote:

On Fri, Sep 11, 2015 at 02:37:37PM -0400, Waiman Long wrote:

+#define queued_spin_trylock(l) pv_queued_spin_trylock_unfair(l)
+static inline bool pv_queued_spin_trylock_unfair(struct qspinlock *lock)
+{
+   struct __qspinlock *l = (void *)lock;
+
+   if (READ_ONCE(l->locked))
+   return 0;
+   /*
+* Wait a bit here to ensure that an actively spinning vCPU has a fair
+* chance of getting the lock.
+*/
+   cpu_relax();
+
+   return cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0;
+}
+static inline int pvstat_trylock_unfair(struct qspinlock *lock)
+{
+   int ret = pv_queued_spin_trylock_unfair(lock);
+
+   if (ret)
+   pvstat_inc(pvstat_utrylock);
+   return ret;
+}
+#undef  queued_spin_trylock
+#define queued_spin_trylock(l) pvstat_trylock_unfair(l)

These aren't actually ever used...


The pvstat_trylock_unfair() is within the CONFIG_QUEUED_LOCK_STAT block. 
It will only be activated when the config parameter is set. Otherwise, 
pv_queued_spin_trylock_unfair() will be used without any counting.


It is used provide count of how many unfair trylock has successfully got 
the lock.


Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-14 Thread Peter Zijlstra
On Fri, Sep 11, 2015 at 02:37:37PM -0400, Waiman Long wrote:
> This patch allows one attempt for the lock waiter to steal the lock
> when entering the PV slowpath.  This helps to reduce the performance
> penalty caused by lock waiter preemption while not having much of
> the downsides of a real unfair lock.

> @@ -415,8 +458,12 @@ static void pv_wait_head(struct qspinlock *lock, struct 
> mcs_spinlock *node)
>  
>   for (;; waitcnt++) {
>   for (loop = SPIN_THRESHOLD; loop; loop--) {
> - if (!READ_ONCE(l->locked))
> - return;
> + /*
> +  * Try to acquire the lock when it is free.
> +  */
> + if (!READ_ONCE(l->locked) &&
> +(cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0))
> + goto gotlock;
>   cpu_relax();
>   }
>  

This isn't _once_, this is once per 'wakeup'. And note that interrupts
unrelated to the kick can equally wake the vCPU up.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-14 Thread Peter Zijlstra
On Fri, Sep 11, 2015 at 02:37:37PM -0400, Waiman Long wrote:
> This patch allows one attempt for the lock waiter to steal the lock
> when entering the PV slowpath.  This helps to reduce the performance
> penalty caused by lock waiter preemption while not having much of
> the downsides of a real unfair lock.


> @@ -416,7 +414,8 @@ queue:
>* does not imply a full barrier.
>*
>*/

If it really were once, like the Changelog says it is, then you could
have simply added:

if (pv_try_steal_lock(...))
goto release;

here, and not wrecked pv_wait_head() like you did. Note that if you do
it like this, you also do not need to play games with the hash, because
you'll never get into that situation.

> - pv_wait_head(lock, node);
> + if (pv_wait_head_and_lock(lock, node, tail))
> + goto release;
>   while ((val = smp_load_acquire(&lock->val.counter)) & 
> _Q_LOCKED_PENDING_MASK)
>   cpu_relax();
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 5/6] locking/pvqspinlock: Allow 1 lock stealing attempt

2015-09-14 Thread Peter Zijlstra
On Fri, Sep 11, 2015 at 02:37:37PM -0400, Waiman Long wrote:
> +#define queued_spin_trylock(l)   pv_queued_spin_trylock_unfair(l)
> +static inline bool pv_queued_spin_trylock_unfair(struct qspinlock *lock)
> +{
> + struct __qspinlock *l = (void *)lock;
> +
> + if (READ_ONCE(l->locked))
> + return 0;
> + /*
> +  * Wait a bit here to ensure that an actively spinning vCPU has a fair
> +  * chance of getting the lock.
> +  */
> + cpu_relax();
> +
> + return cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0;
> +}

> +static inline int pvstat_trylock_unfair(struct qspinlock *lock)
> +{
> + int ret = pv_queued_spin_trylock_unfair(lock);
> +
> + if (ret)
> + pvstat_inc(pvstat_utrylock);
> + return ret;
> +}
> +#undef  queued_spin_trylock
> +#define queued_spin_trylock(l)   pvstat_trylock_unfair(l)

These aren't actually ever used...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/